Re: [zfs-discuss] USB WD Passport 500GB zfs mirror bug

2009-09-14 Thread Louis-Frédéric Feuillette
On Sun, 2009-09-13 at 11:01 -0700, Stefan Parvu wrote:
 5. Disconnecting the other disk. Problems occur:
 # zpool status zones
   pool: zones
  state: ONLINE
 status: One or more devices has experienced an unrecoverable error.
 An
 attempt was made to correct the error.  Applications are
 unaffected.
 action: Determine if the device needs to be replaced, and clear the
 errors
 using 'zpool clear' or replace the device with 'zpool
 replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
  scrub: resilver completed after 0h0m with 0 errors on Sun Sep 13
 20:58:02 2009
 config:
 
 NAME  STATE READ WRITE CKSUM
 zones ONLINE   0 0 0
   mirror  ONLINE   0 0 0
 c7t0d0p0  ONLINE   0   167 0  294K resilvered
 c7t0d0p0  ONLINE   0 0 0  208K resilvered
 
 errors: No known data errors
 
 
 # zpool status zones
   pool: zones
  state: DEGRADED
 status: One or more devices could not be used because the label is
 missing or
 invalid.  Sufficient replicas exist for the pool to continue
 functioning in a degraded state.
 action: Replace the device using 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-4J
  scrub: resilver completed after 0h0m with 0 errors on Sun Sep 13
 20:58:02 2009
 config:
 
 NAME  STATE READ WRITE CKSUM
 zones DEGRADED 0 0 0
   mirror  DEGRADED 0 0 0
 c7t0d0p0  ONLINE   0   167 0  294K resilvered
 c7t0d0p0  FAULTED  0   113 0  corrupted data
 
 errors: No known data errors
 
 
 I have disconnected c8t0d0p0 but zfs reports that c7t0d0p0 has been
 faulty !?

Both disks read c7t0d0p0, not c7t0d0p0 and c8t0d0p0 as you have in #1-4.
Typo?

-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs compression algorithm : jpeg ??

2009-09-04 Thread Louis-Frédéric Feuillette
On Fri, 2009-09-04 at 13:41 -0700, Richard Elling wrote:
 On Sep 4, 2009, at 12:23 PM, Len Zaifman wrote:
 
  We have groups generating terabytes a day of image data  from lab  
  instruments and saving them to an X4500.
 
 Wouldn't it be easier to compress at the application, or between the
 application and the archiving file system?

Preamble:  I am actively doing research into image set compression,
specifically jpeg2000, so this is my point of reference.


I think it would be easier to compress at the application level. I would
suggest getting the image from the source, then use lossless jpeg2000
compression on it, saving the result to an uncompressed ZFS pool.

JPEG2000 uses arithmetic encoding to do the final compression step.
Arithmetic encoding has a higher compression rate (in general) than
gzip-9, lzbj or others.  There is an opensource implementation of
jpeg2000 called jasper[1].  Jasper is the reference implementation for
jpeg2000, meaning that all other jpeg2000 programs must verify it's
output to that of jasper (kinda).

Saving the jpeg2000 image to an uncompressed ZFS partition will be the
fastest thing.  Since jpeg2000 is already compressed, trying to compress
it will not yeild any storage space reduction, in fact it may _increase_
the size of the data stored on disk.  Since good compression algorithms
result in random data you can see why running on a compressed pool would
be bad for performance.

[1] http://www.ece.uvic.ca/~mdadams/jasper

On a side note, if you want to know how Arithmetic encoding works,
Wikipedia[2] has a real nice explanation.  Suffice it to say, in theory
( Without considering implementation details ) arithmetic encoding can
encode _any_ data at the rate of data_entropy*num_of_symbols +
data_symbol_table. In practice this doesn't happen due to floating point
overflows and some other issues.

[2] http://en.wikipedia.org/wiki/Arithmetic_coding

-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Books on File Systems and File System Programming

2009-08-14 Thread Louis-Frédéric Feuillette
On Fri, 2009-08-14 at 12:34 +0200, Joerg Schilling wrote:
 Louis-Frédéric Feuillette jeb...@gmail.com wrote:
 
  I saw this question on another mailing list, and I too would like to
  know. And I have a couple questions of my own.
 
  == Paraphrased from other list ==
  Does anyone have any recommendations for books on File Systems and/or
  File Systems Programming?
  == end ==
 
 Are you interested in how to write a filesystem or in how to write the 
 filesystem/kernel interface part?

I am primarily interested in the theory of how to write a filesystem.
The kernel interface comes later when I dive into a OS specific details.

-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Books on File Systems and File System Programming

2009-08-14 Thread Louis-Frédéric Feuillette
I did see this, Thanks.

On Fri, 2009-08-14 at 10:51 -0400, Christine Tran wrote:
 
 
 2009/8/14 Louis-Frédéric Feuillette jeb...@gmail.com
 
 
 I am primarily interested in the theory of how to write a
 filesystem.
 The kernel interface comes later when I dive into a OS
 specific details.
 
 Have you seen this?  
 
 http://www.letterp.com/~dbg/practical-file-system-design.pdf
 
 I found this an excellent read.  The author begins by explaining
 what's expected from an FS, he explains the design choices, some
 trade-offs, how the design interfaces with the actually hardware.  No
 specific OS detail, no API, no performance number.  Very solid
 fundamentals.

-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Books on File Systems and File System Programming

2009-08-13 Thread Louis-Frédéric Feuillette
I saw this question on another mailing list, and I too would like to
know. And I have a couple questions of my own.

== Paraphrased from other list ==
Does anyone have any recommendations for books on File Systems and/or
File Systems Programming?
== end ==

I have some texts listed below, but are there books/journals/periodicals
that start from the kernel side of open(2), read(2), write(2), etc. and
progress to disk transactions?

With the advent of ZFS and other transaction based files systems it
seems to me that the line between File Systems and Databases are
beginning to blur ( If they haven't already been doing so for some
time ).  Any pointers the likes of X from here, Y from there, Z from
over yonder and squished together like Q are also welcome.

(relevant) Books I have:
Understanding the Linux Kernel ( The chapters about ext2 and VFS )
Systems programming in the UNIX envirionment
File Structures: An OO approach using C++
Database System concepts (More about SQL and how to implement Joins )

Thanks in advance.

-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs fragmentation

2009-08-11 Thread Louis-Frédéric Feuillette
On Tue, 2009-08-11 at 08:04 -0700, Richard Elling wrote:
 On Aug 11, 2009, at 7:39 AM, Ed Spencer wrote:
  I suspect that if we 'rsync' one of these filesystems to a second
  server/pool  that we would also see a performance increase equal to  
  what
  we see on the development server. (I don't know how zfs send a receive
  work so I don't know if it would address this Filesystem Entropy or
  specifically reorganize the files and directories). However, when we
  created a testfs filesystem in the zfs pool on the production server,
  and copied data to it, we saw the same performance as the other
  filesystems, in the same pool.
 
 Directory walkers, like NetBackup or rsync, will not scale well as
 the number of files increases.  It doesn't matter what file system you
 use, the scalability will look more-or-less similar. For millions of  
 files,
 ZFS send/receive works much better.  More details are in my paper.

Is there link to this paper available?

-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Lundman home NAS

2009-08-01 Thread Louis-Frédéric Feuillette
On Sat, 2009-08-01 at 22:31 +0900, Jorgen Lundman wrote:
 Some preliminary speed tests, not too bad for a pci32 card.
 
 http://lundman.net/wiki/index.php/Lraid5_iozone

I don't know anything about iozone, so the following may be NULL 
void.

I find the results suspect.  1.2GB/s read, and 500MB/s write ! These are
impressive numbers indeed.  I then looked at the file sizes that iozone
used...  How much memory do you have?  I seems like the files would be
able to comfortably fit in memory.  I think this test needs to be re-run
with Large files (ie 2*Memory size ) for them to give more accurate
data.

Unrelated, what did you use to generate those graphs, they look good.
Also, do you have a hardware list on your site somewhere that I missed?
I'd like to know more about the hardware.

-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSDs get faster and less expensive

2009-07-21 Thread Louis-Frédéric Feuillette
On Tue, 2009-07-21 at 14:45 -0700, Richard Elling wrote: 
 But to put this in perspective, you would have to *delete* 20 GBytes of
 data a day on a ZFS file system for 5 years (according to Intel) to  
 reach the expected endurance.

Forgive my ignorance, but is this not exactly what a SSD ZIL does? A ZIL
would need to delete it's data when it flushes to disk. I know this
thread is about consumer SSDs but are the enterprise SSDs that much
better in terms of write cycles (not speed, I know they differ in some
cases dramatically).

Richard, do you have a blog post about SSDs that I missed in my travels?

-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pegging the system

2009-07-17 Thread Louis-Frédéric Feuillette
On Thu, 2009-07-16 at 10:51 -0700, Jeff Haferman wrote:
 We have a SGE array task that we wish to run with elements 1-7.  
 Each task generates output and takes roughly 20 seconds to 4 minutes  
 of CPU time.  We're doing them on a machine with about 144 8-core nodes,
 and we've divvied the job up to do about 500 at a time.
 
 So, we have 500 jobs at a time writing to the same ZFS partition.

Sorry no answers, just some question that first came to mind.

Where is your bottleneck?  Is it drive I/O or Network?

Are all nodes accessing/writing via NFS?  Is this a NFS sync issue?
Might a SSD ZIL help?
-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single disk parity

2009-07-07 Thread Louis-Frédéric Feuillette
On Tue, 2009-07-07 at 17:42 -0700, Richard Elling wrote:
 Christian Auby wrote:
  ZFS is able to detect corruption thanks to checksumming, but for single 
  drives (regular folk-pcs) it doesn't help much unless it can correct them. 
  I've been searching and can't find anything on the topic, so here goes:
 
  1. Can ZFS do parity data on a single drive? e.g. x% parity for all writes, 
  recover on checksum error.
  2. If not, why not? I imagine it would have been a killer feature.
 
  I guess you could possibly do it by partitioning the single drive and 
  running raidz(2) on the partitions, but that would lose you way more space 
  than e.g. 10%. Also not practical for OS drive.

 
 You are describing the copies parameter.  It really helps to describe
 it in pictures, rather than words.  So I did that.
 http://blogs.sun.com/relling/entry/zfs_copies_and_data_protection
  -- richard

I think one solution to what Christian is asking is copies.  But I think
he is asking if there is a way to do something like a 'raid' of the
block so that your capacity isn't cut in half. For example, write 5
blocks to the disk, 4 data and one parity, then if any one of the block
gets corrupted or is unreadable, then you can reconstruct the missing
block. In this example you would only loose 20% of your capacity not
50%.

I think this option would only really be useful for home users or simple
workstations. It also could have some performance implications.

-Jebnor
-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Ditto blocks on RAID-Z pool.

2009-07-02 Thread Louis-Frédéric Feuillette
Hello all,

If you have copies=2 on a large enough raid-z(2) pool and 2(3) disks
die, is it possible to recover that information despite the offline
state of the pool?

I don't have this happening to me, it's just a theoretical question.
So, if you can't recover the data, is there any advantage to using ditto
blocks on top of raid-z(2)?

Jebnor

-- 
Louis-Frédéric Feuillette jeb...@gmail.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mobo SATA migration to AOC-SAT2-MV8 SATA card

2009-06-20 Thread Louis-Frédéric Feuillette
A couple questions out of pure curiosity.

Working on the assumption that you are going to be adding more drives to
your server, why not just add the new drives to the Supermicro
controller and keep the existing pool (well vdev) where it is?

Reading your blog, it seems that you need one (or two if you are
mirroring) SATA ports for your rpool.  Why not just migrate two drives
to the new controller and leave the others where they are?  OpenSolaris
won't card where the drives are physically connected as long as you
export/import.

-Jebnor

On Fri, 2009-06-19 at 16:21 -0700, Simon Breden wrote:
 Hi,
 
 I'm using 6 SATA ports from the motherboard but I've now run out of SATA 
 ports, and so I'm thinking of adding a Supermicro AOC-SAT2-MV8 8-port SATA 
 controller card.
 
 What is the procedure for migrating the drives to this card?
 Is it a simple case of (1) issuing a 'zpool export pool_name' command, (2) 
 shutdown, (3) insert card and move all SATA cables for drives from mobo to 
 card, (4) boot and issue a 'zpool import pool_name' command ?
 
 Thanks,
 Simon
 
 http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/
-- 
Louis-Frédéric Feuillette jeb...@gmail.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss