Re: [zfs-discuss] zfs-discuss Digest, Vol 59, Issue 13

2010-09-09 Thread Dr. Martin Mundschenk
Am 09.09.2010 um 07:00 schrieb zfs-discuss-requ...@opensolaris.org: What's the write workload like? You could try disabling the ZIL to see if that makes a difference. If it does, the addition of an SSD-based ZIL / slog device would most certainly help. Maybe you could describe the makeup

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Erik Trimble
On 9/8/2010 10:08 PM, Freddie Cash wrote: On Wed, Sep 8, 2010 at 6:27 AM, Edward Ned Harveysh...@nedharvey.com wrote: Both of the above situations resilver in equal time, unless there is a bus bottleneck. 21 disks in a single raidz3 will resilver just as fast as 7 disks in a raidz1, as long

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Erik Trimble
On 9/9/2010 2:15 AM, taemun wrote: Erik: does that mean that keeping the number of data drives in a raidz(n) to a power of two is better? In the example you gave, you mentioned 14kb being written to each drive. That doesn't sound very efficient to me. (when I say the above, I mean a five

Re: [zfs-discuss] performance leakage when copy huge data

2010-09-09 Thread Tomas Ögren
On 08 September, 2010 - Fei Xu sent me these 5,9K bytes: I dig deeper into it and might find some useful information. I attached an X25 SSD for ZIL to see if it helps. but no luck. I run IOstate -xnz for more details and got interesting result as below.(maybe too long) some explaination:

Re: [zfs-discuss] performance leakage when copy huge data

2010-09-09 Thread Fei Xu
Service times here are crap. Disks are malfunctioning in some way. If your source disks can take seconds (or 10+ seconds) to reply, then of course your copy will be slow. Disk is probably having a hard time reading the data or something. Yeah, that should not go over 15ms. I just

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread hatish
Very interesting... Well, lets see if we can do the numbers for my setup. From a previous post of mine: [i]This is my exact breakdown (cheap disks on cheap bus :P) : PCI-E 8X 4-port ESata Raid Controller. 4 x ESata to 5Sata Port multipliers (each connected to a ESata port on the controller).

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Freddie Cash No, it (21-disk raidz3 vdev) most certainly will not resilver in the same amount of time. In fact, I highly doubt it would resilver at all. My first foray into ZFS resulted

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Erik Trimble
On 9/9/2010 5:49 AM, hatish wrote: Very interesting... Well, lets see if we can do the numbers for my setup. From a previous post of mine: [i]This is my exact breakdown (cheap disks on cheap bus :P) : PCI-E 8X 4-port ESata Raid Controller. 4 x ESata to 5Sata Port multipliers (each

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Erik Trimble
On 9/9/2010 5:49 AM, hatish wrote: Very interesting... Well, lets see if we can do the numbers for my setup. From a previous post of mine: [i]This is my exact breakdown (cheap disks on cheap bus :P) : PCI-E 8X 4-port ESata Raid Controller. 4 x ESata to 5Sata Port multipliers (each

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Will Murnane
On Thu, Sep 9, 2010 at 09:03, Erik Trimble erik.trim...@oracle.com wrote: Actually, your biggest bottleneck will be the IOPS limits of the drives.  A 7200RPM SATA drive tops out at 100 IOPS.  Yup. That's it. So, if you need to do 62.5e6 IOPS, and the rebuild drive can do just 100 IOPS, that

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Erik Trimble the thing that folks tend to forget is that RaidZ is IOPS limited. For the most part, if I want to reconstruct a single slab (stripe) of data, I have to issue a read to EACH

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Edward Ned Harvey
From: Hatish Narotam [mailto:hat...@gmail.com] PCI-E 8X 4-port ESata Raid Controller. 4 x ESata to 5Sata Port multipliers (each connected to a ESata port on the controller). 20 x Samsung 1TB HDD's. (each connected to a Port Multiplier). Assuming your disks can all sustain 500Mbit/sec,

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey The characteristic that *really* makes a big difference is the number of slabs in the pool. i.e. if your filesystem is composed of mostly small files or fragments, versus

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Marty Scholes
Erik wrote: Actually, your biggest bottleneck will be the IOPS limits of the drives. A 7200RPM SATA drive tops out at 100 IOPS. Yup. That's it. So, if you need to do 62.5e6 IOPS, and the rebuild drive can do just 100 IOPS, that means you will finish (best case) in 62.5e4 seconds.

Re: [zfs-discuss] performance leakage when copy huge data

2010-09-09 Thread Ross Walker
On Sep 9, 2010, at 8:27 AM, Fei Xu twinse...@hotmail.com wrote: Service times here are crap. Disks are malfunctioning in some way. If your source disks can take seconds (or 10+ seconds) to reply, then of course your copy will be slow. Disk is probably having a hard time reading the data

Re: [zfs-discuss] performance leakage when copy huge data

2010-09-09 Thread Markus Kovero
On Sep 9, 2010, at 8:27 AM, Fei Xu twinse...@hotmail.com wrote: This might be the dreaded WD TLER issue. Basically the drive keeps retrying a read operation over and over after a bit error trying to recover from a read error themselves. With ZFS one really needs to disable this and have

Re: [zfs-discuss] zpool create using whole disk - do I add p0? E.g. c4t2d0 or c42d0p0

2010-09-09 Thread Cindy Swearingen
Hi-- It might help to review the disk component terminology description: c#t#d#p# = represents the the fdisk partition on x86 systems, where you can have up to 4 fdisk partitions, such as one for the Solaris OS or a Windows OS. An fdisk partition is the larger container of the disk or disk

[zfs-discuss] NFS performance near zero on a very full pool

2010-09-09 Thread Arne Jansen
Hi, currently I'm trying to debug a very strange phenomenon on a nearly full pool (96%). Here are the symptoms: over NFS, a find on the pool takes a very long time, up to 30s (!) for each file. Locally, the performance is quite normal. What I found out so far: It seems that every nfs write

Re: [zfs-discuss] performance leakage when copy huge data

2010-09-09 Thread Mark Little
On Thu, 9 Sep 2010 14:05:51 +, Markus Kovero markus.kov...@nebula.fi wrote: On Sep 9, 2010, at 8:27 AM, Fei Xu twinse...@hotmail.com wrote: This might be the dreaded WD TLER issue. Basically the drive keeps retrying a read operation over and over after a bit error trying to recover

Re: [zfs-discuss] NFS performance near zero on a very full pool

2010-09-09 Thread Neil Perrin
Arne, NFS often demands it's transactions are stable before returning. This forces ZFS to do the system call synchronously. Usually the ZIL (code) allocates and writes a new block in the intent log chain to achieve this. If ever it fails to allocate a block (of the size requested) it it forced

[zfs-discuss] NetApp/Oracle-Sun lawsuit done

2010-09-09 Thread David Magda
of the agreement are confidential. http://tinyurl.com/39qkzgz http://www.netapp.com/us/company/news/news-rel-20100909-oracle-settlement.html A recap of the history at: http://www.theregister.co.uk/2010/09/09/oracle_netapp_zfs_dismiss/ ___ zfs-discuss

Re: [zfs-discuss] NFS performance near zero on a very full pool

2010-09-09 Thread Neil Perrin
I should also have mentioned that if the pool has a separate log device then this shouldn't happen.Assuming the slog is big enough then it it should have enough blocks to not be forced into using main pool device blocks. Neil. On 09/09/10 10:36, Neil Perrin wrote: Arne, NFS often demands

Re: [zfs-discuss] NFS performance near zero on a very full pool

2010-09-09 Thread Arne Jansen
Hi Neil, Neil Perrin wrote: NFS often demands it's transactions are stable before returning. This forces ZFS to do the system call synchronously. Usually the ZIL (code) allocates and writes a new block in the intent log chain to achieve this. If ever it fails to allocate a block (of the size

Re: [zfs-discuss] NetApp/Oracle-Sun lawsuit done

2010-09-09 Thread Richard Elling
. Oracle and NetApp seek to have the lawsuits dismissed without prejudice. The terms of the agreement are confidential. http://tinyurl.com/39qkzgz http://www.netapp.com/us/company/news/news-rel-20100909-oracle-settlement.html A recap of the history at: http://www.theregister.co.uk/2010/09

Re: [zfs-discuss] NetApp/Oracle-Sun lawsuit done

2010-09-09 Thread Erik Trimble
in 2007 between Sun Microsystems and NetApp. Oracle and NetApp seek to have the lawsuits dismissed without prejudice. The terms of the agreement are confidential. http://tinyurl.com/39qkzgz http://www.netapp.com/us/company/news/news-rel-20100909-oracle-settlement.html A recap of the history

Re: [zfs-discuss] NFS performance near zero on a very full pool

2010-09-09 Thread Richard Elling
On Sep 9, 2010, at 10:09 AM, Arne Jansen wrote: Hi Neil, Neil Perrin wrote: NFS often demands it's transactions are stable before returning. This forces ZFS to do the system call synchronously. Usually the ZIL (code) allocates and writes a new block in the intent log chain to achieve

Re: [zfs-discuss] NFS performance near zero on a very full pool

2010-09-09 Thread Arne Jansen
Richard Elling wrote: On Sep 9, 2010, at 10:09 AM, Arne Jansen wrote: Hi Neil, Neil Perrin wrote: NFS often demands it's transactions are stable before returning. This forces ZFS to do the system call synchronously. Usually the ZIL (code) allocates and writes a new block in the intent log

Re: [zfs-discuss] NetApp/Oracle-Sun lawsuit done

2010-09-09 Thread Bob Friesenhahn
On Thu, 9 Sep 2010, Erik Trimble wrote: Yes, it's welcome to get it over with. I do get to bitch about one aspect here of the US civil legal system, though. If you've gone so far as to burn our (the public's) time and money to file a lawsuit, you shouldn't be able to seal up the court

Re: [zfs-discuss] performance leakage when copy huge data

2010-09-09 Thread Miles Nordin
ml == Mark Little marklit...@koallo.com writes: ml Just to clarify - do you mean TLER should be off or on? It should be set to ``do not have asvc_t 11 seconds and 1 io/s''. ...which is not one of the settings of the TLER knob. This isn't a problem with the TLER *setting*. TLER does not

Re: [zfs-discuss] NetApp/Oracle-Sun lawsuit done

2010-09-09 Thread Miles Nordin
dm == David Magda dma...@ee.ryerson.ca writes: dm http://www.theregister.co.uk/2010/09/09/oracle_netapp_zfs_dismiss/ http://www.groklaw.net/articlebasic.php?story=20050121014650517 says when the MPL was modified to become the CDDL, clauses were removed which would have required Oracle to

Re: [zfs-discuss] NetApp/Oracle-Sun lawsuit done

2010-09-09 Thread Erik Trimble
On 9/9/2010 11:11 AM, Garrett D'Amore wrote: On Thu, 2010-09-09 at 12:58 -0500, Bob Friesenhahn wrote: On Thu, 9 Sep 2010, Erik Trimble wrote: Yes, it's welcome to get it over with. I do get to bitch about one aspect here of the US civil legal system, though. If you've gone so far as to burn

Re: [zfs-discuss] NetApp/Oracle-Sun lawsuit done

2010-09-09 Thread Bob Friesenhahn
On Thu, 9 Sep 2010, Garrett D'Amore wrote: True. But, I wonder if the settlement sets a precedent? No precedent has been set. Certainly the lack of a successful lawsuit has *failed* to set any precedent conclusively indicating that NetApp has enforceable patents where ZFS is concerned.

[zfs-discuss] resilver = defrag?

2010-09-09 Thread Orvar Korvar
A) Resilver = Defrag. True/false? B) If I buy larger drives and resilver, does defrag happen? C) Does zfs send zfs receive mean it will defrag? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org

[zfs-discuss] How to migrate to 4KB sector drives?

2010-09-09 Thread Orvar Korvar
ZFS does not handle 4K sector drives well, you need to create a new zpool with 4K property (ashift) set. http://www.solarismen.de/archives/5-Solaris-and-the-new-4K-Sector-Disks-e.g.-WDxxEARS-Part-2.html Are there plans to allow resilver to handle 4K sector drives? -- This message posted from

Re: [zfs-discuss] resilver = defrag?

2010-09-09 Thread Freddie Cash
On Thu, Sep 9, 2010 at 1:04 PM, Orvar Korvar knatte_fnatte_tja...@yahoo.com wrote: A) Resilver = Defrag. True/false? False. Resilver just rebuilds a drive in a vdev based on the redundant data stored on the other drives in the vdev. Similar to how replacing a dead drive works in a hardware

Re: [zfs-discuss] NetApp/Oracle-Sun lawsuit done

2010-09-09 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Bob Friesenhahn There should be little doubt that NetApp's goal was to make money by suing Sun. Nexenta does not have enough income/assets to make a risky lawsuit worthwhile. But in all

Re: [zfs-discuss] resilver = defrag?

2010-09-09 Thread Marty Scholes
I am speaking from my own observations and nothing scientific such as reading the code or designing the process. A) Resilver = Defrag. True/false? False B) If I buy larger drives and resilver, does defrag happen? No. The first X sectors of the bigger drive are identical to the smaller

Re: [zfs-discuss] resilver = defrag?

2010-09-09 Thread Freddie Cash
On Thu, Sep 9, 2010 at 1:26 PM, Freddie Cash fjwc...@gmail.com wrote: On Thu, Sep 9, 2010 at 1:04 PM, Orvar Korvar knatte_fnatte_tja...@yahoo.com wrote: A) Resilver = Defrag. True/false? False.  Resilver just rebuilds a drive in a vdev based on the redundant data stored on the other drives

Re: [zfs-discuss] NetApp/Oracle-Sun lawsuit done

2010-09-09 Thread Tim Cook
On Thu, Sep 9, 2010 at 2:49 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Thu, 9 Sep 2010, Garrett D'Amore wrote: True. But, I wonder if the settlement sets a precedent? No precedent has been set. Certainly the lack of a successful lawsuit has *failed* to set any

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Haudy Kazemi
Comment at end... Mattias Pantzare wrote: On Wed, Sep 8, 2010 at 15:27, Edward Ned Harvey sh...@nedharvey.com wrote: From: pantz...@gmail.com [mailto:pantz...@gmail.com] On Behalf Of Mattias Pantzare It is about 1 vdev with 12 disk or 2 vdev with 6 disks. If you have 2 vdev you have to

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Haudy Kazemi
Erik Trimble wrote: On 9/9/2010 2:15 AM, taemun wrote: Erik: does that mean that keeping the number of data drives in a raidz(n) to a power of two is better? In the example you gave, you mentioned 14kb being written to each drive. That doesn't sound very efficient to me. (when I say the

Re: [zfs-discuss] performance leakage when copy huge data

2010-09-09 Thread Fei Xu
Just to update the status and findings. I've checked TLER settings and they are off by default. I moved the source pool to another chassis and do the 3.8TB send again. this time, not any problems! the difference is 1. New chassis 2. BIGGER memory. 32GB v.s 12GB 3. although wdidle time is

Re: [zfs-discuss] resilver = defrag?

2010-09-09 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Orvar Korvar A) Resilver = Defrag. True/false? I think everyone will agree false on this question. However, more detail may be appropriate. See below. B) If I buy larger drives and

Re: [zfs-discuss] Suggested RaidZ configuration...

2010-09-09 Thread Edward Ned Harvey
From: Haudy Kazemi [mailto:kaze0...@umn.edu] There is another optimization in the Best Practices Guide that says the number of devices in a vdev should be (N+P) with P = 1 (raidz), 2 (raidz2), or 3 (raidz3) and N equals 2, 4, or 8. I.e. 2^N + P where N is 1, 2, or 3 and P is the RAIDZ level.

Re: [zfs-discuss] resilver = defrag?

2010-09-09 Thread Bill Sommerfeld
On 09/09/10 20:08, Edward Ned Harvey wrote: Scores so far: 2 No 1 Yes No. resilver does not re-layout your data or change whats in the block pointers on disk. if it was fragmented before, it will be fragmented after. C) Does zfs send zfs receive mean it will defrag?