[zfs-discuss] ARC Tail

2010-04-02 Thread Abdullah Al-Dahlawi
Greeting All Can any one help me figure out the size of the ARC tail, i.e, the portion in ARC that l2_feed thread is reading from before pages are evicted from ARC. Is the size of this tail proportional to total ARC size ? L2ARC device size ? is tunable ?? your feed back is highly

Re: [zfs-discuss] [install-discuss] Installing Opensolaris without ZFS?

2010-04-02 Thread Erik Trimble
[removing all lists except ZFS-discuss, as this is really pertinent only there] ольга крыжановская wrote: Are there plans to reduce the memory usage of ZFS in the near future? Olga 2010/4/2 Alan Coopersmith alan.coopersm...@oracle.com: ольга крыжановская wrote: Does Opensolaris

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Casper . Dik
On 01/04/2010 20:58, Jeroen Roodhart wrote: I'm happy to see that it is now the default and I hope this will cause the Linux NFS client implementation to be faster for conforming NFS servers. Interesting thing is that apparently defaults on Solaris an Linux are chosen such that one

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Roch
Robert Milkowski writes: On 01/04/2010 20:58, Jeroen Roodhart wrote: I'm happy to see that it is now the default and I hope this will cause the Linux NFS client implementation to be faster for conforming NFS servers. Interesting thing is that apparently defaults on Solaris

Re: [zfs-discuss] how can I remove files when the fiile system is full?

2010-04-02 Thread Tim Haley
On 04/ 1/10 01:46 PM, Eiji Ota wrote: During the IPS upgrade, the file system got full, then I cannot do anything to recover it. # df -kl Filesystem 1K-blocks Used Available Use% Mounted on rpool/ROOT/opensolaris 4976642 4976642 0 100% / swap

Re: [zfs-discuss] how can I remove files when the fiile system is full?

2010-04-02 Thread Eiji Ota
Thanks. It worked, but yet the fs says it's full. Is it normal and I can get some space eventually (if I continue this)? # cat /dev/null ./messages.1 # cat /dev/null ./messages.0 # df -kl Filesystem 1K-blocks Used Available Use% Mounted on rpool/ROOT/opensolaris 4976123 4976123 0 100% /

Re: [zfs-discuss] how can I remove files when the fiile system is full?

2010-04-02 Thread Eiji Ota
Thanks, Brandon. Now that the issue goes away, I could recover my host. -Eiji On Thu, Apr 1, 2010 at 1:39 PM, Eiji Ota eiji@oracle.com wrote: Thanks. It worked, but yet the fs says it's full. Is it normal and I can get some space eventually (if I continue this)? You

Re: [zfs-discuss] how can I remove files when the fiile system is full?

2010-04-02 Thread Edward Ned Harvey
On opensolaris? Did you try deleting any old BEs? Don't forget to zfs destroy rp...@snapshot In fact, you might start with destroying snapshots ... if there are any occupying space. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
Seriously, all disks configured WriteThrough (spindle and SSD disks alike) using the dedicated ZIL SSD device, very noticeably faster than enabling the WriteBack. What do you get with both SSD ZIL and WriteBack disks enabled? I mean if you have both why not use both? Then both

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
I know it is way after the fact, but I find it best to coerce each drive down to the whole GB boundary using format (create Solaris partition just up to the boundary). Then if you ever get a drive a little smaller it still should fit. It seems like it should be unnecessary. It seems like

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Roch
When we use one vmod, both machines are finished in about 6min45, zilstat maxes out at about 4200 IOPS. Using four vmods it takes about 6min55, zilstat maxes out at 2200 IOPS. Can you try 4 concurrent tar to four different ZFS filesystems (same pool). -r

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
http://nfs.sourceforge.net/ I think B4 is the answer to Casper's question: We were talking about ZFS, and under what circumstances data is flushed to disk, in what way sync and async writes are handled by the OS, and what happens if you disable ZIL and lose power to your system. We were

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
I am envisioning a database, which issues a small sync write, followed by a larger async write. Since the sync write is small, the OS would prefer to defer the write and aggregate into a larger block. So the possibility of the later async write being committed to disk before the older

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
hello i have had this problem this week. our zil ssd died (apt slc ssd 16gb). because we had no spare drive in stock, we ignored it. then we decided to update our nexenta 3 alpha to beta, exported the pool and made a fresh install to have a clean system and tried to import the pool. we

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
ZFS recovers to a crash-consistent state, even without the slog, meaning it recovers to some state through which the filesystem passed in the seconds leading up to the crash. This isn't what UFS or XFS do. The on-disk log (slog or otherwise), if I understand right, can actually make the

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
If you have zpool less than version 19 (when ability to remove log device was introduced) and you have a non-mirrored log device that failed, you had better treat the situation as an emergency. Instead, do man zpool and look for zpool remove. If it says supports removing log devices

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Casper . Dik
http://nfs.sourceforge.net/ I think B4 is the answer to Casper's question: We were talking about ZFS, and under what circumstances data is flushed to disk, in what way sync and async writes are handled by the OS, and what happens if you disable ZIL and lose power to your system. We were

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Casper . Dik
So you're saying that while the OS is building txg's to write to disk, the OS will never reorder the sequence in which individual write operations get ordered into the txg's. That is, an application performing a small sync write, followed by a large async write, will never have the second

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
Dude, don't be so arrogant. Acting like you know what I'm talking about better than I do. Face it that you have something to learn here. You may say that, but then you post this: Acknowledged. I read something arrogant, and I replied even more arrogant. That was dumb of me.

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
Only a broken application uses sync writes sometimes, and async writes at other times. Suppose there is a virtual machine, with virtual processes inside it. Some virtual process issues a sync write to the virtual OS, meanwhile another virtual process issues an async write. Then the virtual OS

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Edward Ned Harvey
The purpose of the ZIL is to act like a fast log for synchronous writes. It allows the system to quickly confirm a synchronous write request with the minimum amount of work. Bob and Casper and some others clearly know a lot here. But I'm hearing conflicting information, and don't know what

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Casper . Dik
Questions to answer would be: Is a ZIL log device used only by sync() and fsync() system calls? Is it ever used to accelerate async writes? There are quite a few of sync writes, specifically when you mix in the NFS server. Suppose there is an application which sometimes does sync writes,

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Kyle McDonald
On 4/2/2010 8:08 AM, Edward Ned Harvey wrote: I know it is way after the fact, but I find it best to coerce each drive down to the whole GB boundary using format (create Solaris partition just up to the boundary). Then if you ever get a drive a little smaller it still should fit. It

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Mattias Pantzare
On Fri, Apr 2, 2010 at 16:24, Edward Ned Harvey solar...@nedharvey.com wrote: The purpose of the ZIL is to act like a fast log for synchronous writes.  It allows the system to quickly confirm a synchronous write request with the minimum amount of work. Bob and Casper and some others clearly

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Bob Friesenhahn
On Fri, 2 Apr 2010, Edward Ned Harvey wrote: So you're saying that while the OS is building txg's to write to disk, the OS will never reorder the sequence in which individual write operations get ordered into the txg's. That is, an application performing a small sync write, followed by a large

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Bob Friesenhahn
On Fri, 2 Apr 2010, Edward Ned Harvey wrote: were taking place at the same time. That is, if two processes both complete a write operation at the same time, one in sync mode and the other in async mode, then it is guaranteed the data on disk will never have the async data committed before the

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Stuart Anderson
On Apr 2, 2010, at 5:08 AM, Edward Ned Harvey wrote: I know it is way after the fact, but I find it best to coerce each drive down to the whole GB boundary using format (create Solaris partition just up to the boundary). Then if you ever get a drive a little smaller it still should fit.

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Ross Walker
On Fri, Apr 2, 2010 at 8:03 AM, Edward Ned Harvey solar...@nedharvey.com wrote: Seriously, all disks configured WriteThrough (spindle and SSD disks alike) using the dedicated ZIL SSD device, very noticeably faster than enabling the WriteBack. What do you get with both SSD ZIL and

[zfs-discuss] dedup and memory/l2arc requirements

2010-04-02 Thread Roy Sigurd Karlsbakk
Hi all I've been told (on #opensolaris, irc.freenode.net) that opensolaris needs a lot of memory and/or l2arc for dedup to function properly. How much memory or l2arc should I get for a 12TB zpool (8x2GB in RAIDz2), and then, how much for 125TB (after RAIDz2 overhead)? Is there a function into

Re: [zfs-discuss] [install-discuss] Installing Opensolaris without ZFS?

2010-04-02 Thread Roy Sigurd Karlsbakk
I doubt it. ZFS is meant to be used for large systems, in which memory is not an issue Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært

Re: [zfs-discuss] is this pool recoverable?

2010-04-02 Thread Patrick Tiquet
I tried booting with b134 to attempt to recover the pool. I attempted with one disk of the mirror. Zpool tells me to use -F for import, fails, but then tells me to use -f, which also fails and tells me to use -F again. Any thoughts? j...@opensolaris:~# zpool import pool: atomfs id:

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Robert Milkowski
On 02/04/2010 16:04, casper@sun.com wrote: sync() is actually *async* and returning from sync() says nothing about to clarify - in case of ZFS sync() is actually synchronous. -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss

Re: [zfs-discuss] is this pool recoverable?

2010-04-02 Thread Bob Friesenhahn
On Fri, 2 Apr 2010, Patrick Tiquet wrote: I tried booting with b134 to attempt to recover the pool. I attempted with one disk of the mirror. Zpool tells me to use -F for import, fails, but then tells me to use -f, which also fails and tells me to use -F again. Any thoughts? It looks like it

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Tirso Alonso
If my new replacement SSD with identical part number and firmware is 0.001 Gb smaller than the original and hence unable to mirror, what's to prevent the same thing from happening to one of my 1TB spindle disk mirrors? There is a standard for sizes that many manufatures use (IDEMA LBA1-02):

Re: [zfs-discuss] dedup and memory/l2arc requirements

2010-04-02 Thread Richard Elling
On Apr 1, 2010, at 5:39 PM, Roy Sigurd Karlsbakk wrote: Hi all I've been told (on #opensolaris, irc.freenode.net) that opensolaris needs a lot of memory and/or l2arc for dedup to function properly. How much memory or l2arc should I get for a 12TB zpool (8x2GB in RAIDz2), and then, how much

Re: [zfs-discuss] is this pool recoverable?

2010-04-02 Thread Patrick Tiquet
Thanks, that worked!! It needed -Ff The pool has been recovered with minimal loss in data. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Miles Nordin
enh == Edward Ned Harvey solar...@nedharvey.com writes: enh If you have zpool less than version 19 (when ability to remove enh log device was introduced) and you have a non-mirrored log enh device that failed, you had better treat the situation as an enh emergency. Ed the log device

[zfs-discuss] ZFS behavior under limited resources

2010-04-02 Thread Mike Z
I am trying to see how ZFS behaves under resource starvation - corner cases in embedded environments. I see some very strange behavior. Any help/explanation would really be appreciated. My current setup is : OpenSolaris 111b (iSCSI seems to be broken in 132 - unable to get multiple

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Tim Cook
On Fri, Apr 2, 2010 at 10:08 AM, Kyle McDonald kmcdon...@egenera.comwrote: On 4/2/2010 8:08 AM, Edward Ned Harvey wrote: I know it is way after the fact, but I find it best to coerce each drive down to the whole GB boundary using format (create Solaris partition just up to the boundary).

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Eric D. Mudama
On Fri, Apr 2 at 11:14, Tirso Alonso wrote: If my new replacement SSD with identical part number and firmware is 0.001 Gb smaller than the original and hence unable to mirror, what's to prevent the same thing from happening to one of my 1TB spindle disk mirrors? There is a standard for sizes

Re: [zfs-discuss] dedup and memory/l2arc requirements

2010-04-02 Thread Miles Nordin
re == Richard Elling richard.ell...@gmail.com writes: re # ptime zdb -S zwimming Simulated DDT histogram: re refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE re Total2.63M277G218G225G3.22M337G263G 270G rein-core

Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Edward Ned Harvey
This might be unrelated, but along similar lines . I've also heard that the risk for unexpected failure of your pool is higher if/when you reach 100% capacity. I've heard that you should always create a small ZFS filesystem within a pool, and give it some reserved space, along with the

Re: [zfs-discuss] RAID-Z with Permanent errors detected in files

2010-04-02 Thread Lutz Schumann
I guess it will then remain a mystery how did this happen, since I'm very careful when engaging the commands and I'm sure that I didn't miss the raidz parameter. You can be sure by calling zpool history. Robert -- This message posted from opensolaris.org

Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Ian Collins
On 04/ 3/10 10:23 AM, Edward Ned Harvey wrote: Momentarily, I will begin scouring the omniscient interweb for information, but I’d like to know a little bit of what people would say here. The question is to slice, or not to slice, disks before using them in a zpool. Not. One reason to

Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Brandon High
On Fri, Apr 2, 2010 at 2:23 PM, Edward Ned Harvey guacam...@nedharvey.comwrote: There is some question about performance. Is there any additional overhead caused by using a slice instead of the whole physical device? zfs will disable the write cache when it's not working with whole disks,

Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Brandon High
On Fri, Apr 2, 2010 at 2:29 PM, Edward Ned Harvey solar...@nedharvey.comwrote: I’ve also heard that the risk for unexpected failure of your pool is higher if/when you reach 100% capacity. I’ve heard that you should always create a small ZFS filesystem within a pool, and give it some reserved

Re: [zfs-discuss] dedup and memory/l2arc requirements

2010-04-02 Thread Richard Elling
On Apr 2, 2010, at 2:03 PM, Miles Nordin wrote: re == Richard Elling richard.ell...@gmail.com writes: re # ptime zdb -S zwimming Simulated DDT histogram: re refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE re Total2.63M277G218G225G

Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Richard Elling
On Apr 2, 2010, at 2:29 PM, Edward Ned Harvey wrote: I’ve also heard that the risk for unexpected failure of your pool is higher if/when you reach 100% capacity. I’ve heard that you should always create a small ZFS filesystem within a pool, and give it some reserved space, along with the

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

2010-04-02 Thread Al Hopper
Hi Jeroen, Have you tried the DDRdrive from Christopher George cgeo...@ddrdrive.com? Looks to me like a much better fit for your application than the F20? It would not hurt to check it out. Looks to me like you need a product with low *latency* - and a RAM based cache would be a much better