Re: [zfs-discuss] zfs backup and restore
A good place to start is: http://www.opensolaris.org/os/community/zfs/ Have a look at: http://www.opensolaris.org/os/community/zfs/docs/ as well as http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide# Create some files, which you can use as disks within zfs and demo to your customer precisely what happens on a small scale using snapshots and clones and promotions. Cheers On 5/25/07, Roshan Perera [EMAIL PROTECTED] wrote: Hi, I believe Solaris 10 version 3 supports zfs backup and restore. How can I upgrade previous versions of Solaris to run zfs backup/restore and where to download the relevant versions. Also, I have a customer wanting to know (now I am interested too) the detailed information of how the zfs snapshot and cloning works. Mostly to justify/explain the speed of cloning. Thanks Roshan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Preparing to compare Solaris/ZFS and FreeBSD/ZFS performance.
I have just (re)installed FreeBSD amd64 current with gcc 4.2 with src from May. 21'st on a dual Dell PE 2850. Does the post-gcc-4-2 current include all your zfs-optimizations? I have commented out INVARIANTS, INVARIANTS_SUPPORT, WITNESS and WITNESS_SKIPSPIN in my kernel and recompiled with CPUTYPE=nocona. A default install solaris fares better io-wise compared to a default FreeBSD where writes could pass 100 MB/s (zpool iostat 1) and FreeBSD would write 30-40 MB/s. After adding the following to /boot/loader.conf writes peak at 90-95 MB/s: vm.kmem_size_max=2147483648 vfs.zfs.arc_max=1610612736 Now FreeBSD seems to perfom almost as good as solaris io-wise although I don't have any numbers to justify my statement. I did not import postgresql in solaris as one thing. This patch also improve concurrency in VFS: http://people.freebsd.org/~pjd/patches/vfs_shared.patch I applied the patch and it seems to speed up my reads and writes. Watching zpool iostat I saw reads at 155 MB/s and writes at 111 MB/s. But it also seems to introduce some minor complete stops accessing the zpool lasting for 10-20 secs. I apologize I'm not very specific but I only had time to test disk-io but not digg into the issues. When you want to operate on mmap(2)ed files, you should disable ZIL and remote file systems: # sysctl vfs.zfs.zil_disable=1 # zpool export name # zpool import name Won't disabling ZIL minimize the chance of a consistent zfs-filesystem if - for some reason - the server did an unplanned reboot? -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.
Depend on the guarantees. Some RAID systems have built in block checksumming. But we all know that block checksums stored with the blocks do not catch a number of common errors. (Ghost writes, misdirected writes, misdirected reads) Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: Need guidance on RAID 5, ZFS, and RAIDZ on home file server
[EMAIL PROTECTED] wrote: IRIX was much earlier than Solaris; Solaris was pretty late in the 64 bit game with Solaris 7. And Alpha did not have a real 64 bit port as they did implement ILP64. With ILP64 your application does not really notice that it runs in 64 bits if you only use sizeof(). ILP64? AFAIK, Alpha had int as a 32 bit type and L and P as 64 bit types; even ILP64 would be a proper 64 bit OS if a tad difficult to port some code to. That's why time_t was a 32 bit value (oops). OOps, you are right :-) Is it possible that I confused this with Linux an Alpha? GCC was not 64 bit clean until GCC-3.x If you compiles a GCC-2.x you did get more than 1 warnings for bad printf format strings and people have been very upset for not being able to use gcc to compile 64 bit sparc binaries. Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?
Hi, I'm a big fan of live upgrade. I'm also a big fan of ZFS boot. The latter is more important for me. And yes, I'm looking forward to both being integrated with each other. Meanwhile, what is the best way to upgrade a post-b61 system that is booted from ZFS? I'm thinking: 1. Boot from ZFS 2. Use Tim's excellent multiple boot datasets script to create a new cloned ZFS boot environment: http://blogs.sun.com/timf/entry/an_easy_way_to_manage 3. Loopback mount the new OS ISO image 4. Run the installer from the loopbacked ISO image in upgrade mode on the clone 5. Mark the clone to be booted the next time 6. Reboot into the upgraded OS. Questions: - How exactly do I do step 4? Before, luupgrade did everything for me, now what manpage do I need to do this? - Did I forget something above? I'm ok with losing some logfiles and stuff that maybe changed between the clone and the reboot, but is there anything else? - Did someone already blog about this and I haven't noticed yet? Cheers, Constantin -- Constantin GonzalezSun Microsystems GmbH, Germany Platform Technology Group, Global Systems Engineering http://www.sun.de/ Tel.: +49 89/4 60 08-25 91 http://blogs.sun.com/constantin/ Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten Amtsgericht Muenchen: HRB 161028 Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer Vorsitzender des Aufsichtsrates: Martin Haering ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Error zfs creating new zone
Hello, I get a error when i was creating the new zone on ZFS. I´m using Solaris 11/06 118855-36, i have several others machines identical hardware, but only this is appears this behaviour. [EMAIL PROTECTED]:/] # fmdump -V TIME UUID SUNW-MSG-ID May 23 17:39:43.4886 b3c6c2b6-f41a-eede-dce4-dd42e5c5424a ZFS-8000-CS nvlist version: 0 version = 0x0 class = list.suspect uuid = b3c6c2b6-f41a-eede-dce4-dd42e5c5424a code = ZFS-8000-CS diag-time = 1179952783 488616 de = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = fmd authority = (embedded nvlist) nvlist version: 0 version = 0x0 product-id = LX200 chassis-id = COWYRR10RR0007 server-id = server (end authority) mod-name = zfs-diagnosis mod-version = 1.0 (end de) fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = fault.fs.zfs.pool certainty = 0x64 asru = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x9c85a50f25483bc6 (end asru) resource = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x9c85a50f25483bc6 (end resource) (end fault-list[0]) fault-status = 0x1 __ttl = 0x1 __tod = 0x4654a68f 0x1d20ae28 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.
On 25-May-07, at 1:22 AM, Torrey McMahon wrote: Toby Thain wrote: On 22-May-07, at 11:01 AM, Louwtjie Burger wrote: On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote: What if your HW-RAID-controller dies? in say 2 years or more.. What will read your disks as a configured RAID? Do you know how to (re)configure the controller or restore the config without destroying your data? Do you know for sure that a spare-part and firmware will be identical, or at least compatible? How good is your service subscription? Maybe only scrapyards and museums will have what you had. =o Be careful when talking about RAID controllers in general. They are not created equal! ... Hardware raid controllers have done the job for many years ... Not quite the same job as ZFS, which offers integrity guarantees that RAID subsystems cannot. Depend on the guarantees. Some RAID systems have built in block checksumming. Which still isn't the same. Sigh. --T___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.
Toby Thain wrote: On 25-May-07, at 1:22 AM, Torrey McMahon wrote: Toby Thain wrote: On 22-May-07, at 11:01 AM, Louwtjie Burger wrote: On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote: What if your HW-RAID-controller dies? in say 2 years or more.. What will read your disks as a configured RAID? Do you know how to (re)configure the controller or restore the config without destroying your data? Do you know for sure that a spare-part and firmware will be identical, or at least compatible? How good is your service subscription? Maybe only scrapyards and museums will have what you had. =o Be careful when talking about RAID controllers in general. They are not created equal! ... Hardware raid controllers have done the job for many years ... Not quite the same job as ZFS, which offers integrity guarantees that RAID subsystems cannot. Depend on the guarantees. Some RAID systems have built in block checksumming. Which still isn't the same. Sigh. Yep.you get what you pay for. Funny how ZFS is free to purchase isn't it? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?
I'm actually wondering the same thing because I have b62 w/ the ZFS bits; but need the snapshot's -r functionality. Malachi On 5/25/07, Constantin Gonzalez [EMAIL PROTECTED] wrote: Hi, I'm a big fan of live upgrade. I'm also a big fan of ZFS boot. The latter is more important for me. And yes, I'm looking forward to both being integrated with each other. Meanwhile, what is the best way to upgrade a post-b61 system that is booted from ZFS? I'm thinking: 1. Boot from ZFS 2. Use Tim's excellent multiple boot datasets script to create a new cloned ZFS boot environment: http://blogs.sun.com/timf/entry/an_easy_way_to_manage 3. Loopback mount the new OS ISO image 4. Run the installer from the loopbacked ISO image in upgrade mode on the clone 5. Mark the clone to be booted the next time 6. Reboot into the upgraded OS. Questions: - How exactly do I do step 4? Before, luupgrade did everything for me, now what manpage do I need to do this? - Did I forget something above? I'm ok with losing some logfiles and stuff that maybe changed between the clone and the reboot, but is there anything else? - Did someone already blog about this and I haven't noticed yet? Cheers, Constantin -- Constantin GonzalezSun Microsystems GmbH, Germany Platform Technology Group, Global Systems Engineering http://www.sun.de/ Tel.: +49 89/4 60 08-25 91 http://blogs.sun.com/constantin/ Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten Amtsgericht Muenchen: HRB 161028 Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer Vorsitzender des Aufsichtsrates: Martin Haering ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?
Hi, Our upgrade story isn't great right now. In the meantime, you might check out Tim Haley's blog entry on using bfu with zfs root. thanks. But doesn't live upgrade just start the installer from the new OS DVD with the right options? Can't I just do that too? Cheers, Constantin http://blogs.sun.com/timh/entry/friday_fun_with_bfu_and lori Constantin Gonzalez wrote: Hi, I'm a big fan of live upgrade. I'm also a big fan of ZFS boot. The latter is more important for me. And yes, I'm looking forward to both being integrated with each other. Meanwhile, what is the best way to upgrade a post-b61 system that is booted from ZFS? I'm thinking: 1. Boot from ZFS 2. Use Tim's excellent multiple boot datasets script to create a new cloned ZFS boot environment: http://blogs.sun.com/timf/entry/an_easy_way_to_manage 3. Loopback mount the new OS ISO image 4. Run the installer from the loopbacked ISO image in upgrade mode on the clone 5. Mark the clone to be booted the next time 6. Reboot into the upgraded OS. Questions: - How exactly do I do step 4? Before, luupgrade did everything for me, now what manpage do I need to do this? - Did I forget something above? I'm ok with losing some logfiles and stuff that maybe changed between the clone and the reboot, but is there anything else? - Did someone already blog about this and I haven't noticed yet? Cheers, Constantin -- Constantin GonzalezSun Microsystems GmbH, Germany Platform Technology Group, Global Systems Engineering http://www.sun.de/ Tel.: +49 89/4 60 08-25 91 http://blogs.sun.com/constantin/ Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten Amtsgericht Muenchen: HRB 161028 Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer Vorsitzender des Aufsichtsrates: Martin Haering ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?
Constantin Gonzalez wrote: Hi, Our upgrade story isn't great right now. In the meantime, you might check out Tim Haley's blog entry on using bfu with zfs root. thanks. But doesn't live upgrade just start the installer from the new OS DVD with the right options? Can't I just do that too? I'll look at it and see if I can give you a better recommendation. I don't want to give you bad advice. I might have more information later today. Lori Cheers, Constantin http://blogs.sun.com/timh/entry/friday_fun_with_bfu_and lori Constantin Gonzalez wrote: Hi, I'm a big fan of live upgrade. I'm also a big fan of ZFS boot. The latter is more important for me. And yes, I'm looking forward to both being integrated with each other. Meanwhile, what is the best way to upgrade a post-b61 system that is booted from ZFS? I'm thinking: 1. Boot from ZFS 2. Use Tim's excellent multiple boot datasets script to create a new cloned ZFS boot environment: http://blogs.sun.com/timf/entry/an_easy_way_to_manage 3. Loopback mount the new OS ISO image 4. Run the installer from the loopbacked ISO image in upgrade mode on the clone 5. Mark the clone to be booted the next time 6. Reboot into the upgraded OS. Questions: - How exactly do I do step 4? Before, luupgrade did everything for me, now what manpage do I need to do this? - Did I forget something above? I'm ok with losing some logfiles and stuff that maybe changed between the clone and the reboot, but is there anything else? - Did someone already blog about this and I haven't noticed yet? Cheers, Constantin ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?
Hi Malachi, Malachi de Ælfweald wrote: I'm actually wondering the same thing because I have b62 w/ the ZFS bits; but need the snapshot's -r functionality. you're lucky, it's already there. From my b62 machine's man zfs: zfs snapshot [-r] [EMAIL PROTECTED]|[EMAIL PROTECTED] Creates a snapshot with the given name. See the Snapshots section for details. -rRecursively create snapshots of all descendant datasets. Snapshots are taken atomically, so that all recursive snapshots correspond to the same moment in time. Or did you mean send -r? Best regards, Constantin -- Constantin GonzalezSun Microsystems GmbH, Germany Platform Technology Group, Global Systems Engineering http://www.sun.de/ Tel.: +49 89/4 60 08-25 91 http://blogs.sun.com/constantin/ Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten Amtsgericht Muenchen: HRB 161028 Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer Vorsitzender des Aufsichtsrates: Martin Haering ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: ZFS over a layered driver interface
hi Shweta; First thing is to look for all kernel function return that errno (25 I think) during your test. dtrace -n 'fbt:::return/arg1 == 25/[EMAIL PROTECTED]()}' More verbose but also useful : dtrace -n 'fbt:::return/arg1 == 25/[EMAIL PROTECTED](20)]=count()}' It's a catch all, but often points me in the right direction. -r Le 19 mai 07 à 00:24, Shweta Krishnan a écrit : I explored this a bit and found that the ldi_ioctl in my layered driver does fail, but fails because of an iappropriate ioctl for device error, which the underlying ramdisk driver's ioctl returns. So doesn't seem like that's an issue at all (since I know the storage pool creation is successful when I give the ramdisk directly as the target device). However, as I mentioned, even though reads and writes are getting invoked on the ramdisk, through my layered driver, the storage pool creation still fails. Surprisingly, the layered driver's routines show no sign of error - as in the layered device gets closed successfully when the pool creation command returns. It is unclear to be what would be a good way to go about debugging this, since I'm not familiar with dtrace- i shall try and familiarize myself with dtrace, but even then, it seems like there are a large number of functions returning non-zero values, and confusing to me where to look for the error. Any pointers would be most welcome!! Thanks, Swetha. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?
On Fri, May 25, 2007 at 12:14:45AM -0400, Torrey McMahon wrote: Albert Chin wrote: On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote: I'm getting really poor write performance with ZFS on a RAID5 volume (5 disks) from a storagetek 6140 array. I've searched the web and these forums and it seems that this zfs_nocacheflush option is the solution, but I'm open to others as well. What type of poor performance? Is it because of ZFS? You can test this by creating a RAID-5 volume on the 6140, creating a UFS file system on it, and then comparing performance with what you get against ZFS. If it's ZFS then you might want to check into modifying the 6540 NVRAM as mentioned in this thread http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html there is a fix that doesn't involve modifying the NVRAM in the works. (I don't have an estimate.) The above URL helps only if you have Santricity. -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?
Im using: zfs set:zil_disable 1 On my se6130 with zfs, accessed by NFS and writing performance almost doubled. Since you have BBC, why not just set that? -Andy On 5/24/07 4:16 PM, Albert Chin [EMAIL PROTECTED] wrote: On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote: I'm running SunOS Release 5.10 Version Generic_118855-36 64-bit and in [b]/etc/system[/b] I put: [b]set zfs:zfs_nocacheflush = 1[/b] And after rebooting, I get the message: [b]sorry, variable 'zfs_nocacheflush' is not defined in the 'zfs' module[/b] So is this variable not available in the Solaris kernel? I think zfs:zfs_nocacheflush is only available in Nevada. I'm getting really poor write performance with ZFS on a RAID5 volume (5 disks) from a storagetek 6140 array. I've searched the web and these forums and it seems that this zfs_nocacheflush option is the solution, but I'm open to others as well. What type of poor performance? Is it because of ZFS? You can test this by creating a RAID-5 volume on the 6140, creating a UFS file system on it, and then comparing performance with what you get against ZFS. It would also be worthwhile doing something like the following to determine the max throughput the H/W RAID is giving you: # time dd of=raw disk if=/dev/zero bs=1048576 count=1000 For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip. -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: ZFS over a layered driver interface
Thanks to everyone for their help! yes dtrace did help and I found that in my layered driver, the prop_op entry point had an error in setting the [Ss]ize dynamic property, and apparently that's what ZFS looks for, not just Nblocks! what took me so long in getting to this error was that the driver was faulting not in the beginning but after some reads and writes (basically when the offset exceeded the size, it gave rise to the EINVAL), and that too within zio_wait(), which confused it with a synchronization problem. With that fixed, the layered driver works fine when I try to create a storage pool with it. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?
Le 22 mai 07 à 01:11, Nicolas Williams a écrit : On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote: But still, how is tar/SSH any more multi-threaded than tar/NFS? It's not that it is, but that NFS sync semantics and ZFS sync semantics conspire against single-threaded performance. Hi Nic, I don't agree with the blanket statement. So to clarify. There are 2 independant things at play here. a) NFS sync semantics conspire againts single thread performance with any backend filesystem. However NVRAM normally offers some releaf of the issue. b) ZFS sync semantics along with the Storage Software + imprecise protocol in between, conspire againts ZFS performance of some workloads on NVRAM backed storage. NFS being one of the affected workloads. The conjunction of the 2 causes worst than expected NFS perfomance over ZFS backend running __on NVRAM back storage__. If you are not considering NVRAM storage, then I know of no ZFS/NFS specific problems. Issue b) is being delt with, by both Solaris and Storage Vendors (we need a refined protocol); Issue a) is not related to ZFS and rather fundamental NFS issue. Maybe future NFS protocol will help. Net net; if one finds a way to 'disable cache flushing' on the storage side, then one reaches the state we'll be, out of the box, when b) is implemented by Solaris _and_ Storage vendor. At that point, ZFS becomes a fine NFS server not only on JBOD as it is today , both also on NVRAM backed storage. It's complex enough, I thougt it was worth repeating. -r ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs root: legacy mount or not?
We've been kicking around the question of whether or not zfs root mounts should appear in /etc/vfstab (i.e., be legacy mount) or use the new zfs approach to mounts. Instead of writing up the issues again, here's a blog entry that I just posted on the subject: http://blogs.sun.com/lalt/date/20070525 Weigh in if you care. Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?
Le 22 mai 07 à 01:21, Albert Chin a écrit : On Mon, May 21, 2007 at 06:11:36PM -0500, Nicolas Williams wrote: On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote: But still, how is tar/SSH any more multi-threaded than tar/NFS? It's not that it is, but that NFS sync semantics and ZFS sync semantics conspire against single-threaded performance. What's why we have set zfs:zfs_nocacheflush = 1 in /etc/system. But, that's only helps ZFS. Is there something similar for NFS? With this set, we also reach a state where the NFS/ZFS/NVRAM works as it should. So it should speed things up. The problem is : Once it starts to go in /etc/system it will spread. Customers with no NVRAM storage will use it and some will experience pool corruption. -r -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?
On Fri, May 25, 2007 at 12:01:45PM -0400, Andy Lubel wrote: Im using: zfs set:zil_disable 1 On my se6130 with zfs, accessed by NFS and writing performance almost doubled. Since you have BBC, why not just set that? I don't think it's enough to have BBC to justify zil_disable=1. Besides, I don't know anyone from Sun recommending zil_disable=1. If your storage array has BBC, it doesn't matter. What matters is what happens when ZIL isn't flushed and your file server crashes (ZFS file system is still consistent but you'll lose some info that hasn't been flushed by ZIL). Even having your file server on a UPS won't help here. http://blogs.sun.com/erickustarz/entry/zil_disable discusses some of the issues affecting zil_disable=1. We know we get better performance with zil_disable=1 but we're not taking any chances. -Andy On 5/24/07 4:16 PM, Albert Chin [EMAIL PROTECTED] wrote: On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote: I'm running SunOS Release 5.10 Version Generic_118855-36 64-bit and in [b]/etc/system[/b] I put: [b]set zfs:zfs_nocacheflush = 1[/b] And after rebooting, I get the message: [b]sorry, variable 'zfs_nocacheflush' is not defined in the 'zfs' module[/b] So is this variable not available in the Solaris kernel? I think zfs:zfs_nocacheflush is only available in Nevada. I'm getting really poor write performance with ZFS on a RAID5 volume (5 disks) from a storagetek 6140 array. I've searched the web and these forums and it seems that this zfs_nocacheflush option is the solution, but I'm open to others as well. What type of poor performance? Is it because of ZFS? You can test this by creating a RAID-5 volume on the 6140, creating a UFS file system on it, and then comparing performance with what you get against ZFS. It would also be worthwhile doing something like the following to determine the max throughput the H/W RAID is giving you: # time dd of=raw disk if=/dev/zero bs=1048576 count=1000 For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip. -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?
Le 22 mai 07 à 03:18, Frank Cusack a écrit : On May 21, 2007 6:30:42 PM -0500 Nicolas Williams [EMAIL PROTECTED] wrote: On Mon, May 21, 2007 at 06:21:40PM -0500, Albert Chin wrote: On Mon, May 21, 2007 at 06:11:36PM -0500, Nicolas Williams wrote: On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote: But still, how is tar/SSH any more multi-threaded than tar/NFS? It's not that it is, but that NFS sync semantics and ZFS sync semantics conspire against single-threaded performance. What's why we have set zfs:zfs_nocacheflush = 1 in /etc/system. But, that's only helps ZFS. Is there something similar for NFS? NFS's semantics for open() and friends is that they are synchronous, whereas POSIX's semantics are that they are not. You're paying for a sync() after every open. nocto? I think it's after every client close. But on the server side, there are lots of operations that also requires a commit. So nocto is not the silver bullet. -r ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Preparing to compare Solaris/ZFS and FreeBSD/ZFS performance.
Won't disabling ZIL minimize the chance of a consistent zfs- filesystem if - for some reason - the server did an unplanned reboot? ZIL in ZFS is only used to speed-up various workloads, it has nothing to do with file system consistency. ZFS is always consistent on disk no matter if you use ZIL or not. But it can cause NFS client corruption and you no longer gets synchronous write semantics (see if your app depend on that): http://blogs.sun.com/erickustarz/entry/zil_disable I highly recommend *against* setting zil_disable. Disabling zil did improve the postgresql-import from 1 h. 45 min. to 1. h. 35 min. I get a 10 min. speedup but as the link point out disabling has it's disadvantages. So I'll revert to the old setting. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?
Le 22 mai 07 à 16:23, Dick Davies a écrit : allyourbase Take off every ZIL! http://number9.hellooperator.net/articles/2007/02/12/zil- communication /allyourbase Cause client corrupt but also database corruption and just about anything that carefully manages data. Yes the zpool will survive, but it may be the only thing that does. So please don't do this. -r On 22/05/07, Albert Chin [EMAIL PROTECTED] wrote: On Mon, May 21, 2007 at 06:11:36PM -0500, Nicolas Williams wrote: On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote: But still, how is tar/SSH any more multi-threaded than tar/NFS? It's not that it is, but that NFS sync semantics and ZFS sync semantics conspire against single-threaded performance. What's why we have set zfs:zfs_nocacheflush = 1 in /etc/system. But, that's only helps ZFS. Is there something similar for NFS? -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: No zfs_nocacheflush in Solaris 10?
It would also be worthwhile doing something like the following to determine the max throughput the H/W RAID is giving you: # time dd of=raw disk if=/dev/zero bs=1048576 count=1000 or a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip. -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discu ss Well the Solaris kernel is telling me that it doesn't understand zfs_nocacheflush, but the array sure is acting like it! I ran the dd example, but increased the count for a longer running time. 5-disk RAID5 with UFS: ~79 MB/s 5-disk RAID5 with ZFS: ~470 MB/s I'm assuming there's some caching going on with ZFS that's really helping out? Also, no Santricity, just Sun's Common Array Manager. Is it possible to use both without completely confusing the array? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: No zfs_nocacheflush in Solaris 10?
On Fri, May 25, 2007 at 09:54:04AM -0700, Grant Kelly wrote: It would also be worthwhile doing something like the following to determine the max throughput the H/W RAID is giving you: # time dd of=raw disk if=/dev/zero bs=1048576 count=1000 or a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip. Well the Solaris kernel is telling me that it doesn't understand zfs_nocacheflush, but the array sure is acting like it! I ran the dd example, but increased the count for a longer running time. I don't think a longer running time is going to give you a more accurate measurement. 5-disk RAID5 with UFS: ~79 MB/s What about against a raw RAID-5 device? 5-disk RAID5 with ZFS: ~470 MB/s I don't think you want to if=/dev/zero on ZFS. There's probably some optimization going on. Better to use /dev/urandom or concat n-many files comprised of random bits. I'm assuming there's some caching going on with ZFS that's really helping out? Yes. Also, no Santricity, just Sun's Common Array Manager. Is it possible to use both without completely confusing the array? I think both are ok. CAM is free. Dunno about Santricity. -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS boot: Now, how can I do a pseudo live upgrade?
Malachi de Ælfweald wrote: No, I did mean 'snapshot -r' but I thought someone on the list said that the '-r' wouldn't work until b63... hmmm... 'snapshot -r' is available before b62, however, '-r' may run into a stack overflow (bug 6533813) which is fixed in b63. Lin ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?
On May 25, 2007, at 11:22 AM, Roch Bourbonnais wrote: Le 22 mai 07 à 01:11, Nicolas Williams a écrit : On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote: But still, how is tar/SSH any more multi-threaded than tar/NFS? It's not that it is, but that NFS sync semantics and ZFS sync semantics conspire against single-threaded performance. Hi Nic, I don't agree with the blanket statement. So to clarify. There are 2 independant things at play here. a) NFS sync semantics conspire againts single thread performance with any backend filesystem. However NVRAM normally offers some releaf of the issue. b) ZFS sync semantics along with the Storage Software + imprecise protocol in between, conspire againts ZFS performance of some workloads on NVRAM backed storage. NFS being one of the affected workloads. The conjunction of the 2 causes worst than expected NFS perfomance over ZFS backend running __on NVRAM back storage__. If you are not considering NVRAM storage, then I know of no ZFS/NFS specific problems. Issue b) is being delt with, by both Solaris and Storage Vendors (we need a refined protocol); Issue a) is not related to ZFS and rather fundamental NFS issue. Maybe future NFS protocol will help. Net net; if one finds a way to 'disable cache flushing' on the storage side, then one reaches the state we'll be, out of the box, when b) is implemented by Solaris _and_ Storage vendor. At that point, ZFS becomes a fine NFS server not only on JBOD as it is today , both also on NVRAM backed storage. I will add a third category, response time of individual requests. One can think of the ssh stream of filesystem data as one large remote procedure call that says put this directory tree and contents on the server. The time it takes is essentially the time it takes to transfer the filesystem data. The latency on the very last of the request, amortized across the entire stream is zero. For the NFS client, there is response time injected at each request and the best way to amortize this is through parallelism and that is very difficult for some applications. Add the items in a) and b) and there is a lot to deal with. Not insurmountable but it takes a little more effort to build an effective solution. Spencer ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: No zfs_nocacheflush in Solaris 10?
Albert Chin wrote: I don't think you want to if=/dev/zero on ZFS. There's probably some optimization going on. Better to use /dev/urandom or concat n-many files comprised of random bits. Unless you have turned on compression, that is not the case. By default there is no optimization for all zeros. --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: zfs root: legacy mount or not?
We've been kicking around the question of whether or not zfs root mounts should appear in /etc/vfstab (i.e., be legacy mount) or use the new zfs approach to mounts. Instead of writing up the issues again, here's a blog entry that I just posted on the subject: http://blogs.sun.com/lalt/date/20070525 Weigh in if you care. Interesting. Is there an ARC case that is related to some of these issues? ---Bob This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: zfs root: legacy mount or not?
Bob Palowoda wrote: We've been kicking around the question of whether or not zfs root mounts should appear in /etc/vfstab (i.e., be legacy mount) or use the new zfs approach to mounts. Instead of writing up the issues again, here's a blog entry that I just posted on the subject: http://blogs.sun.com/lalt/date/20070525 Weigh in if you care. Interesting. Is there an ARC case that is related to some of these issues? The ARC case for using zfs as a root file system is PSARC/2006/370, but there isn't much there yet. I'm preparing the documents for the case and this is one of the issues I wanted to get some feedback on from the external community before I make a proposal for what to do. I don't know of any other ARC cases that would be relevant. I'm not sure how old the getvfsent interface is. If that interface got ARC'd, some of the documents for it might be relevant. I'll check it out. Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS boot and SXCE build 64a
Hi Lori, Are there any changes to build 64a that will affect ZFS bootability? Will the conversion script for build 62 still do its magic? Thanks, Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
On Fri, May 25, 2007 at 02:50:15PM -0500, Al Hopper wrote: On Fri, 25 May 2007, Lori Alt wrote: We've been kicking around the question of whether or not zfs root mounts should appear in /etc/vfstab (i.e., be legacy mount) or use the new zfs approach to mounts. Instead of writing up the issues again, here's a blog entry that I just posted on the subject: http://blogs.sun.com/lalt/date/20070525 Weigh in if you care. ZFS is a paradigm shift and Nevada has not been released. Therefore I vote for implementing it the ZFS way - going forward. Place the burden on the other developers to fix their bugs. I second Al's point. In fact, I couldn't have said it better myself. :) -brian -- Perl can be fast and elegant as much as J2EE can be fast and elegant. In the hands of a skilled artisan, it can and does happen; it's just that most of the shit out there is built by people who'd be better suited to making sure that my burger is cooked thoroughly. -- Jonathan Patschke ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
On Fri, 2007-05-25 at 10:20 -0600, Lori Alt wrote: We've been kicking around the question of whether or not zfs root mounts should appear in /etc/vfstab (i.e., be legacy mount) or use the new zfs approach to mounts. Instead of writing up the issues again, here's a blog entry that I just posted on the subject: http://blogs.sun.com/lalt/date/20070525 Weigh in if you care. IMHO, there should be no need to put any ZFS filesystems in /etc/vfstab, but (this is something of a digression based on discussion kicked up by PSARC 2007/297) it's become clear to me that ZFS filesystems *should* be mounted by mountall and mount -a rather than via a special-case invocation of zfs mount at the end of the fs-local method script. in other words: teach mount how to find the list of filesystems in attached pools and mix them in to the dependency graph it builds to mount filesystems in the right order, rather than mounting everything-but-zfs first and then zfs later. - Bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS boot and SXCE build 64a
Build 64a has bug 6553537 (zfs root fails to boot from a snv_63+zfsboot-pfinstall netinstall image), for which I don't have a ready workaround. So I recommend waiting for build 65 (which should be out soon, I think). Lori Al Hopper wrote: Hi Lori, Are there any changes to build 64a that will affect ZFS bootability? Will the conversion script for build 62 still do its magic? Thanks, Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Strange behaviour with sharenfs
Prior to rebooting my system (S10U2) yesterday, I had half a dozen ZFS shares active... Today, how that I look at this, I find I have only 1 of them is being exported through NFS. # zfs list -o name,sharenfs NAME SHARENFS biscuit off biscuit/crashes off biscuit/data off biscuit/foo off biscuit/home off biscuit/on10u3off biscuit/on10u4on biscuit/onnv yes biscuit/[EMAIL PROTECTED] - biscuit/[EMAIL PROTECTED] - biscuit/[EMAIL PROTECTED] - biscuit/onnv_6538379 on biscuit/onnv_6544307 on biscuit/pfh-clone off biscuit/pfhs10u4 off biscuit/queue_t off biscuit/refactor off biscuit/[EMAIL PROTECTED] - biscuit/s10u4fix on biscuit/stc2-hookson mintslice ~# showmount -e export list for mintslice: /biscuit/on10u4 (everyone) mintslice ~# Is this a known problem, fixed already? Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
Bill Sommerfeld wrote: On Fri, 2007-05-25 at 10:20 -0600, Lori Alt wrote: We've been kicking around the question of whether or not zfs root mounts should appear in /etc/vfstab (i.e., be legacy mount) or use the new zfs approach to mounts. Instead of writing up the issues again, here's a blog entry that I just posted on the subject: http://blogs.sun.com/lalt/date/20070525 Weigh in if you care. IMHO, there should be no need to put any ZFS filesystems in /etc/vfstab, but (this is something of a digression based on discussion kicked up by PSARC 2007/297) it's become clear to me that ZFS filesystems *should* be mounted by mountall and mount -a rather than via a special-case invocation of zfs mount at the end of the fs-local method script. in other words: teach mount how to find the list of filesystems in attached pools and mix them in to the dependency graph it builds to mount filesystems in the right order, rather than mounting everything-but-zfs first and then zfs later. I agree with this. This seems like a necessary response to both PSARC/2007/297 and also necessary for eliminating legacy mounts for zfs root file systems. The problem of the interaction between legacy and non-legacy mounts will just get worse once we are using non-legacy mounts for the file systems in the BE. Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote: Bill Sommerfeld wrote: IMHO, there should be no need to put any ZFS filesystems in /etc/vfstab, but (this is something of a digression based on discussion kicked up by PSARC 2007/297) it's become clear to me that ZFS filesystems *should* be mounted by mountall and mount -a rather than via a special-case invocation of zfs mount at the end of the fs-local method script. in other words: teach mount how to find the list of filesystems in attached pools and mix them in to the dependency graph it builds to mount filesystems in the right order, rather than mounting everything-but-zfs first and then zfs later. I agree with this. This seems like a necessary response to both PSARC/2007/297 and also necessary for eliminating legacy mounts for zfs root file systems. The problem of the interaction between legacy and non-legacy mounts will just get worse once we are using non-legacy mounts for the file systems in the BE. Could we also look into why system-console insists on waiting for ALL the zfs mounts to be available? Shouldn't the main file system food groups be mounted and then allow console-login (much like single user or safe-mode)? Would help in many cases where an admin needs to work on a system but doesn't need, say 20k users home directories mounted, to do this work. Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Mike Dotson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Re: Need guidance on RAID 5, ZFS, and RAIDZ on home file server
On 5/24/07, Tom Buskey [EMAIL PROTECTED] wrote: Linux and Windows as well as the BSDs) are all relative newcomers to the 64-bit arena. The 2nd non-x86 port of Linux was to the Alpha in 1999 (98?) by Linus no less. In 1994 to be precise. In 1999 Linux 2.2 got released, which supported few more 64 bit platforms. -- Tomasz Torcz [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
Mike Dotson wrote: On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote: Bill Sommerfeld wrote: IMHO, there should be no need to put any ZFS filesystems in /etc/vfstab, but (this is something of a digression based on discussion kicked up by PSARC 2007/297) it's become clear to me that ZFS filesystems *should* be mounted by mountall and mount -a rather than via a special-case invocation of zfs mount at the end of the fs-local method script. in other words: teach mount how to find the list of filesystems in attached pools and mix them in to the dependency graph it builds to mount filesystems in the right order, rather than mounting everything-but-zfs first and then zfs later. I agree with this. This seems like a necessary response to both PSARC/2007/297 and also necessary for eliminating legacy mounts for zfs root file systems. The problem of the interaction between legacy and non-legacy mounts will just get worse once we are using non-legacy mounts for the file systems in the BE. Could we also look into why system-console insists on waiting for ALL the zfs mounts to be available? Shouldn't the main file system food groups be mounted and then allow console-login (much like single user or safe-mode)? Would help in many cases where an admin needs to work on a system but doesn't need, say 20k users home directories mounted, to do this work. So single-user mode is not sufficient for this? Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
On Fri, 2007-05-25 at 15:50 -0600, Lori Alt wrote: Mike Dotson wrote: On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote: Would help in many cases where an admin needs to work on a system but doesn't need, say 20k users home directories mounted, to do this work. So single-user mode is not sufficient for this? Not all work needs to be done in single user:) And I wouldn't consider a 4+ hour boot time just for mounting file systems a good use of cpu time when an admin could be doing other things - preparation for next patching, configuring changes to webserver, etc. Or just monitoring the status of the file system mounts to give an update to management on how many file systems are mounted and how many are left. Point is, why is console-login dependent on *all* the file systems being mounted in *multiboot*. Does it really need to depend on *all* the file systems being mounted? Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Thanks... Mike Dotson Area System Support Engineer - ACS West Phone: (503) 343-5157 [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
On Fri, 2007-05-25 at 15:50 -0600, Lori Alt wrote: Mike Dotson wrote: On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote: Would help in many cases where an admin needs to work on a system but doesn't need, say 20k users home directories mounted, to do this work. So single-user mode is not sufficient for this? Not all work needs to be done in single user:) And I wouldn't consider a 4+ hour boot time just for mounting file systems a good use of cpu time when an admin could be doing other things - preparation for next patching, configuring changes to webserver, etc. Or just monitoring the status of the file system mounts to give an update to management on how many file systems are mounted and how many are left. Point is, why is console-login dependent on *all* the file systems being mounted in *multiboot*. Does it really need to depend on *all* the file systems being mounted? Why do we need the filesystems mounted at all, ever, if they are not used? Mounts could be more magic than that. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
On Fri, May 25, 2007 at 03:01:20PM -0700, Mike Dotson wrote: On Fri, 2007-05-25 at 15:50 -0600, Lori Alt wrote: Mike Dotson wrote: On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote: Would help in many cases where an admin needs to work on a system but doesn't need, say 20k users home directories mounted, to do this work. So single-user mode is not sufficient for this? Not all work needs to be done in single user:) And I wouldn't consider a 4+ hour boot time just for mounting file systems a good use of cpu time when an admin could be doing other things - preparation for next patching, configuring changes to webserver, etc. Or just monitoring the status of the file system mounts to give an update to management on how many file systems are mounted and how many are left. Point is, why is console-login dependent on *all* the file systems being mounted in *multiboot*. Does it really need to depend on *all* the file systems being mounted? This has been discussed many times in smf-discuss, for all types of login. Basically, there is no way to say console login for root only. As long as any user can log in, we need to have all the filesystems mounted because we don't know what dependencies there may be. Simply changing the definition of console-login isn't a solution because it breaks existing assumptions and software. A much better option is the 'trigger mount' RFE that would allow ZFS to quickly 'mount' a filesystem but not pull all the necessary data off disk until it's first accessed. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?
Thru a sequence of good intentions, I find myself with a raidz'd pool that has a failed drive that I can't replace. We had a generous department donate a fully configured V440 for use as our departmental server. Of course, I installed SX/b56 on it, created a pool with 3x 148Gb drives and made a dozen filesystems on it. Life was good. ZFS is great! One of the raidz pool drives failed. When I went to replace it, I found that the V440's original 72Gb drives had been upgraded to Dell 148Gb Fujitsu drives, and the Sun versions of those drives (same model number...) had different firmware, and more importantly, FEWER sectors! They were only 147.8 Gb! You know what they say about a free lunch and too good to be true... This meant that zpool replace drive faild because the replacement drive is too small. The question of the moment is what to do?. All I can think of is to Attach/create a new pool that has enough space to hold the existing content, Copy the content from the old to new pools, Destroy the old pool, Recreate the old pool with the (slightly) smaller size, and copy the data back onto the pool. Given that there are a bunch of filesystems in the pool, each with some set of properties ..., what is the easiest way to move the data and metadata back and forth without losing anything, and without having to manually recreate the metainfo/properties? (adding to the 'shrink' RFE, if I replace a pool drive with a smaller one, and the existing content is small enough to fit on a shrunk/resized pool, the zpool replace command should (after prompting) simply do the work. In this situation, losing less than 10Mb of pool space to get a healthy raidz configuration seems to be an easy tradeoff :-) TIA, -John ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
Why not simply have a SMF sequence that does early in boot, after / and /usr are mounted: create /etc/nologin (contents=coming up, not ready yet) enable login later in boot, when user filesystems are all mounted: delete /etc/nologin Wouldn't this would give the desired behavior? -John Eric Schrock wrote: This has been discussed many times in smf-discuss, for all types of login. Basically, there is no way to say console login for root only. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
On Fri, 2007-05-25 at 15:19 -0700, Eric Schrock wrote: This has been discussed many times in smf-discuss, for all types of login. Basically, there is no way to say console login for root only. As long as any user can log in, we need to have all the filesystems mounted because we don't know what dependencies there may be. Simply changing the definition of console-login isn't a solution because it breaks existing assumptions and software. devils_advocate So how are you guaranteeing NFS server and automount with autofs are up, running and working for the user for console-login. /devils_advocate I don't buy this argument and you don't have to say console-login for root only you just have to have console-login and the services available are minimal and may not include *all* services much like when a nfs server is down, etc. If the software depends on a file system or all the file systems to be mounted, it adds that as a dependency (filesystem/local). console-login does not require this - only non-root users. (I remember a smf config bug with apache not requiring filesystem/local and failing to start) What software is dependent on console-login? helios(3): svcs -D console-login STATE STIMEFMRI In fact the console-login depends on filesystem/minimal which to me means minimal file systems not all file systems and there is no software dependent on console-login - where's the disconnect? From what I see, problem is auditd is dependent on filesystem/local which is where we possibly have the hangup. A much better option is the 'trigger mount' RFE that would allow ZFS to quickly 'mount' a filesystem but not pull all the necessary data off disk until it's first accessed. Agreed but there's still the issue with console-login being dependent on all file systems instead of minimal file systems. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock -- Mike Dotson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
I didn't mean to imply that it wasn't technically possible, only that there is no one size fits all solution for OpenSolaris as a whole. Even getting this to work in an easily tunable form is quite tricky, since you must dynamically determine dependencies in the process (filesystem/minimal vs. filesystem/user). If someone wants to pursue this, I would suggest moving the discussion to smf-discuss. - Eric On Fri, May 25, 2007 at 03:32:52PM -0700, John Plocher wrote: Why not simply have a SMF sequence that does early in boot, after / and /usr are mounted: create /etc/nologin (contents=coming up, not ready yet) enable login later in boot, when user filesystems are all mounted: delete /etc/nologin Wouldn't this would give the desired behavior? -John Eric Schrock wrote: This has been discussed many times in smf-discuss, for all types of login. Basically, there is no way to say console login for root only. -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?
Given that there are a bunch of filesystems in the pool, each with some set of properties ..., what is the easiest way to move the data and metadata back and forth without losing anything, and without having to manually recreate the metainfo/properties? AFAIK, your only choices are: A. Write/find a script to do the appropriate 'zfs send|recv' and 'zfs set' commands. B. Wait for us to implement 6421959 6421958 (zfs send -r / -p). I'm currently working on this, ETA at least a few months. Sorry, --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
On Fri, May 25, 2007 at 03:39:11PM -0700, Mike Dotson wrote: In fact the console-login depends on filesystem/minimal which to me means minimal file systems not all file systems and there is no software dependent on console-login - where's the disconnect? You're correct - I thought console-login depended in filesystem/local, not filesystem/minimal. ZFS filesystems are not mounted as part of filesystem/minimal, so remind me what the promlem is? - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs root: legacy mount or not?
On Fri, 2007-05-25 at 15:46 -0700, Eric Schrock wrote: On Fri, May 25, 2007 at 03:39:11PM -0700, Mike Dotson wrote: In fact the console-login depends on filesystem/minimal which to me means minimal file systems not all file systems and there is no software dependent on console-login - where's the disconnect? You're correct - I thought console-login depended in filesystem/local, not filesystem/minimal. ZFS filesystems are not mounted as part of filesystem/minimal, so remind me what the promlem is? Create 20k zfs file systems and reboot. Console login waits for all the zfs file systems to be mounted (fully loaded 880, you're looking at about 4 hours so have some coffee ready). The *only* place I can see the filesystem/local dependency is in svc:/system/auditd:default, however, on my systems it's disabled. Haven't had a chance to really prune out the dependency tree to really find the disconnect but once /, /var, /tmp and /usr are mounted, the conditions for console-login should be met. As you mentioned, best solution for this number of filesystems in zfs land is the *automount* fs option where it mounts the filesystems as needed to reduce the *boot time*. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock -- Thanks... Mike Dotson Area System Support Engineer - ACS West Phone: (503) 343-5157 [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?
On 5/25/07, John Plocher [EMAIL PROTECTED] wrote: One of the raidz pool drives failed. When I went to replace it, I found that the V440's original 72Gb drives had been upgraded to Dell 148Gb Fujitsu drives, and the Sun versions of those drives (same model number...) had different firmware, and more importantly, FEWER sectors! They were only 147.8 Gb! You know what they say about a free lunch and too good to be true... What about buying a single larger drive? A 300 GB disk had better have at least 148 GB on it... It's a few hundred bucks extra, granted, but if you have to rent or buy enough space to back everything up it might be a tossup. Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZVol Panic on 62
May 25 23:32:59 summer unix: [ID 836849 kern.notice] May 25 23:32:59 summer ^Mpanic[cpu1]/thread=1bf2e740: May 25 23:32:59 summer genunix: [ID 335743 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ff00232c3a80 addr=490 occurred in module unix due to a NULL pointer dereference May 25 23:32:59 summer unix: [ID 10 kern.notice] May 25 23:32:59 summer unix: [ID 839527 kern.notice] grep: May 25 23:32:59 summer unix: [ID 753105 kern.notice] #pf Page fault May 25 23:32:59 summer unix: [ID 532287 kern.notice] Bad kernel fault at addr=0x490 May 25 23:32:59 summer unix: [ID 243837 kern.notice] pid=18425, pc=0xfb83b6bb, sp=0xff00232c3b78, eflags=0x10246 May 25 23:32:59 summer unix: [ID 211416 kern.notice] cr0: 8005003bpg,wp,ne,et,ts,mp,pe cr4: 6f8xmme,fxsr,pge,mce,pae,pse,de May 25 23:32:59 summer unix: [ID 354241 kern.notice] cr2: 490 cr3: 1fce52000 cr8: c May 25 23:32:59 summer unix: [ID 592667 kern.notice]rdi: 490 rsi:0 rdx: 1bf2e740 May 25 23:32:59 summer unix: [ID 592667 kern.notice]rcx:0 r8:d r9: 62ccc700 May 25 23:32:59 summer unix: [ID 592667 kern.notice]rax:0 rbx:0 rbp: ff00232c3bd0 May 25 23:32:59 summer unix: [ID 592667 kern.notice]r10: fc18 r11:0 r12: 490 May 25 23:32:59 summer unix: [ID 592667 kern.notice]r13: 450 r14: 52e3aac0 r15:0 May 25 23:32:59 summer unix: [ID 592667 kern.notice]fsb:0 gsb: fffec3731800 ds: 4b May 25 23:32:59 summer unix: [ID 592667 kern.notice] es: 4b fs:0 gs: 1c3 May 25 23:33:00 summer unix: [ID 592667 kern.notice]trp:e err:2 rip: fb83b6bb May 25 23:33:00 summer unix: [ID 592667 kern.notice] cs: 30 rfl:10246 rsp: ff00232c3b78 May 25 23:33:00 summer unix: [ID 266532 kern.notice] ss: 38 May 25 23:33:00 summer unix: [ID 10 kern.notice] May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3960 unix:die+c8 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3a70 unix:trap+135b () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3a80 unix:cmntrap+e9 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3bd0 unix:mutex_enter+b () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3c20 zfs:zvol_read+51 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3c50 genunix:cdev_read+3c () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3cd0 specfs:spec_read+276 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3d40 genunix:fop_read+3f () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3e90 genunix:read+288 () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3ec0 genunix:read32+1e () May 25 23:33:00 summer genunix: [ID 655072 kern.notice] ff00232c3f10 unix:brand_sys_syscall32+1a3 () May 25 23:33:00 summer unix: [ID 10 kern.notice] May 25 23:33:00 summer genunix: [ID 672855 kern.notice] syncing file systems... Does anyone have an idea of what bug this might be? Occurred on X86 B62. I'm not seeing any putbacks into 63 or bugs that seem to match. Any insight is appreciated. Core's are available. benr. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] I seem to have backed myself into a corner - how do I migrate filesystems from one pool to another?
On 25-May-07, at 7:28 PM, John Plocher wrote: ... I found that the V440's original 72Gb drives had been upgraded to Dell 148Gb Fujitsu drives, and the Sun versions of those drives (same model number...) had different firmware You can't get hold of another one of the same drive? --Toby ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss