[zfs-discuss] Expert hint for replacing 3.5 SATA drive in X4500 with SSD for ZIL
Hi all, We would like to replace one of our 3.5 inch SATA drives of our Thumpers with a SSD device (and put the ZIL on this device). We are currently looking into this with in a bit more detail and would like to ask for input if people already have experience with single vs. multi cell SSDs, read- and write optimized devices (if these really exist) and so on. If possible I would like this discussion to take place on list, but if people want to suggest brand names/model numbers I'll be happy to accept them off-list as well. Thanks a lot in advance Cheers Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] j4200 drive carriers
The drives that Sun sells will come with the correct bracket. Ergo, there is no reason to sell the bracket as a separate item unless the customer wishes to place non-Sun disks in them. That represents a service liability for Sun, so they are not inclined to do so. It is really basic business. And think of all the money it costs to stock and distribute that separate part. (And our infrastructure is still expensive; too expensive for a $5 part) Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] j4200 drive carriers
+-- | On 2009-02-02 09:46:49, casper@sun.com wrote: | | And think of all the money it costs to stock and distribute that | separate part. (And our infrastructure is still expensive; too expensive | for a $5 part) Facts on the ground: 541-2123 (X4150, X4450, J7410, T51x0) goes for about $70. 541-0239 (X4100, X4200) goes for about $100. I'm sure it's $5 to somebody, but it isn't your customers. Anyway. This is all about fifteen miles off-topic. -- bda Cyberpunk is dead. Long live cyberpunk. http://mirrorshades.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Bad sectors arises - discs differ in size - trouble?
Ok thanks for your help guys! :o) One last question, how do I know that the spare sectors are finishing? SMARTS are not available for Solaris, right? Is there any warnings that plop up in ZFS? Will scrubbing reveal that there are errors? How will I know? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Bad sectors arises - discs differ in size - trouble?
On Mon, Feb 2 at 5:48, Orvar Korvar wrote: Ok thanks for your help guys! :o) One last question, how do I know that the spare sectors are finishing? SMARTS are not available for Solaris, right? Is there any warnings that plop up in ZFS? Will scrubbing reveal that there are errors? How will I know? Short of SMART, I am not sure. If SMART isn't supported, someone should port support for it. -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Zfs and permissions
Actually, the issue seems to be more than what I described below. I cannot seemingly issue any zfs or zpool commands short of just zpool status -x , giving a 'healthy' status. If I do zpool status , I get the following: r...@ec1-nas1# zpool status pool: nasPool state: ONLINE scrub: none requested But then it freezes there. This used to return fairly quickly. Where can I go to see what might be causing this? I see nothing in the message logs. -thx On 2/2/09 9:57 AM, Matthew Arguin marg...@jpr-inc.com wrote: I am having a problem that I am hoping someone might have some insight in to. I am running a x4500 with solaris 5.10 and a zfs filesystem named nasPool. I am also running NetBackup on the box as well...server and client all in one. I have had this up and running for sometime now and recently ran in to a problem that Netbackup, running as root, cannot seem to write to a directory backup and its subdirectories on the zfs filesystem. The directory backup has ownership of backup:backup and at this point also has perms of 777 (did that while trying to figure out this issue). Netbackup cannot write to those directories any longer. Any insight in to this would be greatly appreciated. Thank you in advance. -- Matthew Arguin Production Support Jackpotrewards, Inc. 275 Grove St Newton, MA 02466 617-795-2850 x 2325 www.jackpotrewards.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Matthew Arguin Production Support Jackpotrewards, Inc. 275 Grove St Newton, MA 02466 617-795-2850 x 2325 www.jackpotrewards.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS core contributor nominations
The time has come to review the current Contributor and Core contributor grants for ZFS. Since all of the ZFS core contributors grants are set to expire on 02-24-2009 we need to renew the members that are still contributing at core contributor levels. We should also add some new members to both Contributor and Core contributor levels. First the current list of Core contributors: Bill Moore (billm) Cindy Swearingen (cindys) Lori M. Alt (lalt) Mark Shellenbaum (marks) Mark Maybee (maybee) Matthew A. Ahrens (ahrens) Neil V. Perrin (perrin) Jeff Bonwick (bonwick) Eric Schrock (eschrock) Noel Dellofano (ndellofa) Eric Kustarz (goo)* Georgina A. Chua (chua)* Tabriz Holtz (tabriz)* Krister Johansen (johansen)* All of these should be renewed at Core contributor level, except for those with a *. Those with a * are no longer involved with ZFS and we should let their grants expire. I am nominating the following to be new Core Contributors of ZFS: Jonathan W. Adams (jwadams) Chris Kirby Lin Ling Eric C. Taylor (taylor) Mark Musante Rich Morris George Wilson Tim Haley Brendan Gregg Adam Leventhal Pawel Jakub Dawidek Ricardo Correia For Contributor I am nominating the following: Darren Moffat Richard Elling I am voting +1 for all of these (including myself) Feel free to nominate others for Contributor or Core Contributor. -Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ? Oracle parameters for ZFS : disk_asynch_io filesystemio_options
Could someone help me answer the following question : What is the recommanded value for these 2 Oracle parameters when working with ZFS ? disk_asynch_io = true filesystemio_options = setall or disk_asynch_io = false filesystemio_options = none Thanks in advance. MiK. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Bad sectors arises - discs differ in size - trouble?
Orvar Korvar wrote: Ok. Just to confirm: A modern disk has already some spare capacity which is not normally utilized by ZFS, UFS, etc. If the spare capacity is finished, then the disc should be replaced. Also, if ZFS decides that a block is bad, it can leave it unused. For example, if you have a mirrored pool, it is not required that block N on vdev1 == block N on vdev2. In a sense, this works like block sparing. Regarding SMART, there is much discussion on this in the archives. In a nutshell, there are FMA modules which use SMART data. There are a few tools which allow an administrator to see SMART data. Some question whether SMART data is actually useful. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS extended ACL
fm == Fredrich Maney fredrichma...@gmail.com writes: fm Oddly enough, that seems to be the path was taken by fm Sun quite some time ago with /usr/bin. Those tools are the fm standard, default tools on Sun systems for a reason: they are fm the ones that are maintained and updated with new features nope. The tools in xpg4 and xpg6 have more features and are more updated. For example, ls recently got -% option. This seems to work for /usr/bin/ls, /usr/xpg4/bin/ls, and /usr/xpg6/bin/ls. so, that's good! albeit a little surprising. But if /usr/xpg6/bin/ls came first in PATH, it would make sense to save effort by adding the -% to the newest ls only. Scripts which rely on some bug in older ls, will not know about -% all for viewing ZFS ctime, and not benefit from it. I imagine some broken scripts hardcoded the full path to /usr/bin/ls which some from the Solaris tent took to mean that /usr/bin/ls can never do anything more than what those ancient scripts expect. This is bogus! COMPAT environments belong in a zone. Alternatively, just let the script break. And if the GNU tools are default, they should get -% and any other ZFS feature, and get it first. Whatever tools are default should not be left out of the main thrust of development. Certainly Linux gets this right. Even before adding any GNU stuff Solaris got it wrong. pgp2JSEDBtCug.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS extended ACL
For example, ls recently got -% option. This seems to work for /usr/bin/ls, /usr/xpg4/bin/ls, and /usr/xpg6/bin/ls. so, that's good! albeit a little surprising. There's only one source file. So if you add an option you'll add it to all of them. But if /usr/xpg6/bin/ls came first in PATH, it would make sense to save effort by adding the -% to the newest ls only. Scripts which rely on some bug in older ls, will not know about -% all for viewing ZFS ctime, and not benefit from it. I imagine some broken scripts hardcoded the full path to /usr/bin/ls which some from the Solaris tent took to mean that /usr/bin/ls can never do anything more than what those ancient scripts expect. This is bogus! COMPAT environments belong in a zone. Alternatively, just let the script break. Adding a new option is fine; changing the output is not. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] write cache and cache flush
gm == Greg Mason gma...@msu.edu writes: g == Gary Mills mi...@cc.umanitoba.ca writes: gm I know disabling the ZIL is an Extremely Bad Idea, but maybe you don't care about trashed thunderbird databases. You just don't want to lose the whole pool to ``status: The pool metadata is corrupted and cannot be opened. / action: Destroy the pool and restore from backup.'' I've no answer for that---maybe someone else? The known problem with ZIL disabling, AIUI, is that it breaks the statelessness of NFS. If the server reboots and the NFS clients do not, then assumptions on which the NFS protocol is built could be broken, and files could get corrupted. Behind this dire warning is an expectation I'm not sure everyone shares: if the NFS server reboots, and the clients do not, then (modulo bugs) no data is lost---once the clients unfreeze, it's like nothing ever happened. I don't think other file sharing protocols like SMB or AFP attempt to keep that promise, so maybe people are being warned about something most assumed would happen anyway. will disabling the ZIL make NFS corrupt files worse than SMB or AFP would when the server reboots? not sure---at least SMB or AFP _should_ give an error to the userland when the server reboots, sort of like NFS 'hard,intr' when you press ^C, so applications using sqllite or berkeleydb or whatever can catch that error and perform their own user-level recovery, and if they call fsync() and get success they can trust it absolutely no matter server or client reboots. while the ZIL-less NFS problems would probably be more silent, more analagous to the ZFS-iSCSI problems except one layer higher in the stack so programs think they've written to these .db files but they haven't, and blindly scribble on, not knowing that a batch of writes in the past was silently discarded. In practice everyone always says to run filemaker or Mail.app or Thunderbird or anything with database files on ``a local disk'' only, so I think the SMB and AFP error paths are not working right either and the actual expectation is very low. g Consider a file server running ZFS that exports a volume with g Iscsi. Consider also an application server that imports the g LUN with Iscsi and runs a ZFS filesystem on that LUN. I was pretty sure there was a bug for the iscsitadm target ignoring SYNCHRONIZE_CACHE, but I cannot find the bug number now and may be wrong. Also there is a separate problem with remote storage and filesystems highly dependent on SYNCHRONIZE_CACHE. Even if not for the bug I can't find, remote storage adds a failure case. Normally you have three main cases to handle: SYNCHRONIZE CACHE returns success after some delay SYNCHRONIZE CACHE never returns because someone yanked the cord---the whole system goes down. You deal with it at boot, when mounting the filesystem. SYNCHRONIZE CACHE never returns because a drive went bad. iSCSI adds a fourth: SYNCHRONIZE CACHE returns success SYNCHRONIZE CACHE returns success SYNCHRONIZE CACHE returns failure SYNCHRONIZE CACHE returns success I think ZFS probably does not understand this case. The others are easier, because either you have enough raidz/mirror redundancy, or else you are allowed handle the ``returns failure'' by implicitly unmounting the filesystem and killing everything that held an open file. NFS works around this with the COMMIT op and client-driven replay in v3, or by making everything synchronous in v2. iSCSI is _not_ v2-like because, even if there is no write caching in the initiator/target (there probably ought to be), if the underlying physical disk in the target has a write cache, still the entire target chassis can reboot and lose the contents of that cache. And I suspect iSCSI is not using NFS-v3-like workarounds right now. I think this hole is probably still open. pgpPNNAXMgAyH.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 'zfs recv' is very slow
It definitely does. I made some tests today comparing b101 with b105 while doing 'zfs send -R -I A B /dev/null' with several dozen snapshots between A and B. Well, b105 is almost 5x faster in my case - that's pretty good. -- Robert Milkowski http://milek.blogspot.com -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two-level ZFS
On Mon, Feb 2, 2009 at 9:22 PM, Gary Mills mi...@cc.umanitoba.ca wrote: On Sun, Feb 01, 2009 at 11:44:14PM -0500, Jim Dunham wrote: If there are two (or more) instances of ZFS in the end-to-end data path, each instance is responsible for its own redundancy and error recovery. There is no in-band communication between one instance of ZFS and another instances of ZFS located elsewhere in the same end-to- end data path. I must have been unclear when I stated my question. The configuration, with ZFS on both systems, redundancy only on the file server, and end-to-end error detection and correction, does not exist. What additions to ZFS are required to make this work? None. It's simply not possible. I believe Jim already state that, but let me give some additional comment that might be helpful. (1) zfs can provide end-to-end protection ONLY if you use it end-end. This means : - no other filesystem on top of it (e.g. do not use UFS on zvol or something similar) - no RAID/MIRROR under it (i.e. it must have access to the disk as JBOD) (2) When (1) is not fulfilled, you get limited protection. For example: - when using ufs on top of zvol, or exporting zvol as iscsi, zfs can only provide protection from zvol downwards. It can not manage protection for whatever runs on top of it. - when using zfs on top of HW/SW raid or iscsi, zfs can provide SOME protection, but if certain errors occur on the HW/SW raid or iscsi it MIGHT be unable to recover from it. Here's a scenario : (1) file server (or in this case iscsi server) exports a redundant zvol to app server (2) app server uses the iscsi LUN to create zpool (this would be a single-vdev pool) (3) app server has bad memory/mobo (4) after some writes, app server will show some files have checksum errors In this scenario, app server can NOT correct the error (it doesn't have enough redundancy), and file server can NOT detect the error (because the error is not under its control). Now consider a second scenario (1) file server exports several RAW DISK to app server (2) app server uses the iscsi LUNs to create zpool with redundancy (either mirror, raidz, or raidz2) (3) app server has bad memory/mobo (4) after some writes, app server will show some files have checksum errors In this scenario, app server SHOULD be able to detect and correct the errors properly, but it might be hard to find which one is at fault : app server, file server, or the disks. Third scenario (1) file server exports several RAW DISK to app server (2) app server uses the iscsi LUNs to create zpool with redundancy (either mirror, raidz, or raidz2) (3) file server has a bad disk (4) after some writes, app server will show some files have checksum errors, or it shows that a disk is bad In this scenario, app server SHOULD be able to detect and correct the errors properly, and it should be able to identify which iscsi LUN (and consequently, which disk on file server) is broken. Fourth scenario (1) file server exports several redundant zvols to app server (2) app server uses the iscsi LUNs to create zpool with redundancy (either mirror, raidz, or raidz2) (3) file server has a bad disk, or app server has memory errors In this scenario, app server or file server SHOULD be able to detect and correct the errors properly, so you get end-to-end protection. Sort of. Fourth scenario requires redundancy on both file and app server, while you mentioned that you only want redundancy on file server while running zfs on both file and app server. That's why I said it's not possible. Hope this helps. Regards, Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Bad sectors arises - discs differ in size - trouble?
Orvar Korvar wrote: Ok. Just to confirm: A modern disk has already some spare capacity which is not normally utilized by ZFS, UFS, etc. If the spare capacity is finished, then the disc should be replaced. Yup, that is the case. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Bad sectors arises - discs differ in size - trouble?
Hello Richard, Monday, February 2, 2009, 5:39:34 PM, you wrote: RE Orvar Korvar wrote: Ok. Just to confirm: A modern disk has already some spare capacity which is not normally utilized by ZFS, UFS, etc. If the spare capacity is finished, then the disc should be replaced. RE Also, if ZFS decides that a block is bad, it can leave it unused. RE For example, if you have a mirrored pool, it is not required that RE block N on vdev1 == block N on vdev2. In a sense, this works RE like block sparing. Would ZFS mark such a block permanently as bad? Under what circumstances? -- Best regards, Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] write cache and cache flush
Hello Miles, Monday, February 2, 2009, 7:20:49 PM, you wrote: gm == Greg Mason gma...@msu.edu writes: g == Gary Mills mi...@cc.umanitoba.ca writes: MN gm I know disabling the ZIL is an Extremely Bad Idea, MN but maybe you don't care about trashed thunderbird databases. You MN just don't want to lose the whole pool to ``status: The pool metadata MN is corrupted and cannot be opened. / action: Destroy the pool and MN restore from backup.'' I've no answer for that---maybe someone else? It will not cause the above. Disabling ZIL has nothing to do with a pool consistency. -- Best regards, Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Problem with snapshot
Snapshots are not on a per-pool basis but a per-file-system basis. Thus, when you took a snapshot of testpol, you didn't actually snapshot the pool; rather, you took a snapshot of the top level file system (which has an implicit name matching that of the pool). Thus, you haven't actually affected file systems fs1 or fs2 at all. However, apparently you were able to roll back the file system, which either unmounted or broke the mounts to fs1 and fs2. This probably shouldn't have been allowed. (I wonder what would happen with an explicit non-ZFS mount to a ZFS directory which is removed by a rollback?) Yes the feature to take snapshots directly on pool must not be allowed. Your fs1 and fs2 file systems still exist, but they're not attached to their old names any more. Maybe they got unmounted. You could probably mount them, either on the fs1 directory and on a new fs2 directory if you create one, or at a different point in your file system hierarchy. You are right, they got unmounted. zfs get mounted testpol/fs1 - says no zfs get mounted testpol/fs2 - says no I understand that mounted attribute is a read only property of a zfs file system. I tried to mount the fs1 and fs2, but i was unsuccessful in doing so. Is there any specific way to mount zfs file systems? I have observed another strange behavior, in the same way as discussed in my previous post, i created the pool structure. When i roll back the snapshot for the first time, everything seems to be working perfectly. I could see that file systems fs1 and fs2 are not affected. However when i roll back the snapshot for the second time the file systems are unmounted. Any ideas? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] issue with sharesmb and sharenfs properties enabled on the same pool
My system is OS 8.11, updated to dev build 105. I have two pools constructed from iscsi targets with around 5600 file-systems in each. I was able to enable NFS sharing and CIFS/SMB sharing on both pools, however, after a reboot the SMB shares comes up but the NFS server service does not and eventually times out after about 3 hours of trying. Are there any know issues with large number of NFS and SMB shares or should I be filing a bug report? Thanks, Alastair ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Bad sectors arises - discs differ in size - trouble?
On Mon, Feb 2 at 5:05, Orvar Korvar wrote: Ok. Just to confirm: A modern disk has already some spare capacity which is not normally utilized by ZFS, UFS, etc. If the spare capacity is finished, then the disc should be replaced. Actually, the device has spare sectors beyond the reported LBA capacity. These are transparently exchanged with sectors within the LBA capacity when those sectors develop permanent errors. Filesystems often leave reserve areas, but that is unrelated. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Bad sectors arises - discs differ in size - trouble?
Ok. Just to confirm: A modern disk has already some spare capacity which is not normally utilized by ZFS, UFS, etc. If the spare capacity is finished, then the disc should be replaced. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] need to add space to zfs pool that's part of SNDR replication
Then what if I ever need to export the pool on the primary server and then import it on the replicated server. Will ZFS know which drives should be part of the stripe even though the device names across servers may not be the same? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] need to add space to zfs pool that's part of SNDR replication
BJ Quinn wrote: Then what if I ever need to export the pool on the primary server and then import it on the replicated server. Will ZFS know which drives should be part of the stripe even though the device names across servers may not be the same? Yes, zpool import will figure it out. See a demo at: http://blogs.sun.com/constantin/entry/csi_munich_how_to_save Jim -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Expert hint for replacing 3.5 SATA drive in X4500 with SSD for ZIL
Just a brief addendum Something like this (or a fully DRAM based device if available in 3.5 inch FF) might also be interesting to test, http://www.platinumhdd.com/ any thoughts? Cheers Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS core contributor nominations
+1, Thanks for the nomination, Cindy Mark Shellenbaum wrote: The time has come to review the current Contributor and Core contributor grants for ZFS. Since all of the ZFS core contributors grants are set to expire on 02-24-2009 we need to renew the members that are still contributing at core contributor levels. We should also add some new members to both Contributor and Core contributor levels. First the current list of Core contributors: Bill Moore (billm) Cindy Swearingen (cindys) Lori M. Alt (lalt) Mark Shellenbaum (marks) Mark Maybee (maybee) Matthew A. Ahrens (ahrens) Neil V. Perrin (perrin) Jeff Bonwick (bonwick) Eric Schrock (eschrock) Noel Dellofano (ndellofa) Eric Kustarz (goo)* Georgina A. Chua (chua)* Tabriz Holtz (tabriz)* Krister Johansen (johansen)* All of these should be renewed at Core contributor level, except for those with a *. Those with a * are no longer involved with ZFS and we should let their grants expire. I am nominating the following to be new Core Contributors of ZFS: Jonathan W. Adams (jwadams) Chris Kirby Lin Ling Eric C. Taylor (taylor) Mark Musante Rich Morris George Wilson Tim Haley Brendan Gregg Adam Leventhal Pawel Jakub Dawidek Ricardo Correia For Contributor I am nominating the following: Darren Moffat Richard Elling I am voting +1 for all of these (including myself) Feel free to nominate others for Contributor or Core Contributor. -Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two-level ZFS
On Mon, Feb 02, 2009 at 09:53:15PM +0700, Fajar A. Nugraha wrote: On Mon, Feb 2, 2009 at 9:22 PM, Gary Mills mi...@cc.umanitoba.ca wrote: On Sun, Feb 01, 2009 at 11:44:14PM -0500, Jim Dunham wrote: If there are two (or more) instances of ZFS in the end-to-end data path, each instance is responsible for its own redundancy and error recovery. There is no in-band communication between one instance of ZFS and another instances of ZFS located elsewhere in the same end-to- end data path. I must have been unclear when I stated my question. The configuration, with ZFS on both systems, redundancy only on the file server, and end-to-end error detection and correction, does not exist. What additions to ZFS are required to make this work? None. It's simply not possible. You're talking about the existing ZFS implementation; I'm not! Is ZFS now frozen in time, with only bug being fixed? I have difficulty believing that. Putting a wire between two layers of ZFS should indeed be possible. Think about the Amber Road products, from the Fishworks team. They run ZFS and export Iscsi and FC-AL. Redundancy and disk management is already present in these products. Should it be implimented again in each of the servers that imports LUNs from these products? I think not. I believe Jim already state that, but let me give some additional comment that might be helpful. (1) zfs can provide end-to-end protection ONLY if you use it end-end. This means : - no other filesystem on top of it (e.g. do not use UFS on zvol or something similar) - no RAID/MIRROR under it (i.e. it must have access to the disk as JBOD) Exactly! That leads to my question. What information needs to be exchanged between ZFS on the file server and ZFS on the application server so that end-to-end protection can be maintained with redundancy and disk management only on the file server? -- -Gary Mills--Unix Support--U of M Academic Computing and Networking- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Problem with snapshot
If creation of snapshot is allowed on a top level file system, roll back of snapshot created on top level file system must take care not to disturb other file systems that were created under it. -Abishek -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two-level ZFS
On Sun, Feb 01, 2009 at 11:44:14PM -0500, Jim Dunham wrote: I wrote: I realize that this configuration is not supported. The configuration is supported, but not in the manner mentioned below. If there are two (or more) instances of ZFS in the end-to-end data path, each instance is responsible for its own redundancy and error recovery. There is no in-band communication between one instance of ZFS and another instances of ZFS located elsewhere in the same end-to- end data path. I must have been unclear when I stated my question. The configuration, with ZFS on both systems, redundancy only on the file server, and end-to-end error detection and correction, does not exist. What additions to ZFS are required to make this work? -- -Gary Mills--Unix Support--U of M Academic Computing and Networking- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Zfs and permissions
I am having a problem that I am hoping someone might have some insight in to. I am running a x4500 with solaris 5.10 and a zfs filesystem named nasPool. I am also running NetBackup on the box as well...server and client all in one. I have had this up and running for sometime now and recently ran in to a problem that Netbackup, running as root, cannot seem to write to a directory backup and its subdirectories on the zfs filesystem. The directory backup has ownership of backup:backup and at this point also has perms of 777 (did that while trying to figure out this issue). Netbackup cannot write to those directories any longer. Any insight in to this would be greatly appreciated. Thank you in advance. -- Matthew Arguin Production Support Jackpotrewards, Inc. 275 Grove St Newton, MA 02466 617-795-2850 x 2325 www.jackpotrewards.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two-level ZFS
On Mon, Feb 02, 2009 at 08:22:13AM -0600, Gary Mills wrote: On Sun, Feb 01, 2009 at 11:44:14PM -0500, Jim Dunham wrote: I wrote: I realize that this configuration is not supported. The configuration is supported, but not in the manner mentioned below. If there are two (or more) instances of ZFS in the end-to-end data path, each instance is responsible for its own redundancy and error recovery. There is no in-band communication between one instance of ZFS and another instances of ZFS located elsewhere in the same end-to- end data path. I must have been unclear when I stated my question. The configuration, with ZFS on both systems, redundancy only on the file server, and end-to-end error detection and correction, does not exist. What additions to ZFS are required to make this work? This is a variant of the HW RAID thread that recurs every so often. When redundancy happens below ZFS then ZFS cannot provide end-to-end error correction other than by using ditto blocks. But people using HW RAID typically don't want to dedicate even more space to redundancy by using ditto blocks for data. You still get end-to-end error detection, of course. ZFS layered atop ZFS across iSCSI, with the lower layer providing redundancy, exhibits the same result. You get end-to-end error detection, but not end-to-end error correction. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS core contributor nominations
Looks reasonable +1 Neil. On 02/02/09 08:55, Mark Shellenbaum wrote: The time has come to review the current Contributor and Core contributor grants for ZFS. Since all of the ZFS core contributors grants are set to expire on 02-24-2009 we need to renew the members that are still contributing at core contributor levels. We should also add some new members to both Contributor and Core contributor levels. First the current list of Core contributors: Bill Moore (billm) Cindy Swearingen (cindys) Lori M. Alt (lalt) Mark Shellenbaum (marks) Mark Maybee (maybee) Matthew A. Ahrens (ahrens) Neil V. Perrin (perrin) Jeff Bonwick (bonwick) Eric Schrock (eschrock) Noel Dellofano (ndellofa) Eric Kustarz (goo)* Georgina A. Chua (chua)* Tabriz Holtz (tabriz)* Krister Johansen (johansen)* All of these should be renewed at Core contributor level, except for those with a *. Those with a * are no longer involved with ZFS and we should let their grants expire. I am nominating the following to be new Core Contributors of ZFS: Jonathan W. Adams (jwadams) Chris Kirby Lin Ling Eric C. Taylor (taylor) Mark Musante Rich Morris George Wilson Tim Haley Brendan Gregg Adam Leventhal Pawel Jakub Dawidek Ricardo Correia For Contributor I am nominating the following: Darren Moffat Richard Elling I am voting +1 for all of these (including myself) Feel free to nominate others for Contributor or Core Contributor. -Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS core contributor nominations
+1. I would like to nominate roch.bourbonn...@sun.com for his work on improving the performance of ZFS over the last few years. thanks, -neel On Feb 2, 2009, at 4:02 PM, Neil Perrin wrote: Looks reasonable +1 Neil. On 02/02/09 08:55, Mark Shellenbaum wrote: The time has come to review the current Contributor and Core contributor grants for ZFS. Since all of the ZFS core contributors grants are set to expire on 02-24-2009 we need to renew the members that are still contributing at core contributor levels. We should also add some new members to both Contributor and Core contributor levels. First the current list of Core contributors: Bill Moore (billm) Cindy Swearingen (cindys) Lori M. Alt (lalt) Mark Shellenbaum (marks) Mark Maybee (maybee) Matthew A. Ahrens (ahrens) Neil V. Perrin (perrin) Jeff Bonwick (bonwick) Eric Schrock (eschrock) Noel Dellofano (ndellofa) Eric Kustarz (goo)* Georgina A. Chua (chua)* Tabriz Holtz (tabriz)* Krister Johansen (johansen)* All of these should be renewed at Core contributor level, except for those with a *. Those with a * are no longer involved with ZFS and we should let their grants expire. I am nominating the following to be new Core Contributors of ZFS: Jonathan W. Adams (jwadams) Chris Kirby Lin Ling Eric C. Taylor (taylor) Mark Musante Rich Morris George Wilson Tim Haley Brendan Gregg Adam Leventhal Pawel Jakub Dawidek Ricardo Correia For Contributor I am nominating the following: Darren Moffat Richard Elling I am voting +1 for all of these (including myself) Feel free to nominate others for Contributor or Core Contributor. -Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS core contributor nominations
I would like to nominate roch.bourbonn...@sun.com for his work on improving the performance of ZFS over the last few years. Absolutely. Jeff ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] snapshot identity
The Validated Execution project is investigating how to utilize ZFS snapshots as the basis of a validated filesystem. Given that the blocks of the dataset form a Merkel tree of hashes, it seemed straightforward to validate the individual objects in the snapshot and then sign the hash of the root as a means of indicating that the contents of the dataset were validated. Unfortunately, the block hashes are used to assure the integrity of the physical representation of the dataset. Those hash values can be updated during scrub operations, or even during data error recovery, while the logical content of the dataset remains intact. This would invalidate the signature mechanism proposed above, even though the logical content remains undisturbed. We want to build on the data integrity given us by ZFS. However, we need some means of knowing that the dataset we are currently using is in fact the same snapshot that was validated earlier. We can't use the name, since cloning, promotion, and renaming can lead to a different snapshot having the name under which the prior snapshot was validated. My attempt to forge a replacement snapshot stumbled over the creation time property, but that seems capable of duplication with minimal effort. Does the snapshot dataset include identity information? While a dataset index would be a help, is there perhaps a UUID generated when the snapshot is taken? With regard to the signing mechanism, it might be useful to be able to set properties on a snapshot. Since ZFS expressly prohibits this, how feasible would it be to provide for creation of a snapshot from a snapshot while setting a specific property on the child snapshot, thus avoiding the exposure to modification of the filesystem objects that cloning and snapshotting would entail? Thanks -JZ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS root mirror / moving disks to machine with different hostid
On January 30, 2009 2:26:36 PM -0800 Marcus Reid mar...@blazingdot.com wrote: I am investigating using ZFS as a possible replacement for SVM for root disk mirroring. ... Great. However, if I place the disks into a different machine and try to boot, I get: Executing last command: boot Boot device: disk File and args: SunOS Release 5.10 Version Generic_137137-09 64-bit Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. WARNING: pool 'rpool' could not be loaded as it was last accessed by another system (host: hostid: 0x80c29c4c) ... Is there a way to work through this? I am doing the exact same thing. I just updated to U6 (138889-03) and am using zfs root mirror to replace SVM. After learning the joys of zfs root imagine my horror upon reading your message. I just tested this, and luckily for me I am running x86, where the hostid is software generated. So it is identical when I boot from different hardware, and zfs does not complain. Here's my first shot at workarounds. 1) clone the hostid. 2) keep the backup system in sync with the primary system as far as installed software and configuration. zones may make this easier for you. i used to keep zones in sync on different machines for failover purposed, it's not too awful. neither of those might be useful if you have one backup system for multiple primary systems. But what is probably best, 3) when it comes time to make your backup system act as the failed system, first boot it from the network or from cdrom, or possibly you could have it already running, ready to go. forcibly mount the root pool (just as any other pool) which will record the hostid, then reboot from that drive. A kernel option to forcibly import the root pool would make things a lot easier. Looking at kernel(1M) and boot(1M) there doesn't appear to be one. Maybe OpenSolaris has it? -frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Time taken Backup using ZFS Send Receive
Upgrading to b105 seems to improve zfs send/recv quite a bit. See this thread: http://www.opensolaris.org/jive/message.jspa?messageID=330988 -- Dave Kok Fong Lau wrote: I have been using ZFS send and receive for a while and I noticed that when I try to do a send on a zfs file system of about 3 gig plus it took only about 3 minutes max. zfs send application/sam...@back /backup/sample.zfs However when I tried to send a file system that's about 20 gig, it took almost an hour. I would had expected that since 3 gig took 3 mins, then 20 gig should take 20 mins instead of 60 mins or more. Is there something that I'm doing wrong or could I looks into any logs / enable any logs to find out what is going on. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Time taken Backup using ZFS Send Receive
Kok Fong Lau wrote: I have been using ZFS send and receive for a while and I noticed that when I try to do a send on a zfs file system of about 3 gig plus it took only about 3 minutes max. zfs send application/sam...@back /backup/sample.zfs However when I tried to send a file system that's about 20 gig, it took almost an hour. I would had expected that since 3 gig took 3 mins, then 20 gig should take 20 mins instead of 60 mins or more. Is there something that I'm doing wrong or could I looks into any logs / enable any logs to find out what is going on. What's the nature of the data in the two filesystems? -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS root mirror / moving disks to machine with different hostid
On Mon, Feb 02, 2009 at 08:41:13PM -0800, Frank Cusack wrote: On January 30, 2009 2:26:36 PM -0800 Marcus Reid mar...@blazingdot.com wrote: But what is probably best, 3) when it comes time to make your backup system act as the failed system, first boot it from the network or from cdrom, or possibly you could have it already running, ready to go. forcibly mount the root pool (just as any other pool) which will record the hostid, then reboot from that drive. Hi Frank, Thanks for the response. It turns out that there is a way to do this after the fact. Doing a 'boot disk0 -F failsafe' at the ok prompt (replace disk0 with another device if needed). This gets you to a failsafe shell, where a 'zpool import -f rpool' will import the pool even if it was previously imported by another host. Once that's done the system comes up normally. I wasn't aware of the failsafe shell before, and it seems like a real lifesaver. Hat tip to Scott Dickson at Sun for the tip. Also a (re) read of the boot manpage was long overdue for me, as there seems to be a lot of new stuff. Marcus ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss