Re: [zfs-discuss] ZFS needs a viable backup mechanism
Spare a thought also for the remote serviceability aspects of these systems, if customers raise calls/escalations against such systems then our remote support/solution centre staff would find such an output useful in identifying and verifying the config. I'm don't have visibility of the Explorer development sites at the moment, but I believe that the last publicly available Explorer I looked at (v5.4) still didn't gather any ZFS related info, which would scare me mightily for a FS released in a production-grade Solaris 10 release ... how do we expect our support personnel to engage?? Craig On 18 Jul 2006, at 00:53, Matthew Ahrens wrote: On Fri, Jul 07, 2006 at 04:00:38PM -0400, Dale Ghent wrote: Add an option to zpool(1M) to dump the pool config as well as the configuration of the volumes within it to an XML file. This file could then be sucked in to zpool at a later date to recreate/ replicate the pool and its volume structure in one fell swoop. After that, Just Add Data(tm). Yep, this has been on our to-do list for quite some time: RFE #6276640 zpool config RFE #6276912 zfs config --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Craig Morgan Cinnabar Solutions Ltd t: +44 (0)791 338 3190 f: +44 (0)870 705 1726 e: [EMAIL PROTECTED] w: www.cinnabar-solutions.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS components for a minimal Solaris 10 U2 install?
Included below is a a thread which dealt with trying to find the packages necessary for a minimal Solais 10 U2 install with ZFS functionality. In addition to SUNWzfskr, SUNzfsr and SUNWzfsu the SUNWsmapi package needs to be installed. The libdiskmgt.so.1 library is required for the zpool(1M) command. Finding this out via trial and error, there is no dependency mentioned for SUNWsmapi in the SUNWzfsr depend file. Apologies if this is nitpicking, but is this missing dependency worthy of submitting a P5 CR? -- Jim C Jason Schroeder wrote: Dale Ghent wrote: On Jun 28, 2006, at 4:27 PM, Jim Connors wrote: For an embedded application, I'm looking at creating a minimal Solaris 10 U2 image which would include ZFS functionality. In quickly taking a look at the opensolaris.org site under pkgdefs, I see three packages that appear to be related to ZFS: SUNWzfskr, SUNWzfsr, and SUNWzfsu. Is it naive to think that this would be all that is needed for ZFS? Those packages, as well as what's listed in the depend files for those packages. Ahh, don't you love climbing the dependency tree? /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Glenn Brunette wrote a nifty little tool ... have to assume that all of the dependencies are appropriately doc'ed of course cough. http://blogs.sun.com/roller/page/gbrunett?entry=solaris_package_companion /jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS components for a minimal Solaris 10 U2 install?
No arguement from me. For better or for worse, most of the customers I speak with minimize their OS distributions. The more we can accurately describe dependencies within our current methods, the better. /jason Jim Connors wrote: Included below is a a thread which dealt with trying to find the packages necessary for a minimal Solais 10 U2 install with ZFS functionality. In addition to SUNWzfskr, SUNzfsr and SUNWzfsu the SUNWsmapi package needs to be installed. The libdiskmgt.so.1 library is required for the zpool(1M) command. Finding this out via trial and error, there is no dependency mentioned for SUNWsmapi in the SUNWzfsr depend file. Apologies if this is nitpicking, but is this missing dependency worthy of submitting a P5 CR? -- Jim C Jason Schroeder wrote: Dale Ghent wrote: On Jun 28, 2006, at 4:27 PM, Jim Connors wrote: For an embedded application, I'm looking at creating a minimal Solaris 10 U2 image which would include ZFS functionality. In quickly taking a look at the opensolaris.org site under pkgdefs, I see three packages that appear to be related to ZFS: SUNWzfskr, SUNWzfsr, and SUNWzfsu. Is it naive to think that this would be all that is needed for ZFS? Those packages, as well as what's listed in the depend files for those packages. Ahh, don't you love climbing the dependency tree? /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Glenn Brunette wrote a nifty little tool ... have to assume that all of the dependencies are appropriately doc'ed of course cough. http://blogs.sun.com/roller/page/gbrunett?entry=solaris_package_companion /jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS components for a minimal Solaris 10 U2 install?
On Tue, Jul 25, 2006 at 10:25:04AM -0400, Jim Connors wrote: Included below is a a thread which dealt with trying to find the packages necessary for a minimal Solais 10 U2 install with ZFS functionality. In addition to SUNWzfskr, SUNzfsr and SUNWzfsu the SUNWsmapi package needs to be installed. The libdiskmgt.so.1 library is required for the zpool(1M) command. Finding this out via trial and error, there is no dependency mentioned for SUNWsmapi in the SUNWzfsr depend file. Apologies if this is nitpicking, but is this missing dependency worthy of submitting a P5 CR? Absolutely. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS needs a viable backup mechanism
Craig Morgan wrote: Spare a thought also for the remote serviceability aspects of these systems, if customers raise calls/escalations against such systems then our remote support/solution centre staff would find such an output useful in identifying and verifying the config. I'm don't have visibility of the Explorer development sites at the moment, but I believe that the last publicly available Explorer I looked at (v5.4) still didn't gather any ZFS related info, which would scare me mightily for a FS released in a production-grade Solaris 10 release ... how do we expect our support personnel to engage?? Explorer *should* collect zfs get all and zpool status which will give you all(?) of the file system parameters and pool/device configuration information for first-level troubleshooting. You might check with the explorer developers and see when that is planned. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS state between reboots for RAM rsident OS?
Guys, Thanks for the help so far, now comes the more interesting questions ... Piggybacking off of some work being done to minimize Solaris for embedded use, I have a version of Solaris 10 U2 with ZFS functionality with a disk footprint of about 60MB. Creating a miniroot based upon this image, it can be compressed to under 30MB. Currently, I load this image onto a USB keyring and boot from the USB device running the Solaris miniroot out of RAM. Note: The USB key ring is a hideously slow device, but for the sake of this proof of concept it works fine. In addition, some more packages will need to be added later on (i.e. NFS, Samba?) which will increase the footprint. My ultimate goal here would be to demonstrate a network storage appliance using ZFS, where the OS is effectively stateless, or as stateless as possible. ZFS goes a long way in assisting here since, for example, mount and nfs share information can be managed by ZFS. But I suppose it's not as stateless as I thought. Upon booting from USB device into memory, I can do a `zpool create poo1 c1d0', but a subsequent reboot does not remember this work. Doing a `zpool list' yields 'no pools available'. So the question is, what sort of state is required between reboots for ZFS? Regards, -- Jim C ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: ZFS state between reboots for RAM rsident OS?
I understand. Thanks. Just curious, ZFS manages NFS shares. Have you given any thought to what might be involved for ZFS to manage SMB shares in the same manner. This all goes towards my stateless OS theme. -- Jim C Eric Schrock wrote: You need the following file: /etc/zfs/zpool.cache This file 'knows' about all the pools on the system. These pools can typically be discovered via 'zpool import', but we can't do this at boot because: a. It can be really, really expensive (tasting every disk on the system) b. Pools can be comprised of files or devices not in /dev/dsk So, we have the cache file, which must be editable if you want to remember newly created pools. Note this only affects configuration changes to pools - everything else is stored within the pool itself. - Eric On Tue, Jul 25, 2006 at 12:18:07PM -0400, Jim Connors wrote: Guys, Thanks for the help so far, now comes the more interesting questions ... Piggybacking off of some work being done to minimize Solaris for embedded use, I have a version of Solaris 10 U2 with ZFS functionality with a disk footprint of about 60MB. Creating a miniroot based upon this image, it can be compressed to under 30MB. Currently, I load this image onto a USB keyring and boot from the USB device running the Solaris miniroot out of RAM. Note: The USB key ring is a hideously slow device, but for the sake of this proof of concept it works fine. In addition, some more packages will need to be added later on (i.e. NFS, Samba?) which will increase the footprint. My ultimate goal here would be to demonstrate a network storage appliance using ZFS, where the OS is effectively stateless, or as stateless as possible. ZFS goes a long way in assisting here since, for example, mount and nfs share information can be managed by ZFS. But I suppose it's not as stateless as I thought. Upon booting from USB device into memory, I can do a `zpool create poo1 c1d0', but a subsequent reboot does not remember this work. Doing a `zpool list' yields 'no pools available'. So the question is, what sort of state is required between reboots for ZFS? Regards, -- Jim C -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: ZFS state between reboots for RAM rsident OS?
On Tue, Jul 25, 2006 at 01:07:59PM -0400, Jim Connors wrote: I understand. Thanks. Just curious, ZFS manages NFS shares. Have you given any thought to what might be involved for ZFS to manage SMB shares in the same manner. This all goes towards my stateless OS theme. Yep, this is in the works. We have folks working on an integrted CIFS stack, as well as a rewrite of the way shares are managed. We named the property 'sharenfs' to allow for future, non-NFS share mechanisms. Once the above work is nearing completion, we'll work on integrating closely with the ZFS administration model. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ON build on Blade 1500 ATA disk extremely slow
I've recently started doing ON nightly builds on zfs filesystems on the internal ATA disk of a Blade 1500 running snv_42. Unfortunately, the builds are extremely slow compared to building on an external IEEE 1394 disk attached to the same machine: ATA disk: Elapsed build time (DEBUG) real 21:40:57.7 user 4:32:15.6 sys 8:22:24.1 IEEE 1394 disk: Elapsed build time (DEBUG) real 6:14:11.4 user 4:28:54.1 sys 36:04.1 Running kernel profile with lockstat (lockstat -kIW -D 20 sleep 300), I find in the ATA case: Profiling interrupt: 29117 events in 300.142 seconds (97 events/sec) Count indv cuml rcnt nsec Hottest CPU+PILCaller --- 15082 52% 52% 0.00 1492 cpu[0] (usermode) 9565 33% 85% 0.00 318 cpu[0] usec_delay compared to IEEE 1394: Profiling interrupt: 29195 events in 300.969 seconds (97 events/sec) Count indv cuml rcnt nsec Hottest CPU+PILCaller --- 20042 69% 69% 0.00 2000 cpu[0] (usermode) 5414 19% 87% 0.00 317 cpu[0] usec_delay At other times, the kernel time can be even as high as 80%. Unfortunately, I've not been able to investigate how usec_delay is called since there's no fbt provider for that function (nor for the alternative entry point drv_usecwait found in uts/sun4[uv]/cpu/common_asm.s), so I'm a bit stuck how to further investigate this. I suspect that the dad(7D) driver is the culprit, but it is only included in the closed tarball. In the EDU S9 sources, I find that dcd_flush_cache() calls drv_usecwait(100), which might be the cause of this. How should I proceed to further investigate this, and can this be fixed somehow? This way, the machine is almost unusable as a build machine. Rainer - Rainer Orth, Faculty of Technology, Bielefeld University ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Quotas and Snapshots
I've run into this myself. (I am in a university setting). after reading bug ID 6431277 (URL below for noobs like myself who didn't know what see 6431277 meant): http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6431277 ...it's not clear to me how this will be resolved. What I'd really like to see is the ability to specify where the snapshot backing store will be. (more specifically, the ability for the snapshot space to *not* impact the filesystem space). We have a network Appliance box, whose snapshots are very popular for their value as online backups. Netapp charges snapshots to the storage pool, so they don't cost the filesystem anything. I'm drooling over ZFS as an alternative to the expensive netapp hardware/software, but since we sell RAID space but perform backups administratively, I can't have the snapshots consuming people's space. I could increase the filesystem quota to accomodate the snapshots, but since the snapshot size is dynamic, I would have to increase it well beyond the current snapshot size. Once I do that, users *will* fill the space (that they have not paid for). I could tune the size of the filesystem to match the snapshot + filesystem data, but since snapshot size is dynamic, this is impractical. I also have some very small quotas (50 MB) for users, and would like to be able to create snapshots of them going back 30 days or so without it costing the user anything. The snapshots save us tons of time and effort, but they're not worth it to the user to pay for double or triple the space they're currently using, and I don't want the users going over the original quota of 50 MB, so I can't make enough space in the filesystem to make snapshots of their data... it's maddening. If we must contain snapshots inside a filesystem, perhaps it's possible to set a distinct quota for snapshot space vs. live data space? I could then set snapshot quotas for my filesystems arbitrarily large for my administrative backups, or down to the filesystem size or some other value if there has been delegated authority for the filesystem. It would also be nice to be able to make snapshots of parent filesystems that include their descendants. Then, for example, I could create zfspool/grandparent/parent/child ...and set a filesystem quotas on parent, a user quota on child, and a snapshot quota on grandparent, and this solves most of my problems. in fact, I think a lot of ZFS's hierarchical features would be more valuable if parent filesystems included their descendants (backups and NFS sharing, for example), but I'm sure there are just as many arguments against that as for it. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ON build on Blade 1500 ATA disk extremely slow
On Tue, 2006-07-25 at 13:45, Rainer Orth wrote: At other times, the kernel time can be even as high as 80%. Unfortunately, I've not been able to investigate how usec_delay is called since there's no fbt provider for that function (nor for the alternative entry point drv_usecwait found in uts/sun4[uv]/cpu/common_asm.s), so I'm a bit stuck how to further investigate this. I suspect that the dad(7D) driver is the culprit, but it is only included in the closed tarball. In the EDU S9 sources, I find that dcd_flush_cache() calls drv_usecwait(100), which might be the cause of this. In the future, you can try: # lockstat -s 10 -I sleep 10 which aggregates on the full stack trace, not just the caller, during profiling interrupts. (-s 10 sets the stack depth; tweak up or down to taste). How should I proceed to further investigate this, and can this be fixed somehow? This way, the machine is almost unusable as a build machine. you've rediscovered 6421427 netra x1 slagged by NFS over ZFS leading to long spins in the ATA driver code I've updated the bug to indicate that this wass seen on the Sun Blade 1500 as well. - Bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: ZFS state between reboots for RAM rsident OS?
Eric Schrock wrote: You need the following file: /etc/zfs/zpool.cache So as a workaround (or more appropriately, a kludge) would it be possible to: 1. At boot time do a 'zpool import' of some pool guaranteed to exist. For the sake of this discussion call it 'system' 2. Have /etc/zfs/zpool.cache be symbolically linked to /system/ZPOOL.CACHE -- Jim C This file 'knows' about all the pools on the system. These pools can typically be discovered via 'zpool import', but we can't do this at boot because: a. It can be really, really expensive (tasting every disk on the system) b. Pools can be comprised of files or devices not in /dev/dsk So, we have the cache file, which must be editable if you want to remember newly created pools. Note this only affects configuration changes to pools - everything else is stored within the pool itself. - Eric On Tue, Jul 25, 2006 at 12:18:07PM -0400, Jim Connors wrote: Guys, Thanks for the help so far, now comes the more interesting questions ... Piggybacking off of some work being done to minimize Solaris for embedded use, I have a version of Solaris 10 U2 with ZFS functionality with a disk footprint of about 60MB. Creating a miniroot based upon this image, it can be compressed to under 30MB. Currently, I load this image onto a USB keyring and boot from the USB device running the Solaris miniroot out of RAM. Note: The USB key ring is a hideously slow device, but for the sake of this proof of concept it works fine. In addition, some more packages will need to be added later on (i.e. NFS, Samba?) which will increase the footprint. My ultimate goal here would be to demonstrate a network storage appliance using ZFS, where the OS is effectively stateless, or as stateless as possible. ZFS goes a long way in assisting here since, for example, mount and nfs share information can be managed by ZFS. But I suppose it's not as stateless as I thought. Upon booting from USB device into memory, I can do a `zpool create poo1 c1d0', but a subsequent reboot does not remember this work. Doing a `zpool list' yields 'no pools available'. So the question is, what sort of state is required between reboots for ZFS? Regards, -- Jim C -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ON build on Blade 1500 ATA disk extremely slow
Bill, In the future, you can try: # lockstat -s 10 -I sleep 10 which aggregates on the full stack trace, not just the caller, during profiling interrupts. (-s 10 sets the stack depth; tweak up or down to taste). nice. Perhaps lockstat(1M) should be updated to include something like this in the EXAMPLES section. How should I proceed to further investigate this, and can this be fixed somehow? This way, the machine is almost unusable as a build machine. you've rediscovered 6421427 netra x1 slagged by NFS over ZFS leading to long spins in the ATA driver code I've updated the bug to indicate that this wass seen on the Sun Blade 1500 as well. Ok, thanks. One important difference compared to that CR is that those were local accesses to the FS, but the stack traces from lockstat are identical. Any word when this might be fixed? Thanks. Rainer - Rainer Orth, Faculty of Technology, Bielefeld University ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Quotas and Snapshots
On Tue, 25 Jul 2006, Brad Plecs wrote: I've run into this myself. (I am in a university setting). after reading bug ID 6431277 (URL below for noobs like myself who didn't know what see 6431277 meant): http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6431277 ...it's not clear to me how this will be resolved. What I'd really like to see is the ability to specify where the snapshot backing store will be. (more specifically, the ability for the snapshot space to *not* impact the filesystem space). We have a network Appliance box, whose snapshots are very popular for their value as online backups. Netapp charges snapshots to the storage pool, so they don't cost the filesystem anything. I'm drooling over ZFS as an alternative to the expensive netapp hardware/software, but since we sell RAID space but perform backups administratively, I can't have the snapshots consuming people's space. I could increase the filesystem quota to accomodate the snapshots, but since the snapshot size is dynamic, I would have to increase it well beyond the current snapshot size. Once I do that, users *will* fill the space (that they have not paid for). I could tune the size of the filesystem to match the snapshot + filesystem data, but since snapshot size is dynamic, this is impractical. I also have some very small quotas (50 MB) for users, and would like to be able to create snapshots of them going back 30 days or so without it costing the user anything. The snapshots save us tons of time and effort, but they're not worth it to the user to pay for double or triple the space they're currently using, and I don't want the users going over the original quota of 50 MB, so I can't make enough space in the filesystem to make snapshots of their data... it's maddening. I'll play devils advocate here - because I don't see this as a ZFS related issue or even one that is even in the ZFS domain - in terms of resolving the issue at hand. First, ZFS allows one to take advantage of large, inexpensive Serial ATA disk drives. Paraphrased: ZFS loves large, cheap SATA disk drives. So the first part of the solution looks (to me) as simple as adding some cheap SATA disk drives. Next, after extra storage space has been added to the pool, it's a simple matter of accounting to subtract the size of the ZFS snapshots from the users' disk space to calculate their actual live storage and bill for it! Next, periodic snapshots can be made and older snapshots either deleted or moved to even lower cost storage media (e.g., tape, CDROMs, DVDs etc). Next - 50Mb quotas!?? You've got to be kidding. Let me check my calendar; yep - it's 2006. You are kidding ... right? If you're not kidding, then you've got a business/management issue and not a technical issue to resolve. If we must contain snapshots inside a filesystem, perhaps it's possible to set a distinct quota for snapshot space vs. live data space? I could then set snapshot quotas for my filesystems arbitrarily large for my administrative backups, or down to the filesystem size or some other value if there has been delegated authority for the filesystem. It would also be nice to be able to make snapshots of parent filesystems that include their descendants. Then, for example, I could create zfspool/grandparent/parent/child ...and set a filesystem quotas on parent, a user quota on child, and a snapshot quota on grandparent, and this solves most of my problems. in fact, I think a lot of ZFS's hierarchical features would be more valuable if parent filesystems included their descendants (backups and NFS sharing, for example), but I'm sure there are just as many arguments against that as for it. Regards, Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] somewhat OT: inexpensive 64-bit CPUs for ZFS
A couple of weeks ago, there was a discussion on the best system for ZFS and I mentioned that AMD would reduce pricing and withdraw some of the 939-pin (non AM2) processors from the marketplace. Update: I see a dual-core AMD X2 4400+ (1Mb cache per core) processor on www.monarchcomputers.com for ~ $255. And there's the X2 4600+, with 512kb cache per core for around the same price. http://www.monarchcomputer.com/Merchant2/merchant.mv?Screen=PRODStore_Code=MProduct_Code=120241Category_Code=amddc They won't last long Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Quotas and Snapshots
First, ZFS allows one to take advantage of large, inexpensive Serial ATA disk drives. Paraphrased: ZFS loves large, cheap SATA disk drives. So the first part of the solution looks (to me) as simple as adding some cheap SATA disk drives. I hope not. We have quotas available for a reason. There are legitimate reasons for putting an administrative upper bound on storage space. It's not always about disk acquisition costs. The ability to have the user own their space and the administrator own the snapshot would look good to me. If the user owns both, I expect them to trade snapshots for space. I would prefer to be able to guarantee snapshots as part of a defined recovery system without limiting access to their quota. Next, after extra storage space has been added to the pool, it's a simple matter of accounting to subtract the size of the ZFS snapshots from the users' disk space to calculate their actual live storage and bill for it! Next, periodic snapshots can be made and older snapshots either deleted or moved to even lower cost storage media (e.g., tape, CDROMs, DVDs etc). I don't care how much bigger you make it. At some point the space will be used. Then the user deletes a subtree and doesn't get any of the space back because of snapshots and I don't want the snapshots deleted. Next - 50Mb quotas!?? You've got to be kidding. Let me check my calendar; yep - it's 2006. You are kidding ... right? If you're not kidding, then you've got a business/management issue and not a technical issue to resolve. So pretend it's 500G. The suggestions still seem very valid to me. -- Darren Dunham [EMAIL PROTECTED] Senior Technical Consultant TAOShttp://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area This line left intentionally blank to confuse you. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Quotas and Snapshots
First, ZFS allows one to take advantage of large, inexpensive Serial ATA disk drives. Paraphrased: ZFS loves large, cheap SATA disk drives. So the first part of the solution looks (to me) as simple as adding some cheap SATA disk drives. Next, after extra storage space has been added to the pool, it's a simple matter of accounting to subtract the size of the ZFS snapshots from the users' disk space to calculate their actual live storage and bill for it! Next, periodic snapshots can be made and older snapshots either deleted or moved to even lower cost storage media (e.g., tape, CDROMs, DVDs etc). Next - 50Mb quotas!?? You've got to be kidding. Let me check my calendar; yep - it's 2006. You are kidding ... right? If you're not kidding, then you've got a business/management issue and not a technical issue to resolve. Our problem isn't we don't have enough storage space. Our problem is that the snapshots reduce the filesystem space available to users. Simply creating the snapshot begins squeezing them out of their own space. Just billing people for it doesn't solve the problem. Simply giving people more space doesn't solve the problem, either. I did think of another option -- allow filesystems to dynamically grow their quotas themselves by the size of the snapshots. This might be easiest to implement -- just add a toggle in the zfs set/get parameters. If it's integrated into the filesystem itself, keeping up with the changing size of the data should be a non-issue. Actually, I guess that's fairly similar to the separate quota for snapshot data idea. BP -- [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Proposal: delegated administration
I would like to make a couple of additions to the proposed model. Permission Sets. Allow the administrator to define a named set of permissions, and then use the name as a permission later on. Permission sets would be evaluated dynamically, so that changing the set definition would change what is allowed everywhere the set is used. Permission sets would need a special character to differentiate it from a normal permission. I would like to recommend we use the '@' character for this. The -s option will be used to manipulate a named set. # zfs allow -s @setname perm,perm... dataset # zfs unallow -s @setname perm,perms... dataset # zfs unallow -s @setname dataset Set Examples: First we need to define the set (@myset) # zfs allow -s @myset create,destroy,snapshot,clone datapool Now let group staff use the named set (@myset). # zfs allow staff @myset datapool You could also mix a named set with a normal permission list # zfs allow fred @myset,mount,promote datapool/fred When a permission set is evaluated the nearest ancestor that defines the named set would be used. Permission printing. With permission sets displaying the various permissions becomes a little messy. I would like to propose the following format. Its a bit verbose, but it is readable. -- Permission sets on (pool/fred) @set1 create,destroy,snapshot,mount,clone,promote,rename @simple create,mount Create time permissions on (pool/fred) @set1,mountpoint Local permissions on (pool/fred) user tom @set1 user joe create,destroy,mount Local+Descendent permissions on (pool/fred) user fred @basic,share,rename Descendent permissions on (pool/fred) user barney @basic group staff @basic -- Permission sets on (pool) @simple create,destroy,mount Local permissions on (pool) group staff @simple -- Mark Shellenbaum wrote: The following is the delegated admin model that Matt and I have been working on. At this point we are ready for your feedback on the proposed model. -Mark PERMISSION GRANTING zfs allow [-l] [-d] everyone|user|group ability[,ability...] \ dataset zfs allow [-l] [-d] -u user ability[,ability...] dataset zfs allow [-l] [-d] -g group ability[,ability...] dataset zfs allow [-l] [-d] -e ability[,ability...] dataset zfs allow -c ability[,ability...] dataset If no flags are used, the ability will be allowed for the specified dataset and all of its descendents. -l Local means that the permission will be allowed for the specified dataset, and not its descendents (unless -d is also specified). -d Descendents means that the permission will be allowed for descendent datasets, and not for this dataset (unless -l is also specified). (needed for 'zfs allow -d ahrens quota tank/home/ahrens') When using the first form (without -u, -g, or -e), the everyone|user|group argument will be interpreted as the keyword everyone if possible, then as a user if possible, then as a group as possible. The -u user, -g group, and -e (everyone) forms allow one to specify a user named everyone, or a group whose name conflicts with a user (or everyone). (note: the -e form is not necessary since zfs allow everyone will always mean the keyword everyone not the user everyone.) As a possible extension, multiple who's could be allowed in one command (eg. 'zfs allow -u ahrens,marks create tank/project') -c Create means that the permission will be granted (Locally) to the creator on any newly-created descendant filesystems. Abilities are mostly self explanatory, the ability to run 'zfs [set] ability ds'. Note, this implicitly collapses the subcommand and property namespaces into one. (I think that the 'set' is superfluous anyway, it would be more convenient to say 'zfs property=value' anyway.) create create descendent datasets destroy snapshot rollback clone create clone of any of the ds's snaps (must also have 'create' ability in clone's parent) promote (must also have 'promote' ability in origin fs) rename (must also have 'create' ability in new parent) mount mount and unmount the ds share share and unshare this ds sendsend any of the ds's snapshots receive create a descendent with 'zfs receive' (must also have 'create' ability) quota reservation volsize recordsize mountpoint sharenfs checksum compression atime devices
Re: [zfs-discuss] ON build on Blade 1500 ATA disk extremely slow
On Tue, 2006-07-25 at 14:36, Rainer Orth wrote: Perhaps lockstat(1M) should be updated to include something like this in the EXAMPLES section. I filed 6452661 with this suggestion. Any word when this might be fixed? I can't comment in terms of time, but the engineer working on it has a partially tested fix; he needs to complete testing and integrate the fix.. not clear how long this will take. - Bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ON build on Blade 1500 ATA disk extremely slow
Bill, On Tue, 2006-07-25 at 14:36, Rainer Orth wrote: Perhaps lockstat(1M) should be updated to include something like this in the EXAMPLES section. I filed 6452661 with this suggestion. excellent, thanks. Any word when this might be fixed? I can't comment in terms of time, but the engineer working on it has a partially tested fix; he needs to complete testing and integrate the fix.. not clear how long this will take. No problem: I can use that IEEE 1394 disk for now. Good to know that this is being worked on, though. Rainer - Rainer Orth, Faculty of Technology, Bielefeld University ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Quotas and Snapshots
On Tue, Jul 25, 2006 at 11:13:16AM -0700, Brad Plecs wrote: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6431277 What I'd really like to see is ... the ability for the snapshot space to *not* impact the filesystem space). Yep, as Eric mentioned, that is the purpose of this RFE (want filesystem-only quotas). I imagine that this would be implemented as a quota against the space referenced (as currently reported by 'zfs list', 'zfs get refer', 'df', etc; see the zfs(1m) manpage for details). in fact, I think a lot of ZFS's hierarchical features would be more valuable if parent filesystems included their descendants (backups and NFS sharing, for example), but I'm sure there are just as many arguments against that as for it. Yep, we're working on making more features work on this and all descendents. For example, the recently implemented 'zfs snapshot -r' can create snapshots of a filesystem and all its descendents. This feature will be part of Solaris 10 update 3. We're also working on 'zfs send -r' (RFE 6421958). --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How to best layout our filesystems
Our application Canary has approx 750 clients uploading to the server every 10 mins, that's approx 108,000 gzip tarballs per day writing to the /upload directory. The parser untars the tarball which consists of 8 ascii files into the /archives directory. /app is our application and tools (apache, tomcat, etc) directory. We also have batch jobs that run throughout the day, I would say we read 2 to 3 times more than we write. Since we have an alternate server, downtime or data lost is somewhat acceptable. How can we best layout our filesystems to get the most performance. directory info -- /app - 30G /upload - 10G /archives - 35G HW info --- System Configuration: Sun Microsystems sun4v Sun Fire T200 System clock frequency: 200 MHz Memory size: 8184 Megabytes CPU: 32 x 1000 MHz SUNW,UltraSPARC-T1 Disks: 4x68G Vendor: FUJITSU Product: MAV2073RCSUN72G Revision: 0301 We plan on using 1 disk for OS, the others 3 disks for canary filesystems, /app, /upload, and /archives. Should I create 3 pools, ie zpool create canary_app c1t1d0 zpool create canary_upload c1t2d0 zpool create canary_archives c1t3d0 --OR-- create 1 pool using dynamic stripe, ie zpool create canary c1t1d0 c1t2d0 c1t3d0 --OR-- create a single-parity raid-z pool, ie. zpool create canary raidz c1t1d0 c1t2d0 c1t3d0 Which option gives us the best performance? If there's another method that's not mentioned, please let me know. Also, should be enable read/write cache on the OS as well as the other disks? Is build 9 in S10U2 RR?? If not, please point me to the OS image on nana.eng. Thanks, karen -- NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to best layout our filesystems
Given the amount of I/O wouldn't it make sense to get more drives involved or something that has cache on the front end or both? If you're really pushing the amount of I/O you're alluding too - Hard to tell without all the details - then you're probably going to hit a limitation on the drive IOPS. (Even with the cache on.) Karen Chau wrote: Our application Canary has approx 750 clients uploading to the server every 10 mins, that's approx 108,000 gzip tarballs per day writing to the /upload directory. The parser untars the tarball which consists of 8 ascii files into the /archives directory. /app is our application and tools (apache, tomcat, etc) directory. We also have batch jobs that run throughout the day, I would say we read 2 to 3 times more than we write. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Quotas and Snapshots
On Tue, Jul 25, 2006 at 07:24:51PM -0500, Mike Gerdts wrote: On 7/25/06, Brad Plecs [EMAIL PROTECTED] wrote: What I'd really like to see is ... the ability for the snapshot space to *not* impact the filesystem space). The idea is that you have two storage pools - one for live data, one for backup data. Your live data is *probably* on faster disks than your backup data. The live data and backup data may or may not be on the same server. Whenever you need to perform backups you do something along the lines of: yesterday=$1 today=$2 for user in $allusers ; do zfs snapshot users/[EMAIL PROTECTED] zfs snapshot backup/$user/[EMAIL PROTECTED] zfs clone backup/$user/[EMAIL PROTECTED] backup/$user/$today rsync -axuv /users/$user/.zfs/snapshot/$today /backup/$user/$today zfs destroy users/[EMAIL PROTECTED] zfs destroy backup/$user/$lastweek done You can simplify and improve the performance of this considerably by using 'zfs send': for user in $allusers ; do zfs snapshot users/[EMAIL PROTECTED] zfs send -i $yesterday users/[EMAIL PROTECTED] | \ ssh $host zfs recv -d $backpath ssh $host zfs destroy $backpath/$user/$lastweek done You can send the backup to the same or different host, and the same or different pool, as your hardware needs dictate. 'zfs send' will be much faster than rsync because we can use ZFS metadata to determine which blocks were changed without traversing all files directories. --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Quotas and Snapshots
On 7/25/06, Matthew Ahrens [EMAIL PROTECTED] wrote: You can simplify and improve the performance of this considerably by using 'zfs send': for user in $allusers ; do zfs snapshot users/[EMAIL PROTECTED] zfs send -i $yesterday users/[EMAIL PROTECTED] | \ ssh $host zfs recv -d $backpath ssh $host zfs destroy $backpath/$user/$lastweek done You can send the backup to the same or different host, and the same or different pool, as your hardware needs dictate. 'zfs send' will be much faster than rsync because we can use ZFS metadata to determine which blocks were changed without traversing all files directories. This is what I had originally intended to say, but it seems with this approach the yesterday snapshot has to stick around in order to do incrementals. The fact that snapshots counted against current quota was part of the problem statement. My approach with rsync avoids this but, as I said before, is an ugly hack because it doesn't use the features of zfs. Mike -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to best layout our filesystems
Hi Torrey; we are the cobblers kids. We borrowed this T2000 from Niagara engineering after we did some performance tests for them. I am trying to get a thumper to run this data set. This could take up to 3-4 months. Today we are watching 750 Sun Ray servers and 30,000 employees. Lets see 1) Solaris 10 2) ZFS version 6 3) T2000 32x1000 with the poorer performing drives that come with the Niagara We need a short term solution. Niagara engineering has given us two more of the internal drives so we can max out the Niagara with 4 internal drives. This is the hardware we need to use this week. . When we get a new box, more drives we will reconfigure. Our graphs have 5000 data points per month, 140 data points per day. we can stand to lose data. my suggestion was one drive as the system volume and the remaining three drives as one big zfs volume , probably raidz. thanks sean Torrey McMahon wrote: Given the amount of I/O wouldn't it make sense to get more drives involved or something that has cache on the front end or both? If you're really pushing the amount of I/O you're alluding too - Hard to tell without all the details - then you're probably going to hit a limitation on the drive IOPS. (Even with the cache on.) Karen Chau wrote: Our application Canary has approx 750 clients uploading to the server every 10 mins, that's approx 108,000 gzip tarballs per day writing to the /upload directory. The parser untars the tarball which consists of 8 ascii files into the /archives directory. /app is our application and tools (apache, tomcat, etc) directory. We also have batch jobs that run throughout the day, I would say we read 2 to 3 times more than we write. -- Sean Meighan Mgr ITSM Engineering Sun Microsystems, Inc. US Phone x32329 / +1 408 850-9537 Mobile 303-520-2024 Fax 408 850-9537 Email [EMAIL PROTECTED] NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss