Re: [zfs-discuss] periodic slow responsiveness
On 08/09/2009, at 2:01 AM, Ross Walker wrote: On Sep 7, 2009, at 1:32 AM, James Lever wrote: Well a MD1000 holds 15 drives a good compromise might be 2 7 drive RAIDZ2s with a hotspare... That should provide 320 IOPS instead of 160, big difference. The issue is interactive responsiveness and if there is a way to tune the system to give that while still having good performance for builds when they are run. Look at the write IOPS of the pool with the zpool iostat -v and look at how many are happening on the RAIDZ2 vdev. I was suggesting that slog write were possibly starving reads from the l2arc as they were on the same device. This appears not to have been the issue as the problem has persisted even with the l2arc devices removed from the pool. The SSD will handle a lot more IOPS then the pool and L2ARC is a lazy reader, it mostly just holds on to read cache data. It just may be that the pool configuration just can't handle the write IOPS needed and reads are starving. Possible, but hard to tell. Have a look at the iostat results I’ve posted. The busy times of the disks while the issue is occurring should let you know. So it turns out that the problem is that all writes coming via NFS are going through the slog. When that happens, the transfer speed to the device drops to ~70MB/s (the write speed of his SLC SSD) and until the load drops all new write requests are blocked causing a noticeable delay (which has been observed to be up to 20s, but generally only 2-4s). I can reproduce this behaviour by copying a large file (hundreds of MB in size) using 'cp src dst’ on an NFS (still currently v3) client and observe that all data is pushed through the slog device (10GB partition of a Samsung 50GB SSD behind a PERC 6/i w/256MB BBC) rather than going direct to the primary storage disks. On a related note, I had 2 of these devices (both using just 10GB partitions) connected as log devices (so the pool had 2 separate log devices) and the second one was consistently running significantly slower than the first. Removing the second device made an improvement on performance, but did not remove the occasional observed pauses. I was of the (mis)understanding that only metadata and writes smaller than 64k went via the slog device in the event of an O_SYNC write request? The clients are (mostly) RHEL5. Is there a way to tune this on the NFS server or clients such that when I perform a large synchronous write, the data does not go via the slog device? I have investigated using the logbias setting, but that will just kill small file performance also on any filesystem using it and defeat the purpose of having a slog device at all. cheers, James ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
Would I just do the following then: > zpool create -f zone1 c1t1d0s0 > zfs create zone1/test1 > zfs create zone1/test2 Woud I then use zfs set quota=xxxG to handle disk usage? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
Mike Gerdts wrote: On Wed, Sep 23, 2009 at 7:32 AM, bertram fukuda wrote: Thanks for the info Mike. Just so I'm clear. You suggest 1)create a single zpool from my LUN 2) create a single ZFS filesystem 3) create 2 zone in the ZFS filesystem. Sound right? Correct Well I would actually recommend to create a dedicate zfs file system for each zone (which zoneadm should do for you anyway). The reason is that it is much easier then to get information on how much storage each zone is using, you can set a quote or reservation for storage for each zone independently, you can easily clone each zone, snapshot it, etc. -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Checksum property change does not change pre-existing data - right?
On Wed, 23 Sep 2009, Ray Clark wrote: My understanding is that if I "zfs set checksum=" to change the algorithm that this will change the checksum algorithm for all FUTURE data blocks written, but does not in any way change the checksum for previously written data blocks. This is correct. The same applies to blocksize and compression. I need to corroborate this understanding. Could someone please point me to a document that states this? I have searched and searched and cannot find this. Sorry, I am not aware of a document and don't have time to look. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Checksum property change does not change pre-existing data - right?
My understanding is that if I "zfs set checksum=" to change the algorithm that this will change the checksum algorithm for all FUTURE data blocks written, but does not in any way change the checksum for previously written data blocks. I need to corroborate this understanding. Could someone please point me to a document that states this? I have searched and searched and cannot find this. Thank you. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
On Wed, Sep 23, 2009 at 8:12 PM, David Magda wrote: > On Sep 23, 2009, at 20:48, bertram fukuda wrote: > >> What if we have no plans on moving or cloning the zone, I can get away >> with only pool right? > > Sure. > >> If I'm doing a separate FS for each zone is it just slice up my LUN, >> create a FS for each zone and I'm done? > > > One pool from the LUN ('zpool create zonepool c2t0d0'), but with-in that > pool you do a 'zfs create' for each zone (I think zoneadm can do this > automatically as well): > > zfs create zonepool/zone1 > zfs create zonepool/zone2 > zfs create zonepool/zone3 > > This allows you to do snapshots and rollbacks on the zone for things like > patching and other major changes. Agreed. And it allows you to do migrations sometime in the future with host1# zoneadm -z zone1 detach host1# zfs snapshot zonepool/zo...@migrate host1# zfs send -r zonepool/zo...@migrate \ | ssh host2 zfs receive zones/zo...@migrate host2# zonecfg -z zone1 create -a /zones/zone1 host2# zonecfg -z zone1 attach host2# zoneadm -z zone1 boot -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
On Sep 23, 2009, at 20:48, bertram fukuda wrote: What if we have no plans on moving or cloning the zone, I can get away with only pool right? Sure. If I'm doing a separate FS for each zone is it just slice up my LUN, create a FS for each zone and I'm done? One pool from the LUN ('zpool create zonepool c2t0d0'), but with-in that pool you do a 'zfs create' for each zone (I think zoneadm can do this automatically as well): zfs create zonepool/zone1 zfs create zonepool/zone2 zfs create zonepool/zone3 This allows you to do snapshots and rollbacks on the zone for things like patching and other major changes. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
David, What if we have no plans on moving or cloning the zone, I can get away with only pool right? If I'm doing a separate FS for each zone is it just slice up my LUN, create a FS for each zone and I'm done? Thanks, Bert -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] You're Invited: OpenSolaris Security Summit
To: Developers and Students You are invited to participate in the first OpenSolaris Security Summit OpenSolaris Security Summit Tuesday, November 3rd, 2009 Baltimore Marriott Waterfront 700 Aliceanna Street Baltimore, Maryland 21202 Join us as we explore the latest trends of OpenSolaris Security technologies, as well as key insights from security community members, technologists, and users. You will also have the unique opportunity to hear from our keynote speaker William Cheswick, Lead Member of the Technical Staff at AT&T labs Bio: Ches is an early innovator in Internet security. He is known for his work in firewalls, proxies, and Internet mapping at Bell Labs and Lumeta Corp. He is best known for the book he co-authored with Steve Bellovin and now Avi Rubin, Firewalls and Internet Security; Repelling the Wily Hacker. Ches is now a member of the technical staff at AT&T Labs - Research in Florham Park, NJ, where he is working on security, visualization, user interfaces, and a variety of other things. Registration is now available http://wikis.sun.com/display/secsummit09/ http://www.usenix.org/events/lisa09/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] URGENT: very high busy and average service time with ZFS and USP1100
Thanks Richard and Jim, Your answers helped me to show to the customer that there was no issue with ZFS and the HDS. I went onsite to see the problem, and as Jim suggested, the customer just saw the %b and average service time and he thought there was a problem. The server is running an Oracle DB, and the 2 zfs file systems showing a lot of activity were the one with the database files, and the one for the redo logs. For the DB file system, the recordsize is set to 8k, and that's why we see around 2000 IOPS, with an asvct of 10 ms The redolog file system have the default recordsize of 128k, so we see much less IOPS, the same transfer rate, and an asvct of 30 ms. Everything is normal on this system. I probed that the activity of the IO side while the DB was not responding correctly was not different from the rest of the day. I still don't know what the final where was the problem, but it seems it has been solved now. Regards, Javier Richard Elling wrote: comment below... On Sep 22, 2009, at 9:57 AM, Jim Mauro wrote: Cross-posting to zfs-discuss. This does not need to be on the confidential alias. It's a performance query - there's nothing confidential in here. Other folks post performance queries to zfs-discuss Forget %b - it's useless. It's not the bandwidth that's hurting you, it's the IOPS. One of the hot devices did 1515.8 reads-per-second, the other did over 500. Is this Oracle? You never actually tell us what the huge performance problem is - what's the workload, what's the delivered level of performance? IO service times in the 32-22 millisecond range are not great, but not the worst I've seen. Do you have any data that connects the delivered perfomance of the workload to an IO latency issue, or did the customer just run "iostat", saw "100% b", and assumed this was the problem? I need to see zpool stats. Is each of these c3txx devices actually a raid 7+1 (which means 7 data disks and 1 parity disk)?? There's nothing here that tells us there's something that needs to be done on the ZFS side. Not enough data. It looks like a very lopsided IO load distribution problem. You have 8 LUNs cetX devices, 2 of which are getting slammed with IOPS, the other 6 are relatively idle. Thanks, /jim Javier Conde wrote: Hello, IHAC with a huge performance problem in a newly installed M8000 confiured with a USP1100 and ZFS. From what we can see, 2 disks used by in different ZPOOLS have are 100% busy and and average service time is also quite high (between 30 and 5 ms). r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 11.40.0 224.1 0.0 0.20.0 20.7 0 5 c3t5000C5000F94A607d0 0.0 11.80.0 224.1 0.0 0.30.0 24.2 0 6 c3t5000C5000F94E38Fd0 0.20.0 25.60.0 0.0 0.00.07.9 0 0 c3t60060E8015321F01321F0032d0 0.03.60.0 20.8 0.0 0.00.00.5 0 0 c3t60060E8015321F01321F0020d0 0.2 24.0 25.6 488.0 0.0 0.00.00.6 0 1 c3t60060E8015321F01321F001Cd0 11.40.8 92.88.0 0.0 0.00.03.9 0 4 c3t60060E8015321F01321F0019d0 573.40.0 73395.50.0 0.0 20.60.0 36.0 0 100 c3t60060E8015321F01321F000Bd0 avg read size ~128kBytes... which is good 0.80.8 102.48.0 0.0 0.00.0 22.8 0 4 c3t60060E8015321F01321F0008d0 1515.8 10.2 30420.9 148.0 0.0 34.90.0 22.9 1 100 c3t60060E8015321F01321F0006d0 avg read size ~20 kBytes... not so good These look like single-LUN pools. What is the workload? 0.40.4 51.21.6 0.0 0.00.05.1 0 0 c3t60060E8015321F01321F0055d0 The USP1100 is configured with a raid 7+1, which is the default recommendation. Check the starting sector for the partition. For older OpenSolaris and Solaris 10 installations, the default starting sector is 34, which has the unfortunate affect of misaligning with most hardware RAID arrays. For newer installations, the default starting sector is 256, which has a better chance of aligning with hardware RAID arrays. This will be more pronounced when using RAID-5. To check, look at the partition table in format(1m) or prtvtoc(1m) BTW, the customer is surely not expecting super database performance from RAID-5 are they? The data transfered is not very high, between 50 and 150 MB/sec. Is this normal to see the disks all the time busy at 100% and the average time always greater than 30 ms? Is there something we can do from the ZFS side? We have followed the recommendations regarding the block size for the database file systems, we use 4 different zpools for the DB, indexes, redolog and archive logs, the vdev_cache_bshift is set to 13 (8k blocks)... hmmm... what OS release? The vdev cache should only read metadata, unless you are running on an old OS. In other words, the solution which suggests changing vdev_cache_bshift has been
[zfs-discuss] zfs snapshot -r panic on b114
While a resilver was running, we attempted a recursive snapshot which resulted in a kernel panic: panic[cpu1]/thread=ff00104c0c60: assertion failed: 0 == zap_remove_int(mos, next_clones_obj, dsphys->ds_next_snap_obj, tx) (0x0 == 0x2), file: ../../common/ fs/zfs/dsl_dataset.c, line: 1869 ff00104c0960 genunix:assfail3+c1 () ff00104c0a00 zfs:dsl_dataset_snapshot_sync+4a2 () ff00104c0a50 zfs:snapshot_sync+41 () ff00104c0aa0 zfs:dsl_sync_task_group_sync+eb () ff00104c0b10 zfs:dsl_pool_sync+196 () ff00104c0ba0 zfs:spa_sync+32a () ff00104c0c40 zfs:txg_sync_thread+265 () ff00104c0c50 unix:thread_start+8 () System is a X4100M2 running snv_114. Any ideas? -- albert chin (ch...@thewrittenword.com) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] libzfs.h versioning
Richard, I compared the libzfs_jni source code and they're pretty different from what we're doing. libzfs_jni is essentially a jni wrapper to (yet?) another set of zfs-related programs written in C. zfs for Java, on the other hand, is a Java wrapper to the functionality of (and only of) libzfs. I suppose that libzfs_jni capabilities could be implemented on top of zfs for java but the approach is pretty different: the main difference is the purpose of the exposed methods: libzfs is the interface to ZFS and its methods are low level while libzfs_jni exposes a set of operations which are coarse grained and targeted to management. Nevertheless, the functionality provided by libzfs_jni is interesting and I'd like to build something similar by using zfs for java. Personally, I'm doing this for two reasons: having a libzfs wrapper for Java seems like a good thing to have and I'd like to use to build some management interfaces (such as web but not only) instead on having to rely on shell scripting with zfs and zpool commands. I'll keep an eye to libzfs_jni. Now, to return to the original question, I haven't found a way to correlate libzfs.h versions (and dependencies) to Nevada releases. At the moment, I'm willing to extract information from a sysinfo call (any suggestion about a better way?) and the next step, whose logic I'm missing, is how to correlate this information with to a concrete libzfs.h version from openGrok: maybe it's just trivial, but I do not find it. Have you got some information to help me address this problem? Thanks, Enrico On Fri, Sep 11, 2009 at 12:53 AM, Enrico Maria Crisostomo wrote: > On Fri, Sep 11, 2009 at 12:26 AM, Richard Elling > wrote: >> On Sep 10, 2009, at 1:03 PM, Peter Tribble wrote: >>> >>> On Thu, Sep 10, 2009 at 8:52 PM, Richard Elling >>> wrote: Enrico, Could you compare and contrast your effort with the existing libzfs_jni? http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs_jni/common/ >>> >>> Where's the source for the java code that uses that library? >> >> Excellent question! It is used for the BUI ZFS manager webconsole that >> comes with S10 and SXCE. So you might find the zfs.jar as >> /usr/share/webconsole/webapps/zfs/WEB-INF/lib/zfs.jar >> The jar only contains the class files, though. > Yes, that's what I thought when I saw it. Furthermore, the last time I > tried it was still unaligned with the new ZFS capabilites: it crashed > because of an unknown gzip compression type... > >> >> Someone from Sun could comment on the probability that they >> will finally get their act together and have a real BUI framework for >> systems management... they've tried dozens (perhaps hundreds) >> of times, with little to show for the effort :-( > By the way, one of the goals I'd like to reach with such kind of > library is just that: putting the basis for building a java based > management framework for ZFS. Unfortunately wrapping libzfs will > hardly fulfill this goal and the more I dig into the code the more I > realize that we will need to wrap (or reimplement) some of the logic > of the zfs and zpool commands. I'm also confident that building a good > library on top of this wrapper will give us a very powerful tool to > play with from Java. > >> -- richard >> >> > > > > -- > Ελευθερία ή θάνατος > "Programming today is a race between software engineers striving to > build bigger and better idiot-proof programs, and the Universe trying > to produce bigger and better idiots. So far, the Universe is winning." > GPG key: 1024D/FD2229AF > -- Ελευθερία ή θάνατος "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." GPG key: 1024D/FD2229AF ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
On Sep 23, 2009, at 08:13, Mike Gerdts wrote: That is, at time t1 I have zones z1 and z2 on host h1. I think that at some time in the future I would like to move z2 to host h2 while leaving z1 on h1. You can have a single pool, but it's probably good to have each zone in its own file system. As mentioned in another message this would allow you to delegate (if so desired), but what it also allows you to do is move the zone later on (also with a 'zfs send / recv'): http://prefetch.net/blog/index.php/2006/09/27/moving-solaris-zones/ http://blogs.sun.com/gz/entry/how_to_move_a_solaris It also allows you to clone zones if you want cookie-cutter configurations (or even have a base set up that's common to multiple zones): http://www.cuddletech.com/blog/pivot/entry.php?id=751 http://www.google.com/search?q=zone+clone+zfs ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to verify if the ZIL is disabled
> zfs share -a Ah-ha! Thanks. FYI, I got between 2.5x and 10x improvement in performance, depending on the test. So tempting :) -Scott -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to verify if the ZIL is disabled
On 23 September, 2009 - Scott Meilicke sent me these 0,5K bytes: > Thank you both, much appreciated. > > I ended up having to put the flag into /etc/system. When I disabled > the ZIL and umount/mounted without a reboot, my ESX host would not see > the NFS export, nor could I create a new NFS connection from my ESX > host. I could get into the file system from the host itself of course. zfs share -a /Tomas -- Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to verify if the ZIL is disabled
Thank you both, much appreciated. I ended up having to put the flag into /etc/system. When I disabled the ZIL and umount/mounted without a reboot, my ESX host would not see the NFS export, nor could I create a new NFS connection from my ESX host. I could get into the file system from the host itself of course. -Scott -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk
Cindy: AWESOME! Didn't know about that property, I'll make sure I set it :). All I did to replace the drives was to power off the machine (the failed drive had hard-locked the SCSI bus, so I had to anyways). Once the machine was powered off, I pulled the bad drive, inserted the new drive, and powered the machine on. That's when the machine came up showing the pool in a corrupted state. I'm assuming if I had removed the old drive, booted it with the drive missing, let it come up DEGRADED, and then inserted the new drive and did a zpool replace, it would have been fine. So I was going by the guess that zpool didn't know that the disk was replaced, and I was just curious why. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to verify if the ZIL is disabled
Le 23 sept. 09 à 19:07, Neil Perrin a écrit : On 09/23/09 10:59, Scott Meilicke wrote: How can I verify if the ZIL has been disabled or not? I am trying to see how much benefit I might get by using an SSD as a ZIL. I disabled the ZIL via the ZFS Evil Tuning Guide: echo zil_disable/W0t1 | mdb -kw - this only temporarily disables the zil until the reboot. In fact it has no effect unless file systems are remounted as the variable is only looked at on mount. Scott, just setting it; zfs umount xxx; zfs mount xxx and then run your experiment. Directly compare fast/incorrect xxx dataset with slower/correct yyy mount point. No need to reboot. and then rebooted. However, I do not see any benefits for my NFS workload. To set zil_disable from boot put the following in /etc/system and reboot: set zfs:zil_disable=1 Actually you need to have these 2 lines or it won't work : TEMPORARY zil disable on non-production system; Sept 23 for a test by * set zfs:zil_disable=1 -r Neil ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss smime.p7s Description: S/MIME cryptographic signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk
Dustin, You didn't describe the process that you used to replace the disk so its difficult to commment on what happened. In general, you physically replace the disk and then let ZFS know that the disk is replaced, like this: # zpool replace pool-name device-name This process is described here: http://docs.sun.com/app/docs/doc/819-5461/gazgd?a=view If you want to reduce the steps in the future, you can enable the autoreplace property on the pool and all you need to do is physically replace the disks in the pool. Cindy On 09/23/09 11:23, Dustin Marquess wrote: Tim: I couldn't do a zpool scrub, since the pool was marked as UNAVAIL. Believe me, I tried :) Bob: Ya, I realized that after I clicked send. My brain was a little frazzled, so I completely overlooked it. Solaris 10u7 - Sun E450 ZFS pool version 10 ZFS filesystem version 3 -Dustin ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk
Tim: I couldn't do a zpool scrub, since the pool was marked as UNAVAIL. Believe me, I tried :) Bob: Ya, I realized that after I clicked send. My brain was a little frazzled, so I completely overlooked it. Solaris 10u7 - Sun E450 ZFS pool version 10 ZFS filesystem version 3 -Dustin -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to verify if the ZIL is disabled
On 09/23/09 10:59, Scott Meilicke wrote: How can I verify if the ZIL has been disabled or not? I am trying to see how much benefit I might get by using an SSD as a ZIL. I disabled the ZIL via the ZFS Evil Tuning Guide: echo zil_disable/W0t1 | mdb -kw - this only temporarily disables the zil until the reboot. In fact it has no effect unless file systems are remounted as the variable is only looked at on mount. and then rebooted. However, I do not see any benefits for my NFS workload. To set zil_disable from boot put the following in /etc/system and reboot: set zfs:zil_disable=1 Neil ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How to verify if the ZIL is disabled
How can I verify if the ZIL has been disabled or not? I am trying to see how much benefit I might get by using an SSD as a ZIL. I disabled the ZIL via the ZFS Evil Tuning Guide: echo zil_disable/W0t1 | mdb -kw and then rebooted. However, I do not see any benefits for my NFS workload. Thanks, Scott -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
Awesome!!! Thanks for you help. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk
On Wed, 23 Sep 2009, Dustin Marquess wrote: Okay.. I "fixed" it by powering the server off, removing the new drive, letting the pool come up degraded, and then doing zpool replace. I'm assuming what happened was ZFS saw that the disk was online, tried to use it, and then noticed that the checksums didn't match (of course) and marked the pool as corrupted. The question is why didn't ZFS check the labels on the drive and see that the drive wasn't in the pool and kick it out itself? You never told us what OS and version (OpenSolaris, Solaris 10, FreeBSD, NetBSD, Linux Fuse, OS X zfs preview) you are using. If you are using an older version of zfs, maybe a newer version works as expected? Never report a problem without identifying the software and hardware you are using. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk
On Wed, Sep 23, 2009 at 10:57 AM, Dustin Marquess wrote: > Okay.. I "fixed" it by powering the server off, removing the new drive, > letting the pool come up degraded, and then doing zpool replace. > > I'm assuming what happened was ZFS saw that the disk was online, tried to > use it, and then noticed that the checksums didn't match (of course) and > marked the pool as corrupted. The question is why didn't ZFS check the > labels on the drive and see that the drive wasn't in the pool and kick it > out itself? > -- > Did you do a zpool scrub after you replaced the drive? How would zfs know what you wanted done with the drive if you didn't tell it? --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID-Z2 won't come online after replacing failed disk
Okay.. I "fixed" it by powering the server off, removing the new drive, letting the pool come up degraded, and then doing zpool replace. I'm assuming what happened was ZFS saw that the disk was online, tried to use it, and then noticed that the checksums didn't match (of course) and marked the pool as corrupted. The question is why didn't ZFS check the labels on the drive and see that the drive wasn't in the pool and kick it out itself? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Real help
hereby is my menu.lst I tried to change the zpool mountpoints but no way either menu.lst= j...@opensolaris:~# more /a/boot/grub/menu.lst splashimage /boot/grub/splash.xpm.gz background 215ECA timeout 30 default 10 #-- ADDED BY BOOTADM - DO NOT EDIT -- title OpenSolaris 2008.11 snv_101b_rc2 X86 findroot (pool_rpool,0,a) splashimage /boot/solaris.xpm foreground d25f00 background 115d93 bootfs rpool/ROOT/opensolaris kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics module$ /platform/i86pc/$ISADIR/boot_archive #-END BOOTADM # Unknown partition of type 175 found on /dev/rdsk/c4t0d0p0 partition: 1 # It maps to the GRUB device: (hd2,0) . # Unknown partition of type 175 found on /dev/rdsk/c4t0d0p0 partition: 2 # It maps to the GRUB device: (hd2,1) . # Unknown partition of type 5 found on /dev/rdsk/c6d0p0 partition: 2 # It maps to the GRUB device: (hd0,1) . title Ubuntu root (hd0,4) kernel$ /boot/vmlinuz-2.6.27-7-generic root=UUID=8dea655b-ce58-4cf6-8097-84f6f a0d44e3 ro quiet splash initrd /boot/initrd.img-2.6.27-7-generic quite title OpenSolaris 2008.11 snv_101b_rc2 X86 text boot findroot (pool_rpool,0,a) bootfs rpool/ROOT/opensolaris kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS module$ /platform/i86pc/$ISADIR/boot_archive title opensolaris-1 findroot (pool_rpool,0,a) splashimage /boot/solaris.xpm foreground d25f00 background 115d93 bootfs rpool/ROOT/opensolaris-1 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = title opensolaris-2 findroot (pool_rpool,0,a) splashimage /boot/solaris.xpm foreground d25f00 background 115d93 bootfs rpool/ROOT/opensolaris-2 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = title opensolaris-3 findroot (pool_rpool,0,a) splashimage /boot/solaris.xpm foreground d25f00 background 115d93 bootfs rpool/ROOT/opensolaris-3 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = title opensolaris-4 findroot (pool_rpool,0,a) splashimage /boot/solaris.xpm foreground d25f00 background 115d93 bootfs rpool/ROOT/opensolaris-4 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = title opensolaris-5 findroot (pool_rpool,0,a) splashimage /boot/solaris.xpm foreground d25f00 background 115d93 bootfs rpool/ROOT/opensolaris-5 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -k module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = title be_name findroot (pool_rpool,0,a) splashimage /boot/solaris.xpm foreground d25f00 background 115d93 bootfs rpool/ROOT/be_name kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -k module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = title be_name-1 findroot (pool_rpool,0,a) splashimage /boot/solaris.xpm foreground d25f00 background 115d93 bootfs rpool/ROOT/be_name-1 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -k module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = title OSOL_123 findroot (pool_rpool,0,a) splashimage /boot/solaris.xpm foreground d25f00 background 115d93 bootfs rpool/ROOT/OSOL_123 kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS -k module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = j...@opensolaris:~# and here the zfs list j...@opensolaris:~$ zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 66.2G 5.65G78K /a rpool/ROOT21.7G 5.65G18K /a/ rpool/ROOT/OSOL_123 16.9G 5.65G 10.7G /tmp/tmpd38Ilg rpool/ROOT/be_name 160M 5.65G 10.5G /a/ rpool/ROOT/be_name-1 4.23G 5.65G 10.5G /a/ rpool/ROOT/opensolaris35.4M 5.65G 5.69G / rpool/ROOT/opensolaris-1 188M 5.65G 5.05G /tmp/tmp2GKUrN rpool/ROOT/opensolaris-2 43.0M 5.65G 5.60G /tmp/tmpUbjG28 rpool/ROOT/opensolaris-3 42.8M 5.65G 5.67G /tmp/tmpKIjHhB rpool/ROOT/opensolaris-4 32.3M 5.65G 6.49G /tmp/tmp7JeXnR rpool/ROOT/opensolaris-5 87.5M 5.65G 10.4G /tmp/tmpfvpm6V rpool/dump1.50G 5.65G 1.50G - rpool/export 41.5G 5.65G21K /export rpool/export/home 41.5G 5.65G 195M /export/home/ rpool/export/home/hazz
Re: [zfs-discuss] Real help
On Wed, Sep 23, 2009 at 3:32 AM, vattini giacomo wrote: > Hi there i'v been able to restore my zpool on a live cd,reinstall the > grub,but booting from the HD it hangs for a while and than nothing comes up > j...@opensolaris:~# zfs list > NAME USED AVAIL REFER MOUNTPOINT > rpool 66.2G 5.65G78K /a > rpool/ROOT21.7G 5.65G18K / > rpool/ROOT/OSOL_123 16.9G 5.65G 10.7G /tmp/tmpd38Ilg > rpool/ROOT/be_name 160M 5.65G 10.5G / > rpool/ROOT/be_name-1 4.23G 5.65G 10.5G / > rpool/ROOT/opensolaris35.4M 5.65G 5.69G / > rpool/ROOT/opensolaris-1 188M 5.65G 5.05G /tmp/tmp2GKUrN > rpool/ROOT/opensolaris-2 43.0M 5.65G 5.60G /tmp/tmpUbjG28 > rpool/ROOT/opensolaris-3 42.8M 5.65G 5.67G /tmp/tmpKIjHhB > rpool/ROOT/opensolaris-4 32.3M 5.65G 6.49G /tmp/tmp7JeXnR > rpool/ROOT/opensolaris-5 87.5M 5.65G 10.4G /tmp/tmpfvpm6V > rpool/dump1.50G 5.65G 1.50G - > rpool/export 41.5G 5.65G21K /export > rpool/export/home 41.5G 5.65G 195M /export/home/ > rpool/export/home/hazz41.3G 5.65G 41.3G /export/home/hazz/ > rpool/swap1.50G 5.86G 1.29G - > Any clue to get on the rescue? > -- > > What does the grub.conf look like now that you've re-installed grub? --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] RAID-Z2 won't come online after replacing failed disk
I replaced a bad disk in a RAID-Z2 pool, and now the pool won't come online. Status shows nothing helpful at all. I don't understand why this is, which I should be able to lose 2 drives, and I only replaced one! # zpool status -v pool pool: pool state: UNAVAIL scrub: none requested config: NAMESTATE READ WRITE CKSUM poolUNAVAIL 0 0 0 insufficient replicas raidz2UNAVAIL 0 0 0 corrupted data c2t0d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] lots of zil_clean threads
I wonder if a taskq pool does not suffer from a similar effect observed for the nfsd pool : 6467988 Minimize the working set of nfsd threads Created threads round robin our of taskq loop, doing little work but wake up at least once per 5 minute and so are never reaped. -r Nils Goroll writes: > Hi Neil and all, > > thank you very much for looking into this: > > > So I don't know what's going on. What is the typical call stack for those > > zil_clean() threads? > > I'd say they are all blocking on their respective CVs: > > ff0009066c60 fbc2c0300 0 60 ff01d25e1180 >PC: _resume_from_idle+0xf1TASKQ: zil_clean >stack pointer for thread ff0009066c60: ff0009066b60 >[ ff0009066b60 _resume_from_idle+0xf1() ] > swtch+0x147() > cv_wait+0x61() > taskq_thread+0x10b() > thread_start+8() > > I should add that I have quite a lot of datasets: > > r...@haggis:~# zfs list -r -t filesystem | wc -l >49 > r...@haggis:~# zfs list -r -t volume | wc -l >14 > r...@haggis:~# zfs list -r -t snapshot | wc -l > 6018 > > Nils > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [dtrace-discuss] How to drill down cause of cross-calls in the kernel? (output provided)
> The only thing that jumps out at me is the ARC size - > 53.4GB, or > most of your 64GB of RAM. This in-and-of-itself is > not necessarily > a bad thing - if there are no other memory consumers, > let ZFS cache > data in the ARC. But if something is coming along to > flush dirty > ARC pages periodically The workload is a set of 50 python processes, each receiving a stream of data via TCP/IP. The processes run until they notice something interesting in the stream (sorry I can't be more specific), then they connect to a server via TCP/IP and issue a command or two. Log files are written that takes up about 50M per day per process. It's relatively low-traffic. > I found what looked to be an applicable bug; > CR 6699438 zfs induces crosscall storm under heavy > mapped sequential > read workload > but the stack signature for the above bug is > different than yours, and > it doesn't sound like your workload is doing mmap'd > sequential reads. > That said, I would be curious to know if your > workload used mmap(), > versus read/write? I asked and they couldn't say. It's python so I think it's unlikely. > For the ZFS folks just seeing this, here's the stack > frame; > > unix`xc_do_call+0x8f > unix`xc_wait_sync+0x36 > unix`x86pte_invalidate_pfn+0x135 > unix`hat_pte_unmap+0xa9 > unix`hat_unload_callback+0x109 > unix`hat_unload+0x2a > unix`segkmem_free_vn+0x82 > unix`segkmem_zio_free+0x10 > genunix`vmem_xfree+0xee > genunix`vmem_free+0x28 > genunix`kmem_slab_destroy+0x80 > genunix`kmem_slab_free+0x1be > genunix`kmem_magazine_destroy+0x54 > genunix`kmem_depot_ws_reap+0x4d > genunix`taskq_thread+0xbc > unix`thread_start+0x8 > > Let's see what the fsstat and zpool iostat data looks > like when this > starts happening.. Both are unremarkable, I'm afraid. Here's the fsstat from when it starts happening: new name name attr attr lookup rddir read read write write file remov chng get set ops ops ops bytes ops bytes 0 0 0 75 0 0 0 0 0 10 1.25M zfs 0 0 0 83 0 0 0 0 0 7 896K zfs 0 0 0 78 0 0 0 0 0 13 1.62M zfs 0 0 0 229 0 0 0 0 0 29 3.62M zfs 0 0 0 217 0 0 0 0 0 28 3.37M zfs 0 0 0 212 0 0 0 0 0 26 3.03M zfs 0 0 0 151 0 0 0 0 0 18 2.07M zfs 0 0 0 184 0 0 0 0 0 31 3.41M zfs 0 0 0 187 0 0 0 0 0 32 2.74M zfs 0 0 0 219 0 0 0 0 0 24 2.61M zfs 0 0 0 222 0 0 0 0 0 29 3.29M zfs 0 0 0 206 0 0 0 0 0 29 3.26M zfs 0 0 0 205 0 0 0 0 0 19 2.26M zfs -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [dtrace-discuss] How to drill down cause of cross-calls in the kernel? (output provided)
(posted to zfs-discuss) Hmmm...this is nothing in terms of load. So you say that the system becomes sluggish/unresponsive periodically, and you noticed the xcall storm when that happens, correct? Refresh my memory - what is the frequency and duration of the sluggish cycles? Could you capture a kernel profile during a sluggish cycle; #dtrace -n 'profile-997hz / arg0 && curthread->t_pri != -1 / { @[stack()]=count(); } tick-1sec { trunc(@,10); printa(@); clear(@); }' And/or - #lockstat -i997 -kIW -s 10 sleep 30 > lockstat.kprof.out And #lockstat -Cc sleep 30 > lockstat.locks.out Thanks, /jim Jim Leonard wrote: Can you gather some ZFS IO statistics, like "fsstat zfs 1" for a minute or so. Here is a snapshot from when it is exhibiting the behavior: new name name attr attr lookup rddir read read write write file remov chng get setops ops ops bytes ops bytes 0 0 075 0 0 0 0 010 1.25M zfs 0 0 083 0 0 0 0 0 7 896K zfs 0 0 078 0 0 0 0 013 1.62M zfs 0 0 0 229 0 0 0 0 029 3.62M zfs 0 0 0 217 0 0 0 0 028 3.37M zfs 0 0 0 212 0 0 0 0 026 3.03M zfs 0 0 0 151 0 0 0 0 018 2.07M zfs 0 0 0 184 0 0 0 0 031 3.41M zfs 0 0 0 187 0 0 0 0 032 2.74M zfs 0 0 0 219 0 0 0 0 024 2.61M zfs 0 0 0 222 0 0 0 0 029 3.29M zfs 0 0 0 206 0 0 0 0 029 3.26M zfs 0 0 0 205 0 0 0 0 019 2.26M zfs Unless attr_get is ludicrously costly, I can't see any issues...? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [dtrace-discuss] How to drill down cause of cross-calls in the kernel? (output provided)
I'm cross-posting to zfs-discuss, as this is now more of a ZFS query than a dtrace query at this point, and I'm not sure if all the ZFS experts are listening on dtrace-discuss (although they probably are... :^). The only thing that jumps out at me is the ARC size - 53.4GB, or most of your 64GB of RAM. This in-and-of-itself is not necessarily a bad thing - if there are no other memory consumers, let ZFS cache data in the ARC. But if something is coming along to flush dirty ARC pages periodically I found what looked to be an applicable bug; CR 6699438 zfs induces crosscall storm under heavy mapped sequential read workload but the stack signature for the above bug is different than yours, and it doesn't sound like your workload is doing mmap'd sequential reads. That said, I would be curious to know if your workload used mmap(), versus read/write? For the ZFS folks just seeing this, here's the stack frame; unix`xc_do_call+0x8f unix`xc_wait_sync+0x36 unix`x86pte_invalidate_pfn+0x135 unix`hat_pte_unmap+0xa9 unix`hat_unload_callback+0x109 unix`hat_unload+0x2a unix`segkmem_free_vn+0x82 unix`segkmem_zio_free+0x10 genunix`vmem_xfree+0xee genunix`vmem_free+0x28 genunix`kmem_slab_destroy+0x80 genunix`kmem_slab_free+0x1be genunix`kmem_magazine_destroy+0x54 genunix`kmem_depot_ws_reap+0x4d genunix`taskq_thread+0xbc unix`thread_start+0x8 Let's see what the fsstat and zpool iostat data looks like when this starts happening.. Thanks, /jim Jim Leonard wrote: It would also be interesting to see some snapshots of the ZFS arc kstats kstat -n arcstats Here you go, although I didn't see anything jump out (massive amounts of cache misses or something). Any immediate trouble spot? # kstat -n arcstats module: zfs instance: 0 name: arcstatsclass:misc c 53490612870 c_max 67636535296 c_min 8454566912 crtime 212.955493179 deleted 7267003 demand_data_hits179708165 demand_data_misses 189797 demand_metadata_hits9959277 demand_metadata_misses 194228 evict_skip 1709 hash_chain_max 9 hash_chains 205513 hash_collisions 9372169 hash_elements 851634 hash_elements_max 886509 hdr_size143082240 hits198822346 l2_abort_lowmem 0 l2_cksum_bad0 l2_evict_lock_retry 0 l2_evict_reading0 l2_feeds0 l2_free_on_write0 l2_hdr_size 0 l2_hits 0 l2_io_error 0 l2_misses 0 l2_rw_clash 0 l2_size 0 l2_writes_done 0 l2_writes_error 0 l2_writes_hdr_miss 0 l2_writes_sent 0 memory_throttle_count 0 mfu_ghost_hits 236508 mfu_hits165895558 misses 388618 mru_ghost_hits 70149 mru_hits24777390 mutex_miss 6094 p 49175731760 prefetch_data_hits 7993639 prefetch_data_misses370 prefetch_metadata_hits 1161265 prefetch_metadata_misses4223 recycle_miss7149 size53490565328 snaptime5759009.53378144 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS send-receive between remote machines as non-root user
Hi list, I have a question about setting up zfs send-receive functionality (between remote machine) as non-root user. "server1" - is a server where "zfs send" will be executed "server2" - is a server where "zfs receive" will be executed. I am using the following zfs structure: [server1]$ zfs list -t filesystem -r datapool/data NAME USED AVAIL REFER MOUNTPOINT datapool/data 2.05G 223G 2.05G /opt/data datapool/data/logs 35K 223G19K /opt/data/logs datapool/data/db18K 223G18K /opt/data/db [server1]$ zfs list -t filesystem -r datapool2/data NAME USED AVAIL REFER MOUNTPOINT datapool2/data 72K 6.91G18K /datapool2/data datapool2/data/fastdb 18K 6.91G18K /opt/data/fastdb datapool2/data/fastdblog18K 6.91G18K /opt/data/fastdblog datapool2/data/dblog18K 6.91G18K /opt/data/dblog ZFS delegated permissions setup on the sending machine: [server1]$ zfs allow datapool/data - Local+Descendent permissions on (datapool/data) user joe atime,canmount,create,destroy,mount,receive,rollback,send,snapshot - [server1]$ zfs allow datapool2/data - Local+Descendent permissions on (data2/data) user joe atime,canmount,create,destroy,mount,receive,rollback,send,snapshot - The idea is to create a snapshot and send it to another machine with zfs using zfs send-receive. So I am creating a snapshot and ... get the following error: [server1]$ zfs list -t snapshot -r datapool/data NAMEUSED AVAIL REFER MOUNTPOINT datapool/d...@rolling-2009092314071448K - 2.05G - datapool/data/l...@rolling-20090923140714 16K -18K - datapool/data/d...@rolling-20090923140714 0 -18K - [server1]$ zfs list -t snapshot -r datapool2/data NAMEUSED AVAIL REFER MOUNTPOINT datapool2/d...@rolling-20090923140714 0 -18K - datapool2/data/fas...@rolling-20090923140714 0 -18K - datapool2/data/fastdb...@rolling-20090923140714 0 -18K - datapool2/data/db...@rolling-20090923140714 0 -18K - To send the snapshot I'm using the following command (for "datapool" datapool): [server1]$ zfs send -R datapool/d...@rolling-20090923140714 | ssh server2 zfs receive -vd datapool/data_backups/`hostname`/datapool receiving full stream of datapool/d...@rolling-20090923140714 into datapool/data_backups/server1/datapool/data @rolling-20090923140714 received 2.06GB stream in 62 seconds (34.0MB/sec) receiving full stream of datapool/data/l...@rolling-20090923140714 into datapool/data_backups/server2/datapool/data/l...@rolling-20090923140714 cannot mount 'datapool/data_backups/server1/datapool/data/logs': Insufficient privileges Seems like user "joe" on the remote server ("server2") can not mount the filesystem: [server2]$ zfs mount datapool/data_backups/server1/datapool/data/logs cannot mount 'datapool/data_backups/server1/datapool/data/logs': Insufficient privileges ZFS delegated permissions on the receiving side look fine for me: [server2]$ zfs allow datapool/data_backups/server1/datapool/data/logs - Local+Descendent permissions on (datapool/data_backups) user joe atime,canmount,create,destroy,mount,receive,rollback,send,snapshot - Local+Descendent permissions on (datapool) user joe atime,canmount,create,destroy,mount,receive,rollback,send,snapshot "zfs receive" creates a mountpoint with "root:root" permissions: [server2]$ ls -ld /opt/data_backups/server2/datapool/data/logs/ drwxr-xr-x 2 root root 2 Sep 23 14:02 /opt/data_backups/server1/datapool/data/logs/ I've tried to play with RBAC a bit ..: [server2]$ id uid=750(joe) gid=750(prod) [server2]$ profiles File System Security ZFS File System Management File System Management Service Management Basic Solaris User All ... but no luck - I still have zfs mount error while receiving a snapshot: Both servers are running Solaris U7 x86_64, Generic_139556-08. Is there any method to setup zfs send-receive functionality for descending zfs filesystems as non-root user? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
On Wed, Sep 23, 2009 at 7:32 AM, bertram fukuda wrote: > Thanks for the info Mike. > > Just so I'm clear. You suggest 1)create a single zpool from my LUN 2) create > a single ZFS filesystem 3) create 2 zone in the ZFS filesystem. Sound right? Correct -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
2009/9/23 bertram fukuda > Thanks for the info Mike. > > Just so I'm clear. You suggest 1)create a single zpool from my LUN 2) > create a single ZFS filesystem 3) create 2 zone in the ZFS filesystem. Sound > right? > > You can create zfs filesystems for each zone and you also can delegate zfs filesystems to be managed by zones. I usually put each zone's root on its own filesystem atleast. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [indiana-discuss] Boot failure with snv_122 and snv_123
Cross-posted to ZFS-Discuss per Vikram's suggestion. Summary: I upgraded to snv_123 and the system hangs on boot. snv_121, and earlier are working fine. Booting with -kv, the system still hung, but after a few minutes, the system continued, spit out more text (referring to disks, but I could not capture the text). Here is what was left on the screen once the debugger kicked in: PCI Express-device: i...@0, ata0 ata0 is /p...@0,0/pci-...@14,1/i...@0 PCI Express-device: pci1002,5...@4, pcieb1 pcieb1 is /p...@0,0/pci1002,5...@4 PCI Express-device: pci1002,5...@7 pcieb3 is /p...@0,0/pci1002,5...@7 UltraDMA mode 4 selected sd4 at ata0: target 0 lun 0 sd4 is /p...@0,0/pci-...@14,1/i...@0/s...@0,0 NOTICE: Can not read the pool label from '/p...@0,0/pci1043,8...@12/d...@0,0:a' NOTICE: spa_import_rootpool: error 5 Cannot mount root on /p...@0,0/pci1043,8...@12/d...@0,0:a fstype zfs panic[cpu0]/thread=fbc2efe0: vfs_mountroot: cannot mount root fbc50ce0 genunix:vfs_mountroot+350 () fbc50d10 genunix:main+e7 () fbc50d20 unix:_locore_start+92 () panic: entering debugger (do dump device, continue to reboot) Then I ran ::stack and ::status: [0]> ::stack kmdb_enter+0xb() debug_enter+0x38(fb934340) panicsys+0x41c(fbb89070, fbc50c70, fbc58e80, 1) vpanic+0x15c() panic+0x94() vfs_mountroot+0x350() main+0xe7() _locore_start+0x92() [0]> ::status debugging live kernel (64-bit) on (not set) operating system: 5.11 snv_123 (i86pc) CPU-specific support: AMD DTrace state: inactive stopped on: debugger entry trap To clarify, when build 122 was announced, I tried upgrading. The new BE would not boot, hanging in the same way that snv_123 does. I later deleted the snv_122 BE. Also, I checked my grub config, and nothing seems out of line there (though I have edited the boot entries to remove the splashimage, foreground, background, and console=graphics). Thanks, Charles > Hi, > > A problem with your root pool - something went wrong > when you upgraded > which explains why snv_122 no longer works as well. > One of the ZFS > experts on this list could help you - I suspect > others may have run into > similar issues before. > > Vikram > > Charles Menser wrote: > > Vikram, > > > > Thank you for the prompt reply! > > > > I have made no BIOS changes. The last time I > changed the BIOS was before reinstalling OpenSolaris > 2009.06 after changing my SATA controller to AHCI > mode. This was some time ago, and I have been using > the /dev repo and installed several development > builds since then (the latest that worked was > snv_121). > > > > I switched to a USB keyboard and mdb was happy. I > am curious why a PS/AUX keyboard works with the > system normally, but not MDB. > > > > Here is what I have from MDB so far: > > > > I rebooted with -kv, and after a few minutes, the > system continued, spit out more text (referring to > disks, but I could not capture the text). Here is > what was left on the screen once the debugger kicked > in: > > > > PCI Express-device: i...@0, ata0 > > ata0 is /p...@0,0/pci-...@14,1/i...@0 > > PCI Express-device: pci1002,5...@4, pcieb1 > > pcieb1 is /p...@0,0/pci1002,5...@4 > > PCI Express-device: pci1002,5...@7 > > pcieb3 is /p...@0,0/pci1002,5...@7 > > UltraDMA mode 4 selected > > sd4 at ata0: target 0 lun 0 > > sd4 is /p...@0,0/pci-...@14,1/i...@0/s...@0,0 > > NOTICE: Can not read the pool label from > '/p...@0,0/pci1043,8...@12/d...@0,0:a' > > NOTICE: spa_import_rootpool: error 5 > > Cannot mount root on > /p...@0,0/pci1043,8...@12/d...@0,0:a fstype zfs > > > > panic[cpu0]/thread=fbc2efe0: vfs_mountroot: > cannot mount root > > > > fbc50ce0 genunix:vfs_mountroot+350 () > > fbc50d10 genunix:main+e7 () > > fbc50d20 unix:_locore_start+92 () > > > > panic: entering debugger (do dump device, continue > to reboot) > > > > [again, the above is hand transcribed, and may > contain typos] > > > > Then I ran ::stack and ::status: > > > > [0]> ::stack > > kmdb_enter+0xb() > > debug_enter+0x38(fb934340) > > panicsys+0x41c(fbb89070, fbc50c70, > fbc58e80, 1) > > vpanic+0x15c() > > panic+0x94() > > vfs_mountroot+0x350() > > main+0xe7() > > _locore_start+0x92() > > > > [0]> ::status > > debugging live kernel (64-bit) on (not set) > > operating system: 5.11 snv_123 (i86pc) > > CPU-specific support: AMD > > DTrace state: inactive > > stopped on: debugger entry trap > > > > The motherboard is an ASUS M3A32-MVP, with BIOS Rev > 1705. > > > > There are four 500G SATA drives connected to the > on-board SATA controller. > > > > There is only one pool (rpool), setup as a > three-way mirror: > > > > char...@carbon-box:~$ zpool status > > pool: rpool > > state: ONLINE > > status: The pool is formatted using an older > on-disk format. The pool can > > still be used, but some features are > unavailable. > > action: Upgrade the pool using 'zpool upgrade'. > Once this is
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
Thanks for the info Mike. Just so I'm clear. You suggest 1)create a single zpool from my LUN 2) create a single ZFS filesystem 3) create 2 zone in the ZFS filesystem. Sound right? Thanks again, Bert -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New to ZFS: One LUN, multiple zones
On Wed, Sep 23, 2009 at 7:04 AM, bertram fukuda wrote: > I have a 1TB LUN being presented to me from our storage team. I need to > create 2 zones and share the storage between them. Would it be best to > repartition the LUNs (2 500Gb slices), create 2 separate storage pools then > assign them separately to each zone? If not, what would be the recommended > way. > I've read a ton of documentation but end up getting more confused than > anything. The only time that I would create multiple storage pools for zones is if I intend to migrate them to other hosts independently. That is, at time t1 I have zones z1 and z2 on host h1. I think that at some time in the future I would like to move z2 to host h2 while leaving z1 on h1. Since you only have one LUN, you are not able to move the zones independently via reassigning a LUN to another host. That is, it is impossible to split the LUN and unsafe to share the LUN to multiple hosts. In your situation, I would create one pool and put both zones on it. When you decide you need more zones, put them in it too. As an aside, I rarely find the idea that I have X amount of space and Y things to put into it, so I will give each thing X/Y space. This is because it is quite likely that someone will do the operation Y++ and there are very few storage technologies that allow you to shrink the amount of space allocated to each item. -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] New to ZFS: One LUN, multiple zones
I have a 1TB LUN being presented to me from our storage team. I need to create 2 zones and share the storage between them. Would it be best to repartition the LUNs (2 500Gb slices), create 2 separate storage pools then assign them separately to each zone? If not, what would be the recommended way. I've read a ton of documentation but end up getting more confused than anything. Thanks, Bert -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Real help
Hi there i'v been able to restore my zpool on a live cd,reinstall the grub,but booting from the HD it hangs for a while and than nothing comes up j...@opensolaris:~# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 66.2G 5.65G78K /a rpool/ROOT21.7G 5.65G18K / rpool/ROOT/OSOL_123 16.9G 5.65G 10.7G /tmp/tmpd38Ilg rpool/ROOT/be_name 160M 5.65G 10.5G / rpool/ROOT/be_name-1 4.23G 5.65G 10.5G / rpool/ROOT/opensolaris35.4M 5.65G 5.69G / rpool/ROOT/opensolaris-1 188M 5.65G 5.05G /tmp/tmp2GKUrN rpool/ROOT/opensolaris-2 43.0M 5.65G 5.60G /tmp/tmpUbjG28 rpool/ROOT/opensolaris-3 42.8M 5.65G 5.67G /tmp/tmpKIjHhB rpool/ROOT/opensolaris-4 32.3M 5.65G 6.49G /tmp/tmp7JeXnR rpool/ROOT/opensolaris-5 87.5M 5.65G 10.4G /tmp/tmpfvpm6V rpool/dump1.50G 5.65G 1.50G - rpool/export 41.5G 5.65G21K /export rpool/export/home 41.5G 5.65G 195M /export/home/ rpool/export/home/hazz41.3G 5.65G 41.3G /export/home/hazz/ rpool/swap1.50G 5.86G 1.29G - Any clue to get on the rescue? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss