Re: [zfs-discuss] Maximum zfs send/receive throughput
On 25.06.2010 14:32, Mika Borner wrote: It seems we are hitting a boundary with zfs send/receive over a network link (10Gb/s). We can see peak values of up to 150MB/s, but on average about 40-50MB/s are replicated. This is far away from the bandwidth that a 10Gb link can offer. Is it possible, that ZFS is giving replication a too low priority/throttling it too much? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss you can probably improve overall performance by using mbuffer [1] to stream the data over the network. At least some people have reported increased performance. mbuffer will buffer the datastream and disconnect zfs send operations from network latencies. Get it there: original source: http://www.maier-komor.de/mbuffer.html binary package: http://www.opencsw.org/packages/CSWmbuffer/ - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs corruptions in pool
On 06.06.2010 08:06, devsk wrote: I had an unclean shutdown because of a hang and suddenly my pool is degraded (I realized something is wrong when python dumped core a couple of times). This is before I ran scrub: pool: mypool state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scan: scrub repaired 0 in 0h7m with 0 errors on Mon May 31 09:00:27 2010 config: NAMESTATE READ WRITE CKSUM mypool DEGRADED 0 0 0 c6t0d0s0 DEGRADED 0 0 0 too many errors errors: Permanent errors have been detected in the following files: mypool/ROOT/May25-2010-Image-Update:0x3041e mypool/ROOT/May25-2010-Image-Update:0x31524 mypool/ROOT/May25-2010-Image-Update:0x26d24 mypool/ROOT/May25-2010-Image-Update:0x37234 //var/pkg/download/d6/d6be0ef348e3c81f18eca38085721f6d6503af7a mypool/ROOT/May25-2010-Image-Update:0x25db3 //var/pkg/download/cb/cbb0ff02bcdc6649da3763900363de7cff78ec72 mypool/ROOT/May25-2010-Image-Update:0x26cf6 I ran scrub and this is what it has to say afterwards. pool: mypool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scan: scrub repaired 0 in 0h11m with 0 errors on Sat Jun 5 22:43:54 2010 config: NAMESTATE READ WRITE CKSUM mypool DEGRADED 0 0 0 c6t0d0s0 DEGRADED 0 0 0 too many errors errors: No known data errors Few of questions: 1. Have the errors really gone away? Can I just clear and be content that errors are really gone? 2. Why did the errors occur anyway if ZFS guarantees on-disk consistency? I wasn't writing anything. Those files were definitely not being touched when the hang and unclean shutdown happened. I mean I don't mind if I create or modify a file and it doesn't land on disk because on unclean shutdown happened but a bunch of unrelated files getting corrupted, is sort of painful to digest. 3. The action says Determine if the device needs to be replaced. How the heck do I do that? Is it possible that this system runs on a virtual box? At least I've seen such a thing happen on a Virtual Box but never on a real machine. The reason why the error have gone away might be that meta data has three copies IIRC. So if your disk only had corruptions in the meta data area these errors can be repaired by scrubbing the pool. The smartmontools might help you figuring out if the disk is broken. But if you only had an unexpected shutdown and now everything is clean after a scrub, I wouldn't expect the disk to be broken. You can get the smartmontools from opencsw.org. If your system is really running on a Virtual Box I'd recommend that you turn of disk write caching of Virtual Box. Search the OpenSolaris forum of Virtual Box. There is an article somewhere how to do this. IIRC the subject is somethink like 'zfs pool curruption'. But it is also somewhere in the docs. HTH, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fileserver help.
On 13.04.2010 10:12, Ian Collins wrote: On 04/13/10 05:47 PM, Daniel wrote: Hi all. Im pretty new to the whole OpenSolaris thing, i've been doing a bit of research but cant find anything on what i need. I am thinking of making myself a home file server running OpenSolaris with ZFS and utilizing Raid/Z I was wondering if there is anything i can get that will allow Windows Media Center based hardware (HTPC or XBOX 360) to steam from my new fileserver? Any help is appreciated and remember im new :) OpenSolaris as a native CIFS service, which enables sharing filesystems to windows clients. I used this blog entry to setup my windows shares: http://blogs.sun.com/timthomas/entry/solaris_cifs_in_workgroup_mode With OpenSolaris, you can get the SMB server with the package manager GUI. I guess Daniel is rather looking for a UPnP Media Server [1] like ushare or coherence that is able to transcode media files and hand them out to streaming clients. I have been trying to get this up and running on a Solaris 10 based SPARC box, but I had no luck. I am not sure if the problem is my streaming client (Philips TV), because my FritzBox, which has a streaming server is also not always visible on the Philips TV. But the software running on the Solaris box never showed up as a service provider on the TV... Anyway, this was on Solaris 10, and I didn't bother too much to get it setup and running on OpenSolaris. There might be even a package available in the repository. Just look for the candidates like ushare, coherence, and of course libupnp. If those aren't available you'll have to build by hand, I guess this will also require some portability work. Cheers, Thomas [1] http://en.wikipedia.org/wiki/UPnP_AV_MediaServers ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Reclaiming Windows Partitions
On 07.04.2010 18:05, Ron Marshall wrote: I finally decided to get rid of my Windows XP partition as I rarely used it except to fire it up to install OS updates and virus signatures. I had some trouble locating information on how to do this so I thought I'd document it here. My system is Toshiba Tecra M9. It had four partitions on it. Partition 1 - NTFS Windows XP OS (Drive C:) Partition 2 - NTFS Windows data partition (D:) Partition 3 - FAT32 Partition 4 - Solaris2 Partition 1 and 2 where laid down by my company's standard OS install. I had shrunk these using QTparted to enable me to install OpenSolaris. Partition 3 was setup to have a common file system mountable by OpenSolaris and Windows. There may be ways to do this with NTFS now, but this was a legacy from older Solaris installs. Partition 4 is my OpenSolaris ZFS install Step 1) Backuped up all my data from Partition 3, and any files I needed from Partition 1 and 2. I also had a current snapshot of my OpenSolaris partition (Partition 4). Step 2) Delete Partitions 1,2, and 3. I did this using fdisk option in format under Opensolaris. format - Select Disk 0 (make note of the short drive name alias, mine was c4t0d0) You will receive a warning something like this; [disk formatted] /dev/dsk/c4t0d0s0 is part of active ZFS pool rpool. Please see zpool(1M) Then select fdisk from the FORMAT MENU You will see something like this; Total disk size is 14593 cylinders Cylinder size is 16065 (512 byte) blocks Cylinders PartitionStatus Type Start End Length% = == = === == === 1 FAT32LBA x xx 2 FAT32LBA xx 3 Win95 FAT32 5481 8157267 18 4 Active Solaris2 8158 145796422 44 SELECT ONE OF THE FOLLOWING: 1. Create a partition 2. Specify the active partition 3. Delete a partition 4. Change between Solaris and Solaris2 Partition IDs 5. Edit/View extended partitions 6. Exit (update disk configuration and exit) 7. Cancel (exit without updating disk configuration) Enter Selection: Delete the partitions 1, 2 and 3 (Don't forget to back them up before you do this) Using the fdisk menu create a new Solaris2 partition for use by ZFS. When you are done you should see something like this; Cylinder size is 16065 (512 byte) blocks Cylinders Partition Status Type Start End Length% = == = === == === 1 Solaris2 1 81578157 56 4 Active Solaris2 8158 14579 6422 44 Exit and update the disk configuration. Step 3) Create the ZFS pool First you can test if zpool will be successful in creating the pool by using the -n option; zpool create -n datapool c4t0d0p1 (I will make some notes about this disk name at the end) Should report something like; would create 'datapool' with the following layout: datapool c4t0d0p1 By default the zpool command will make a mount-point in your root / with the same name as your pool. If you don't want this you can change that in the create command (see the man page for details) Now issue the command without the -n option; zpool create datapool c4t0d0p1 Now check to see if it is there; zpool list It should report something like this; NAME SIZE ALLOC FREECAPDEDUP HEALTH ALTROOT datapool 62G 30.7G 31.3G49% 1.06x ONLINE - rpool49G 43.4G 5.65G88% 1.00x ONLINE - Step 4) Remember to take any of the mount parameters out of your /etc/vfstab file. You should be good to go at this point. == Notes about disk/partition naming; In my case the disk is called c4t0d0. So how did I come up with c4t0d0p1? The whole disk name is c4t0d0p0. Each partition is has the following naming convention; Partition 1 = c4t0d0p1 Partition 2 = c4t0d0p2 Partition 3 = c4t0d0p3 Partition 4 = c4t0d0p4 The fdisk command does not
Re: [zfs-discuss] unionfs help
On 04.02.2010 12:12, dick hoogendijk wrote: Frank Cusack wrote: Is it possible to emulate a unionfs with zfs and zones somehow? My zones are sparse zones and I want to make part of /usr writable within a zone. (/usr/perl5/mumble to be exact) Why don't you just export that directory with NFS (rw) to your sparse zone and mount it on /usr/perl5/mumble ? Or is this too simple a thought? What about lofs? I thinks lofs is the equivalent for unionfs on Solaris. E.g. mount -F lofs /originial/path /my/alternate/mount/point - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cannot attach c5d0s0 to c4d0s0: device is too small
On 28.01.2010 15:55, dick hoogendijk wrote: Cindy Swearingen wrote: On some disks, the default partitioning is not optimal and you have to modify it so that the bulk of the disk space is in slice 0. Yes, I know, but in this case the second disk indeed is smaller ;-( So I wonder, should I reinstall the whole thing on this smaller disk and thren let the bigger second attach? That would mean opening up the case and all that, because I don't have a DVD player built in. So I thought I'd go the zfs send|recv way. What are yout thoughts about this? Another thought is that a recent improvement was that you can attach a disk that is an equivalent size, but not exactly the same geometry. Which OpenSolaris release is this? b131 And this only works if the difference is realy (REALLY) small. :) have you considered creating an alternate boot environment on the smaller disk, rebooting into this new boot environment, and then attaching the larger disk after destroy the old boot environment? beadm might do this job for you... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup clarification
Chavdar Ivanov schrieb: Hi, I BFUd successfully snv_128 over snv_125: --- # cat /etc/release Solaris Express Community Edition snv_125 X86 Copyright 2009 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 05 October 2009 # uname -a SunOS cheeky 5.11 snv_128 i86pc i386 i86pc ... being impatient to test zfs dedup. I was able to set dedup=on (I presume with the default sha256 key) on a few filesystems and did the following trivial test (this is an edited script session): Script started on Wed Oct 28 09:38:38 2009 # zfs get dedup rpool/export/home NAME PROPERTY VALUE SOURCE rpool/export/home dedup onlocal # for i in 1 2 3 4 5 ; do mkdir /export/home/d${i} df -k /export/home/d${i} zfs get used rpool/export/home cp /testfile /export/home/d${i}; done Filesystemkbytesused avail capacity Mounted on rpool/export/home17418240 27 6063425 1%/export/home NAME PROPERTY VALUE SOURCE rpool/export/home used 27K- Filesystemkbytesused avail capacity Mounted on rpool/export/home17515512 103523 6057381 2%/export/home NAME PROPERTY VALUE SOURCE rpool/export/home used 102M - Filesystemkbytesused avail capacity Mounted on rpool/export/home17682840 271077 6056843 5%/export/home NAME PROPERTY VALUE SOURCE rpool/export/home used 268M - Filesystemkbytesused avail capacity Mounted on rpool/export/home17852184 442345 6054919 7%/export/home NAME PROPERTY VALUE SOURCE rpool/export/home used 432M - Filesystemkbytesused avail capacity Mounted on rpool/export/home17996580 587996 6053933 9%/export/home NAME PROPERTY VALUE SOURCE rpool/export/home used 574M - # zfs get all rpool/export/home NAME PROPERTY VALUE SOURCE rpool/export/home type filesystem - rpool/export/home creation Mon Sep 21 9:27 2009 - rpool/export/home used 731M - rpool/export/home available 5.77G - rpool/export/home referenced731M - rpool/export/home compressratio 1.00x - rpool/export/home mounted yes- rpool/export/home quota none default rpool/export/home reservation none default rpool/export/home recordsize128K default rpool/export/home mountpoint/export/home inherited from rpool/export rpool/export/home sharenfs offdefault rpool/export/home checksum on default rpool/export/home compression offdefault rpool/export/home atime on default rpool/export/home devices on default rpool/export/home exec on default rpool/export/home setuidon default rpool/export/home readonly offdefault rpool/export/home zoned offdefault rpool/export/home snapdir hidden default rpool/export/home aclmode groupmask default rpool/export/home aclinheritrestricted default rpool/export/home canmount on default rpool/export/home shareiscsioffdefault rpool/export/home xattr on default rpool/export/home copies1 default rpool/export/home version 4 - rpool/export/home utf8only off- rpool/export/home normalization none - rpool/export/home casesensitivity sensitive - rpool/export/home vscan offdefault rpool/export/home nbmandoffdefault rpool/export/home sharesmb offdefault rpool/export/home refquota none default rpool/export/home refreservationnone default rpool/export/home primarycache alldefault rpool/export/home secondarycachealldefault rpool/export/home usedbysnapshots 0 - rpool/export/home usedbydataset
Re: [zfs-discuss] ZFS dedup clarification
Michael Schuster schrieb: Thomas Maier-Komor wrote: Script started on Wed Oct 28 09:38:38 2009 # zfs get dedup rpool/export/home NAME PROPERTY VALUE SOURCE rpool/export/home dedup onlocal # for i in 1 2 3 4 5 ; do mkdir /export/home/d${i} df -k /export/home/d${i} zfs get used rpool/export/home cp /testfile /export/home/d${i}; done as far as I understood it, the dedup works during writing, and won't deduplicate already written data (this is planned for a later release). isn't he doing just that (writing, that is)? Michael Oh - I overlooked this very one line... Maybe zfs's used property returns the accumulated usage, and only zpool shows the real usage? - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] raidz-1 vs mirror
Hi everybody, I am considering moving my data pool from a two disk (10krpm) mirror layout to a three disk raidz-1. This is just a single user workstation environment, where I mostly perform compile jobs. From past experiences with raid5 I am a little bit reluctant to do so, as software raid5 has a major impact on write performance. Is this similar with raidz-1 or does the zfs stack work around the limitations that come with raid5 into play? How big would the penalty be? As an alternative I could swap the drives for bigger ones - but these would probably then be 7.2k rpm discs, because of costs. Any experiences or thoughts? TIA, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool resilver - error history
Marcel Gschwandl schrieb: Hi all! I'm running a Solaris 10 Update 6 (10/08) system and had to resilver a zpool. It's now showing snip scrub: resilver completed after 9h0m with 21 errors on Wed Nov 4 22:07:49 2009 /snip but I haven't found an option to see what files where affected, Is there any way to do that? Thanks in advance Marcel Try zpool status -v poolname ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zdb assertion failure/zpool recovery
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, I have a corrupt pool, which lives on a .vdi file of a VirtualBox. IIRC the corruption (i.e. pool being not importable) was caused when I killed virtual box, because it was hung. This pool consists of a single vdev and I would really like to get some files out of that thing. So I tried running zdb, but this fails with an assertion failure: Assertion failed: object_count == usedobjs (0xce == 0xcd), file ../zdb.c, line 1215 Abort (core dumped) The core file consists of 107 threads. The thread with the assertion failure has the following stack trace: - - lwp# 1 / thread# 1 d2af1997 _lwp_kill (1, 6, 8047638, d2a9ab6e) + 7 d2a9ab7a raise(6, 0, 8047688, d2a71fea) + 22 d2a7200a abort(65737341, 6f697472, 6166206e, 64656c69, 626f203a, 7463656a) + f2 d2a7225a _assert (80478b0, 8062eec, 4bf, 1) + 82 08057116 dump_dir (82cadd8, 8047dac, 805aff8, 0) + 33e 08058b4f dump_zpool (81740c0, 8047dac) + 93 0805a1d8 main (0, 8047e0c, 8047e24, d2bfc7b4) + 598 08053d1d _start (5, 8047ec4, 8047ec8, 8047ecb, 8047ed0, 8047ed3) + 7d So for me this looks like the object_count of a directory is inconsistent. Any idea or hint what I could do now? I read that there is some utility to roolback the pool for simple (mirror) setups. This setup is even more simple as it consists of a single vdev. So I would like to try it out. Does anybody know, where I can get the tool, or how I could use zdb in this situation to rollback the pool? TIA, Thomas -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.8 (SunOS) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkp1nX8ACgkQ6U+hp8PKQZLdaQCfQgCDStLqYX16D8HqeL9McjPT G78An2aD6P6aTlsM9YfpxyMP8BUkzXQ1 =v/iN -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [osol-discuss] zdb assertion failure/zpool recovery
Thomas Maier-Komor wrote: Hi, I have a corrupt pool, which lives on a .vdi file of a VirtualBox. IIRC the corruption (i.e. pool being not importable) was caused when I killed virtual box, because it was hung. This pool consists of a single vdev and I would really like to get some files out of that thing. So I tried running zdb, but this fails with an assertion failure: Assertion failed: object_count == usedobjs (0xce == 0xcd), file ../zdb.c, line 1215 Abort (core dumped) The core file consists of 107 threads. The thread with the assertion failure has the following stack trace: - lwp# 1 / thread# 1 d2af1997 _lwp_kill (1, 6, 8047638, d2a9ab6e) + 7 d2a9ab7a raise(6, 0, 8047688, d2a71fea) + 22 d2a7200a abort(65737341, 6f697472, 6166206e, 64656c69, 626f203a, 7463656a) + f2 d2a7225a _assert (80478b0, 8062eec, 4bf, 1) + 82 08057116 dump_dir (82cadd8, 8047dac, 805aff8, 0) + 33e 08058b4f dump_zpool (81740c0, 8047dac) + 93 0805a1d8 main (0, 8047e0c, 8047e24, d2bfc7b4) + 598 08053d1d _start (5, 8047ec4, 8047ec8, 8047ecb, 8047ed0, 8047ed3) + 7d So for me this looks like the object_count of a directory is inconsistent. Any idea or hint what I could do now? I read that there is some utility to roolback the pool for simple (mirror) setups. This setup is even more simple as it consists of a single vdev. So I would like to try it out. Does anybody know, where I can get the tool, or how I could use zdb in this situation to rollback the pool? TIA, Thomas I've searched the web some more and came across http://www.opensolaris.org/jive/thread.jspa?threadID=85794 The posting by nhand gave me the information I needed to get my pool up and running again. Thanks! - Thomas ___ opensolaris-discuss mailing list opensolaris-disc...@opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] assertion failure
Hi, I am just having trouble with my opensolaris in a virtual box. It refuses to boot with the following crash dump: panic[cpu0]/thread=d5a3edc0: assertion failed: 0 == dmu_buf_hold_array(os, object, offset, size, FALSE, FTAG, numbufs, dbp), file: ../../common/fs/zfs/dmu.c, line: 614 d5a3eb08 genunix:assfail+5a (f9ce09da4, f9ce0a9c) d5a3eb68 zfs:dmu_write+1a0 (d55af620, 57, 0, ba) d5a3ec08 zfs:space_map_sync+304 (d5f13ed4, 1, d5f13c) d5a3ec7b zfs:metaslab_sync+284 (d5f1ecc0, 122f3, 0,) d5a3ecb8 zfs:vdev_sync+c6 (d579d940, 122f3,0) d5a3ed28 zfs:spa_sync+3d0 (d579c980, 122f3,0,) d5a3eda8 zfs:txg_sync_thread+308 (d55045c0, 0) d5a3edb8 unix:thread_start+8 () This is on snv_117 32-bit Is this a known issue? Any workarounds? - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Increase size of ZFS mirror
dick hoogendijk schrieb: On Wed, 24 Jun 2009 03:14:52 PDT Ben no-re...@opensolaris.org wrote: If I detach c5d1s0, add a 1TB drive, attach that, wait for it to resilver, then detach c5d0s0 and add another 1TB drive and attach that to the zpool, will that up the storage of the pool? That will do the trick perfectly. I just did the same last week ;-) Doesn't detaching render the detach disk command the detached disk as a disk unassociated with a pool? I think it might be better to import the pool with only one half of the mirror without detaching the disk, and the do a zpool replace. In this case if something goes wrong during resilver, you still have the other half of the mirror to bring your pool back up again. If you detach the disk upfront this won't be possible. Just an idea... - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
Andre van Eyssen schrieb: On Mon, 22 Jun 2009, Jacob Ritorto wrote: Is there a card for OpenSolaris 2009.06 SPARC that will do SATA correctly yet? Need it for a super cheapie, low expectations, SunBlade 100 filer, so I think it has to be notched for 5v PCI slot, iirc. I'm OK with slow -- main goals here are power saving (sleep all 4 disks) and 1TB+ space. Oh, and I hate to be an old head, but I don't want a peecee. They still scare me :) Thinking root pool on 16GB ssd, perhaps, so the thing can spin down the main pool and idle *really* cheaply.. The LSI SAS controllers with SATA ports work nicely with SPARC. I have one in my V880. On a Blade-100, however, you might have some issues due to the craptitude of the PCI slots. To be honest, the Grover was a fun machine at the time, but I think that time may have passed. Oh, and if you do grab the LSI card, don't let James catch you using the itmpt driver or lsiutils ;-) I'm also using an LSI SAS card for attaching sata disks to a Blade 2500. In my experience there is some severe problems: 1) Once the disks spin down due to idleness it can become impossible to reactivate them without doing a full reboot (i.e. hot plugging won't help) 2) disks that were attached once leave a stale /dev/dsk entry behind that takes full 7 seconds to stat() with kernel running at 100%. Apart from that it works fine. - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SPARC SATA, please.
Volker A. Brandt schrieb: 2) disks that were attached once leave a stale /dev/dsk entry behind that takes full 7 seconds to stat() with kernel running at 100%. Such entries should go away with an invocation of devfsadm -vC. If they don't, it's a bug IMHO. yes, they go away. But the problem is when you do this and replug the disks they don't show up again... And that's even worse IMO... So you want such disks to behave more like USB sticks? If there was a good way to mark certain devices or a device tree as volatile then this would be an interesting RFE. I would certainly not want *all* of my disks to come and go as they please. :-) I am not sure how feasible an implementation would be though. Regards -- Volker yes - that's my usage scenario. Or to be more precise I have a small chassis with two disks, which I only want to attach for backup purposes. I just send/receive from my active pool to the backup pool, and then detach the backup pool. I just like having backup disks being physically detached when not in use. Like this, nothing can really screw them up but a fire in the room... I thought SAS/SATA would be hot-pluggable - so what's the difference between USB's hot-plug feature and the one of SAS/SATA other that that USB is handled by the volume manager? So, yes, it would be nice if one could assign a SATA disk to the volume manager. - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] replication issue
Hi, I just tried replicating a zfs dataset, which failed because the dataset has a mountpoint set and zfs received tried to mount the target dataset to the same directory. I.e. I did the following: $ zfs send -R mypool/h...@20090615 | zfs receive -d backup cannot mount '/var/hg': directory is not empty Is this a known issue or is this a user error because of -d on the receiving side? This happened on: % uname -a SunOS azalin 5.10 Generic_139555-08 sun4u sparc SUNW,Sun-Blade-2500 - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Monitoring ZFS host memory use
Troy Nancarrow (MEL) schrieb: Hi, Please forgive me if my searching-fu has failed me in this case, but I've been unable to find any information on how people are going about monitoring and alerting regarding memory usage on Solaris hosts using ZFS. The problem is not that the ZFS ARC is using up the memory, but that the script Nagios is using to check memory usage simply sees, say 96% RAM used, and alerts. The options I can see, and the risks I see with them, are: 1) Raise the alert thresholds so that they are both (warn and crit) above the maximum that the ARC should let itself be. The problem is I can see those being in the order of 98/99% which doesn't leave a lot of room for response if memory usage is headed towards 100%. 2) Alter the warning script to ignore the ARC cache and do alerting based on what's left. Perhaps with a third threshold somewhere above where the ARC should let things get, in case for some reason the ARC isn't returning memory to apps. The risk I see here is that ignoring the ARC may present other odd scenarios where I'm essentially ignoring what's causing the memory problems. So how are others monitoring memory usage on ZFS servers? I've read (but can't find a written reference) that the ARC limits itself such that 1GB of memory is always free. Is that a hard coded number? Is there a bit of leeway around it or can I rely on that exact number of bytes being free unless there is impending 100% memory utilisation? Regards, *TROY NANCARROW* the ZFS evil tuning guide contains a description how to limit the arc size. Look here: http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Solaris_10_8.2F07_and_Solaris_Nevada_.28snv_51.29_Releases Concerning monitoring of ARC size, I use (of course) my own tool called sysstat. It shows all key system metrics on one terminal page similar to top. You can get it there: http://www.maier-komor.de/sysstat.html HTH, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ext4 bug zfs handling of the very same situation
Hi, there was recently a bug reported against EXT4 that gets triggered by KDE: https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781 Now I'd like to verify that my understanding of ZFS behavior and implementations is correct, and ZFS is unaffected from this kind of issue. Maybe somebody would like to comment on this. The underlying problem with ext4 is that some kde executables do something like this: 1a) open and read data from file x, close file x 1b) open and truncate file x 1c) write data to file x 1d) close file x or 2a) open and read data from file x, close file x 2b) open and truncate file x.new 2c) write data to file x.new 2d) close file x.new 2e) rename file x.new to file x Concerning case 1) I think ZFS may lose data if power is lost right after 1b) and open(xxx,O_WRONLY|O_TRUNC|O_CREAT) is issued in a transaction group separately from the one containing 1c/1d. Concerning case 2) I cannot see ZFS losing any data, because of copy-on-write and transaction grouping. Theodore Ts'o (ext4 developer) commented that both cases are flawed and cannot be supported correctly, because of a lacking fsync() before close. Is this correct? His comment is over here: https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/54 Any thoughts or comments? TIA, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to use mbuffer with zfs send/recv
Julius Roberts wrote: How do i compile mbuffer for our system, Thanks to Mike Futerko for help with the compile, i now have it installed OK. and what syntax to i use to invoke it within the zfs send recv? Still looking for answers to this one? Any example syntax, gotchas etc would be much appreciated. First start the receive side, then the sender side: receiver mbuffer -s 128k -m 200M -I sender:8000 | zfs receive filesystem sender zfs send pool/filesystem | mbuffer -s 128k -m 200M -O receiver:8000 Of course, you should adjust the hostnames accordingly, and set the mbuffer buffer size to a value that fits your needs (option -m). BTW: I've just released a new version of mbuffer which defaults to TCP buffer size of 1M, which can be adjusted with option --tcpbuffer. Cheers, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 'zfs recv' is very slow
Seems like there's a strong case to have such a program bundled in Solaris. I think, the idea of having a separate configurable buffer program with a high feature set fits into UNIX philosophy of having small programs that can be used as building blocks to solve larger problems. mbuffer is already bundled with several Linux distros. And that is also the reason its feature set expanded over time. In the beginning there wasn't even support for network transfers. Today mbuffer supports direct transfer to multiple receivers, data transfer rate limitation, high/low water mark algorithm, on the fly md5 calculation, multi volume tape access, usage of sendfile, and has a configurable buffer size/layout. So ZFS send/receive is just another use case for this tool. - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 'zfs recv' is very slow
Joerg Schilling schrieb: Andrew Gabriel [EMAIL PROTECTED] wrote: That is exactly the issue. When the zfs recv data has been written, zfs recv starts reading the network again, but there's only a tiny amount of data buffered in the TCP/IP stack, so it has to wait for the network to heave more data across. In effect, it's a single buffered copy. The addition of a buffer program turns it into a double-buffered (or cyclic buffered) copy, with the disks running flat out continuously, and the network streaming data across continuously at the disk platter speed. rmt and star increase the Socket read/write buffer size via setsockopt(STDOUT_FILENO, SOL_SOCKET, SO_SNDBUF, setsockopt(STDIN_FILENO, SOL_SOCKET, SO_RCVBUF, when doing remote tape access. This has a notable effect on throughput. Jörg yesterday, I've release a new version of mbuffer, which also enlarges the default TCP buffer size. So everybody using mbuffer for network data transfer might want to update. For everybody unfamiliar with mbuffer, it might be worth to note that it has a bunch of additional features like e.g. sending to multiple clients at once, high/low watermark flushing to prevent tape drives from stop/rewind/restart cycles. - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mbuffer WAS'zfs recv' is very slow
Jerry K schrieb: Hello Thomas, What is mbuffer? Where might I go to read more about it? Thanks, Jerry yesterday, I've release a new version of mbuffer, which also enlarges the default TCP buffer size. So everybody using mbuffer for network data transfer might want to update. For everybody unfamiliar with mbuffer, it might be worth to note that it has a bunch of additional features like e.g. sending to multiple clients at once, high/low watermark flushing to prevent tape drives from stop/rewind/restart cycles. - Thomas The man page is included in the source, which you get over here: http://www.maier-komor.de/mbuffer.html New release are announce on freshmeat.org. Maybe I should add an html of the man page to the homepage of mbuffer... - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 'zfs recv' is very slow
- original Nachricht Betreff: Re: [zfs-discuss] 'zfs recv' is very slow Gesendet: Fr, 14. Nov 2008 Von: Bob Friesenhahn[EMAIL PROTECTED] On Fri, 14 Nov 2008, Joerg Schilling wrote: On my first Sun at home (a Sun 2/50 with 1 MB of RAM) in 1986, I could set the socket buffer size to 63 kB. 63kB : 1 MB is the same ratio as 256 MB : 4 GB. BTW: a lot of numbers in Solaris did not grow since a long time and thus create problems now. Just think about the maxphys values 63 kB on x86 does not even allow to write a single BluRay disk sector with a single transfer. Bloating kernel memory is not the right answer. Solaris comes with a quite effective POSIX threads library (standard since 1996) which makes it easy to quickly shuttle the data into a buffer in your own application. One thread deals with the network while the other thread deals with the device. I imagine that this is what the supreme mbuffer program is doing. Bob Basically, mbuffer just does this - but it additionally has a whole bunch of extra functionality. At least there are people who use it to lengthen the live of their tape drives with the high/low watermark feature... Thomas --- original Nachricht Ende ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Roch schrieb: Thomas, for long latency fat links, it should be quite beneficial to set the socket buffer on the receive side (instead of having users tune tcp_recv_hiwat). throughput of a tcp connnection is gated by receive socket buffer / round trip time. Could that be Ross' problem ? -r Hmm, I'm not a TCP expert, but that sounds absolutely possible, if Solaris 10 isn't tuning the TCP buffer automatically. The default receive buffer seems to be 48k (at least on a V240 running 118833-33). So if the block size is something like 128k it would absolutely make sense to tune the receive buffer to lower the rund trip time... Ross: Would you like a patch to test if this is the case? Which version of mbuffer are you currently using? - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS wrecked our system...
Christiaan Willemsen schrieb: do the disks show up as expected in format? Is your root pool just a single disk or is it a mirror of mutliple disks? Did you attach/detach any disks to the root pool before rebooting? No, we did nothing at all to the pools. The root pool is a hardware mirror, not a zfs mirror. Actually, it looks like Opensolaris can't find any of the disk. There was recently a thread were someone had an issue importing a known-to-be-healthy pool after a BIOS update. It turned out that the new BIOS had a different host protected area on the disks and therefore delivered a different disk size to OS. I'd check the controller and BIOS settings that are concerned with disks. Any change in this area might lead to this effect. Additionally, I think it is not a good idea to use a RAID controller to mirror disks for ZFS. Like this a silently corrupted sector cannot be corrected by ZFS. In contrast if you give ZFS both disks as individual disks and create a ZPOOL mirror, ZFS is able to detect corrupted sectors and correct them from the health side of the mirror. A hardware mirror will never know which side of the mirror is good and which is bad... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS wrecked our system...
Christiaan Willemsen schrieb: Since the last reboot, our system wont boot anymore. It hangs at the Use is subject to license terms. line for a few minutes, and then gives an error that it can't find the device it needs for making the root pool, and eventually reboots. We did not change anything to the system or to the Adaptec controller So I tried the OpenSolaris boot CD. It also takes a few minutes to boot (this was never before the case), halting at the exact same line as the normal boot. It also complains about drives being offline, but this actually cannot be the case, all drives are working fine.. When I get to a console, and do a zpool import, it can't find any pool. There should be two pools, one for booting, and another one for the data. This is all on SNV_98... do the disks show up as expected in format? Is your root pool just a single disk or is it a mirror of mutliple disks? Did you attach/detach any disks to the root pool before rebooting? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] change in zpool_get_prop_int
Hi, I'm observing a change in the values returned by zpool_get_prop_int. In Solaris 10 update 5 this function returned the values for ZPOOL_PROP_CAPACITY in bytes, but in update 6 (i.e. nv88?) it seems to be returning the value in kB. Both Solaris versions were shipped with libzfs.so.2. So how can one distinguish between those two variants. Any comments on this change? - Thomas P.S.: I know this is a private interface, but it is quite handy for my system observation tool sysstat... -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Building a 2nd pool, can I do it in stages?
Bob Friesenhahn schrieb: On Tue, 21 Oct 2008, Håvard Krüger wrote: Is it possible to build a RaidZ with 3x 1TB disks and 5x 0.5TB disks, and then swap out the 0.5 TB disks as time goes by? Is there a documentation/wiki on doing this? Yes, you can build a raidz vdev with all of these drives but only 0.5TB will be used from your 1TB drives. Once you replace *all* of the 0.5TB drives with 1TB drives, then the full space of the 1TB drives will be used. Depending on how likely it is that you will replace all of these old drives, you might consider using the new drives to add a second vdev to the pool so that the disk space on all the existing drives may be fully used and you obtain better mutiuser performance. Bob But in this case one should be aware that if one adds another vdev, it is currently impossible to get rid of it afterwards. I.e. the pool will always have to RaidZ vdefs, and the new vdev which would consist in this scenario of 3 1T disks couldn't be grown by adding another disk. So one would be forced to add another raid-z vdev. IMO, I'd go for replacing the 0.5TB disks one by one and stick to a single vdev. - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Thomas Maier-Komor schrieb: BTW: I release a new version of mbuffer today. WARNING!!! Sorry people!!! The latest version of mbuffer has a regression that can CORRUPT output if stdout is used. Please fall back to the last version. A fix is on the way... - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Ross Smith schrieb: I'm using 2008-05-07 (latest stable), am I right in assuming that one is ok? Date: Wed, 15 Oct 2008 13:52:42 +0200 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED]; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Improving zfs send performance Thomas Maier-Komor schrieb: BTW: I release a new version of mbuffer today. WARNING!!! Sorry people!!! The latest version of mbuffer has a regression that can CORRUPT output if stdout is used. Please fall back to the last version. A fix is on the way... - Thomas _ Discover Bird's Eye View now with Multimap from Live Search http://clk.atdmt.com/UKM/go/111354026/direct/01/ Yes this one is OK. The regression appeared in 20081014. - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Ross schrieb: Hi, I'm just doing my first proper send/receive over the network and I'm getting just 9.4MB/s over a gigabit link. Would you be able to provide an example of how to use mbuffer / socat with ZFS for a Solaris beginner? thanks, Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss receiver mbuffer -I sender:1 -s 128k -m 512M | zfs receive sender zfs send mypool/[EMAIL PROTECTED] | mbuffer -s 128k -m 512M -O receiver:1 BTW: I release a new version of mbuffer today. HTH, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Carsten Aulbert schrieb: Hi Thomas, Thomas Maier-Komor wrote: Carsten, the summary looks like you are using mbuffer. Can you elaborate on what options you are passing to mbuffer? Maybe changing the blocksize to be consistent with the recordsize of the zpool could improve performance. Is the buffer running full or is it empty most of the time? Are you sure that the network connection is 10Gb/s all the way through from machine to machine? Well spotted :) right now plain mbuffer with plenty of buffer (-m 2048M) on both ends and I have not seen any buffer exceeding the 10% watermark level. The network connection are via Neterion XFrame II Sun Fire NICs then via CX4 cables to our core switch where both boxes are directly connected (WovenSystmes EFX1000). netperf tells me that the TCP performance is close to 7.5 GBit/s duplex and if I use cat /dev/zero | mbuffer | socat --- socat | mbuffer /dev/null I easily see speeds of about 350-400 MB/s so I think the network is fine. Cheers Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss I don't know socat or what benefit it gives you, but have you tried using mbuffer to send and receive directly (options -I and -O)? Additionally, try to set the block size of mbuffer to the recordsize of zfs (usually 128k): receiver$ mbuffer -I sender:1 -s 128k -m 2048M | zfs receive sender$ zfs send blabla | mbuffer -s 128k -m 2048M -O receiver:1 As transmitting from /dev/zero to /dev/null is at a rate of 350MB/s, I guess, you are really hitting the maximum speed of your zpool. From my understanding, I'd guess sending is always slower than receiving, because reads are random and writes are sequential. So it should be quite normal that mbuffer's buffer doesn't really see a lot of usage. Cheers, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Carsten Aulbert schrieb: Hi again, Thomas Maier-Komor wrote: Carsten Aulbert schrieb: Hi Thomas, I don't know socat or what benefit it gives you, but have you tried using mbuffer to send and receive directly (options -I and -O)? I thought we tried that in the past and with socat it seemed faster, but I just made a brief test and I got (/dev/zero - remote /dev/null) 330 MB/s with mbuffer+socat and 430MB/s with mbuffer alone. Additionally, try to set the block size of mbuffer to the recordsize of zfs (usually 128k): receiver$ mbuffer -I sender:1 -s 128k -m 2048M | zfs receive sender$ zfs send blabla | mbuffer -s 128k -m 2048M -O receiver:1 We are using 32k since many of our user use tiny files (and then I need to reduce the buffer size because of this 'funny' error): mbuffer: fatal: Cannot address so much memory (32768*65536=21474836481544040742911). Does this qualify for a bug report? Thanks for the hint of looking into this again! Cheers Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Yes this qualifies for a bug report. As a workaround for now, you can compile in 64 bit mode. I.e.: $ ./configure CFLAGS=-g -O -m64 $ make make install This works for Sun Studio 12 and gcc. For older version of Sun Studio, you need to pass -xarch=v9 instead of -m64. I am planning to release an updated version mbuffer this week. I'll include a patch for this issue. Cheers, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Improving zfs send performance
Carsten Aulbert schrieb: Hi all, although I'm running all this in a Sol10u5 X4500, I hope I may ask this question here. If not, please let me know where to head to. We are running several X4500 with only 3 raidz2 zpools since we want quite a bit of storage space[*], but the performance we get when using zfs send is sometimes really lousy. Of course this depends what's in the file system, but when doing a few backups today I have seen the following: receiving full stream of atlashome/[EMAIL PROTECTED] into atlashome/BACKUP/[EMAIL PROTECTED] in @ 11.1 MB/s, out @ 11.1 MB/s, 14.9 GB total, buffer 0% full summary: 14.9 GByte in 45 min 42.8 sec - average of 5708 kB/s So, a mere 15 GB were transferred in 45 minutes, another user's home which is quite large (7TB) took more than 42 hours to be transferred. Since all this is going a 10 Gb/s network and the CPUs are all idle I would really like to know why * zfs send is so slow and * how can I improve the speed? Thanks a lot for any hint Cheers Carsten [*] we have some quite a few tests with more zpools but were not able to improve the speeds substantially. For this particular bad file system I still need to histogram the file sizes. Carsten, the summary looks like you are using mbuffer. Can you elaborate on what options you are passing to mbuffer? Maybe changing the blocksize to be consistent with the recordsize of the zpool could improve performance. Is the buffer running full or is it empty most of the time? Are you sure that the network connection is 10Gb/s all the way through from machine to machine? - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Pros/Cons of multiple zpools?
Joseph Mocker schrieb: Hello, I haven't seen this discussed before. Any pointers would be appreciated. I'm curious, if I have a set of disks in a system, is there any benefit or disadvantage to breaking the disks into multiple pools instead of a single pool? Does multiple pools cause any additional overhead for ZFS, for example? Can it cause cache contention/starvation issues? Thanks... --joe ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Currently, I've two pools in my system: one for live data and the other for backup. When doing large backups (i.e. tar'ing one directory hierarchy from live to backup), I've seen severe memory pressure on the system - as if both pools were competing for memory... Maybe with zfs boot/root becoming available, I'll add a third pool for the OS. From what I've seen, zfs makes very much sense for boot/root if you are using live upgrade. I like the idea of having OS and data separated, but on a system with only two disks, I'd definitely go for a single mirrored zpool where both OS and data reside. I guess sharing one physical disk among multiple zpools could have severe negative impacts during concurrent accesses. But I really have no in-depth knowledge to say for sure. Maybe somebody else can comment on this... - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, SATA, LSI and stability
Frank Fischer wrote: After having massive problems with a supermicro X7DBE box using AOC-SAT2-MV8 Marvell controllers and opensolaris snv79 (same as described here: http://sunsolve.sun.com/search/document.do?assetkey=1-66-233341-1) we just start over using new hardware and opensolaris 2008.05 upgraded to snv94. We used again a supermicro X7DBE but now with two LSI SAS3081E SAS controllers. And guess what? Now we get these error-messages in /var/adm/messages: Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd11): Aug 11 18:20:52 thumper2Error for Command: read(10) Error Level: Retryable Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice] Requested Block: 1423173120Error Block: 1423173120 Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: WD-WCAP Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice] Sense Key: Unit_Attention Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 Along whit these messages there are a lot of this messages: Aug 11 18:20:51 thumper2 scsi: [ID 365881 kern.info] /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt1): Aug 11 18:20:51 thumper2Log info 0x31123000 received for target 5. Aug 11 18:20:51 thumper2scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc I would believe having a faulty disk, but not two: Aug 11 17:47:47 thumper2 scsi: [ID 365881 kern.info] /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt1): Aug 11 17:47:47 thumper2Log info 0x31123000 received for target 4. Aug 11 17:47:47 thumper2scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd10): Aug 11 17:47:48 thumper2Error for Command: read(10) Error Level: Retryable Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice] Requested Block: 252165120 Error Block: 252165120 Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice] Sense Key: Unit_Attention Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 Aug 11 17:48:34 thumper2 scsi: [ID 243001 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0): Does somebody know what is going on here? I have checked the disks with iostat -En : -bash-3.2# iostat -En ... c4t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: FUJITSU Product: MBA3073RCRevision: 0103 Serial No: Size: 73.54GB 73543163904 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 c4t5d0 Soft Errors: 4 Hard Errors: 24 Transport Errors: 179 Vendor: ATA Product: ST3750330NS Revision: SN04 Serial No: Size: 750.16GB 750156374016 bytes Media Error: 0 Device Not Ready: 0 No Device: 22 Recoverable: 4 Illegal Request: 0 Predictive Failure Analysis: 0 c4t6d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: WDC WD7500AYYS-0 Revision: 4G30 Serial No: Size: 750.16GB 750156374016 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 c6t4d0 Soft Errors: 6 Hard Errors: 17 Transport Errors: 466 Vendor: ATA Product: ST3750640NS Revision: GSerial No: Size: 750.16GB 750156374016 bytes Media Error: 0 Device Not Ready: 0 No Device: 17 Recoverable: 6 Illegal Request: 0 Predictive Failure Analysis: 0 c6t5d0 Soft Errors: 2 Hard Errors: 23 Transport Errors: 539 Vendor: ATA Product: WDC WD7500AYYS-0 Revision: 4G30 Serial No: Size: 750.16GB 750156374016 bytes Media Error: 0 Device Not Ready: 0 No Device: 23 Recoverable: 2 Illegal Request: 0 Predictive Failure Analysis: 0 I have check the drives with smartctl: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 115 075 006Pre-fail Always - 94384069 3 Spin_Up_Time0x0003 093 093 000Pre-fail Always - 0 4 Start_Stop_Count0x0032 100 100 020Old_age Always - 15 5 Reallocated_Sector_Ct 0x0033 100 100 036Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 084 060 030
Re: [zfs-discuss] SATA controller suggestion
Tom Buskey schrieb: On Fri, Jun 6, 2008 at 16:23, Tom Buskey [EMAIL PROTECTED] wrote: I have an AMD 939 MB w/ Nvidea on the motherboard and 4 500GB SATA II drives in a RAIDZ. ... I get 550 MB/s I doubt this number a lot. That's almost 200 (550/N-1 = 183) MB/s per disk, and drives I've seen are usually more in the neighborhood of 80 MB/s. How did you come up with this number? What benchmark did you run? While it's executing, what does zpool iostat mypool 10 show? time gdd if=/dev/zero bs=1048576 count=10240 of=/data/video/x real 0m13.503s user 0m0.016s sys 0m8.981s Are you sure gdd doesn't create a sparse file? - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Get your SXCE on ZFS here!
[EMAIL PROTECTED] wrote: Uwe, Please see pages 55-80 of the ZFS Admin Guide, here: http://opensolaris.org/os/community/zfs/docs/ Basically, the process is to upgrade from nv81 to nv90 by using the standard upgrade feature. Then, use lucreate to migrate your UFS root file system to a ZFS file system, like this: 1. Verify you have a current backup. 2. Read the known issues and requirements. 3. Upgrade to nv81 to nv90 using the standard upgrade feature. 4. Migrate your UFS root file system to a ZFS root file system, like this: # zpool create rpool mirror c0t1d0s0 c0t2d0s0 # lucreate -c c0t0d0s0 -n zfsBE -p rpool 5. Activate the ZFS BE, like this: # luactivate zfsBE Please see the doc for more examples of this process. Cindy Hi Cindy, unfortunately, this approach fails for me, because lucreate errors out (see below). Does anybody know, if this is a known issue? - Thomas # lucreate -n nv90ext -p ext1 Analyzing system configuration. Comparing source boot environment c0t1d0s0 file systems with the file system(s) you specified for the new boot environment. Determining which file systems should be in the new boot environment. Updating boot environment description database on all BEs. Updating system configuration files. The device /dev/dsk/c1t9d0 is not a root device for any boot environment; cannot get BE ID. Creating configuration for boot environment nv90ext. Source boot environment is c0t1d0s0. Creating boot environment nv90ext. Creating file systems on boot environment nv90ext. Creating zfs file system for / in zone global on ext1/ROOT/nv90ext. Populating file systems on boot environment nv90ext. Checking selection integrity. Integrity check OK. Populating contents of mount point /. Copying. WARNING: The file /tmp/lucopy.errors.5981 contains a list of 45 potential problems (issues) that were encountered while populating boot environment nv90ext. INFORMATION: You must review the issues listed in /tmp/lucopy.errors.5981 and determine if any must be resolved. In general, you can ignore warnings about files that were skipped because they did not exist or could not be opened. You cannot ignore errors such as directories or files that could not be created, or file systems running out of disk space. You must manually resolve any such problems before you activate boot environment nv90ext. Creating shared file system mount points. Creating compare databases for boot environment nv90ext. Creating compare database for file system /. Updating compare databases on boot environment nv90ext. Making boot environment nv90ext bootable. ERROR: Unable to determine the configuration of the target boot environment nv90ext. ERROR: Update of loader failed. ERROR: Cannot make ABE nv90ext bootable. Making the ABE nv90ext bootable FAILED. ERROR: Unable to make boot environment nv90ext bootable. ERROR: Unable to populate file systems on boot environment nv90ext. ERROR: Cannot make file systems for boot environment nv90ext. $ cat /tmp/lucopy.errors.5981 Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/template Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/latest Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/1/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/4/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/5/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/14/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/16/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/18/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/19/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/23/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/25/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/28/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/37/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/43/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/44/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/45/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/46/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/47/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/48/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/51/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/52/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/53/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/55/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/56/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/57/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/58/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/59/ctl Restoring existing /.alt.tmp.b-aEb.mnt/system/contract/process/60/ctl Restoring existing
Re: [zfs-discuss] zfs equivalent of ufsdump and ufsrestore
Darren J Moffat schrieb: Joerg Schilling wrote: Poulos, Joe [EMAIL PROTECTED] wrote: Is there a ZFS equivalent of ufsdump and ufsrestore? Will creating a tar file work with ZFS? We are trying to backup a ZFS file system to a separate disk, and would like to take advantage of something like ufsdump rather than using expensive backup software. The closest equivalent to ufsdump and ufsrestore is star. I very strongly disagree. The closest ZFS equivalent to ufsdump is 'zfs send'. 'zfs send' like ufsdump has initmiate awareness of the the actual on disk layout and is an integrated part of the filesystem implementation. star is a userland archiver. The man page for zfs states the following for send: The format of the stream is evolving. No backwards compati- bility is guaranteed. You may not be able to receive your streams on future versions of ZFS. I think this should be taken into account when considering 'zfs send' for backup purposes... - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] opensolaris 2008.05 boot recovery
Hi, I've run into an issue with a test machine that I'm happy to encounter with this machine, because it is no real trouble. But I'd like to know the solution for this issue in case I run into it again... I've installed OpenSolaris 2008.05 on an USB disk on a laptop. After installing I've modified /etc/passwd from hand: I've added another entry for the same uid with a different login name and home directory, so that I can login with a local and an NFS import home directory. Unfortunately I forgot to update /etc/shadow accordingly, so I ended up, being unable to login at all, because root is a profile and the newly added account was before the original account with the same uid. So no login possible anymore. Normally, such a situation is not a problem. So I did what I usually would do and booted from the CD. OK, now I zpool imported rpool, modified /etc/passwd, and /etc/shadow, exported the pool, and rebooted. BANG! Now the machine doesn't boot anymore, because during this process it somehow lost the ramdisk image. Now my question is: How to I reinstall the boot procedure to the laptop from the install CD? I've tried using bootadm with -R and the like but couldn't make any progress... Any ideas? Any hint would be highly appreciated... TIA, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance of one single 'cp'
after some fruitful discussions with Jörg, it turned out that my mtwrite patch prevents tar, star, gtar, and unzip from setting the file times correctly. I've investigated this issue and updated the patch accordingly. Unfortunately, I encountered an issue concerning semaphores, which seem to have a race condition. At least I couldn't get it to work reliably with semaphores so I switched over to condition variables, which works now. I'll investigate the semaphore issue as soon as I have time, but I'm pretty convinced that there is a race condition in the semaphore implementation, as the semaphore value from time to time grew larger than the number of elements in the work list. This was on Solaris 10 - so I'll try to generate a test for SX. Does anybody know of any issues related to semaphores? the work creator did the following: - lock the structure containing the list - attach an element to the list - post the semaphore - unlock the structure the worker thread did the following: - wait on the semaphore - lock the structure containing the list - remove an element from the list - unlock the structure - perform the work described by the list element - lock the structure - update the structure to reflect the work results - unlock the structure - restart from the beginning Is anything wrong with this approach? Replacing the semaphore calls with condition calls and swapping steps 1 and 2 of worker thread made it reliable... - Thomas P.S.: I published the updated mtwrite on my website yesterday - get it here: http://www.maier-komor.de/mtwrite.html ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance of one single 'cp'
Bob Friesenhahn schrieb: On my drive array (capable of 260MB/second single-process writes and 450MB/second single-process reads) 'zfs iostat' reports a read rate of about 59MB/second and a write rate of about 59MB/second when executing 'cp -r' on a directory containing thousands of 8MB files. This seems very similar to the performance you are seeing. The system indicators (other than disk I/O) are almost flatlined at zero while the copy is going on. It seems that a multi-threaded 'cp' could be much faster. With GNU xargs, find, and cpio, I think that it is possible to cobble together a much faster copy since GNU xargs supports --max-procs and --max-args arguments to allow executing commands concurrently with different sets of files. Bob That's the reason I wrote a binary patch (preloadable shared object) for cp, tar, and friends. You might want to take a look at it... Here: http://www.maier-komor.de/mtwrite.html - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] kernel memory and zfs
Richard Elling wrote: The size of the ARC (cache) is available from kstat in the zfs module (kstat -m zfs). Neel wrote a nifty tool to track it over time called arcstat. See http://www.solarisinternals.com/wiki/index.php/Arcstat Remember that this is a cache and subject to eviction when memory pressure grows. The Solaris Internals books have more details on how the Solaris virtual memory system works and is recommended reading. -- richard The arcsize is also displayed in sysstat, which additionally shows a lot more information in a 'top' like fashion. Get it here: http://www.maier-komor.de/sysstat.html - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LVM on ZFS
Kava schrieb: My 2 cents ... read somewhere that you should not be running LVM on top of ZFS ... something about additional overhead. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Just for clarification: I was talking about different aspects of SVM and ZFS separately. I never recommended that you should run LVM on top of ZFS. I said that it might be possible, but I would rather do something else... - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LVM on ZFS
Thiago Sobral schrieb: Hi Thomas, Thomas Maier-Komor escreveu: Thiago Sobral schrieb: I need to manage volumes like LVM does on Linux or AIX, and I think that ZFS can solve this issue. I read the SVM specification and certainly it doesn't will be the solution that I'll adopt. I don't have Veritas here. Why do you think it doesn't fit your needs? What would you do on Linux or AIX that you think SVM cannot do? On Linux and AIX it's possible to create volume groups and create logical volumes inside it, so I can expand or reduce the logical volume. How can I do the same with SVM ? If I create a slice (c0t0d0s0) with 100GB, can I create two metadevices inside (d0 and d1) and grow them ? I think that I should use softpartitions. Shouldn't I !? AFAIK, you can create soft partitions and grow them, but shrinking is not possible. Use metattach d0 10g to enlarge a logical volume. After that use growfs to grow the filesystem within the enlarged volume. This is documented here: http://docs.sun.com/app/docs/doc/816-4520/tasks-softpart-1?a=view $ zfs create black/lv00 would give you a filesystem named lv00. Ok, but this filesystem get the whole size of the pool and I want to limit this (i.e) 10GB.. if later I need, I grow this.. No. The newly created filesystem will only consume as much as currently needed. I.e. it grows and shrinks automagically within the pool. Additionally, you can reserve space for the filesystem or an upper boundary of how much it may consume using somethink like zfs set reservation=16G black/lv00 and zfs set quota=20G black/lv00. IMO this is more flexible than anything you will get with any logical volume manager. I think you should investigate the docs a little bit more closely or be a little bit more precise when posting your question. What are you actually trying to accomplish? I reading Sun Docs, but I didn't found satisfactory answers. Do you have a great document or URL ? Where did you look? Yes, docs.sun.com is pretty exhaustive. But there are always tasks sections (see above referenced doc example), where you can look for certain standard tasks that provide a step by step guide how to realize things. I think this is pretty good. HTH, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LVM on ZFS
Thiago Sobral schrieb: Hi folks, I need to manage volumes like LVM does on Linux or AIX, and I think that ZFS can solve this issue. I read the SVM specification and certainly it doesn't will be the solution that I'll adopt. I don't have Veritas here. Why do you think it doesn't fit your needs? What would you do on Linux or AIX that you think SVM cannot do? I created a pool with name black and a volume lv00, then created a filesystem with 'newfs' command: #newfs /dev/zvol/rdsk/black/lv00 is this the right way ? What's is the best way to manage volumes in Solaris? Do you have a URL or document describing this !? You can do it this way? But why would you? Doing a $ zfs create black/lv00 would give you a filesystem named lv00. I think you should investigate the docs a little bit more closely or be a little bit more precise when posting your question. What are you actually trying to accomplish? cheers, TS - Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] how to relocate a disk
Robert Milkowski schrieb: Hello Thomas, Friday, January 18, 2008, 10:31:17 AM, you wrote: TMK Hi, TMK I'd like to move a disk from one controller to another. This disk is TMK part of a mirror in a zfs pool. How can one do this without having to TMK export/import the pool or reboot the system? TMK I tried taking it offline and online again, but then zpool says the disk TMK is unavailable. Trying a zpool replace didn't work because it complains TMK that the new disk is part of a zfs pool... TMK So how can one do this? Instead of offline'ing it try to detach it and then attach it. However offline/online should work... does detach/attach work with just a very short resilvering or will this resync the disk completely? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] how to relocate a disk
Hi, I'd like to move a disk from one controller to another. This disk is part of a mirror in a zfs pool. How can one do this without having to export/import the pool or reboot the system? I tried taking it offline and online again, but then zpool says the disk is unavailable. Trying a zpool replace didn't work because it complains that the new disk is part of a zfs pool... So how can one do this? TIA, Thomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Intent logs vs Journaling
the ZIL is always there in host memory, even when no synchronous writes are being done, since the POSIX fsync() call could be made on an open write channel at any time, requiring all to-date writes on that channel to be committed to persistent store before it returns to the application ... it's cheaper to write the ZIL at this point than to force the entire 5 sec buffer out prematurely I have a question that is related to this topic: Why is there only a (tunable) 5 second threshold and not also an additional threshold for the buffer size (e.g. 50MB)? Sometimes I see my system writing huge amounts of data to a zfs, but the disks staying idle for 5 seconds, although the memory consumption is already quite big and it really would make sense (from my uneducated point of view as an observer) to start writing all the data to disks. I think this leads to the pumping effect that has been previously mentioned in one of the forums here. Can anybody comment on this? TIA, Thomas This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] speedup 2-8x of tar xf on ZFS
Hi, now, as I'm back to Germany,I've got access to my machine at home with ZFS, so I could test my binary patch for multi-threading with tar on a ZFS filesystems. Results look like this: .tar, small files (e.g. gcc source tree), speedup: x8 .tar.gz, small files (gcc sources tree), speedup x4 .tar, medium size files (e.g. object files of a compile binutil tree), speedup x5 .tar.gz, medium size files, speedup x2-x3 Speedup is a comparison of the wallclock time (timex real) of tar with the patched multi-threaded tar, where the patched version is 2x-8x faster. Be aware that on UFS filesystem it is about 1:1 speed - you may even suffer a 5%-10% decrease of performance. This test was on a Blade 2500, with 5GB RAM (i.e. everything in cache) running Solaris 10U3, and a ZFS filesystem on two 10k rpm 146G SCSI drives arranged as a ZFS mirror. To me this looks like a pretty good speedup. If you also want to benefit from this patch, grab it here (http://www.maier-komor.de/mtwrite.html). The current version includes a wrapper for tar called mttar, to ease use, and has some enhancements concerning performance and errorhandling (see Changelog for details). Have fun with Solaris! Cheers, Thomas This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] patch making tar multi-thread
Hi everybody, many people, like myself, tested the performance of the ZFS filesystem by doing a tar xf something.tar. Unfortunately, ZFS doesn't handle this workload pretty well as all writes are being executed sequentially. So some people requested a multi-threaded tar... Well, here it comes: I have written a small patch that intercepts the write system calls an friends an passes them off to worker threads. I'd really like to see some performance metrics with this patch on a ZFS filesystem. Unfortunately, I am currently far from home, where I have such a system. I'd be pleased if you anybody could send me some results. Feedback and RFEs are also welcome. Get it here: http://www.maier-komor.de/mtwrite.html Cheers, Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: patch making tar multi-thread
Hello Thomas, With ZFS as local file system it shouldn't be a problem unless tar fdsync's each file but then removing fdsyncs would be easier. In case of nfs/zfs multi-threaded tar should help but I guess not for writes but rather for file/dirs creation and file closes. If you only put writes to worker threads I'm not sure it will really help much if any. write and close are sent to worker threads. For open I cannot imagine a way to parallelize the operations without rewriting tar completely. I'd be interested in both local ZFS and ZFS over NFS. Masking out the fdsyncs could be an easy enhancement. But Sun's tar doesn't do it anyway IIRC and star can be told to suppress the fdsyncs. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS and savecore
Hi, I'm not sure if this is the right forum, but I guess this topic will be bounced into the right direction from here. With ZFS using as much physical memory as it can get, dumps and livedumps via 'savecore -L' are huge in size. I just tested it on my workstation and got a 1.8G vmcore file, when dumping only kernel pages. Might it be possible to add an extension that would make it possible, to support dumping without the whole ZFS cache? I guess this would make kernel live dumps smaller again, as they used to be... Any comments? Cheers, Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: zpool snapshot fails on unmounted filesystem
Hi Tim, I just retried to reproduce it to generate a reliable test case. Unfortunately, I cannot reproduce the error message. So I really have no idea what might have cause it Sorry, Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool snapshot fails on unmounted filesystem
Is this a known problem/bug? $ zfs snapshot zpool/[EMAIL PROTECTED] internal error: unexpected error 16 at line 2302 of ../common/libzfs_dataset.c this occured on: $ uname -a SunOS azalin 5.10 Generic_118833-24 sun4u sparc SUNW,Sun-Blade-2500 This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] live upgrade incompability
Hi, concerning this issue I didn't find anything in the bug database, so I thought I report it here... When running live-upgrade on a system with a zfs, LU creates directories for all ZFS filesystems in the ABE. This causes svc:/system/filesystem/local to go to maintainance state, when booting the ABE, because the zpool won't be imported because of the existing directory structure in its mount point. I observed this behavior on a Solaris 10 system with live-upgrade 11.10. Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] howto reduce ?zfs introduced? noise
Hi, after switching over to zfs from ufs for my ~/ at home, I am a little bit disturbed by the noise the disks are making. To be more precise, I always have thunderbird and firefox running on my desktop and either or both seem to be writing to my ~/ at short intervals and ZFS flushes these transactions at intervals about 2-5 seconds to the disks. In contrast UFS seems to be doing a little bit more aggressive caching, which reduces disk noise. I didn't really track down who is the offender and what is the precise reason. I only know that the noise disappears as soon as I close Thunderbird and Firefox. So maybe there is an easy way to solve this problem at the application level. And anyway I want to move my $HOME to more silent disks. But I am curious, if I am the only one who observed this behaviour? Maybe there is even an easy way to reduce this noise. Additionally, I'd guess that moving the heads of the disks all the time, won't make the disks last any longer... Cheers, Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS home/JDS interaction issue
Hi, I just upgraded my machine at home to Solaris 10U2. As I already had a ZFS, I wanted to migrate my home directories at once to a ZFS from a local UFS metadisk. Copying and changing the config of the automounter succeeded without any problems. But when I tried to login to JDS, login suceeded, but JDS did not start and the X session gets always terminated after a couple of seconds. /var/dt/Xerrors says that /dev/fb could not be accessed, although it works without any problem when running from the UFS filesystem. Switching back to my UFS based home resolved this issue. I even tried switching over to ZFS and rebooted the machine to make 100% sure everything is in a sane state (i.e. no gconfd etc.), but the issue persisted and switching back to UFS again resolved this issue. Has anybody else had similar problems? Any idea how to resolve this? TIA, Tom This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss