Re: [zfs-discuss] Small stalls slowing down rsync from holding network saturation every 5 seconds
On 06/04/2010 06:15 PM, Bob Friesenhahn wrote: > On Fri, 4 Jun 2010, Sandon Van Ness wrote: >> >> Interesting enough when I went to copy the data back I got even worse >> download speeds than I did write speeds! It looks like i need some sort >> of read-ahead as unlike the writes it doesn't appear to be CPU bound as >> using mbuffer/tar gives me full gigabit speeds. You can see in my graph >> here: >> >> http://uverse.houkouonchi.jp/stats/netusage/1.1.1.3_2.html > > I am still not sure what you are doing, however, it should not > surprise that gigabit ethernet is limited to one gigabit of traffic > (1000 Mb/s) in either direction. Theoretically you should be able to > get a gigabit of traffic in both directions at once, but this depends > on the quality of your ethernet switch, ethernet adaptor card, device > driver, and capabilities of where the data is read and written to. > > Bob The problem is that just using rsync I am not getting gigabit. For me gigabit maxes out at around 930-940 megabits. When I use rsync alone I only was getting around 720 megabits incomming. This is only when its reading from the block device. When reading from the memory (IE: cat a few big files on the server to have them cached) it gets ~935 megabits. The machine is easily able to sustain that read speed (and write) but the problem is getting it to actually do it. The only way I was able to get full gig (935 megabits) was using tar and mbuffer due to it acting as a read-ahead buffer. is there anyway to turn the prefetch up as there really is no reason I should only be getting 720 megabits when copying files off with rsync (or NFS) like I am seeing. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Small stalls slowing down rsync from holding network saturation every 5 seconds
On Fri, 4 Jun 2010, Sandon Van Ness wrote: Interesting enough when I went to copy the data back I got even worse download speeds than I did write speeds! It looks like i need some sort of read-ahead as unlike the writes it doesn't appear to be CPU bound as using mbuffer/tar gives me full gigabit speeds. You can see in my graph here: http://uverse.houkouonchi.jp/stats/netusage/1.1.1.3_2.html I am still not sure what you are doing, however, it should not surprise that gigabit ethernet is limited to one gigabit of traffic (1000 Mb/s) in either direction. Theoretically you should be able to get a gigabit of traffic in both directions at once, but this depends on the quality of your ethernet switch, ethernet adaptor card, device driver, and capabilities of where the data is read and written to. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ssd pool + ssd cache ?
On Fri, Jun 4, 2010 at 2:59 PM, David Magda wrote: > Are you referring to a read cache or a write cache? A cache vdev is a L2ARC, used for reads. A log vdev is a slog/zil, used for writes. Oh, how we overload our terms. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ssd pool + ssd cache ?
On Fri, Jun 4, 2010 at 11:28 AM, zfsnoob4 wrote: > Does anyone know if opensolaris supports Trim? It does not. However, it doesn't really matter for a cache device. The cache device is written to rather slowly, and only needs to have low latency access on reads. Most current gen SSDs such as the Intel X25-M, Indilinx Barefoot, etc. also support garbage collection which reduces the need for TRIM. It's important that you align block on a 4k or 8k boundary though. (OCZ recommends 8k for the Vertex drives.) I think that most current drives have between a 128k and 512k erase block size, which is another alignment point you can use. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrating to ZFS
Frank, The format utility is not technically correct because it refers to slices as partitions. Check the output below. We might describe that the "partition" menu is used to partition the disk into slices, but all of format refers to partitions, not slices. I agree with Brandon's explanation, but no amount of explanation resolves the confusion for those unfamiliar with how we use the same term to describe different disk components. Cindy format> p PARTITION MENU: 0 - change `0' partition 1 - change `1' partition 2 - change `2' partition 3 - change `3' partition 4 - change `4' partition 5 - change `5' partition 6 - change `6' partition expand - expand label to use whole disk select - select a predefined table modify - modify a predefined partition table name - name the current table print - display the current table label - write partition map and label to the disk ! - execute , then return quit partition> p Current partition table (original): Total disk sectors available: 286722878 + 16384 (reserved sectors) Part TagFlag First Sector Size Last Sector 0usrwm 256 136.72GB 286722911 1 unassignedwm 0 0 0 2 unassignedwm 0 0 0 3 unassignedwm 0 0 0 4 unassignedwm 0 0 0 5 unassignedwm 0 0 0 6 unassignedwm 0 0 0 8 reservedwm 2867229128.00MB 286739295 partition> On 06/04/10 15:43, Frank Cusack wrote: On 6/4/10 11:46 AM -0700 Brandon High wrote: Be aware that Solaris on x86 has two types of partitions. There are fdisk partitions (c0t0d0p1, etc) which is what gparted, windows and other tools will see. There are also Solaris partitions or slices (c0t0d0s0). You can create or edit these with the 'format' command in Solaris. These are created in an fdisk partition that is the SOLARIS2 type. So yeah, it's a partition table inside a partition table. That's not correct, at least not technically. Solaris *slices* within the Solaris fdisk partition, are not also known as partitions. They are simply known as slices. By calling them "Solaris partitions or slices" you are just adding confusion. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slog / log recovery is here!
Victor Latushkin wrote: On Jun 4, 2010, at 5:01 PM, Sigbjørn Lie wrote: R. Eulenberg wrote: Sorry for reviving this old thread. I even have this problem on my (productive) backup server. I lost my system-hdd and my separate ZIL-device while the system crashs and now I'm in trouble. The old system was running under the least version of osol/dev (snv_134) with zfs v22. After the server crashs I was very optimistic of solving the problems the same day. It's a long time ago. I was setting up a new systen (osol 2009.06 and updating to the lastest version of osol/dev - snv_134 - with deduplication) and then I tried to import my backup zpool, but it does not work. # zpool import pool: tank1 id: 5048704328421749681 state: UNAVAIL status: The pool was last accessed by another system. action: The pool cannot be imported due to damaged devices or data. see: http://www.sun.com/msg/ZFS-8000-EY config: tank1UNAVAIL missing device raidz2-0 ONLINE c7t5d0 ONLINE c7t0d0 ONLINE c7t6d0 ONLINE c7t3d0 ONLINE c7t1d0 ONLINE c7t4d0 ONLINE c7t2d0 ONLINE # zpool import -f tank1 cannot import 'tank1': one or more devices is currently unavailable Destroy and re-create the pool from a backup source Any other option (-F, -X, -V, -D) and any combination of them doesn't helps too. I can not add / attach / detach / remove a vdev and the ZIL-device either, because the system tells me: there is no zpool 'tank1'. In the last ten days I read a lot of threads, guides to solve problems and best practice documentations with ZFS and so on, but I do not found a solution for my problem. I created a fake-zpool with separate ZIL-device to combine the new ZIL-file with my old zpool for importing them, but it doesn't work in course of the different GUID and checksum (the name I was modifiing by an binary editor). The output of: e...@opensolaris:~# zdb -e tank1 Configuration for import: vdev_children: 2 version: 22 pool_guid: 5048704328421749681 name: 'tank1' state: 0 hostid: 946038 hostname: 'opensolaris' vdev_tree: type: 'root' id: 0 guid: 5048704328421749681 children[0]: type: 'raidz' id: 0 guid: 16723866123388081610 nparity: 2 metaslab_array: 23 metaslab_shift: 30 ashift: 9 asize: 7001340903424 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 6858138566678362598 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@0,0:a' whole_disk: 1 DTL: 4345 create_txg: 4 path: '/dev/dsk/c7t5d0s0' devid: 'id1,s...@sata_samsung_hd103uj___s13pj1bq709050/a' children[1]: type: 'disk' id: 1 guid: 16136237447458434520 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@1,0:a' whole_disk: 1 DTL: 4344 create_txg: 4 path: '/dev/dsk/c7t0d0s0' devid: 'id1,s...@sata_samsung_hd103uj___s13pjdwq317311/a' children[2]: type: 'disk' id: 2 guid: 10876853602231471126 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@2,0:a' whole_disk: 1 DTL: 4343 create_txg: 4 path: '/dev/dsk/c7t6d0s0' devid: 'id1,s...@sata_hitachi_hdt72101__stf604mh14s56w/a' children[3]: type: 'disk' id: 3 guid: 2384677379114262201 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@3,0:a' whole_disk: 1 DTL: 4342 create_txg: 4 path: '/dev/dsk/c7t3d0s0' devid: 'id1,s...@sata_samsung_hd103uj___s13pj1nq811135/a' children[4]: type: 'disk' id: 4 guid: 15143849195434333247 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@4,0:a' whole_disk: 1 DTL: 4341 create_txg: 4 path: '/dev/dsk/c7t1d0s0' devid: 'id1,s...@sata_hitachi_hdt72101__stf604mh16v73w/a' children[5]: type: 'disk' id: 5 guid: 11627603446133164653 phys_path:
Re: [zfs-discuss] ssd pool + ssd cache ?
On Jun 4, 2010, at 14:28, zfsnoob4 wrote: Does anyone know if opensolaris supports Trim? Not at this time. Are you referring to a read cache or a write cache? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrating to ZFS
On 6/4/10 11:46 AM -0700 Brandon High wrote: Be aware that Solaris on x86 has two types of partitions. There are fdisk partitions (c0t0d0p1, etc) which is what gparted, windows and other tools will see. There are also Solaris partitions or slices (c0t0d0s0). You can create or edit these with the 'format' command in Solaris. These are created in an fdisk partition that is the SOLARIS2 type. So yeah, it's a partition table inside a partition table. That's not correct, at least not technically. Solaris *slices* within the Solaris fdisk partition, are not also known as partitions. They are simply known as slices. By calling them "Solaris partitions or slices" you are just adding confusion. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Deduplication and ISO files
On 05.06.10 00:10, Ray Van Dolson wrote: On Fri, Jun 04, 2010 at 01:03:32PM -0700, Brandon High wrote: On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson wrote: Makes sense. So, as someone else suggested, decreasing my block size may improve the deduplication ratio. It might. It might make your performance tank, too. Decreasing the block size increases the size of the dedup table (DDT). Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT gets too large to fit in memory, it will have to be read from disk, which will destroy any sort of write performance (although a L2ARC on SSD can help) If you move to 64k blocks, you'll double the DDT size and may not actually increase your ratio. Moving to 8k blocks will increase your DDT by a factor of 16, and still may not help. Changing the recordsize will not affect files that are already in the dataset. You'll have to recopy them to re-write with the smaller block size. -B Gotcha. Just trying to make sure I understand how all this works, and if I _would_ in fact see an improvement in dedupe-ratio by tweaking the recordsize with our data-set. You can use zdb -S to assess how effective deduplication can be without actually turning it on your pool. regards victor ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS recovery tools
On Jun 4, 2010, at 10:18 PM, Miles Nordin wrote: >> "sl" == Sigbjørn Lie writes: > >sl> Excellent! I wish I would have known about these features when >sl> I was attempting to recover my pool using 2009.06/snv111. > > the OP tried the -F feature. It doesn't work after you've lost zpool.cache: Starting from build 128 option -F is documented option for 'zpool import' and 'zpool clear' and it has nothing to do with zpool.cache. Old -F has been renamed to -V In some cases it may be possible to extract configuration details from the in-pool copy of configuration by running zdb -eC regards victor > >op> I was setting up a new systen (osol 2009.06 and updating to >op> the lastest version of osol/dev - snv_134 - with >op> deduplication) and then I tried to import my backup zpool, but >op> it does not work. > >op> # zpool import -f tank1 >op> cannot import 'tank1': one or more devices is currently unavailable >op> Destroy and re-create the pool from a backup source > >op> Any other option (-F, -X, -V, -D) and any combination of them >op> doesn't helps too. > > I have been in here repeatedly warning about this incompleteness of > the feature while fanbois keep saying ``we have slog recovery so don't > worry.'' > > R., please let us know if the 'zdb -e -bcsvL ' incantation > Sigbjorn suggested ends up working for you or not. > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slog / log recovery is here!
On Jun 4, 2010, at 5:01 PM, Sigbjørn Lie wrote: > > R. Eulenberg wrote: >> Sorry for reviving this old thread. >> >> I even have this problem on my (productive) backup server. I lost my >> system-hdd and my separate ZIL-device while the system crashs and now I'm in >> trouble. The old system was running under the least version of osol/dev >> (snv_134) with zfs v22. After the server crashs I was very optimistic of >> solving the problems the same day. It's a long time ago. >> I was setting up a new systen (osol 2009.06 and updating to the lastest >> version of osol/dev - snv_134 - with deduplication) and then I tried to >> import my backup zpool, but it does not work. >> >> # zpool import >> pool: tank1 >>id: 5048704328421749681 >> state: UNAVAIL >> status: The pool was last accessed by another system. >> action: The pool cannot be imported due to damaged devices or data. >> see: http://www.sun.com/msg/ZFS-8000-EY >> config: >> >>tank1UNAVAIL missing device >> raidz2-0 ONLINE >>c7t5d0 ONLINE >>c7t0d0 ONLINE >>c7t6d0 ONLINE >>c7t3d0 ONLINE >>c7t1d0 ONLINE >>c7t4d0 ONLINE >>c7t2d0 ONLINE >> >> # zpool import -f tank1 >> cannot import 'tank1': one or more devices is currently unavailable >>Destroy and re-create the pool from >>a backup source >> >> Any other option (-F, -X, -V, -D) and any combination of them doesn't helps >> too. >> I can not add / attach / detach / remove a vdev and the ZIL-device either, >> because the system tells me: there is no zpool 'tank1'. >> In the last ten days I read a lot of threads, guides to solve problems and >> best practice documentations with ZFS and so on, but I do not found a >> solution for my problem. I created a fake-zpool with separate ZIL-device to >> combine the new ZIL-file with my old zpool for importing them, but it >> doesn't work in course of the different GUID and checksum (the name I was >> modifiing by an binary editor). >> The output of: >> e...@opensolaris:~# zdb -e tank1 >> >> Configuration for import: >>vdev_children: 2 >>version: 22 >>pool_guid: 5048704328421749681 >>name: 'tank1' >>state: 0 >>hostid: 946038 >>hostname: 'opensolaris' >>vdev_tree: >>type: 'root' >>id: 0 >>guid: 5048704328421749681 >>children[0]: >>type: 'raidz' >>id: 0 >>guid: 16723866123388081610 >>nparity: 2 >>metaslab_array: 23 >>metaslab_shift: 30 >>ashift: 9 >>asize: 7001340903424 >>is_log: 0 >>create_txg: 4 >>children[0]: >>type: 'disk' >>id: 0 >>guid: 6858138566678362598 >>phys_path: >> '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@0,0:a' >>whole_disk: 1 >>DTL: 4345 >>create_txg: 4 >>path: '/dev/dsk/c7t5d0s0' >>devid: >> 'id1,s...@sata_samsung_hd103uj___s13pj1bq709050/a' >>children[1]: >>type: 'disk' >>id: 1 >>guid: 16136237447458434520 >>phys_path: >> '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@1,0:a' >>whole_disk: 1 >>DTL: 4344 >>create_txg: 4 >>path: '/dev/dsk/c7t0d0s0' >>devid: >> 'id1,s...@sata_samsung_hd103uj___s13pjdwq317311/a' >>children[2]: >>type: 'disk' >>id: 2 >>guid: 10876853602231471126 >>phys_path: >> '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@2,0:a' >>whole_disk: 1 >>DTL: 4343 >>create_txg: 4 >>path: '/dev/dsk/c7t6d0s0' >>devid: >> 'id1,s...@sata_hitachi_hdt72101__stf604mh14s56w/a' >>children[3]: >>type: 'disk' >>id: 3 >>guid: 2384677379114262201 >>phys_path: >> '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@3,0:a' >>whole_disk: 1 >>DTL: 4342 >>create_txg: 4 >>path: '/dev/dsk/c7t3d0s0' >>devid: >> 'id1,s...@sata_samsung_hd103uj___s13pj1nq811135/a' >>children[4]: >>type: 'disk' >>id: 4 >>guid: 15143849195434333247 >>phys_path: >> '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@4,0:a' >>whole_disk: 1 >>
Re: [zfs-discuss] Deduplication and ISO files
On Fri, Jun 04, 2010 at 01:03:32PM -0700, Brandon High wrote: > On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson wrote: > > Makes sense. So, as someone else suggested, decreasing my block size > > may improve the deduplication ratio. > > It might. It might make your performance tank, too. > > Decreasing the block size increases the size of the dedup table (DDT). > Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT > gets too large to fit in memory, it will have to be read from disk, > which will destroy any sort of write performance (although a L2ARC on > SSD can help) > > If you move to 64k blocks, you'll double the DDT size and may not > actually increase your ratio. Moving to 8k blocks will increase your > DDT by a factor of 16, and still may not help. > > Changing the recordsize will not affect files that are already in the > dataset. You'll have to recopy them to re-write with the smaller block > size. > > -B Gotcha. Just trying to make sure I understand how all this works, and if I _would_ in fact see an improvement in dedupe-ratio by tweaking the recordsize with our data-set. Once we know that we can decide if it's worth the extra costs in RAM/L2ARC. Thanks all. Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Deduplication and ISO files
On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson wrote: > Makes sense. So, as someone else suggested, decreasing my block size > may improve the deduplication ratio. It might. It might make your performance tank, too. Decreasing the block size increases the size of the dedup table (DDT). Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT gets too large to fit in memory, it will have to be read from disk, which will destroy any sort of write performance (although a L2ARC on SSD can help) If you move to 64k blocks, you'll double the DDT size and may not actually increase your ratio. Moving to 8k blocks will increase your DDT by a factor of 16, and still may not help. Changing the recordsize will not affect files that are already in the dataset. You'll have to recopy them to re-write with the smaller block size. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Deduplication and ISO files
On Fri, Jun 04, 2010 at 12:37:01PM -0700, Ray Van Dolson wrote: > On Fri, Jun 04, 2010 at 11:16:40AM -0700, Brandon High wrote: > > On Fri, Jun 4, 2010 at 9:30 AM, Ray Van Dolson wrote: > > > The ISO's I'm testing with are the 32-bit and 64-bit versions of the > > > RHEL5 DVD ISO's. While both have their differences, they do contain a > > > lot of similar data as well. > > > > Similar != identical. > > > > Dedup works on blocks in zfs, so unless the iso files have identical > > data aligned at 128k boundaries you won't see any savings. > > > > > If I explode both ISO files and copy them to my ZFS filesystem I see > > > about a 1.24x dedup ratio. > > > > Each file starts a new block, so the identical files can be deduped. > > > > -B > > Makes sense. So, as someone else suggested, decreasing my block size > may improve the deduplication ratio. > > recordsize I presume is the value to tweak? Yes, but I'd not expect that much commonality between 32-bit and 64-bit Linux ISOs... Do the same check again with the ISOs "exploded", as you say. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Deduplication and ISO files
> Makes sense. So, as someone else suggested, decreasing my block size > may improve the deduplication ratio. > > recordsize I presume is the value to tweak? It is, but keep in mind that zfs will need about 150 bytes for each block. 1TB with 128k blocks will need about 1GB memory for the index to stay in RAM. 64k blocks, the double, et cetera... l2arc will help a lot if memory is low Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Deduplication and ISO files
On Fri, Jun 04, 2010 at 11:16:40AM -0700, Brandon High wrote: > On Fri, Jun 4, 2010 at 9:30 AM, Ray Van Dolson wrote: > > The ISO's I'm testing with are the 32-bit and 64-bit versions of the > > RHEL5 DVD ISO's. While both have their differences, they do contain a > > lot of similar data as well. > > Similar != identical. > > Dedup works on blocks in zfs, so unless the iso files have identical > data aligned at 128k boundaries you won't see any savings. > > > If I explode both ISO files and copy them to my ZFS filesystem I see > > about a 1.24x dedup ratio. > > Each file starts a new block, so the identical files can be deduped. > > -B Makes sense. So, as someone else suggested, decreasing my block size may improve the deduplication ratio. recordsize I presume is the value to tweak? Thanks, Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Small stalls slowing down rsync from holding network saturation every 5 seconds
On 06/01/2010 07:57 AM, Bob Friesenhahn wrote: > On Mon, 31 May 2010, Sandon Van Ness wrote: >> With sequential writes I don't see how parity writing would be any >> different from when I just created a 20 disk zpool which is doing the >> same writes every 5 seconds but the only difference is it isn't maxing >> out CPU usage when doing the writes and and I don't see the transfer >> stall during the writes like I did on raidz2. > > I am not understanding the above paragraph, but hopefully you agree > that raidz2 issues many more writes (based on vdev stripe width) to > the underlying disks than a simple non-redundant load-shared pool > does. Depending on your system, this might not be an issue, but it is > possible that there is an I/O threshold beyond which something > (probably hardware) causes a performance issue. > > Bob Interesting enough when I went to copy the data back I got even worse download speeds than I did write speeds! It looks like i need some sort of read-ahead as unlike the writes it doesn't appear to be CPU bound as using mbuffer/tar gives me full gigabit speeds. You can see in my graph here: http://uverse.houkouonchi.jp/stats/netusage/1.1.1.3_2.html On the weekly graph is when I was sending to the ZFS server and then daily is showing it comming back but I stopped it and shut down the computer for a while which is the low speed flat line and then started it up again this time using mbuffer and speeds are great. I don't see why I am having a trouble getting full speeds when doing reads unless it needs to read ahead more than it is. I decided to go ahead and to tar + mbuffer for the first pass and then run rsync after for the final sync just to make sure nothing was missed. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs list sizes - newbie question
On Fri, Jun 4, 2010 at 11:41 AM, Andres Noriega wrote: > I understand now. So each vol's available space is reporting it's > reservation and whatever is still available in the pool. > > I appreciate the explanation. Thank you! > > If you want the available space to be a hard limit, have a look at the quota property. The reservation tells the pool to reserve that amount of space for the dataset, meaning that space is no longer available to anything else in the pool. The quota tells the pool the max amount of storage the dataset can use, and is reflected in the "space available" output of various tools (like zfs list, df, etc). -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrating to ZFS
On Fri, Jun 4, 2010 at 12:59 AM, zfsnoob4 wrote: > This is what I'm thinking: > 1) Use Gparted to resize the windows partition and therefore create a 50GB > raw partition. > 2) Use the opensolaris installer to format the raw partition into a Solaris > FS. > 3) Install opensolaris 2009.06, the setup should automatically configure the > dual boot with windows and opensolaris. > > Does that make sense? That will work fine. Be aware that Solaris on x86 has two types of partitions. There are fdisk partitions (c0t0d0p1, etc) which is what gparted, windows and other tools will see. There are also Solaris partitions or slices (c0t0d0s0). You can create or edit these with the 'format' command in Solaris. These are created in an fdisk partition that is the SOLARIS2 type. So yeah, it's a partition table inside a partition table. The caiman installer will allow you to create and install into fdisk partitions. It creates a Solaris slice that uses the entire fdisk partition. If you want to change the size or layout of the slices, you can't do it at install time. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs list sizes - newbie question
I understand now. So each vol's available space is reporting it's reservation and whatever is still available in the pool. I appreciate the explanation. Thank you! > On Thu, Jun 3, 2010 at 1:06 PM, Andres Noriega > wrote: > > Hi everyone, I have a question about the zfs list > output. I created a large zpool and then carved out > 1TB volumes (zfs create -V 1T vtl_pool/lun##). > Looking at the zfs list output, I'm a little thrown > off by the AVAIL amount. Can anyone clarify for me > why it is saying 2T? > > You have a 16T zpool, and have created 15x 1T zvols, > leaving 1T free. > Each zvol is mostly unused, so it has 1T available in > its > refreservation, and an additional 1T available from > the zpool. > > The zvols won't actually hold 2T, because they were > created with 1T of > space. The space beyond 1T can be used for snapshots > though. > > -B > > -- > Brandon High : bh...@freaks.com > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discu > ss > -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Usage on drives
On Fri, Jun 4, 2010 at 6:36 AM, Andreas Iannou wrote: > I'm wondering if we can see the amount of usage for a drive in ZFS raidz > mirror. I'm in the process of replacing some drives but I want to replace By definition, a mirror has the a copy of all the data on each drive. A raidz vdev is auto-balancing, and effort is made to spread data across as many devices as possible. Unless access patterns are a weird, each drive should hold the same amount of data within a reasonable margin of error. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ssd pool + ssd cache ?
I'm also considering adding a cheap SSD as a a cache drive. The only problem is that SSDs loose performance over time because when something is deleted, it is not actually deleted. So the next time something is written on the same blocks, it must first delete, then write. To fix this, SSDs allow a new command called Trim which automatically clean the blocks after deleting something. Does anyone know if opensolaris supports Trim? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Depth of Scrub
On Fri, Jun 4, 2010 at 1:29 AM, sensille wrote: > But what I'm really targeting with my question: How much coverage can be > reached with a find | xargs wc in contrast to scrub? It misses the snapshots, > but anything beyond that? Your script will also update the atime on every file, which may not be the desired effect. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS recovery tools
> "sl" == Sigbjørn Lie writes: sl> Excellent! I wish I would have known about these features when sl> I was attempting to recover my pool using 2009.06/snv111. the OP tried the -F feature. It doesn't work after you've lost zpool.cache: op> I was setting up a new systen (osol 2009.06 and updating to op> the lastest version of osol/dev - snv_134 - with op> deduplication) and then I tried to import my backup zpool, but op> it does not work. op> # zpool import -f tank1 op> cannot import 'tank1': one or more devices is currently unavailable op> Destroy and re-create the pool from a backup source op> Any other option (-F, -X, -V, -D) and any combination of them op> doesn't helps too. I have been in here repeatedly warning about this incompleteness of the feature while fanbois keep saying ``we have slog recovery so don't worry.'' R., please let us know if the 'zdb -e -bcsvL ' incantation Sigbjorn suggested ends up working for you or not. pgpFHj14VBEC7.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Deduplication and ISO files
On Fri, Jun 4, 2010 at 9:30 AM, Ray Van Dolson wrote: > The ISO's I'm testing with are the 32-bit and 64-bit versions of the > RHEL5 DVD ISO's. While both have their differences, they do contain a > lot of similar data as well. Similar != identical. Dedup works on blocks in zfs, so unless the iso files have identical data aligned at 128k boundaries you won't see any savings. > If I explode both ISO files and copy them to my ZFS filesystem I see > about a 1.24x dedup ratio. Each file starts a new block, so the identical files can be deduped. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs list sizes - newbie question
On Thu, Jun 3, 2010 at 1:06 PM, Andres Noriega wrote: > Hi everyone, I have a question about the zfs list output. I created a large > zpool and then carved out 1TB volumes (zfs create -V 1T vtl_pool/lun##). > Looking at the zfs list output, I'm a little thrown off by the AVAIL amount. > Can anyone clarify for me why it is saying 2T? You have a 16T zpool, and have created 15x 1T zvols, leaving 1T free. Each zvol is mostly unused, so it has 1T available in its refreservation, and an additional 1T available from the zpool. The zvols won't actually hold 2T, because they were created with 1T of space. The space beyond 1T can be used for snapshots though. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs list sizes - newbie question
Thanks... here's the requested output: NAMEAVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD vtl_pool1020G 15.0T 0 46.3K 0 15.0T vtl_pool/lun00 1.99T 1T 0 6.05G 1018G 0 vtl_pool/lun01 1.99T 1T 0 4.46G 1020G 0 vtl_pool/lun02 1.99T 1T 0 4.44G 1020G 0 vtl_pool/lun03 1.99T 1T 0 4.49G 1020G 0 vtl_pool/lun04 2.00T 1T 0869M 1023G 0 vtl_pool/lun05 2.00T 1T 0725M 1023G 0 vtl_pool/lun06 2.00T 1T 0722M 1023G 0 vtl_pool/lun07 2.00T 1T 0700M 1023G 0 vtl_pool/lun08 2.00T 1T 0534M 1023G 0 vtl_pool/lun09 2.00T 1T 0518M 1023G 0 vtl_pool/lun10 2.00T 1T 0309M 1024G 0 vtl_pool/lun11 2.00T 1T 0 4.84M 1024G 0 vtl_pool/lun12 2.00T 1T 0 4.84M 1024G 0 vtl_pool/lun13 2.00T 1T 0 4.84M 1024G 0 vtl_pool/lun14 2.00T 1T 0 4.84M 1024G 0 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Deduplication and ISO files
I'm running zpool version 23 (via ZFS fuse on Linux) and have a zpool with deduplication turned on. I am testing how well deduplication will work for the storage of many, similar ISO files and so far am seeing unexpected results (or perhaps my expectations are wrong). The ISO's I'm testing with are the 32-bit and 64-bit versions of the RHEL5 DVD ISO's. While both have their differences, they do contain a lot of similar data as well. If I explode both ISO files and copy them to my ZFS filesystem I see about a 1.24x dedup ratio. However, if I have only the ISO files on the ZFS filesystem, the ratio is 1.00x -- no savings at all. Does this make sense? I'm going to experiment with other combinations of ISO files as well... Thanks, Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] nfs share of nested zfs directories?
Well, yes I understand I need to research the issue of running the idmapd service, but I also need to figure out how to use nfsv4 and automount. - Cassandra (609) 243-2413 Unix Administrator "From a little spark may burst a mighty flame." -Dante Alighieri On Fri, Jun 4, 2010 at 10:00 AM, Pasi Kärkkäinen wrote: > On Fri, Jun 04, 2010 at 08:43:32AM -0400, Cassandra Pugh wrote: > >Thank you, when I manually mount using the "mount -t nfs4" option, I > am > >able to see the entire tree, however, the permissions are set as > >nfsnobody. > >"Warning: rpc.idmapd appears not to be running. > > All uids will be mapped to the nobody uid." > > > > Did you actually read the error message? :) > Finding a solution shouldn't be too difficult after that.. > > -- Pasi > > >- > >Cassandra > >(609) 243-2413 > >Unix Administrator > > > >"From a little spark may burst a mighty flame." > >-Dante Alighieri > > > >On Thu, Jun 3, 2010 at 4:33 PM, Brandon High <[1]bh...@freaks.com> > wrote: > > > > On Thu, Jun 3, 2010 at 12:50 PM, Cassandra Pugh <[2]cp...@pppl.gov> > > wrote: > > > The special case here is that I am trying to traverse NESTED zfs > > systems, > > > for the purpose of having compressed and uncompressed directories. > > > > Make sure to use "mount -t nfs4" on your linux client. The standard > > "nfs" type only supports nfs v2/v3. > > > > -B > > -- > > Brandon High : [3]bh...@freaks.com > > > > References > > > >Visible links > >1. mailto:bh...@freaks.com > >2. mailto:cp...@pppl.gov > >3. mailto:bh...@freaks.com > > > ___ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [zones-discuss] ZFS ARC cache issue
On Fri, 2010-06-04 at 16:03 +0100, Robert Milkowski wrote: > On 04/06/2010 15:46, James Carlson wrote: > > Petr Benes wrote: > > > >> add to /etc/system something like (value depends on your needs) > >> > >> * limit greedy ZFS to 4 GiB > >> set zfs:zfs_arc_max = 4294967296 > >> > >> And yes, this has nothing to do with zones :-). > >> > > That leaves unanswered the underlying question: why do you need to do > > this at all? Isn't the ZFS ARC supposed to release memory when the > > system is under pressure? Is that mechanism not working well in some > > cases ... ? > > > > > > My understanding is that if kmem gets heavily fragmaneted ZFS won't be > able to give back much memory. > The slab allocator and virtual memory are designed to prevent memory fragmentation. That said, it is possible that certain devices which need physically contiguous memory may be affected by physical address fragmentation. I'm not sure exactly what kind of fragmentation you're talking about here though... - Garrett ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [zones-discuss] ZFS ARC cache issue
On 04/06/2010 15:46, James Carlson wrote: Petr Benes wrote: add to /etc/system something like (value depends on your needs) * limit greedy ZFS to 4 GiB set zfs:zfs_arc_max = 4294967296 And yes, this has nothing to do with zones :-). That leaves unanswered the underlying question: why do you need to do this at all? Isn't the ZFS ARC supposed to release memory when the system is under pressure? Is that mechanism not working well in some cases ... ? My understanding is that if kmem gets heavily fragmaneted ZFS won't be able to give back much memory. -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Depth of Scrub
On Fri, June 4, 2010 03:29, sensille wrote: > Hi, > > I have a small question about the depth of scrub in a raidz/2/3 > configuration. > I'm quite sure scrub does not check spares or unused areas of the disks > (it > could check if the disks detects any errors there). > But what about the parity? Obviously it has to be checked, but I can't > find > any indications for it in the literature. The man page only states that > the > data is being checksummed and only if that fails the redundancy is being > used. > Please tell me I'm wrong ;) I believe you're wrong. Scrub checks all the blocks used by ZFS, regardless of what's in them. (It doesn't check free blocks.) > But what I'm really targeting with my question: How much coverage can be > reached with a find | xargs wc in contrast to scrub? It misses the > snapshots, but anything beyond that? Your find script misses the redundant data; scrub checks it all. It may well miss some of the metadata as well, and probably misses the redundant copies of metadata. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Usage on drives
On Fri, Jun 4, 2010 at 6:36 AM, Andreas Iannou < andreas_wants_the_w...@hotmail.com> wrote: > Hello again, > > I'm wondering if we can see the amount of usage for a drive in ZFS raidz > mirror. I'm in the process of replacing some drives but I want to replace > the less used drives first (maybe only 40-50% utilisation). Is there such a > thing? I saw somewhere that a guy had 3 drives in a raidz, one drive only > had to be resilvered 612Gb to replace. > > I'm hoping as theres quite a bit of free space that some drives only occupy > a little and therefore only resilver 200-300Gb of data. > When in doubt, read the man page. :) zpool iostat -v -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] nfs share of nested zfs directories?
On Fri, Jun 04, 2010 at 08:43:32AM -0400, Cassandra Pugh wrote: >Thank you, when I manually mount using the "mount -t nfs4" option, I am >able to see the entire tree, however, the permissions are set as >nfsnobody. >"Warning: rpc.idmapd appears not to be running. > All uids will be mapped to the nobody uid." > Did you actually read the error message? :) Finding a solution shouldn't be too difficult after that.. -- Pasi >- >Cassandra >(609) 243-2413 >Unix Administrator > >"From a little spark may burst a mighty flame." >-Dante Alighieri > >On Thu, Jun 3, 2010 at 4:33 PM, Brandon High <[1]bh...@freaks.com> wrote: > > On Thu, Jun 3, 2010 at 12:50 PM, Cassandra Pugh <[2]cp...@pppl.gov> > wrote: > > The special case here is that I am trying to traverse NESTED zfs > systems, > > for the purpose of having compressed and uncompressed directories. > > Make sure to use "mount -t nfs4" on your linux client. The standard > "nfs" type only supports nfs v2/v3. > > -B > -- > Brandon High : [3]bh...@freaks.com > > References > >Visible links >1. mailto:bh...@freaks.com >2. mailto:cp...@pppl.gov >3. mailto:bh...@freaks.com > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Depth of Scrub
> I have a small question about the depth of scrub in a > raidz/2/3 configuration. > I'm quite sure scrub does not check spares or unused > areas of the disks (it > could check if the disks detects any errors there). > But what about the parity? >From some informal performance testing of RAIDZ2/3 arrays, I am confident that >scrub reads the parity blocks and normal reads do not. You can see this for yourself with "iostat -x" or "zpool iostat -v" Start monitoring and watch read I/O. You will see regularly that a RAIDZ3 array will read from all but three drives, which I presume is the unread parity. Do the same monitoring while a scrub is underway and you will see all drives being read from equally. My experience suggests something similar is taking place with mirrors. If you think about it, having a scrub check everything but the parity would be a rather pointless operation. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Usage on drives
Hello again, I'm wondering if we can see the amount of usage for a drive in ZFS raidz mirror. I'm in the process of replacing some drives but I want to replace the less used drives first (maybe only 40-50% utilisation). Is there such a thing? I saw somewhere that a guy had 3 drives in a raidz, one drive only had to be resilvered 612Gb to replace. I'm hoping as theres quite a bit of free space that some drives only occupy a little and therefore only resilver 200-300Gb of data. Thanks, Andre _ New, Used, Demo, Dealer or Private? Find it at CarPoint.com.au http://clk.atdmt.com/NMN/go/206222968/direct/01/___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrating to ZFS
On Fri, Jun 4, 2010 at 2:59 PM, zfsnoob4 wrote: > "It's not easy to make Solaris slices on the boot drive." > > As I am just realizing. The installer does not have any kind of partition > software. > > I have a linux boot disc and I am contemplating using gparted to resize the > win partition to create a raw 50GB empty partition. Can the installer format > a raw partition into a Solaris FS? If it can it will be easy (assuming it can > set up the dual boot properly). > > This is what I'm thinking: > 1) Use Gparted to resize the windows partition and therefore create a 50GB > raw partition. > 2) Use the opensolaris installer to format the raw partition into a Solaris > FS. > 3) Install opensolaris 2009.06, the setup should automatically configure the > dual boot with windows and opensolaris. > > Does that make sense? > that's exactly what I usually do -- O< ascii ribbon campaign - stop html mail - www.asciiribbon.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slog / log recovery is here!
R. Eulenberg wrote: Sorry for reviving this old thread. I even have this problem on my (productive) backup server. I lost my system-hdd and my separate ZIL-device while the system crashs and now I'm in trouble. The old system was running under the least version of osol/dev (snv_134) with zfs v22. After the server crashs I was very optimistic of solving the problems the same day. It's a long time ago. I was setting up a new systen (osol 2009.06 and updating to the lastest version of osol/dev - snv_134 - with deduplication) and then I tried to import my backup zpool, but it does not work. # zpool import pool: tank1 id: 5048704328421749681 state: UNAVAIL status: The pool was last accessed by another system. action: The pool cannot be imported due to damaged devices or data. see: http://www.sun.com/msg/ZFS-8000-EY config: tank1UNAVAIL missing device raidz2-0 ONLINE c7t5d0 ONLINE c7t0d0 ONLINE c7t6d0 ONLINE c7t3d0 ONLINE c7t1d0 ONLINE c7t4d0 ONLINE c7t2d0 ONLINE # zpool import -f tank1 cannot import 'tank1': one or more devices is currently unavailable Destroy and re-create the pool from a backup source Any other option (-F, -X, -V, -D) and any combination of them doesn't helps too. I can not add / attach / detach / remove a vdev and the ZIL-device either, because the system tells me: there is no zpool 'tank1'. In the last ten days I read a lot of threads, guides to solve problems and best practice documentations with ZFS and so on, but I do not found a solution for my problem. I created a fake-zpool with separate ZIL-device to combine the new ZIL-file with my old zpool for importing them, but it doesn't work in course of the different GUID and checksum (the name I was modifiing by an binary editor). The output of: e...@opensolaris:~# zdb -e tank1 Configuration for import: vdev_children: 2 version: 22 pool_guid: 5048704328421749681 name: 'tank1' state: 0 hostid: 946038 hostname: 'opensolaris' vdev_tree: type: 'root' id: 0 guid: 5048704328421749681 children[0]: type: 'raidz' id: 0 guid: 16723866123388081610 nparity: 2 metaslab_array: 23 metaslab_shift: 30 ashift: 9 asize: 7001340903424 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 6858138566678362598 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@0,0:a' whole_disk: 1 DTL: 4345 create_txg: 4 path: '/dev/dsk/c7t5d0s0' devid: 'id1,s...@sata_samsung_hd103uj___s13pj1bq709050/a' children[1]: type: 'disk' id: 1 guid: 16136237447458434520 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@1,0:a' whole_disk: 1 DTL: 4344 create_txg: 4 path: '/dev/dsk/c7t0d0s0' devid: 'id1,s...@sata_samsung_hd103uj___s13pjdwq317311/a' children[2]: type: 'disk' id: 2 guid: 10876853602231471126 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@2,0:a' whole_disk: 1 DTL: 4343 create_txg: 4 path: '/dev/dsk/c7t6d0s0' devid: 'id1,s...@sata_hitachi_hdt72101__stf604mh14s56w/a' children[3]: type: 'disk' id: 3 guid: 2384677379114262201 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@3,0:a' whole_disk: 1 DTL: 4342 create_txg: 4 path: '/dev/dsk/c7t3d0s0' devid: 'id1,s...@sata_samsung_hd103uj___s13pj1nq811135/a' children[4]: type: 'disk' id: 4 guid: 15143849195434333247 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@4,0:a' whole_disk: 1 DTL: 4341 create_txg: 4 path: '/dev/dsk/c7t1d0s0' devid: 'id1,s...@sata_hitachi_hdt72101__stf604mh16v73w/a' children[5]: type: 'disk' id: 5 guid: 11627603446133164653 ph
Re: [zfs-discuss] ZFS recovery tools
David Magda wrote: On Wed, June 2, 2010 02:20, Sigbjorn Lie wrote: I have just recovered from a ZFS crash. During the antagonizing time this took, I was surprised to learn how undocumented the tools and options for ZFS recovery we're. I managed to recover thanks to some great forum posts from Victor Latushkin, however without his posts I would still be crying at night... For the archives, from a private exchange: Zdb(1M) is complicated and in-flux, so asking on zfs-discuss or calling Oracle isn't a very onerous request IMHO. As for recovery, see zpool(1M): zpool import [-o mntopts] [ -o property=value] ... [-d dir | -c cachefile] [-D] [-f] [-R root] [-F [-n]] pool | id [newpool] [...] -F Recovery mode for a non-importable pool. Attempt to return the pool to an importable state by discarding the last few transactions. Not all damaged pools can be recovered by using this option. If successful, the data from the discarded transactions is irretrievably lost. This option is ignored if the pool is importable or already imported. http://docs.sun.com/app/docs/doc/819-2240/zpool-1m This is available as of svn_128, and not in Solaris as of Update 8 (10/09): http://bugs.opensolaris.org/view_bug.do?bug_id=6667683 This was part of PSARC 2009/479: http://arc.opensolaris.org/caselog/PSARC/2009/479/ http://www.c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-support.html http://sparcv9.blogspot.com/2009/09/zpool-recovery-support-psarc2009479.html Personally I'm waiting for Solaris 10u9 for a lot of these fixes and updates [...]. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Excellent! I wish I would have known about these features when I was attempting to recover my pool using 2009.06/snv111. I still believe there are some document updates to be done. While I was attempting to recover my pool and googling for information I found none of these documents. What I did find was a lot of forum posts about people that did not manage to make a recover, and assumed their data was lost. "ZFS Troubleshooting and Data Recovery" from the "Solaris ZFS Administration Guide" and the ZFS Troubleshooting Guide at SolarisInternals would greatly benefit from being updated with the information you provided. One of the reasons for this is that they appear at the top of googles rankings for "zfs recovery" as search topic. :) Thank you for the links. :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Depth of Scrub
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of sensille > > I'm quite sure scrub does not check spares or unused areas of the disks > (it > could check if the disks detects any errors there). > But what about the parity? Obviously it has to be checked, but I can't > find > any indications for it in the literature. The man page only states that > the > data is being checksummed and only if that fails the redundancy is > being used. > Please tell me I'm wrong ;) If my understanding is correct, a scrub reads and checksums all the used blocks on all the primary storage devices. Meaning: The scrub is not checking log devices or spares, and I don't think it checks cache devices. And as you said, it's not reading empty space. The main reason to use scrub, as opposed to your find command (which has some serious shortcomings) or even a "zfs send > /dev/null" command (which has far fewer shortcomings) is: When you just tell the system to read data, you're only sure to read one half of redundant data. You might coincidentally just read the good side of the mirror, or whatever, and therefore fail to detect the corrupted data on the other side of the mirror. You've got to use the scrub. It is very wise to perform a scrub occasionally, because you can only correct errors as long as you still have redundancy. If a device fails, and degrades redundancy, and then some rarely used block is discovered to be corrupt during the resilver ... too bad for you. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] one more time: pool size changes
On Jun 3, 2010 7:35 PM, David Magda wrote: > On Jun 3, 2010, at 13:36, Garrett D'Amore wrote: > > > Perhaps you have been unlucky. Certainly, there is > a window with N > > +1 redundancy where a single failure leaves the > system exposed in > > the face of a 2nd fault. This is a statistics > game... > > It doesn't even have to be a drive failure, but an > unrecoverable read > error. Well said. Also include a controller burp, a bit flip somewhere, a drive going offline briefly, fibre cable momentary interruption, etc. The list goes on. My experience is that these weirdo "once in a lifetime" issues tend to present in clumps which are not as evenly distributred as statistics would lead you to believe. Rather, like my kids, they save up their fun into coordinated bursts. When these bursts happen, the ensuing conversations with stakeholders about how all of this "redundancy" you tricked them into purchasing has left them exposed. Not good times. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] nfs share of nested zfs directories?
Thank you, when I manually mount using the "mount -t nfs4" option, I am able to see the entire tree, however, the permissions are set as nfsnobody. "Warning: rpc.idmapd appears not to be running. All uids will be mapped to the nobody uid." - Cassandra (609) 243-2413 Unix Administrator "From a little spark may burst a mighty flame." -Dante Alighieri On Thu, Jun 3, 2010 at 4:33 PM, Brandon High wrote: > On Thu, Jun 3, 2010 at 12:50 PM, Cassandra Pugh wrote: > > The special case here is that I am trying to traverse NESTED zfs systems, > > for the purpose of having compressed and uncompressed directories. > > Make sure to use "mount -t nfs4" on your linux client. The standard > "nfs" type only supports nfs v2/v3. > > -B > > -- > Brandon High : bh...@freaks.com > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slog / log recovery is here!
Sorry for reviving this old thread. I even have this problem on my (productive) backup server. I lost my system-hdd and my separate ZIL-device while the system crashs and now I'm in trouble. The old system was running under the least version of osol/dev (snv_134) with zfs v22. After the server crashs I was very optimistic of solving the problems the same day. It's a long time ago. I was setting up a new systen (osol 2009.06 and updating to the lastest version of osol/dev - snv_134 - with deduplication) and then I tried to import my backup zpool, but it does not work. # zpool import pool: tank1 id: 5048704328421749681 state: UNAVAIL status: The pool was last accessed by another system. action: The pool cannot be imported due to damaged devices or data. see: http://www.sun.com/msg/ZFS-8000-EY config: tank1UNAVAIL missing device raidz2-0 ONLINE c7t5d0 ONLINE c7t0d0 ONLINE c7t6d0 ONLINE c7t3d0 ONLINE c7t1d0 ONLINE c7t4d0 ONLINE c7t2d0 ONLINE # zpool import -f tank1 cannot import 'tank1': one or more devices is currently unavailable Destroy and re-create the pool from a backup source Any other option (-F, -X, -V, -D) and any combination of them doesn't helps too. I can not add / attach / detach / remove a vdev and the ZIL-device either, because the system tells me: there is no zpool 'tank1'. In the last ten days I read a lot of threads, guides to solve problems and best practice documentations with ZFS and so on, but I do not found a solution for my problem. I created a fake-zpool with separate ZIL-device to combine the new ZIL-file with my old zpool for importing them, but it doesn't work in course of the different GUID and checksum (the name I was modifiing by an binary editor). The output of: e...@opensolaris:~# zdb -e tank1 Configuration for import: vdev_children: 2 version: 22 pool_guid: 5048704328421749681 name: 'tank1' state: 0 hostid: 946038 hostname: 'opensolaris' vdev_tree: type: 'root' id: 0 guid: 5048704328421749681 children[0]: type: 'raidz' id: 0 guid: 16723866123388081610 nparity: 2 metaslab_array: 23 metaslab_shift: 30 ashift: 9 asize: 7001340903424 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 6858138566678362598 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@0,0:a' whole_disk: 1 DTL: 4345 create_txg: 4 path: '/dev/dsk/c7t5d0s0' devid: 'id1,s...@sata_samsung_hd103uj___s13pj1bq709050/a' children[1]: type: 'disk' id: 1 guid: 16136237447458434520 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@1,0:a' whole_disk: 1 DTL: 4344 create_txg: 4 path: '/dev/dsk/c7t0d0s0' devid: 'id1,s...@sata_samsung_hd103uj___s13pjdwq317311/a' children[2]: type: 'disk' id: 2 guid: 10876853602231471126 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@2,0:a' whole_disk: 1 DTL: 4343 create_txg: 4 path: '/dev/dsk/c7t6d0s0' devid: 'id1,s...@sata_hitachi_hdt72101__stf604mh14s56w/a' children[3]: type: 'disk' id: 3 guid: 2384677379114262201 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@3,0:a' whole_disk: 1 DTL: 4342 create_txg: 4 path: '/dev/dsk/c7t3d0s0' devid: 'id1,s...@sata_samsung_hd103uj___s13pj1nq811135/a' children[4]: type: 'disk' id: 4 guid: 15143849195434333247 phys_path: '/p...@0,0/pci8086,2...@1e/pci11ab,1...@9/d...@4,0:a' whole_disk: 1 DTL: 4341 create_txg: 4 path: '/dev/dsk/c7t1d0s0' devid: 'id1,s...@sata_hitachi_hdt72101__stf604mh16v73w/a' children[5]: type: 'disk' id: 5 guid: 11627603446133164653 phys_path: '/p...@0,0
[zfs-discuss] Depth of Scrub
Hi, I have a small question about the depth of scrub in a raidz/2/3 configuration. I'm quite sure scrub does not check spares or unused areas of the disks (it could check if the disks detects any errors there). But what about the parity? Obviously it has to be checked, but I can't find any indications for it in the literature. The man page only states that the data is being checksummed and only if that fails the redundancy is being used. Please tell me I'm wrong ;) But what I'm really targeting with my question: How much coverage can be reached with a find | xargs wc in contrast to scrub? It misses the snapshots, but anything beyond that? Thanks, Arne ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrating to ZFS
"It's not easy to make Solaris slices on the boot drive." As I am just realizing. The installer does not have any kind of partition software. I have a linux boot disc and I am contemplating using gparted to resize the win partition to create a raw 50GB empty partition. Can the installer format a raw partition into a Solaris FS? If it can it will be easy (assuming it can set up the dual boot properly). This is what I'm thinking: 1) Use Gparted to resize the windows partition and therefore create a 50GB raw partition. 2) Use the opensolaris installer to format the raw partition into a Solaris FS. 3) Install opensolaris 2009.06, the setup should automatically configure the dual boot with windows and opensolaris. Does that make sense? Thanks again. Message was edited by: zfsnoob4 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss