Re: [zfs-discuss] How does resilver/scrub work?

2012-05-18 Thread Jim Klimov
on that below :) 2012-05-18 15:30, Daniel Carosone wrote: On Fri, May 18, 2012 at 03:05:09AM +0400, Jim Klimov wrote: While waiting for that resilver to complete last week, I caught myself wondering how the resilvers (are supposed to) work in ZFS? The devil finds work for idle hands

Re: [zfs-discuss] How does resilver/scrub work?

2012-05-18 Thread Jim Klimov
2012-05-18 19:08, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov I'm reading the ZFS on-disk spec, and I get the idea that there's an uberblock pointing to a self-balancing tree (some say b-tree, some say

Re: [zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-17 Thread Jim Klimov
with no deletions so far is oh-so-good! ;) 2012-05-17 1:21, Jim Klimov wrote: 2012-05-15 19:17, casper@oracle.com wrote: Your old release of Solaris (nearly three years old) doesn't support disks over 2TB, I would think. (A 3TB is 3E12, the 2TB limit is 2^41 and the difference is around 800Gb

Re: [zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-17 Thread Jim Klimov
New question: if the snv_117 does see the 3Tb disks well, the matter of upgrading the OS becomes not so urgent - we might prefer to delay that until the next stable release of OpenIndiana or so. Now that I think of it, when was raidz3 introduced?.. I don't see it in the zpool manpage as of SXCE

Re: [zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-17 Thread Jim Klimov
2012-05-18 1:39, Jim Klimov написал: A small follow-up on my tests, just in case readers are interested in some numbers: the UltraStar 3Tb disk got filled up by a semi-random selection of data from our old pool in 24 hours sharp One more number: the smaller pool completed its scrub in 57

[zfs-discuss] How does resilver/scrub work?

2012-05-17 Thread Jim Klimov
Hello all, While waiting for that resilver to complete last week, I caught myself wondering how the resilvers (are supposed to) work in ZFS? Based on what I see in practice and read in this list and some blogs, I've built a picture and would be grateful if some experts actually familiar

Re: [zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-16 Thread Jim Klimov
2012-05-16 6:18, Bob Friesenhahn wrote: You forgot IDEA #6 where you take advantage of the fact that zfs can be told to use sparse files as partitions. This is rather like your IDEA #3 but does not require that disks be partitioned. This is somewhat the method of making missing devices when

Re: [zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-16 Thread Jim Klimov
2012-05-16 13:30, Joerg Schilling написал: Jim Klimovjimkli...@cos.ru wrote: We know that large redundancy is highly recommended for big HDDs, so in-place autoexpansion of the raidz1 pool onto 3Tb disks is out of the question. Before I started to use my thumper, I reconfigured it to use

Re: [zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-16 Thread Jim Klimov
Hello fellow BOFH, I also went by that title in a previous life ;) 2012-05-16 21:58, bofh wrote: Err, why go to all that trouble? Replace one disk per pool. Wait for resilver to finish. Replace next disk. Once all/enough disks have been replaced, turn on autoexpand, and you're done. As

Re: [zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-16 Thread Jim Klimov
2012-05-16 22:21, bofh wrote: There's something going on then. I have 7x 3TB disk at home, in raidz3, so about 12TB usable. 2.5TB actually used. Scrubbing takes about 2.5 hours. I had done the resilvering as well, and that did not take 15 hours/drive. That is the critical moment ;) The

Re: [zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-16 Thread Jim Klimov
2012-05-15 19:17, casper@oracle.com wrote: Your old release of Solaris (nearly three years old) doesn't support disks over 2TB, I would think. (A 3TB is 3E12, the 2TB limit is 2^41 and the difference is around 800Gb) While this was proven correct by my initial experiments, it seems that

[zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-15 Thread Jim Klimov
Hello all, I'd like some practical advice on migration of a Sun Fire X4500 (Thumper) from aging data disks to a set of newer disks. Some questions below are my own, others are passed from the customer and I may consider not all of them sane - but must ask anyway ;) 1) They hope to use 3Tb disks,

Re: [zfs-discuss] Migration of a Thumper to bigger HDDs

2012-05-15 Thread Jim Klimov
, check! ;} 2012-05-15 13:41, Jim Klimov wrote: Hello all, I'd like some practical advice on migration of a Sun Fire X4500 (Thumper) from aging data disks to a set of newer disks. Some questions below are my own, others are passed from the customer and I may consider not all of them sane - but must ask

Re: [zfs-discuss] Resilver restarting several times

2012-05-12 Thread Jim Klimov
2012-05-11 14:22, Jim Klimov wrote: What conditions can cause the reset of the resilvering process? My lost-and-found disk can't get back into the pool because of resilvers restarting... FOLLOW-UP AND NEW QUESTIONS Here is a new piece of evidence - I've finally got something out of fmdump

Re: [zfs-discuss] Resilver restarting several times

2012-05-12 Thread Jim Klimov
2012-05-12 15:52, Jim Klimov wrote: 2012-05-11 14:22, Jim Klimov wrote: What conditions can cause the reset of the resilvering process? My lost-and-found disk can't get back into the pool because of resilvers restarting... Guess I must assume that the disk is dying indeed, losing connection

Re: [zfs-discuss] Resilver restarting several times

2012-05-12 Thread Jim Klimov
Thanks for staying tuned! ;) 2012-05-12 18:34, Richard Elling wrote: On May 12, 2012, at 4:52 AM, Jim Klimov wrote: 2012-05-11 14:22, Jim Klimov wrote: What conditions can cause the reset of the resilvering process? My lost-and-found disk can't get back into the pool because of resilvers

Re: [zfs-discuss] Resilver restarting several times

2012-05-12 Thread Jim Klimov
2012-05-12 7:01, Jim Klimov wrote: Overall the applied question is whether the disk will make it back into the live pool (ultimately with no continuous resilvering), and how fast that can be done - I don't want to risk the big pool with nonredundant arrays for too long. Here lies another

[zfs-discuss] Resilver restarting several times

2012-05-11 Thread Jim Klimov
zfs_resilver_min_time_ms/W0t2 | mdb -kw mdb: failed to dereference symbol: unknown symbol name Thanks for any ideas, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Resilver restarting several times

2012-05-11 Thread Jim Klimov
2012-05-11 17:18, Bob Friesenhahn wrote: On Fri, 11 May 2012, Jim Klimov wrote: Hello all, SHORT VERSION: What conditions can cause the reset of the resilvering process? My lost-and-found disk can't get back into the pool because of resilvers restarting... I recall that with sufficiently

Re: [zfs-discuss] Resilver restarting several times

2012-05-11 Thread Jim Klimov
2012-05-11 17:18, Bob Friesenhahn написал: On Fri, 11 May 2012, Jim Klimov wrote: Hello all, SHORT VERSION: What conditions can cause the reset of the resilvering process? My lost-and-found disk can't get back into the pool because of resilvers restarting... I recall that with sufficiently

Re: [zfs-discuss] Resilver restarting several times

2012-05-11 Thread Jim Klimov
2012-05-12 5:50, Robert Milkowski wrote: What conditions can cause the reset of the resilvering process? My lost-and-found disk can't get back into the pool because of resilvers restarting... Well, for the night I rebooted the machine into single-user mode, to rule out zones, crontabs and

Re: [zfs-discuss] Resilver restarting several times

2012-05-11 Thread Jim Klimov
2012-05-12 4:26, Jim Klimov wrote: Wonder if things would get better or worse if I kick one of the drives (i.e. hotspare c5t6d0) out of the equation: raidz1 ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 spare ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 6.72G resilvered c5t6d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c6t5d0

Re: [zfs-discuss] slow zfs send

2012-05-07 Thread Jim Klimov
2012-05-07 20:45, Karl Rossing цкщеу: I'm wondering why the zfs send could be so slow. Could the other server be slowing down the sas bus? I hope other posters would have more relevant suggestions, but you can see if the buses are contended by dd'ing from the drives. At least that would give

Re: [zfs-discuss] autoexpand in a physical disk with 2 zpool

2012-05-03 Thread Jim Klimov
2012-05-03 9:44, Jordi Espasa Clofent wrote: Note, as you can see, the slice 0 i used for 'rpool' and the slice 7 is used for 'opt'. The autoexpand propierty is enabled in 'rpool' but is disabled in 'opt' This machine is a virtual one (VMware), so I can enlarge the disk easily if I need.

Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?

2012-05-02 Thread Jim Klimov
2012-05-03 3:07, Fred Liu wrote: There is no specific problem to resolve. Just want to get sort of accurate equation between the raw storage size and the usable storage size although the *meta file* size is trivial. If you do mass storage budget, this equation is meaningful. I don't think

Re: [zfs-discuss] cluster vs nfs

2012-04-26 Thread Jim Klimov
On 2012-04-26 2:20, Ian Collins wrote: On 04/26/12 09:54 AM, Bob Friesenhahn wrote: On Wed, 25 Apr 2012, Rich Teer wrote: Perhaps I'm being overly simplistic, but in this scenario, what would prevent one from having, on a single file server, /exports/nodes/node[0-15], and then having each node

Re: [zfs-discuss] cluster vs nfs

2012-04-26 Thread Jim Klimov
On 2012-04-26 14:47, Ian Collins wrote: I don't think it even made it into Solaris 10. Actually, I see the kernel modules available in both Solaris 10, several builds of OpenSolaris SXCE and an illumos-current. $ find /kernel/ /platform/ /usr/platform/ /usr/kernel/ | grep -i cachefs

Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?

2012-04-26 Thread Jim Klimov
On 2012-04-26 11:27, Fred Liu wrote: “zfs 'userused@' properties” and “'zfs userspace' command” are good enough to gather usage statistics. ... Since no one is focusing on enabling default user/group quota now, the temporarily remedy could be a script which traverse all the users/groups in the

Re: [zfs-discuss] Two disks giving errors in a raidz pool, advice needed

2012-04-24 Thread Jim Klimov
On 2012-04-24 19:14, Tim Cook wrote: Personally unless the dataset is huge and you're using z3, I'd be scrubbing once a week. Even if it's z3, just do a window on Sunday's or something so that you at least make it through the whole dataset at least once a month. +1 I guess Among other

Re: [zfs-discuss] Two disks giving errors in a raidz pool, advice needed

2012-04-23 Thread Jim Klimov
2012-04-23 9:35, Daniel Carosone wrote: I'll try to leave all 6 original disks in the machine while replacing, maybe zfs will be smart enough to use the 6 drives to build the replacement disk ? I don't think it will.. others who know the code, feel free to comment otherwise. Well, I've heard

Re: [zfs-discuss] Aaron Toponce: Install ZFS on Debian GNU/Linux

2012-04-18 Thread Jim Klimov
2012-04-18 6:57, David E. wrote: Now, make your zpool, and start playing: $ sudo zpool create test raidz sdd sde sdf sdg sdh sdi It is stable enough to run a ZFS root filesystem on a GNU/Linux installation for your workstation as something to play around with. It is copy-on-write, supports

Re: [zfs-discuss] Aaron Toponce: Install ZFS on Debian GNU/Linux

2012-04-18 Thread Jim Klimov
2012-04-18 18:54, Cindy Swearingen wrote: Hmmm, how come they have encryption and we don't? As in Solaris releases, or some other we? With all due respect, I did not mean to start a flame war, so I'll frantically try to stomp out the sparks ;) Still, this is a zfs discuss list at

Re: [zfs-discuss] Drive upgrades

2012-04-17 Thread Jim Klimov
2012-04-17 5:15, Richard Elling wrote: For the archives... Write-back cache enablement is toxic for file systems that do not issue cache flush commands, such as Solaris' UFS. In the early days of ZFS, on Solaris 10 or before ZFS was bootable on OpenSolaris, it was not uncommon to have ZFS and

Re: [zfs-discuss] zpool split failing

2012-04-17 Thread Jim Klimov
2012-04-17 14:47, Matt Keenan wrote: - or is it possible that one of the devices being a USB device is causing the failure ? I don't know. Might be, I've got little experience with those beside LiveUSB imagery ;) My reason for splitting the pool was so I could attach the clean USB rpool to

Re: [zfs-discuss] What's wrong with LSI 3081 (1068) + expander + (bad) SATA disk?

2012-04-08 Thread Jim Klimov
2012-04-08 6:06, Richard Elling wrote: You can't get past the age-old idiom: you get what you pay for. True... but it can be somewhat countered with DrHouse-age idiom: people lie, even if they don't mean to ;) Rhetoric foolows ;) Hardware breaks sooner or later, due to poor design, brownian

Re: [zfs-discuss] What's wrong with LSI 3081 (1068) + expander + (bad) SATA disk?

2012-04-07 Thread Jim Klimov
I'm not familiar with the J4400 at all, but isn't Sun/Oracle using -like NetAPP- Interposer cards and thus handling the SATA drives more or less like SAS ones? Out of curiosity, are there any third-party hardware vendors that make server/storage chassis (Supermicro et al) who make SATA

Re: [zfs-discuss] no valid replicas

2012-04-05 Thread Jim Klimov
2012-04-04 23:27, Jan-Aage Frydenbø-Bruvoll wrote: Which OS and release? This is OpenIndiana oi_148, ZFS pool version 28. There was a bug in some releases circa 2010 that you might be hitting. It is harmless, but annoying. Ok - what bug is this, how do I verify whether I am facing it here

Re: [zfs-discuss] no valid replicas

2012-04-05 Thread Jim Klimov
2012-04-05 16:04, Jim Klimov написал: 2012-04-04 23:27, Jan-Aage Frydenbø-Bruvoll wrote: Which OS and release? This is OpenIndiana oi_148, ZFS pool version 28. There was a bug in some releases circa 2010 that you might be hitting. It is harmless, but annoying. Ok - what bug is this, how

Re: [zfs-discuss] kernel panic during zfs import

2012-03-27 Thread Jim Klimov
2012-03-27 11:14, Carsten John write: I saw a similar effect some time ago on a opensolaris box (build 111b). That time my final solution was to copy over the read only mounted stuff to a newly created pool. As it is the second time this failure occures (on different machines) I'm really

Re: [zfs-discuss] volblocksize for VMware VMFS-5

2012-03-26 Thread Jim Klimov
heard that VMWare has some smallish limit on the number of NFS connections, but 30 should be bearable... HTH, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] webserver zfs root lock contention under heavy load

2012-03-26 Thread Jim Klimov
2012-03-26 14:27, Aubrey Li wrote: The php temporary folder is set to /tmp, which is tmpfs. By the way, how much RAM does the box have available? tmpfs in Solaris is backed by virtual memory. It is like a RAM disk, although maybe slower than ramdisk FS as seen in livecd, as long as there is

Re: [zfs-discuss] webserver zfs root lock contention under heavy load

2012-03-26 Thread Jim Klimov
As a random guess, try pointing PHP tmp directory to /var/tmp (backed by zfs) and see if any behaviors change? Good luck, //Jim Thanks for your suggestions. Actually the default PHP tmp directory was /var/tmp, and I changed /var/tmp to /tmp. This reduced zfs root lock contention

Re: [zfs-discuss] Good tower server for around 1,250 USD?

2012-03-24 Thread Jim Klimov
2012-03-24 2:02, Bob Friesenhahn wrote: On Fri, 23 Mar 2012, The Honorable Senator and Mrs. John Blutarsky wrote: Obtaining an approved system seems very difficult. Because of the list being out of date and so the systems are no longer available, or because systems available now don't show

Re: [zfs-discuss] Basic ZFS Questions + Initial Setup Recommendation

2012-03-22 Thread Jim Klimov
2012-03-21 22:53, Richard Elling wrote: ... This is why a single vdev's random-read performance is equivalent to the random-read performance of a single drive. It is not as bad as that. The actual worst case number for a HDD with zfs_vdev_max_pending of one is: average IOPS * ((D+P) / D)

Re: [zfs-discuss] Basic ZFS Questions + Initial Setup Recommendation

2012-03-22 Thread Jim Klimov
2012-03-22 20:52, Richard Elling wrote: Yes, but it is a rare case for 512b sectors. It could be more common for 4KB sector disks when ashift=12. ... Were there any research or tests regarding storage of many small files (1-sector sized or close to that) on different vdev layouts? It is not

Re: [zfs-discuss] Basic ZFS Questions + Initial Setup Recommendation

2012-03-21 Thread Jim Klimov
should consider? Thanks for any help. Again, I hope someone else would correctly suggest the setup for your numbers. I'm somewhat more successful with theory now ;( HTH, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http

Re: [zfs-discuss] Basic ZFS Questions + Initial Setup Recommendation

2012-03-21 Thread Jim Klimov
2012-03-21 16:41, Paul Kraus wrote: I have been running ZFS in a mission critical application since zpool version 10 and have not seen any issues with some of the vdevs in a zpool full while others are virtually empty. We have been running commercial Solaris 10 releases. The configuration

Re: [zfs-discuss] Basic ZFS Questions + Initial Setup Recommendation

2012-03-21 Thread Jim Klimov
2012-03-21 17:28, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of MLR ... Am I correct in thinking this means, for example, I have a single 14 disk raidz2 vdev zpool, It's not advisable to put more than ~8 disks

Re: [zfs-discuss] Basic ZFS Questions + Initial Setup Recommendation

2012-03-21 Thread Jim Klimov
2012-03-21 21:40, Marion Hakanson цкщеу: Small, random read performance does not scale with the number of drives in each raidz[123] vdev because of the dynamic striping. In order to read a single logical block, ZFS has to read all the segments of that logical block, which have been spread out

Re: [zfs-discuss] Convert pool from ashift=12 to ashift=9

2012-03-20 Thread Jim Klimov
2012-03-18 23:47, Richard Elling wrote: ... Yes, it is wrong to think that. Ok, thanks, we won't try that :) copy out, copy in. Whether this is easy or not depends on how well you plan your storage use ... Home users and personal budgets do tend to have a problem with planning. Any

[zfs-discuss] Question about Seagate Pipeline HD or SV35 seriesHDDs

2012-03-18 Thread Jim Klimov
Hello, while browsing around today I stumbled across Seagate Pipeline HD HDDs lineup (i.e. ST2000VM002). Did any ZFS users have any experience with them? http://www.seagate.com/www/en-us/products/consumer_electronics/pipeline/

[zfs-discuss] Convert pool from ashift=12 to ashift=9

2012-03-18 Thread Jim Klimov
Hello all, I was asked if it is possible to convert a ZFS pool created explicitly with ashift=12 (via the tweaked binary) and filled with data back into ashift=9 so as to use the slack space from small blocks (BP's, file tails, etc.) The user's HDD marketing text says that it efficiently

Re: [zfs-discuss] Move files between ZFS folder/Datasets ?

2012-03-14 Thread Jim Klimov
2012-03-14 13:57, Svavar Örn Eysteinsson wrote: Hello. I can't seem to find any good info about this case. I'm running OpenIndiana 151 and have some files on a ZFS folder (located under /datapool/stuff). I'm in the need to create a new ZFS folder (/datapool/temp) and move some files from

Re: [zfs-discuss] Move files between ZFS folder/Datasets ?

2012-03-14 Thread Jim Klimov
2012-03-14 16:38, Paul Kraus wrote: On Wed, Mar 14, 2012 at 5:57 AM, Svavar Örn Eysteinssonsva...@fiton.is wrote: I'm running OpenIndiana 151 and have some files on a ZFS folder (located under /datapool/stuff). I'm in the need to create a new ZFS folder (/datapool/temp) and move some files

Re: [zfs-discuss] Unable to import exported zpool on a new server

2012-03-13 Thread Jim Klimov
2012-03-13 16:52, Hung-Sheng Tsao (LaoTsao) Ph.D wrote: hi are the disk/sas controller the same on both server? Seemingly no. I don't see the output of format on Server2, but for Server1 I see that the 3TB disks are used as IDE devices (probably with motherboard SATA-IDE emulation?) while on

[zfs-discuss] zfs command botched in HG source?

2012-03-12 Thread Jim Klimov
VM takes a few days. I did check that the /sbin/zfs binary from original oi_151a works as expected with the same kernel/libs that I've built. Tested on two different pools. Just my 2c about a possible regresion in the bleeding-edge code. Posting a bug to tracker as well... HTH, //Jim Klimov

Re: [zfs-discuss] Receive failing with invalid backup stream error

2012-03-09 Thread Jim Klimov
2012-03-09 9:24, Ian Collins wrote: I sent the snapshot to a file, coped the file to the remote host and piped the file into zfs receive. That worked and I was able to send further snapshots with ssh. Odd. Is it possible that in case of zfs send ... | ssh | zfs recv piping, the two ZFS

Re: [zfs-discuss] Compatibility of Hitachi Deskstar 7K3000 HDS723030ALA640 with ZFS

2012-03-08 Thread Jim Klimov
2012-03-07 17:21, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of luis Johnstone As far as I can tell, the Hitachi Deskstar 7K3000 (HDS723030ALA640) uses 512B sectors and so I presume does not suffer from such

Re: [zfs-discuss] Problem with ESX NFS store on ZFS

2012-02-29 Thread Jim Klimov
2012-02-29 21:15, Mark Wolek wrote: Running Solaris 11 with ZFS and the VM’s on this storage can only be opened and run on 1 ESX host, if I move the files to another host I get access denied, even though root has full permissions to the files. Any ideas or does it ring any bells for anyone

Re: [zfs-discuss] zpool fails with panic in zio_ddt_free()

2012-02-23 Thread Jim Klimov
2012-02-09 22:43, Jim Klimov пишет: 2012-02-04 18:27, Jim Klimov wrote: panicstr = BAD TRAP: type=e (#pf Page fault) rp=ff0010a5e920 addr=30 occurred in module zfs due to a NULL pointer dereference panicstack = unix:die+dd () | unix:trap+1799 () | unix:cmntrap+e6 () | zfs:ddt_phys_decref+c

Re: [zfs-discuss] [o.seib...@cs.ru.nl: A broken ZFS pool...]

2012-02-16 Thread Jim Klimov
2012-02-16 14:57, Olaf Seibert wrote: On Wed 15 Feb 2012 at 14:49:14 +0100, Olaf Seibert wrote: NAME STATE READ WRITE CKSUM tank FAULTED 0 0 2 raidz2-0 DEGRADED 0 0 8 da0

Re: [zfs-discuss] zpool fails with panic in zio_ddt_free()

2012-02-09 Thread Jim Klimov
2012-02-04 18:27, Jim Klimov wrote: panicstr = BAD TRAP: type=e (#pf Page fault) rp=ff0010a5e920 addr=30 occurred in module zfs due to a NULL pointer dereference panicstack = unix:die+dd () | unix:trap+1799 () | unix:cmntrap+e6 () | zfs:ddt_phys_decref+c () | zfs:zio_ddt_free+5c

Re: [zfs-discuss] zpool fails with panic in zio_ddt_free()

2012-02-04 Thread Jim Klimov
rollbacks until required to... Thanks, //Jim 2012-02-04 4:28, Jim Klimov wrote: I got the machine with my 6-disk raidz2 pool booted again, into oi_151a, but it reboots soon after importing the pool. Kernel hits a NULL pointer dereference in DDT-related routines and crashes. According to fmdump, error

Re: [zfs-discuss] HP JBOD D2700 - ok?

2012-02-01 Thread Jim Klimov
2012-02-01 6:22, Ragnar Sundblad wrote: That is almost what I do, except that I only have one HBA. We haven't seen many HBAs fail during the years, none actually, so we thought it was overkill to double those too. But maybe we are wrong? Question: if you use two HBAs on different PCI buses to

[zfs-discuss] (gang?)block layout question, and how to decipher ZDB output?

2012-01-31 Thread Jim Klimov
, :) //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs send recv without uncompressing data stream

2012-01-24 Thread Jim Klimov
2012-01-24 13:05, Mickaël CANÉVET wrote: Hi, Unless I misunderstood something, zfs send of a volume that has compression activated, uncompress it. So if I do a zfs send|zfs receive from a compressed volume to a compressed volume, my data are uncompressed and compressed again. Right ? Is there

Re: [zfs-discuss] What is your data error rate?

2012-01-24 Thread Jim Klimov
RAM or overheated CPUs, power surges from PSU... There is a lot of stuff that can break :) //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs send recv without uncompressing data stream

2012-01-24 Thread Jim Klimov
2012-01-24 19:52, Jim Klimov wrote: 2012-01-24 13:05, Mickaël CANÉVET wrote: Hi, Unless I misunderstood something, zfs send of a volume that has compression activated, uncompress it. So if I do a zfs send|zfs receive from a compressed volume to a compressed volume, my data are uncompressed

Re: [zfs-discuss] ZFS Dedup and bad checksums

2012-01-23 Thread Jim Klimov
2012-01-22 22:58, Richard Elling wrote: On Jan 21, 2012, at 6:32 AM, Jim Klimov wrote: ... So it currently seems to me, that: 1) My on-disk data could get corrupted for whatever reason ZFS tries to protect it from, at least once probably from misdirected writes (i.e. the head landed

Re: [zfs-discuss] ZFS Dedup and bad checksums

2012-01-23 Thread Jim Klimov
2012-01-23 18:25, Jim Klimov wrote: 4) I did not get to check whether dedup=verify triggers a checksum mismatch alarm if the preexisting on-disk data does not in fact match the checksum. All checksum mismatches are handled the same way. I have yet to test (to be certain) whether writing

Re: [zfs-discuss] ZFS Dedup and bad checksums

2012-01-21 Thread Jim Klimov
2012-01-21 0:33, Jim Klimov wrote: 2012-01-13 4:12, Jim Klimov wrote: As I recently wrote, my data pool has experienced some unrecoverable errors. It seems that a userdata block of deduped data got corrupted and no longer matches the stored checksum. For whatever reason, raidz2 did not help

Re: [zfs-discuss] ZFS Dedup and bad checksums

2012-01-21 Thread Jim Klimov
2012-01-21 19:18, Bob Friesenhahn wrote: On Sat, 21 Jan 2012, Jim Klimov wrote: 5) It seems like a worthy RFE to include a pool-wide option to verify-after-write/commit - to test that recent TXG sync data has indeed made it to disk on (consumer-grade) hardware into the designated sector

Re: [zfs-discuss] ZFS Dedup and bad checksums

2012-01-21 Thread Jim Klimov
2012-01-21 20:50, Bob Friesenhahn wrote: TXGs get forgotten from memory as soon as they are written. As I said, that can be arranged - i.e. free the TXG cache after the corresponding TXG number has been verified? Point about ARC being overwritten seems valid... Zfs already knows how to

Re: [zfs-discuss] ZFS Dedup and bad checksums

2012-01-21 Thread Jim Klimov
2012-01-22 0:55, Bob Friesenhahn wrote: On Sun, 22 Jan 2012, Jim Klimov wrote: So far I rather considered flaky hardware with lousy consumer qualities. The server you describe is likely to exceed that bar ;) The most common flaky behavior of consumer hardware which causes troubles for zfs

Re: [zfs-discuss] ZFS Dedup and bad checksums

2012-01-20 Thread Jim Klimov
2012-01-13 4:12, Jim Klimov wrote: As I recently wrote, my data pool has experienced some unrecoverable errors. It seems that a userdata block of deduped data got corrupted and no longer matches the stored checksum. For whatever reason, raidz2 did not help in recovery of this data, so I rsync'ed

Re: [zfs-discuss] Data loss by memory corruption?

2012-01-19 Thread Jim Klimov
2012-01-18 20:36, Nico Williams wrote: On Wed, Jan 18, 2012 at 4:53 AM, Jim Klimovjimkli...@cos.ru wrote: 2012-01-18 1:20, Stefan Ring wrote: I don’t care too much if a single document gets corrupted – there’ll always be a good copy in a snapshot. I do care however if a whole directory branch

Re: [zfs-discuss] Data loss by memory corruption?

2012-01-18 Thread Jim Klimov
2012-01-18 1:20, Stefan Ring wrote: The issue is definitely not specific to ZFS. For example, the whole OS depends on relable memory content in order to function. Likewise, no one likes it if characters mysteriously change in their word processing documents. I don’t care too much if a single

Re: [zfs-discuss] S11 vs illumos zfs compatiblity

2012-01-17 Thread Jim Klimov
of ZFS be the highest quality possible. I'd like to think that we all try to achieve that to the extent that it is possible within our corporate priorities. Thank you for shedding some light of hope. ;) //Jim Klimov ___ zfs-discuss mailing list zfs

Re: [zfs-discuss] Failing WD desktop drive in mirror, how to identify?

2012-01-17 Thread Jim Klimov
2012-01-17 16:17, casper@oracle.com пишет: I have a desktop system with 2 ZFS mirrors. One drive in one mirror is starting to produce read errors and slowing things down dramatically. I detached it and the system is running fine. I can't tell which drive it is though! The error message

Re: [zfs-discuss] zfs disapeare on FreeBSD.

2012-01-17 Thread Jim Klimov
to cache file: # zpool import -R /RPOOL-BACKUP -f 12076177533503245216 rpool-backup Hope these tips help you, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZDB returning strange values

2012-01-17 Thread Jim Klimov
segment line in the end, does it mean that it was not deduplicated with any other blocks (and is stored contiguously)? Thanks, //Jim Klimov === ZDB output samples: strange output Dataset pool/mydata [ZPL], ID 281, cr_txg 291705, 850G, 338171 objects, rootbp DVA[0]=0:49c1a79:3000 DVA[1]=0

Re: [zfs-discuss] Does raidzN actually protect against bitrot? If yes - how?

2012-01-16 Thread Jim Klimov
Thanks again for answering! :) 2012-01-16 10:08, Richard Elling wrote: On Jan 15, 2012, at 7:04 AM, Jim Klimov wrote: Does raidzN actually protect against bitrot? That's a kind of radical, possibly offensive, question formula that I have lately. Simple answer: no. raidz provides data

Re: [zfs-discuss] zfs defragmentation via resilvering?

2012-01-16 Thread Jim Klimov
2012-01-16 8:39, Bob Friesenhahn wrote: On Sun, 15 Jan 2012, Edward Ned Harvey wrote: While I'm waiting for this to run, I'll make some predictions: The file is 2GB (16 Gbit) and the disk reads around 1Gbit/sec, so reading the initial sequential file should take ~16 sec After fragmentation, it

Re: [zfs-discuss] Injection of ZFS snapshots into existing data, and replacement of older snapshots with zfs recv without truncating newer ones

2012-01-16 Thread Jim Klimov
2012-01-16 23:14, Matthew Ahrens пишет: On Thu, Jan 12, 2012 at 5:00 PM, Jim Klimov jimkli...@cos.ru mailto:jimkli...@cos.ru wrote: While reading about zfs on-disk formats, I wondered once again why is it not possible to create a snapshot on existing data, not of the current TXG

Re: [zfs-discuss] Data loss by memory corruption?

2012-01-15 Thread Jim Klimov
it as it matches the checksum. On the good side, there is a smaller window that data is exposed unprotected, so statistically this solution should help. HTH, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman

[zfs-discuss] RaidzN + mirror

2012-01-15 Thread Jim Klimov
How nested can be the VDEV tree? All the examples I've seen suggested 3 layers - root vdev, striping over some top-level vdevs (if present), made (redundantly) of some physical/leaf vdevs. In trivial cases this goes down to two levels (a root striping over non-redundant leaf vdevs) or one

[zfs-discuss] Does raidzN actually protect against bitrot? If yes - how?

2012-01-15 Thread Jim Klimov
that be wonderful for ZFS in general? :) Thanks in advance, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS Metadata on-disk grouping

2012-01-15 Thread Jim Klimov
Does ZFS currently attempt to group metadata in large sector-ranges on the disk? Can this be expected to happen automagically - i.e. during each TXG close we have to COW-update whole branches of the blockpointer tree, so these new blocks might just happen to always coalesce into larger sector

Re: [zfs-discuss] RaidzN + mirror

2012-01-15 Thread Jim Klimov
2012-01-15 19:16, Edward Ned Harvey пишет: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov 2) In particular, if I wanted to make a mirror of raidzN's, can it be done in one ZFS pool, or would I have to play with iSCSI and ZVOLs

Re: [zfs-discuss] Does raidzN actually protect against bitrot? If yes - how?

2012-01-15 Thread Jim Klimov
2012-01-15 19:38, Edward Ned Harvey wrote: 1) How does raidzN protect agaist bit-rot without known full death of a component disk, if it at all does? zfs can read disks 1,2,3,4... Then read disks 1,2,3,5... Then read disks 1,2,4,5... ZFS can figure out which disk returned the faulty

Re: [zfs-discuss] Does raidzN actually protect against bitrot? If yes - how?

2012-01-15 Thread Jim Klimov
for the replies, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does raidzN actually protect against bitrot? If yes - how?

2012-01-15 Thread Jim Klimov
with the on-wire CRC/ECC, perhaps the IDE (and maybe consumer SATA) protocols? Thanks for replies, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Do the disks in a zpool have a private region that I can read to get a zpool name or id?

2012-01-12 Thread Jim Klimov
-discuss HTH, //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Do the disks in a zpool have a private region that I can read to get a zpool name or id?

2012-01-12 Thread Jim Klimov
Followup: namely, the 'hostname' field should report the host which has last (or currently) imported the pool, and the 'name' field is the pool name as of last import (can be changed by like zpool import pool1 testpool2). HTH, //Jim Klimov ___ zfs

Re: [zfs-discuss] Do the disks in a zpool have a private region that I can read to get a zpool name or id?

2012-01-12 Thread Jim Klimov
there is popular. Good luck, and let us know if your practice proves my generic rant wrong! ;) //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Jim Klimov
detail and many examples, and gave me a better understanding of it all even though I deal with this for several years now. A good read, I suggest it to others ;) //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http

[zfs-discuss] ZFS Dedup and bad checksums

2012-01-12 Thread Jim Klimov
), so corruption of the checksum would also cause replacement of really-good-but-normally-inaccessible data. //Jim Klimov (Bug reported to Illumos: https://www.illumos.org/issues/1981) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http

Re: [zfs-discuss] ZFS Dedup and bad checksums

2012-01-12 Thread Jim Klimov
2012-01-13 4:26, Richard Elling wrote: On Jan 12, 2012, at 4:12 PM, Jim Klimov wrote: As I recently wrote, my data pool has experienced some unrecoverable errors. It seems that a userdata block of deduped data got corrupted and no longer matches the stored checksum. For whatever reason, raidz2

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Jim Klimov
2012-01-13 2:34, Jim Klimov wrote: I guess I have another practical rationale for a second checksum, be it ECC or not: my scrubbing pool found some unrecoverable errors. ...Applications need to know whether the digest has been changed. As Richard reminded me in another thread, both metadata

[zfs-discuss] Injection of ZFS snapshots into existing data, and replacement of older snapshots with zfs recv without truncating newer ones

2012-01-12 Thread Jim Klimov
, why not use the available knowledge of known-good blocks to repair detected {small} errors in large volumes of same data? What do you think?.. //Jim Klimov ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman

<    1   2   3   4   5   6   >