Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach I am always experiencing chksum errors while scrubbing my zpool(s), but I never experienced chksum errors while resilvering. Does anybody know why that would be? When you resilver, you're not reading all the data on all the drives. Only just enough to resilver, which doesn't include all the data that was previously in-sync (maybe a little of it, but mostly not). Even if you have a completely failed drive, replaced with a completely new empty drive, if you have a 3-way mirror, you only need to read one good copy of the data in order to write the resilver'd data onto the new drive. So you could still be failing to detect cksum errors on the *other* side of the mirror, which wasn't read during the resilver. What's more, when you resilver, the system is just going to write the target disk. Not go back and verify every written block of the target disk. So, think of a scrub as a complete, thorough, resilver whereas resilver is just a lightweight version, doing only the parts that are known to be out-of sync, and without subsequent read verification. This happens on all of my servers, Sun Fire 4170M2, Dell PE 650 and on any FC storage that I have. While you apparently have been able to keep the system in production for a while, consider yourself lucky. You have a real problem, and solving it probably won't be easy. Your problem is either hardware, firmware, or drivers. If you have a support contract on the Sun, I would recommend starting there. Because the Dell is definitely a configuration that you won't find official support for - just a lot of community contributors, who will likely not provide a super awesome answer for you super soon. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov And regarding the considerable activity - AFAIK there is little way for ZFS to reliably read and test TXGs newer than X My understanding is like this: When you make a snapshot, you're just creating a named copy of the present latest TXG. When you zfs send incremental from one snapshot to another, you're creating the delta between two TXG's, that happen to have names. So when you break a mirror and resilver, it's exactly the same operation as an incremental zfs send, it needs to calculate the delta between the latest (older) TXG on the previously UNAVAIL device, up to the latest TXG on the current pool. Yes this involves examining the meta tree structure, and yes the system will be very busy while that takes place. But the work load is very small relative to whatever else you're likely to do with your pool during normal operation, because that's the nature of the meta tree structure ... very small relative to the rest of your data. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Nico Williams I've wanted a system where dedup applies only to blocks being written that have a good chance of being dups of others. I think one way to do this would be to keep a scalable Bloom filter (on disk) into which one inserts block hashes. To decide if a block needs dedup one would first check the Bloom filter, then if the block is in it, use the dedup code path, How is this different or better than the existing dedup architecture? If you found that some block about to be written in fact matches the hash of an existing block on disk, then you've already determined it's a duplicate block, exactly as you would, if you had dedup enabled. In that situation, gosh, it sure would be nice to have the extra information like reference count, and pointer to the duplicate block, which exists in the dedup table. In other words, exactly the way existing dedup is already architected. The nice thing about this is that Bloom filters can be sized to fit in main memory, and will be much smaller than the DDT. If you're storing all the hashes of all the blocks, how is that going to be smaller than the DDT storing all the hashes of all the blocks? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
So ... The way things presently are, ideally you would know in advance what stuff you were planning to write that has duplicate copies. You could enable dedup, then write all the stuff that's highly duplicated, then turn off dedup and write all the non-duplicate stuff. Obviously, however, this is a fairly implausible actual scenario. In reality, while you're writing, you're going to have duplicate blocks mixed in with your non-duplicate blocks, which fundamentally means the system needs to be calculating the cksums and entering into DDT, even for the unique blocks... Just because the first time the system sees each duplicate block, it doesn't yet know that it's going to be duplicated later. But as you said, after data is written, and sits around for a while, the probability of duplicating unique blocks diminishes over time. So they're just a burden. I would think, the ideal situation would be to take your idea of un-dedup for unique blocks, and take it a step further. Un-dedup unique blocks that are older than some configurable threshold. Maybe you could have a command for a sysadmin to run, to scan the whole pool performing this operation, but it's the kind of maintenance that really should be done upon access, too. Somebody goes back and reads a jpg from last year, system reads it and consequently loads the DDT entry, discovers that it's unique and has been for a long time, so throw out the DDT info. But, by talking about it, we're just smoking pipe dreams. Cuz we all know zfs is developmentally challenged now. But one can dream... finglonger ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iSCSI access patterns and possible improvements?
From: Richard Elling [mailto:richard.ell...@gmail.com] Sent: Saturday, January 19, 2013 5:39 PM the space allocation more closely resembles a variant of mirroring, like some vendors call RAID-1E Awesome, thank you. :-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
Bloom filters are very small, that's the difference. You might only need a few bits per block for a Bloom filter. Compare to the size of a DDT entry. A Bloom filter could be cached entirely in main memory. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Nico Williams To decide if a block needs dedup one would first check the Bloom filter, then if the block is in it, use the dedup code path, else the non-dedup codepath and insert the block in the Bloom filter. Sorry, I didn't know what a Bloom filter was before I replied before - Now I've read the wikipedia article and am consequently an expert. *sic* ;-) It sounds like, what you're describing... The first time some data gets written, it will not produce a hit in the Bloom filter, so it will get written to disk without dedup. But now it has an entry in the Bloom filter. So the second time the data block gets written (the first duplicate) it will produce a hit in the Bloom filter, and consequently get a dedup DDT entry. But since the system didn't dedup the first one, it means the second one still needs to be written to disk independently of the first one. So in effect, you'll always miss the first duplicated block write, but you'll successfully dedup n-1 duplicated blocks. Which is entirely reasonable, although not strictly optimal. And sometimes you'll get a false positive out of the Bloom filter, so sometimes you'll be running the dedup code on blocks which are actually unique, but with some intelligently selected parameters such as Bloom table size, you can get this probability to be reasonably small, like less tha n 1%. In the wikipedia article, they say you can't remove an entry from the Bloom filter table, which would over time cause consistent increase of false positive probability (approaching 100% false positives) from the Bloom filter and consequently high probability of dedup'ing blocks that are actually unique; but with even a minimal amount of thinking about it, I'm quite sure that's a solvable implementation detail. Instead of storing a single bit for each entry in the table, store a counter. Every time you create a new entry in the table, increment the different locations; every time you remove an entry from the table, decrement. Obviously a counter requires more bits than a bit, but it's a linear increase of size, exponential increase of utility, and within the implementation limits of available hardware. But there may be a more intelligent way of accomplishing the same goal. (Like I said, I've only thought about this minimally). Meh, well. Thanks for the interesting thought. For whatever it's worth. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
On 19 January, 2013 - Jim Klimov sent me these 2,0K bytes: Hello all, While revising my home NAS which had dedup enabled before I gathered that its RAM capacity was too puny for the task, I found that there is some deduplication among the data bits I uploaded there (makes sense, since it holds backups of many of the computers I've worked on - some of my homedirs' contents were bound to intersect). However, a lot of the blocks are in fact unique - have entries in the DDT with count=1 and the blkptr_t bit set. In fact they are not deduped, and with my pouring of backups complete - they are unlikely to ever become deduped. Another RFE would be 'zfs dedup mypool/somefs' and basically go through and do a one-shot dedup. Would be useful in various scenarios. Possibly go through the entire pool at once, to make dedups intra-datasets (like the real thing). /Tomas -- Tomas Forsman, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors
Am 20.01.13 16:51, schrieb Edward Ned Harvey (opensolarisisdeadlongliveopensolaris): From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach I am always experiencing chksum errors while scrubbing my zpool(s), but I never experienced chksum errors while resilvering. Does anybody know why that would be? When you resilver, you're not reading all the data on all the drives. Only just enough to resilver, which doesn't include all the data that was previously in-sync (maybe a little of it, but mostly not). Even if you have a completely failed drive, replaced with a completely new empty drive, if you have a 3-way mirror, you only need to read one good copy of the data in order to write the resilver'd data onto the new drive. So you could still be failing to detect cksum errors on the *other* side of the mirror, which wasn't read during the resilver. What's more, when you resilver, the system is just going to write the target disk. Not go back and verify every written block of the target disk. So, think of a scrub as a complete, thorough, resilver whereas resilver is just a lightweight version, doing only the parts that are known to be out-of sync, and without subsequent read verification. Well, I always used to issue a scrub after resilver, but since we completely re-designed our server room, things started to act up and each scrub would at least come up with chksum errors. On the Fire 4170 I only noticed these chksum errors, while on the Dell sometimes the whole thing broke down and ZFS would mark numerous disks as faulted. This happens on all of my servers, Sun Fire 4170M2, Dell PE 650 and on any FC storage that I have. While you apparently have been able to keep the system in production for a while, consider yourself lucky. You have a real problem, and solving it probably won't be easy. Your problem is either hardware, firmware, or drivers. If you have a support contract on the Sun, I would recommend starting there. Because the Dell is definitely a configuration that you won't find official support for - just a lot of community contributors, who will likely not provide a super awesome answer for you super soon. I know, I dedicated quite some of my time to keep this setup up and running. I do have support coverage for my two Sun Solaris servers, but as you may have experienced as well, you're sometimes better off asking here first… ;) I have gone over our SAN setup/topology and maybe I have found at leats one issue worth looking at: we do have five QLogic 5600 SanBoxes and one of then basically operates as a core switch, were all other ISLs are hooked up, That is, this switch has 4 ISLs and 12 storage array connects, while the Dell sits on another Sanbox and thus all traffic is routed through that switch. I don't know, but maybe this a bit too much for this setup and the Dell hosts around 240 drives, which are mostly located on a neighbour switch. I will try and tweak this setup such as that the Dell gets a connection on that Sanbox directly which will vastly reduce the inter-switch-traffic. I am also seeing these warnings in /var/adm/messages on either the Dell and the my new Sun Server X2: Jan 20 18:22:10 solaris11b scsi: [ID 243001 kern.warning] WARNING: /pci@0,0/pci8086,3c08@3/pci1077,171@0,1/fp@0,0 (fcp0): Jan 20 18:22:10 solaris11b SCSI command to d_id=0x10601 lun=0x0 failed, Bad FCP response values: rsvd1=0, rsvd2=0, sts-rsvd1=0, sts-rsvd2=0, rsplen=0, senselen=0 Jan 20 18:22:10 solaris11b scsi: [ID 243001 kern.warning] WARNING: /pci@0,0/pci8086,3c08@3/pci1077,171@0,1/fp@0,0 (fcp0): Jan 20 18:22:10 solaris11b SCSI command to d_id=0x30e01 lun=0x1 failed, Bad FCP response values: rsvd1=0, rsvd2=0, sts-rsvd1=0, sts-rsvd2=0, rsplen=0, senselen=0 These are always targeted at LUNs on a remote Sanboxes… ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
On 2013-01-20 19:55, Tomas Forsman wrote: On 19 January, 2013 - Jim Klimov sent me these 2,0K bytes: Hello all, While revising my home NAS which had dedup enabled before I gathered that its RAM capacity was too puny for the task, I found that there is some deduplication among the data bits I uploaded there (makes sense, since it holds backups of many of the computers I've worked on - some of my homedirs' contents were bound to intersect). However, a lot of the blocks are in fact unique - have entries in the DDT with count=1 and the blkptr_t bit set. In fact they are not deduped, and with my pouring of backups complete - they are unlikely to ever become deduped. Another RFE would be 'zfs dedup mypool/somefs' and basically go through and do a one-shot dedup. Would be useful in various scenarios. Possibly go through the entire pool at once, to make dedups intra-datasets (like the real thing). Yes, but that was asked before =) Actually, the pool's metadata does contain all the needed bits (i.e. checksum and size of blocks) such that a scrub-like procedure could try and find same blocks among unique ones (perhaps with a filter of this block being referenced from a dataset that currently wants dedup), throw one out and add a DDT entry to another. On 2013-01-20 17:16, Edward Harvey wrote: So ... The way things presently are, ideally you would know in advance what stuff you were planning to write that has duplicate copies. You could enable dedup, then write all the stuff that's highly duplicated, then turn off dedup and write all the non-duplicate stuff. Obviously, however, this is a fairly implausible actual scenario. Well, I guess I could script a solution that uses ZDB to dump the blockpointer tree (about 100Gb of text on my system), and some perl or sort/uniq/grep parsing over this huge text to find blocks that are the same but not deduped - as well as those single-copy deduped ones, and toggle the dedup property while rewriting the block inside its parent file with DD. This would all be within current ZFS's capabilities and ultimately reach the goals of deduping pre-existing data as well as dropping unique blocks from the DDT. It would certainly not be a real-time solution (likely might take months on my box - just fetching the BP tree took a couple of days) and would require more resources than needed otherwise (rewrites of same userdata, storing and parsing of addresses as text instead of binaries, etc.) But I do see how this is doable even today even by a non-expert ;) (Not sure I'd ever get around to actually doing this thus, though - it is not a very clean solution nor a performant one). As a bonus, however, this ZDB dump would also provide an answer to a frequently-asked question: which files on my system intersect or are the same - and have some/all blocks in common via dedup? Knowledge of this answer might help admins with some policy decisions, be it witch-hunt for hoarders of same files or some pattern-making to determine which datasets should keep dedup=on... My few cents, //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors
On 2013-01-20 16:56, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov And regarding the considerable activity - AFAIK there is little way for ZFS to reliably read and test TXGs newer than X My understanding is like this: When you make a snapshot, you're just creating a named copy of the present latest TXG. When you zfs send incremental from one snapshot to another, you're creating the delta between two TXG's, that happen to have names. So when you break a mirror and resilver, it's exactly the same operation as an incremental zfs send, it needs to calculate the delta between the latest (older) TXG on the previously UNAVAIL device, up to the latest TXG on the current pool. Yes this involves examining the meta tree structure, and yes the system will be very busy while that takes place. But the work load is very small relative to whatever else you're likely to do with your pool during normal operation, because that's the nature of the meta tree structure ... very small relative to the rest of your data. Hmmm... Given that many people use automatic snapshots, those do provide us many roots for branches of block-pointer tree after a certain TXG (creation of snapshot and the next live variant of the dataset). This might allow resilvering to quickly select only those branches of the metadata tree that are known or assumed to have changed after a disk was temporarily lost - and not go over datasets (snapshots) that are known to have been committed and closed (became read-only) while that disk was online. I have no idea if this optimization does take place in ZFS code, but it seems bound to be there... if not - a worthy RFE, IMHO ;) //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors
Did you try replacing the patch-cables and/or SFPs on the path between servers and disks, or at least cleaning them? A speck of dust (or, God forbid, a pixel of body fat from a fingerprint) caught between the two optic cable cutoffs might cause any kind of signal weirdness from time to time... and lead to improper packets of that optic protocol. Are there switch stats on whether it has seen media errors? //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
On Jan 20, 2013, at 8:16 AM, Edward Harvey imaginat...@nedharvey.com wrote: But, by talking about it, we're just smoking pipe dreams. Cuz we all know zfs is developmentally challenged now. But one can dream... I disagree the ZFS is developmentally challenged. There is more development now than ever in every way: # of developers, companies, OSes, KLOCs, features. Perhaps the level of maturity makes progress appear to be moving slower than it seems in early life? -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
On 2013-01-20 17:16, Edward Harvey wrote: But, by talking about it, we're just smoking pipe dreams. Cuz we all know zfs is developmentally challenged now. But one can dream... I beg to disagree. While most of my contribution was so far about learning stuff and sharing with others, as well as planting some new ideas and (hopefully, seen as constructively) doubting others - including the implementation we have now - and I do have yet to see someone pick up my ideas and turn them into code (or prove why they are rubbish) -- overall I can't say that development stagnated by some metric of stagnation or activity. Yes, maybe there were more cool new things per year popping up with Sun's concentrated engineering talent and financing, but now it seems that most players - wherever they work now - took a pause from the marathon, to refine what was done in the decade before. And this is just as important as churning out innovations faster than people can comprehend or audit or use them. As a loud example of present active development - take the LZ4 quests completed by Saso recently. From what I gather, this is a single man's job done on-line in the view of fellow list members over a few months, almost like a reality-show; and I guess anyone with enough concentration, time and devotion could do likewise. I suspect many of my proposals to the list might also take some half of a man-year to complete. Unfortunately for the community and for part of myself, I now have some higher daily priorities so that I likely won't sit down and code lots of stuff in the nearest years (until that Priority goes to school, or so). Maybe that's why I'm eager to suggest quests for brilliant coders here who can complete the job better and faster than I ever would ;) So I'm doing the next best things I can do to help the progress :) And I don't believe this is in vain, that the development ceased and my writings are only destined to be stuffed under the carpet. Be it these RFEs or dome others, better and more useful, I believe they shall be coded and published in common ZFS code. Sometime... //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
On Sun, Jan 20, 2013 at 6:19 PM, Richard Elling richard.ell...@gmail.comwrote: On Jan 20, 2013, at 8:16 AM, Edward Harvey imaginat...@nedharvey.com wrote: But, by talking about it, we're just smoking pipe dreams. Cuz we all know zfs is developmentally challenged now. But one can dream... I disagree the ZFS is developmentally challenged. There is more development now than ever in every way: # of developers, companies, OSes, KLOCs, features. Perhaps the level of maturity makes progress appear to be moving slower than it seems in early life? -- richard Well, perhaps a part of it is marketing. Maturity isn't really an excuse for not having a long-term feature roadmap. It seems as though maturity in this case equals stagnation. What are the features being worked on we aren't aware of? The big ones that come to mind that everyone else is talking about for not just ZFS but openindiana as a whole and other storage platforms would be: 1. SMB3 - hyper-v WILL be gaining market share over the next couple years, not supporting it means giving up a sizeable portion of the market. Not to mention finally being able to run SQL (again) and Exchange on a fileshare. 2. VAAI support. 3. the long-sought bp-rewrite. 4. full drive encryption support. 5. tiering (although I'd argue caching is superior, it's still a checkbox). There's obviously more, but those are just ones off the top of my head that others are supporting/working on. Again, it just feels like all the work is going into fixing bugs and refining what is there, not adding new features. Obviously Saso personally added features, but overall there don't seem to be a ton of announcements to the list about features that have been added or are being actively worked on. It feels like all these companies are just adding niche functionality they need that may or may not be getting pushed back to mainline. /debbie-downer ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RFE: Un-dedup for unique blocks
On Jan 20, 2013, at 4:51 PM, Tim Cook t...@cook.ms wrote: On Sun, Jan 20, 2013 at 6:19 PM, Richard Elling richard.ell...@gmail.com wrote: On Jan 20, 2013, at 8:16 AM, Edward Harvey imaginat...@nedharvey.com wrote: But, by talking about it, we're just smoking pipe dreams. Cuz we all know zfs is developmentally challenged now. But one can dream... I disagree the ZFS is developmentally challenged. There is more development now than ever in every way: # of developers, companies, OSes, KLOCs, features. Perhaps the level of maturity makes progress appear to be moving slower than it seems in early life? -- richard Well, perhaps a part of it is marketing. A lot of it is marketing :-/ Maturity isn't really an excuse for not having a long-term feature roadmap. It seems as though maturity in this case equals stagnation. What are the features being worked on we aren't aware of? Most of the illumos-centric discussion is on the developer's list. The ZFSonLinux and BSD communities are also quite active. Almost none of the ZFS developers hang out on this zfs-discuss@opensolaris.org anymore. In fact, I wonder why I'm still here... The big ones that come to mind that everyone else is talking about for not just ZFS but openindiana as a whole and other storage platforms would be: 1. SMB3 - hyper-v WILL be gaining market share over the next couple years, not supporting it means giving up a sizeable portion of the market. Not to mention finally being able to run SQL (again) and Exchange on a fileshare. I know of at least one illumos community company working on this. However, I do not know their public plans. 2. VAAI support. VAAI has 4 features, 3 of which have been in illumos for a long time. The remaining feature (SCSI UNMAP) was done by Nexenta and exists in their NexentaStor product, but the CEO made a conscious (and unpopular) decision to keep that code from the community. Over the summer, another developer picked up the work in the community, but I've lost track of the progress and haven't seen an RTI yet. 3. the long-sought bp-rewrite. Go for it! 4. full drive encryption support. This is a key management issue mostly. Unfortunately, the open source code for handling this (trousers) covers much more than keyed disks and can be unwieldy. I'm not sure which distros picked up trousers, but it doesn't belong in the illumos-gate and it doesn't expose itself to ZFS. 5. tiering (although I'd argue caching is superior, it's still a checkbox). You want to add tiering to the OS? That has been available for a long time via the (defunct?) SAM-QFS project that actually delivered code http://hub.opensolaris.org/bin/view/Project+samqfs/ If you want to add it to ZFS, that is a different conversation. -- richard There's obviously more, but those are just ones off the top of my head that others are supporting/working on. Again, it just feels like all the work is going into fixing bugs and refining what is there, not adding new features. Obviously Saso personally added features, but overall there don't seem to be a ton of announcements to the list about features that have been added or are being actively worked on. It feels like all these companies are just adding niche functionality they need that may or may not be getting pushed back to mainline. /debbie-downer -- richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors
Am 21.01.13 00:21, schrieb Jim Klimov: Did you try replacing the patch-cables and/or SFPs on the path between servers and disks, or at least cleaning them? A speck of dust (or, God forbid, a pixel of body fat from a fingerprint) caught between the two optic cable cutoffs might cause any kind of signal weirdness from time to time... and lead to improper packets of that optic protocol. I cleaned the patch cables that run from the Dell to its Sanbox, but not the other ones - especially not the ISLs, since this would almost interrupt our SAN. Are there switch stats on whether it has seen media errors? Has anybody gotton QLogic's SanSurfer to work with anything newer than Java 1.4.2? ;) I checked the logs on my switches and they don't seem to indicate such issues, but I am lacking the real-time monitoring that the old SanSurfer provides. Stephan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss