Re: [zfs-discuss] File contents changed with no ZFS error
Why don't you see which byte differs, and how it does? Maybe that would suggest the failure mode. Is it the same byte data in all affected files, for instance? Mark Sent from my iPhone On Oct 22, 2011, at 2:08 PM, Robert Watzlavick rob...@watzlavick.com wrote: On Oct 22, 2011, at 13:14, Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.com wrote: How can you outrule the possibility of something changed the file. Intentionally, not as a form of filesystem corruption. I suppose that's possible but seems unlikely. One byte on a file changed on the disk with no corresponding change in the mod time seems unlikely. I did access that file for read sometime I'm the past few months but again, if it had accidentally been written to, the time would have been updated. If you have snapshots on your ZFS filesystem, you can use zhist (or whatever technique you want) to see in which snapshot(s) it changed, and find all the unique versions of it. 'Course that will only give you any valuable information if you have different versions of the file in different snapshots. I only have one or two snapshots but I'll look. Thanks, -Bob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Oct 18, 2011, at 11:09 AM, Nico Williams wrote: On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson wrote: I just wanted to add something on fsck on ZFS - because for me that used to make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments. Where ZFS doesn't have an fsck command - and that really used to bug me - it does now have a -F option on zpool import. To me it's the same functionality for my environment - the ability to try to roll back to a 'hopefully' good state and get the filesystem mounted up, leaving the corrupted data objects corrupted. [...] Yes, that's exactly what it is. There's no point calling it fsck because fsck fixes individual filesystems, while ZFS fixups need to happen at the volume level (at volume import time). It's true that this should have been in ZFS from the word go. But it's there now, and that's what matters, IMO. Doesn't a scrub do more than what 'fsck' does? It's also true that this was never necessary with hardware that doesn't lie, but it's good to have it anyways, and is critical for personal systems such as laptops. IIRC, fsck was seldom needed at my former site once UFS journalling became available. Sweet update. Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Large scale performance query
Shouldn't the choice of RAID type also be based on the i/o requirements? Anyway, with RAID-10, even a second failed disk is not catastophic, so long as it is not the counterpart of the first failed disk, no matter the no. of disks. (With 2-way mirrors.) But that's why we do backups, right? Mark Sent from my iPhone On Aug 6, 2011, at 7:01 AM, Orvar Korvar knatte_fnatte_tja...@yahoo.com wrote: Ok, so mirrors resilver faster. But, it is not uncommon that another disk shows problem during resilver (for instance r/w errors), this scenario would mean your entire raid is gone, right? If you are using mirrors, and one disk crashes and you start resilver. Then the other disk shows r/w errors because of the increased load - then you are screwed? Because large disks take long time to resilver, possibly weeks? In that case, it would be preferable to use mirrors with 3 disks in each vdev. Trimorrs. Each vdev should be one raidz3. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 2:37 AM, Ian Collins i...@ianshome.com wrote: On 04/ 8/11 06:30 PM, Erik Trimble wrote: On 4/7/2011 10:25 AM, Chris Banal wrote: While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... The move seems to be to the Unified Storage (aka ZFS Storage) line, which is a successor to the 7000-series OpenStorage stuff. http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html Which is not a lot of use to those of us who use X4540s for what they were intended: storage appliances. Can you elaborate briefly on what exactly the problem is? I don't follow? What else would an X4540 or a 7xxx box be used for, other than a storage appliance? Guess I'm slow. :-) Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 3:29 AM, Ian Collins i...@ianshome.com wrote: On 04/ 8/11 08:08 PM, Mark Sandrock wrote: On Apr 8, 2011, at 2:37 AM, Ian Collinsi...@ianshome.com wrote: On 04/ 8/11 06:30 PM, Erik Trimble wrote: On 4/7/2011 10:25 AM, Chris Banal wrote: While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... The move seems to be to the Unified Storage (aka ZFS Storage) line, which is a successor to the 7000-series OpenStorage stuff. http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html Which is not a lot of use to those of us who use X4540s for what they were intended: storage appliances. Can you elaborate briefly on what exactly the problem is? I don't follow? What else would an X4540 or a 7xxx box be used for, other than a storage appliance? Guess I'm slow. :-) No, I just wasn't clear - we use ours as storage/application servers. They run Samba, Apache and various other applications and P2V zones that access the large pool of data. Each also acts as a fail over box (both data and applications) for the other. You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Anything's a fraction of the price of a SAN, isn't it? :-) Mark They replaced several application servers backed by a SAN for a fraction the price of a new SAN. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 7:50 AM, Evaldas Auryla evaldas.aur...@edqm.eu wrote: On 04/ 8/11 01:14 PM, Ian Collins wrote: You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Which is all fine and dandy if you have a green field, or are willing to re-architect your systems. We just wanted to add a couple more x4540s! Hi, same here, it's a sad news that Oracle decided to stop x4540s production line. Before, ZFS geeks had choice - buy 7000 series if you want quick out of the box storage with nice GUI, or build your own storage with x4540 line, which by the way has brilliant engineering design, the choice is gone now. Okay, so what is the great advantage of an X4540 versus X86 server plus disk array(s)? Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 9:39 PM, Ian Collins i...@ianshome.com wrote: On 04/ 9/11 03:20 AM, Mark Sandrock wrote: On Apr 8, 2011, at 7:50 AM, Evaldas Aurylaevaldas.aur...@edqm.eu wrote: On 04/ 8/11 01:14 PM, Ian Collins wrote: You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Which is all fine and dandy if you have a green field, or are willing to re-architect your systems. We just wanted to add a couple more x4540s! Hi, same here, it's a sad news that Oracle decided to stop x4540s production line. Before, ZFS geeks had choice - buy 7000 series if you want quick out of the box storage with nice GUI, or build your own storage with x4540 line, which by the way has brilliant engineering design, the choice is gone now. Okay, so what is the great advantage of an X4540 versus X86 server plus disk array(s)? One less x86 box (even more of an issue now we have to mortgage the children for support), a lot less $. Not to mention an existing infrastructure built using X4540s and me looking a fool explaining to the client they can't get any more so the systems we have spent two years building up are a dead end. One size does not fit all, choice is good for business. I'm not arguing. If it were up to me, we'd still be selling those boxes. Mark -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 11:19 PM, Ian Collins i...@ianshome.com wrote: On 04/ 9/11 03:53 PM, Mark Sandrock wrote: I'm not arguing. If it were up to me, we'd still be selling those boxes. Maybe you could whisper in the right ear? I wish. I'd have a long list if I could do that. Mark :) -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Any use for extra drives?
On Mar 24, 2011, at 7:23 AM, Anonymous wrote: Generally, you choose your data pool config based on data size, redundancy, and performance requirements. If those are all satisfied with your single mirror, the only thing left for you to do is think about splitting your data off onto a separate pool due to better performance etc. (Because there are things you can't do with the root pool, such as striping and raidz) That's all there is to it. To split, or not to split. Thanks for the update. I guess there's not much to do for this box since it's a development machine and doesn't have much need for extra redundancy although if I would have had some extra 500s I would have liked to stripe the root pool. I see from your answer that's not possible anyway. Cheers. If you plan to generate a lot of data, why use the root pool? You can put the /home and /proj filesystems (/export/...) on a separate pool, thus off-loading the root pool. My two cents, Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Any use for extra drives?
On Mar 24, 2011, at 5:42 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Nomen Nescio Hi ladies and gents, I've got a new Solaris 10 development box with ZFS mirror root using 500G drives. I've got several extra 320G drives and I'm wondering if there's any way I can use these to good advantage in this box. I've got enough storage for my needs with the 500G pool. At this point I would be looking for a way to speed things up if possible or add redundancy if necessary but I understand I can't use these smaller drives to stripe the root pool, so what would you suggest? Thanks. Generally, you choose your data pool config based on data size, redundancy, and performance requirements. If those are all satisfied with your single mirror, the only thing left for you to do is think about splitting your data off onto a separate pool due to better performance etc. (Because there are things you can't do with the root pool, such as striping and raidz) That's all there is to it. To split, or not to split. I'd just put /export/home on this second set of drives, as a striped mirror. Same as I would have done in the old days under SDS. :-) Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and spindle speed (7.2k / 10k / 15k)
On Feb 2, 2011, at 8:10 PM, Eric D. Mudama wrote: All other things being equal, the 15k and the 7200 drive, which share electronics, will have the same max transfer rate at the OD. Is that true? So the only difference is in the access time? Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best choice - file system for system
Why do you say fssnap has the same problem? If it write locks the file system, it is only for a matter of seconds, as I recall. Years ago, I used it on a daily basis to do ufsdumps of large fs'es. Mark On Jan 30, 2011, at 5:41 PM, Torrey McMahon wrote: On 1/30/2011 5:26 PM, Joerg Schilling wrote: Richard Ellingrichard.ell...@gmail.com wrote: ufsdump is the problem, not ufsrestore. If you ufsdump an active file system, there is no guarantee you can ufsrestore it. The only way to guarantee this is to keep the file system quiesced during the entire ufsdump. Needless to say, this renders ufsdump useless for backup when the file system also needs to accommodate writes. This is why there is a ufs snapshot utility. You'll have the same problem. fssnap_ufs(1M) write locks the file system when you run the lock command. See the notes section of the man page. http://download.oracle.com/docs/cd/E19253-01/816-5166/6mbb1kq1p/index.html#Notes ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best choice - file system for system
iirc, we would notify the user community that the FS'es were going to hang briefly. Locking the FS'es is the best way to quiesce it, when users are worldwide, imo. Mark On Jan 31, 2011, at 9:45 AM, Torrey McMahon wrote: A matter of seconds is a long time for a running Oracle database. The point is that if you have to keep writing to a UFS filesystem - when the file system also needs to accommodate writes - you're still out of luck. If you can quiesce the apps, great, but if you can't then you're still stuck. In other words, fssnap_ufs doesn't solve the quiesce problem. On 1/31/2011 10:24 AM, Mark Sandrock wrote: Why do you say fssnap has the same problem? If it write locks the file system, it is only for a matter of seconds, as I recall. Years ago, I used it on a daily basis to do ufsdumps of large fs'es. Mark On Jan 30, 2011, at 5:41 PM, Torrey McMahon wrote: On 1/30/2011 5:26 PM, Joerg Schilling wrote: Richard Ellingrichard.ell...@gmail.com wrote: ufsdump is the problem, not ufsrestore. If you ufsdump an active file system, there is no guarantee you can ufsrestore it. The only way to guarantee this is to keep the file system quiesced during the entire ufsdump. Needless to say, this renders ufsdump useless for backup when the file system also needs to accommodate writes. This is why there is a ufs snapshot utility. You'll have the same problem. fssnap_ufs(1M) write locks the file system when you run the lock command. See the notes section of the man page. http://download.oracle.com/docs/cd/E19253-01/816-5166/6mbb1kq1p/index.html#Notes ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A few questions
On Dec 18, 2010, at 12:23 PM, Lanky Doodle wrote: Now this is getting really complex, but can you have server failover in ZFS, much like DFS-R in Windows - you point clients to a clustered ZFS namespace so if a complete server failed nothing is interrupted. This is the purpose of an Amber Road dual-head cluster (7310C, 7410C, etc.) -- not only the storage pool fails over, but also the server IP address fails over, so that NFS, etc. shares remain active, when one storage head goes down. Amber Road uses ZFS, but the clustering and failover are not related to the filesystem type. Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A few questions
Erik, just a hypothetical what-if ... In the case of resilvering on a mirrored disk, why not take a snapshot, and then resilver by doing a pure block copy from the snapshot? It would be sequential, so long as the original data was unmodified; and random access in dealing with the modified blocks only, right. After the original snapshot had been replicated, a second pass would be done, in order to update the clone to 100% live data. Not knowing enough about the inner workings of ZFS snapshots, I don't know why this would not be doable. (I'm biased towards mirrors for busy filesystems.) I'm supposing that a block-level snapshot is not doable -- or is it? Mark On Dec 20, 2010, at 1:27 PM, Erik Trimble wrote: On 12/20/2010 9:20 AM, Saxon, Will wrote: -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey Sent: Monday, December 20, 2010 11:46 AM To: 'Lanky Doodle'; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] A few questions From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Lanky Doodle I believe Oracle is aware of the problem, but most of the core ZFS team has left. And of course, a fix for Oracle Solaris no longer means a fix for the rest of us. OK, that is a bit concerning then. As good as ZFS may be, i'm not sure I want to committ to a file system that is 'broken' and may not be fully fixed, if at all. ZFS is not broken. It is, however, a weak spot, that resilver is very inefficient. For example: On my server, which is made up of 10krpm SATA drives, 1TB each... My drives can each sustain 1Gbit/sec sequential read/write. This means, if I needed to resilver the entire drive (in a mirror) sequentially, it would take ... 8,000 sec = 133 minutes. About 2 hours. In reality, I have ZFS mirrors, and disks are around 70% full, and resilver takes 12-14 hours. So although resilver is broken by some standards, it is bounded, and you can limit it to something which is survivable, by using mirrors instead of raidz. For most people, even using 5-disk, or 7-disk raidzN will still be fine. But you start getting unsustainable if you get up to 21-disk radiz3 for example. This argument keeps coming up on the list, but I don't see where anyone has made a good suggestion about whether this can even be 'fixed' or how it would be done. As I understand it, you have two basic types of array reconstruction: in a mirror you can make a block-by-block copy and that's easy, but in a parity array you have to perform a calculation on the existing data and/or existing parity to reconstruct the missing piece. This is pretty easy when you can guarantee that all your stripes are the same width, start/end on the same sectors/boundaries/whatever and thus know a piece of them lives on all drives in the set. I don't think this is possible with ZFS since we have variable stripe width. A failed disk d may or may not contain data from stripe s (or transaction t). This information has to be discovered by looking at the transaction records. Right? Can someone speculate as to how you could rebuild a variable stripe width array without replaying all the available transactions? I am no filesystem engineer but I can't wrap my head around how this could be handled any better than it already is. I've read that resilvering is throttled - presumably to keep performance degradation to a minimum during the process - maybe this could be a tunable (e.g. priority: low, normal, high)? Do we know if resilvers on a mirror are actually handled differently from those on a raidz? Sorry if this has already been explained. I think this is an issue that everyone who uses ZFS should understand completely before jumping in, because the behavior (while not 'wrong') is clearly NOT the same as with more conventional arrays. -Will the problem is NOT the checksum/error correction overhead. that's relatively trivial. The problem isn't really even variable width (i.e. variable number of disks one crosses) slabs. The problem boils down to this: When ZFS does a resilver, it walks the METADATA tree to determine what order to rebuild things from. That means, it resilvers the very first slab ever written, then the next oldest, etc. The problem here is that slab age has nothing to do with where that data physically resides on the actual disks. If you've used the zpool as a WORM device, then, sure, there should be a strict correlation between increasing slab age and locality on the disk. However, in any reasonable case, files get deleted regularly. This means that the probability that for a slab B, written immediately after slab A, it WON'T be physically near slab A. In the end, the problem is that using metadata order, while reducing the total amount of work to do in the resilver (as
Re: [zfs-discuss] A few questions
On Dec 20, 2010, at 2:05 PM, Erik Trimble wrote: On 12/20/2010 11:56 AM, Mark Sandrock wrote: Erik, just a hypothetical what-if ... In the case of resilvering on a mirrored disk, why not take a snapshot, and then resilver by doing a pure block copy from the snapshot? It would be sequential, so long as the original data was unmodified; and random access in dealing with the modified blocks only, right. After the original snapshot had been replicated, a second pass would be done, in order to update the clone to 100% live data. Not knowing enough about the inner workings of ZFS snapshots, I don't know why this would not be doable. (I'm biased towards mirrors for busy filesystems.) I'm supposing that a block-level snapshot is not doable -- or is it? Mark Snapshots on ZFS are true snapshots - they take a picture of the current state of the system. They DON'T copy any data around when created. So, a ZFS snapshot would be just as fragmented as the ZFS filesystem was at the time. But if one does a raw (block) copy, there isn't any fragmentation -- except for the COW updates. If there were no updates to the snapshot, then it becomes a 100% sequential block copy operation. But even with COW updates, presumably the large majority of the copy would still be sequential i/o. Maybe for the 2nd pass, the filesystem would have to be locked, so the operation would ever complete, but if this is fairly short in relation to the overall resilvering time, then it could still be a win in many cases. I'm probably not explaining it well, and may be way off, but it seemed an interesting notion. Mark The problem is this: Let's say I write block A, B, C, and D on a clean zpool (what kind, it doesn't matter). I now delete block C. Later on, I write block E. There is a probability (increasing dramatically as times goes on), that the on-disk layout will now look like: A, B, E, D rather than A, B, [space], D, E So, in the first case, I can do a sequential read to get A B, but then must do a seek to get D, and a seek to get E. The fragmentation problem is mainly due to file deletion, NOT to file re-writing. (though, in ZFS, being a C-O-W filesystem, re-writing generally looks like a delete-then-write process, rather than a modify process). -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A few questions
It well may be that different methods are optimal for different use cases. Mechanical disk vs. SSD; mirrored vs. raidz[123]; sparse vs. populated; etc. It would be interesting to read more in this area, if papers are available. I'll have to take a look. ... Or does someone have pointers? Mark On Dec 20, 2010, at 6:28 PM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Erik Trimble In the case of resilvering on a mirrored disk, why not take a snapshot, and then resilver by doing a pure block copy from the snapshot? It would be sequential, So, a ZFS snapshot would be just as fragmented as the ZFS filesystem was at the time. I think Mark was suggesting something like dd copy device 1 onto device 2, in order to guarantee a first-pass sequential resilver. And my response would be: Creative thinking and suggestions are always a good thing. In fact, the above suggestion is already faster than the present-day solution for what I'm calling typical usage, but there are an awful lot of use cases where the dd solution would be worse... Such as a pool which is largely sequential already, or largely empty, or made of high IOPS devices such as SSD. However, there is a desire to avoid resilvering unused blocks, so I hope a better solution is possible... The fundamental requirement for a better optimized solution would be a way to resilver according to disk ordering... And it's just a question for somebody that actually knows the answer ... How terrible is the idea of figuring out the on-disk order? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Excruciatingly slow resilvering on X4540 (build 134)
On Nov 2, 2010, at 12:10 AM, Ian Collins wrote: On 11/ 2/10 08:33 AM, Mark Sandrock wrote: I'm working with someone who replaced a failed 1TB drive (50% utilized), on an X4540 running OS build 134, and I think something must be wrong. Last Tuesday afternoon, zpool status reported: scrub: resilver in progress for 306h0m, 63.87% done, 173h7m to go and a week being 168 hours, that put completion at sometime tomorrow night. However, he just reported zpool status shows: scrub: resilver in progress for 447h26m, 65.07% done, 240h10m to go so it's looking more like 2011 now. That can't be right. How is the pool configured? Both 10 and 12 disk RAIDZ-2. That, plus too much other io must be the problem. I'm thinking 5 x (7-2) would be better, assuming he doesn't want to go RAID-10. Thanks much for all the helpful replies. Mark I look after a very busy x5400 with 500G drives configured as 8 drive raidz2 and these take about 100 hours to resilver. The workload on this box is probably worst case for resivering, it receives a steady stream of snapshots. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Faster than 1G Ether... ESX to ZFS
Edward, I recently installed a 7410 cluster, which had added Fiber Channel HBAs. I know the site also has Blade 6000s running VMware, but no idea if they were planning to run fiber to those blades (or even had the option to do so). But perhaps FC would be an option for you? Mark On Nov 12, 2010, at 9:03 AM, Edward Ned Harvey wrote: Since combining ZFS storage backend, via nfs or iscsi, with ESXi heads, I’m in love. But for one thing. The interconnect between the head storage. 1G Ether is so cheap, but not as fast as desired. 10G ether is fast enough, but it’s overkill and why is it so bloody expensive? Why is there nothing in between? Is there something in between? Is there a better option? I mean … sata is cheap, and it’s 3g or 6g, but it’s not suitable for this purpose. But the point remains, there isn’t a fundamental limitation that *requires* 10G to be expensive, or *requires* a leap directly from 1G to 10G. I would very much like to find a solution which is a good fit… to attach ZFS storage to vmware. What are people using, as interconnect, to use ZFS storage on ESX(i)? Any suggestions? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Excruciatingly slow resilvering on X4540 (build 134)
Hello, I'm working with someone who replaced a failed 1TB drive (50% utilized), on an X4540 running OS build 134, and I think something must be wrong. Last Tuesday afternoon, zpool status reported: scrub: resilver in progress for 306h0m, 63.87% done, 173h7m to go and a week being 168 hours, that put completion at sometime tomorrow night. However, he just reported zpool status shows: scrub: resilver in progress for 447h26m, 65.07% done, 240h10m to go so it's looking more like 2011 now. That can't be right. I'm hoping for a suggestion or two on this issue. I'd search the archives, but they don't seem searchable. Or am I wrong about that? Thanks. Mark (subscription pending) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss