Re: [zfs-discuss] TLER and ZFS
>This would require a low-level re-format and would significantly >reduce the available space if it was possible at all. I don't think it is possible. >> WD has a jumper, >>but is there explicitly to work with WindowsXP, and is not a real way >>to dumb down the drive to 512. > >All it does is offset the sector numbers by 1 so that sector 63 >becomes physical sector 64 (a multiple of 4KB). Is that all? And this forces 4K alignment? >> I would presume that any vendor that >>is shipping 4K sector size drives now, with a jumper to make it >>'real' 512, would be supporting that over the long run? > >I would be very surprised if any vendor shipped a drive that could >be jumpered to "real" 512 bytes. The best you are going to get is >jumpered to logical 512 bytes and maybe a 1-sector offset (needed >for WindozeXP only). These jumpers will probably last as long as >the 8GB jumpers that were needed by old BIOS code. (Eg BIOS boots >using simulated 512-byte sectors and then the OS tells the drive to >switch to native mode). I would assume that such a jumper would change the drive from "4K native" to "pretend to be have 512 byte sectors"/ >It's unfortunate that Sun didn't bite the bullet several decades >ago and provide support for block sizes other than 512-bytes >instead of getting custom firmware for their CD drives to make >them provide 512-byte logical blocks for 2KB CD-ROMs. Since Solaris x86 works fine with standard CD/DVD drives, that is no longer an issue. Solaris does support larger sectors. >It's even more idiotic of WD to sell a drive with 4KB sectors but >not provide any way for an OS to identify those drives and perform >4KB aligned I/O. I'm not sure that that is correct; the drive works on naive clients but I believe it can reveal its true colors. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
>Changing the sector size (if it's possible at all) would require a >reformat of the drive. The WD drives only support a 4K sector but they pretend to have 512byte sectors. I don't think they need to format the drive when changing to 4K sectors. A non-aligned write requires a read-modify-write operation and that makes the file slower. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
ZFS already aligns the beginning of data areas to 4KB offsets from the label. For modern OpenSolaris and Solaris implementations, the default starting block for partitions is also aligned to 4KB. On Oct 5, 2010, at 6:36 PM, Michael DeMan wrote: > Hi upfront, and thanks for the valuable information. > > > On Oct 5, 2010, at 4:12 PM, Peter Jeremy wrote: > >>> Another annoying thing with the whole 4K sector size, is what happens >>> when you need to replace drives next year, or the year after? >> >> About the only mitigation needed is to ensure that any partitioning is >> based on multiples of 4KB. > > I agree, but to be quite honest, I have no clue how to do this with ZFS. It > seems that it should be something under the regular tuning documenation. Disagree. Starting alignment is not a problem OOB. You have to go out of your way to make the starting alignments not be 4KB aligned. > > http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide > > http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide > > > Is it going to be the case that basic information like about how to deal with > common scenarios like this is no longer going to be publicly available, and > Oracle will simply keep it 'close to the vest', with the relevant information > simply available for those who choose to research it themselves, or only > available to those with certain levels of support contracts from Oracle? > > To put it another way - does the community that uses ZFS need to fork 'ZFS > Best Practices' and 'ZFZ Evil Tuning' to ensure that it is reasonably up to > date? ZFS Best Practices and Evil Tuning Guide are not hosted by Oracle. They are hosted at the SolarisInternals.com site. -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com ZFS and performance consulting http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] invalid vdev configuration after power failure
Kyle Kakligian gmail.com> writes: > I'm not sure why `zfs import` choked on this [typical?] error case, > but its easy to fix with a very careful dd. I took a different and > very roundabout approach to recover my data, however, since I'm not > confident in my 'careful' skills. (after all, where's my backup?) > Instead, on a linux workstation where I am more cozy, I compiled > zfs-fuse from the source with a slight modification to ignore labels 2 > and 3. fusermount worked great and I recovered my data without issue. Hi, waking up the old thread, would you mind sharing the information how to edit the zfs-fuse to ignoring labels? thanks, regards, diyanc ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
Hi upfront, and thanks for the valuable information. On Oct 5, 2010, at 4:12 PM, Peter Jeremy wrote: >> Another annoying thing with the whole 4K sector size, is what happens >> when you need to replace drives next year, or the year after? > > About the only mitigation needed is to ensure that any partitioning is > based on multiples of 4KB. I agree, but to be quite honest, I have no clue how to do this with ZFS. It seems that it should be something under the regular tuning documenation. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide Is it going to be the case that basic information like about how to deal with common scenarios like this is no longer going to be publicly available, and Oracle will simply keep it 'close to the vest', with the relevant information simply available for those who choose to research it themselves, or only available to those with certain levels of support contracts from Oracle? To put it another way - does the community that uses ZFS need to fork 'ZFS Best Practices' and 'ZFZ Evil Tuning' to ensure that it is reasonably up to date? Sorry for the somewhat hostile in the above, but the changes w/ the merger have demoralized a lot of folks I think. - Mike ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
On 2010-Oct-06 05:59:06 +0800, Michael DeMan wrote: >Another annoying thing with the whole 4K sector size, is what happens >when you need to replace drives next year, or the year after? About the only mitigation needed is to ensure that any partitioning is based on multiples of 4KB. > Does >anybody know if there any vendors that are shipping 4K sector drives >that have a jumper option to make them 512 size? This would require a low-level re-format and would significantly reduce the available space if it was possible at all. > WD has a jumper, >but is there explicitly to work with WindowsXP, and is not a real way >to dumb down the drive to 512. All it does is offset the sector numbers by 1 so that sector 63 becomes physical sector 64 (a multiple of 4KB). > I would presume that any vendor that >is shipping 4K sector size drives now, with a jumper to make it >'real' 512, would be supporting that over the long run? I would be very surprised if any vendor shipped a drive that could be jumpered to "real" 512 bytes. The best you are going to get is jumpered to logical 512 bytes and maybe a 1-sector offset (needed for WindozeXP only). These jumpers will probably last as long as the 8GB jumpers that were needed by old BIOS code. (Eg BIOS boots using simulated 512-byte sectors and then the OS tells the drive to switch to native mode). It's unfortunate that Sun didn't bite the bullet several decades ago and provide support for block sizes other than 512-bytes instead of getting custom firmware for their CD drives to make them provide 512-byte logical blocks for 2KB CD-ROMs. It's even more idiotic of WD to sell a drive with 4KB sectors but not provide any way for an OS to identify those drives and perform 4KB aligned I/O. -- Peter Jeremy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
Michael DeMan wrote: The WD 1TB 'enterprise' drives are still 512 sector size and safe to use, who knows though, maybe they just started shipping with 4K sector size as I write this e-mail? Another annoying thing with the whole 4K sector size, is what happens when you need to replace drives next year, or the year after? That part has me worried on this whole 4K sector migration thing more than what to buy today. Given the choice, I would prefer to buy 4K sector size now, but operating system support is still limited. Does anybody know if there any vendors that are shipping 4K sector drives that have a jumper option to make them 512 size? WD has a jumper, but is there explicitly to work with WindowsXP, and is not a real way to dumb down the drive to 512. I would presume that any vendor that is shipping 4K sector size drives now, with a jumper to make it 'real' 512, would be supporting that over the long run? Changing the sector size (if it's possible at all) would require a reformat of the drive. On SCSI disks which support it, you do it by changing the sector size on the relevant mode select page, and then sending a format-unit command to make the drive relayout all the sectors. I've no idea if these 4K sata drives have any such mechanism, but I would expect they would. BTW, I've been using a pair of 1TB Hitachi Ultrastar for something like 18 months without any problems at all. Of course, a 1 year old disk model is no longer available now. I'm going to have to swap out for bigger disks in the not too distant future. -- Andrew Gabriel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs volume snapshot
On Oct 4, 2010, at 8:53 AM, Wei Li wrote: > Hi All, > > If a ZFS volume is presented to LDOM guest domain as whole disk (used as root > disk), does anyone know how to snapshot it? It is something like how to > snapshot zfs raw volume (NOTE, no ufs file system directly created on ZFS > volume in above case)). zfs snapshot poolname/volumen...@snapshotname -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com ZFS and performance consulting http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
On Oct 5, 2010, at 2:06 PM, Michael DeMan wrote: > > On Oct 5, 2010, at 1:47 PM, Roy Sigurd Karlsbakk wrote: > >>> Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal >>> Hard Drive -Bare Drive >>> >>> are only $129. >>> >>> vs. $89 for the 'regular' black drives. >>> >>> 45% higher price, but it is my understanding that the 'RAID Edition' >>> ones also are physically constructed for longer life, lower vibration >>> levels, etc. >> >> Well, here it's about 60% up and for 150 drives, that makes a wee >> difference... >> >> Vennlige hilsener / Best regards >> >> roy > > Understood on 1.6 times cost, especially for quantity 150 drives. One service outage will consume far more in person-hours and downtime than this little bit of money. Penny-wise == Pound-foolish? -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com ZFS and performance consulting http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
On Oct 5, 2010, at 2:47 PM, casper@sun.com wrote: > > > I've seen several important features when selecting a drive for > a mirror: > > TLER (the ability of the drive to timeout a command) > sector size (native vs virtual) > power use (specifically at home) > performance (mostly for work) > price > > I've heard scary stories about a mismatch of the native sector size and > unaligned Solaris partitions (4K sectors, unaligned cylinder). > Yes, avoiding the 4K sector sizes is a huge issue right now too - another item I forgot on the reasons to absolutely avoid those WD 'green' drives. Three good reasons to avoid WD 'green' drives for ZFS... - TLER issues - IntelliPower head park issues - 4K sector size issues ...they are an absolutely nightmare. The WD 1TB 'enterprise' drives are still 512 sector size and safe to use, who knows though, maybe they just started shipping with 4K sector size as I write this e-mail? Another annoying thing with the whole 4K sector size, is what happens when you need to replace drives next year, or the year after? That part has me worried on this whole 4K sector migration thing more than what to buy today. Given the choice, I would prefer to buy 4K sector size now, but operating system support is still limited. Does anybody know if there any vendors that are shipping 4K sector drives that have a jumper option to make them 512 size? WD has a jumper, but is there explicitly to work with WindowsXP, and is not a real way to dumb down the drive to 512. I would presume that any vendor that is shipping 4K sector size drives now, with a jumper to make it 'real' 512, would be supporting that over the long run? I would be interested, and probably others would too, on what the original poster finally decides on this? - Mike ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side
On Mon, Oct 04, 2010 at 02:28:18PM -0400, Miles Nordin wrote: > > "nw" == Nicolas Williams writes: > > nw> I would think that 777 would invite chmods. I think you are > nw> handwaving. > > it is how AFS worked. Since no file on a normal unix box besides /tmp But would the AFS experience translate into double plus happiness for us? > ever had 777 it would send a SIGWTF to any AFS-unaware graybeards that > stumbled onto the directory, alerting them that they needed to go > learn something and come back. A signal?! How would that work when the entity doing a chmod is on a remote NFS client? > I understand that everything:everyone on windows doesn't send SIGWTF, > but 777 on unix for AFS sites it did. You realize it's not > hypothetical, right? AFS was actually implemented, widely, and > there's experience with it. Yes... but I'm skeptical about the universality of that experience's applicability. Specifically: I don't think it could work for us. AFS developers had fewer constraints than Solaris developers. It is no surprise that they were able to find happy solutions to these sorts of problems long ago. OpenAFS has a Windows native client and an Explorer shell extension (which surely handles chmod?). However, we don't have the luxury of telling customers to install third-party (possibly ours, whatever) Windows native clients for protocols other than SMB, nor can we tell them to install Explorer shell extensions. Solaris' SMB server needs to work out of the box and without the limitations implied by having a separate ACL and mode (well, we have that now, but we always compute a new mode from the new ACL when ACLs are changed). > If they failed to act on the SIGWTF, the overall system enforced the > tighter of the unix permissions and the AFS ACL, so it fails closed. > The current system fails open. The current system fails closed (by discarding the ACL and replacing it with a new one based entirely on the new mode). > Also AFS did no translation between unix permissions and AFS ACL's so > it was easy to undo such a mistake when it happened: double-check the > AFS ACL is not too wide on the directories where you see unix people > mucking around in case the muckers were responding to a real problem, > then set the unix modes back to 777. Right, but with SMB in the picture we don't have this luxury. You seem unwilling to accept that one constraint. > nw> When chmod()ing an object... ZFS would search for the most > nw> specific matching file in .zfs/ACLs/ and, if found, would > nw> replace the chmod()ed object's ACL with that of the > nw> .zfs/ACLs/... file found. The .inherit suffix would indicate > nw> that if the chmod() target's parent directory has inherittable > nw> ACEs then they will be groupmasked and added to the ACEs from > nw> the .zfs/ACLs/... file to produce a final ACL. > > This proposal, like the current situation, seems to make chmod > configurable to act like ``not chmod'' which IMHO is exactly what's > unpopular about the current regime. You've tried to leave chmod To some degree, yes. It's different though, and might conceivably be acceptable, though I don't think it will be (I was illustrating potential alternatives). But I really like one thing about it: most apps shouldn't care about ACL contents, they should care about context-specific permissions changes. In a directory containing shared documents the intention should typically be "share with all these people", while in home directories the intention should typically be "don't share with anyone" (but this will vary; e.g., ~/.ssh/authorized_keys needs to be reachable and readable by everyone). Add in executable versus not- executable, and you have a pretty complete picture -- just a few "named" ACLs at most, per-dataset. If we could replace chmod(2) with a version that takes actual names for pre-configured ACLs, _that_ would be great. But we can't for the same reason that we can't remove chmod(2): it's a widely used interface. > active on windows trees and guess at the intent of whoever invokes > chmod, providing no warning that you're secretly doing > ``approximately'' what he asked for rather than exactly. Maybe that > flies on Windows, but on Unix people expect more precision: thorough > abstractions that survive corner cases and have good exception > handling. Look, mode is a pretty lame hammer -- ACLs are far, far more granular-- but it's a hammer that many apps use. Given the lack of granularity of modes, I think an approximation of intent is the best we can do. Consider: both, aclmode=discard and aclmode=groupmask behaviors can be considered to be what the user intended. How do you know if the user intended for other users and groups to retain access limited to the group bits of a new mode? You can't, not without asking the user. So aclmode=discard is certainly an approximation of user intent, and so aclmode=groupmask must be considered an approximation
Re: [zfs-discuss] TLER and ZFS
>My immediate reaction to this is "time to avoid WD drives for a while"; >until things shake out and we know what's what reliably. > >But, um, what do we know about say the Seagate Barracuda 7200.12 ($70), >the SAMSUNG Spinpoint F3 1TB ($75), or the HITACHI Deskstar 1TB 3.5" >($70)? I've seen several important features when selecting a drive for a mirror: TLER (the ability of the drive to timeout a command) sector size (native vs virtual) power use (specifically at home) performance (mostly for work) price I've heard scary stories about a mismatch of the native sector size and unaligned Solaris partitions (4K sectors, unaligned cylinder). I was pretty happen with the WD drives (except for the one with a seriously broken cache) but I see the reasons to not to pick WD drives over the 1TB range. Are people now using 4K native sectors and formating them with 4K sectors in (Open)Solaris? Performance sucks when you use unaligned accesses but is performance good when the performance is aligned? Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
On Tue, Oct 5, 2010 at 3:47 PM, Roy Sigurd Karlsbakk wrote: > > Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal > > Hard Drive -Bare Drive > > > > are only $129. > > > > vs. $89 for the 'regular' black drives. > > > > 45% higher price, but it is my understanding that the 'RAID Edition' > > ones also are physically constructed for longer life, lower vibration > > levels, etc. > > Well, here it's about 60% up and for 150 drives, that makes a wee > difference... > > Vennlige hilsener / Best regards > > roy > -- > Roy Sigurd Karlsbakk > (+47) 97542685 > r...@karlsbakk.net > http://blogg.karlsbakk.net/ > > If you're spending upwards of $30,000 on a storage system, you probably shouldn't skimp on the most important component. You might as well be complaining that ECC ram costs more. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
On Tue, October 5, 2010 15:30, Roy Sigurd Karlsbakk wrote: > I just discovered WD Black drives are rumored not to be set to allow TLER. > Does anyone know how much performance impact the lack of TLER might have > on a large pool? Choosing Enterprise drives will cost about 60% more, and > on a large install, that means a lot of money... My immediate reaction to this is "time to avoid WD drives for a while"; until things shake out and we know what's what reliably. But, um, what do we know about say the Seagate Barracuda 7200.12 ($70), the SAMSUNG Spinpoint F3 1TB ($75), or the HITACHI Deskstar 1TB 3.5" ($70)? This is not a completely theoretical question to me; it's getting on towards time to at least consider replacing my oldest mirrored pair; those are 400GB Seagate, I think, dating from 2006. I'd want something at least twice as big (to make the space upgrade worthwhile), and I'm expecting to buy three of them rather than just two because I think it's time to add a hot spare to the system (currently 3 pair of data disks, and I've got two more bays; I think a hot spare is a better use for them than a fourth pair; safety of the data is very important, performance is adequate, and I need a modest capacity upgrade, but the whole pool is currently 1.2TB usable, not large). On the third hand, there's the Barracuda 7200.11 1.5TB for only $75, which is a really small price increment for a big space increment. The WD RE3 1TB is $130 (all these prices are from Newegg just now). That's very close to TWICE the price of the competing 1TB drives. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
On Oct 5, 2010, at 1:47 PM, Roy Sigurd Karlsbakk wrote: >> Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal >> Hard Drive -Bare Drive >> >> are only $129. >> >> vs. $89 for the 'regular' black drives. >> >> 45% higher price, but it is my understanding that the 'RAID Edition' >> ones also are physically constructed for longer life, lower vibration >> levels, etc. > > Well, here it's about 60% up and for 150 drives, that makes a wee > difference... > > Vennlige hilsener / Best regards > > roy Understood on 1.6 times cost, especially for quantity 150 drives. I think (and if I am wrong, somebody else correct me) - that if you are using commodity controllers, which seems to generally fine for ZFS, then if a drive times out trying to constantly re-read a bad sector, it could stall out the read on the entire pool overall. On the other hand, if the drives are exported as JBOD from a RAID controller, I would think the RAID controller itself would just mark the drive as bad and offline it quickly based on its own internal algorithms. The above would also be relevant to the anticipated usage. For instance, if it is some sort of backup machine and delays due to some reads stalling on out TLER then perhaps it is not a big deal. If it is for more of an up-front production use, that could be intolerable. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
> Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal > Hard Drive -Bare Drive > > are only $129. > > vs. $89 for the 'regular' black drives. > > 45% higher price, but it is my understanding that the 'RAID Edition' > ones also are physically constructed for longer life, lower vibration > levels, etc. Well, here it's about 60% up and for 150 drives, that makes a wee difference... Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] TLER and ZFS
I'm not sure on the TLER issues by themselves, but after the nightmares I have gone through dealing with the 'green drives', which have both the TLER issue and the IntelliPower head parking issues, I would just stay away from it all entirely and pay extra for the 'RAID Editiion' drives. Just out of curiosity, I took a peek a newegg. Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal Hard Drive -Bare Drive are only $129. vs. $89 for the 'regular' black drives. 45% higher price, but it is my understanding that the 'RAID Edition' ones also are physically constructed for longer life, lower vibration levels, etc. On Oct 5, 2010, at 1:30 PM, Roy Sigurd Karlsbakk wrote: > Hi all > > I just discovered WD Black drives are rumored not to be set to allow TLER. > Does anyone know how much performance impact the lack of TLER might have on a > large pool? Choosing Enterprise drives will cost about 60% more, and on a > large install, that means a lot of money... > > Vennlige hilsener / Best regards > > roy > -- > Roy Sigurd Karlsbakk > (+47) 97542685 > r...@karlsbakk.net > http://blogg.karlsbakk.net/ > -- > I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det > er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av > idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og > relevante synonymer på norsk. > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] TLER and ZFS
Hi all I just discovered WD Black drives are rumored not to be set to allow TLER. Does anyone know how much performance impact the lack of TLER might have on a large pool? Choosing Enterprise drives will cost about 60% more, and on a large install, that means a lot of money... Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS crypto bug status change
> "dm" == David Magda writes: dm> Thank you Mr. Moffat et al. Hopefully the rest of us will be dm> able to bang on this at some point. :) Thanks for the heads-up on the gossip. This etiquette seems weird, though: I don't thank Microsoft for releasing a new version of Word. I'll postpone my thanks for 2 years until the source is released, though by then who knows if I'll still be using ZFS at all. Maybe more appropriate would be: congrats on finally finishing your seven-year project, Darren! must be a huge relief. I'm glad it wasn't my project, though. If I were in Darren's place I'd have signed on to work for an open-source company, spent seven years of my life working on something, delaying it and pushing hard to make it a generation beyond other filesystem crypto, and then when I'm finally done, . That's me, though. I shouldn't speculate on someone else's situation. Maybe he signed on under different circumstances, or delayed for different reasons than feature-ambition, or cares about different things than I do. I only mean to make an example of how politics, featuresets, and IT planning interact to make an ecosystem that's got more complicated implications than just a bulleted list of features and a license with an OSI logo. -- READ CAREFULLY. By reading this fortune, you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. pgpxfnP4VSj9Z.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Long import due to spares.
Just for history as to why Fishworks was running on this box...we were in the beta program and have upgraded along the way. This box is an X4240 with 16x 146GB disks running the Feb 2010 release of FW with de-dupe. We were getting ready to re-purpose the box and getting our data off. We then deleted a filesystem that was using de-duplication and the box suddenly went into a freeze and the pool had activity like crazy. After several failed attempts to recover the box to usable state (days of importing failed), we reloaded the boot drives with Nexenta 3.0 (b134) (which was our goal anyway). When we tried to import this pool again, after 24 hours the pool finally imported but with the error that the two spares were FAULTED with too many errors. Controller is an LSI 1068E-IR Normally, I'd believe the drive was dead except, both spares? Could this be related to the de-dupe FS being deleted? -J ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] moving newly created pool to alternate host
Hi Sridhar, After a zpool split operation, you can access the newly created pool by using the zpool import command. If the LUNs from mypool are available on host1 and host2, you should be able to import mypool_snap from host2. After mypool_snap is imported, it will be available for backups, but not in read-only mode. It is important that data from these pools is not accessed from two the different hosts at the same. An upcoming feature is a read-only import that might be helpful in your environment. Thanks, Cindy On 10/05/10 07:41, sridhar surampudi wrote: Hi, If have below kind of configuration (as an example): c1t1d1 and c2t2d2 are two LUNs visible (un masked) to both host1 and host2. Created a pool mypool as below zpool create mypool mirror c1t1d1 c2t2d2 Now I did zpool split zpool split mypool mypool_snap Once i run zpool split, is there a way I can move/visible newly created mypool_snap to other host i.e host2 ?? so that I can able to access all file systems and files in read only mode for backup? Thanks & Regards, sridhar. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrating to an aclmode-less world
On Mon, Oct 04, 2010 at 04:30:05PM -0600, Cindy Swearingen wrote: > Hi Simon, > > I don't think you will see much difference for these reasons: > > 1. The CIFS server ignores the aclinherit/aclmode properties. Because CIFS/SMB has no chmod operation :) > 2. Your aclinherit=passthrough setting overrides the aclmode > property anyway. aclinherit=passthrough-x is a better choice. Also, aclinherit doesn't override aclmode. aclinherit applies on create and aclmode used to apply on chmod. > 3. The only difference is that if you use chmod on these files > to manually change the permissions, you will lose the ACL values. Right. That only happens from NFSv3 clients [that don't instead edit the POSIX Draft ACL translated from the ZFS ACL], from non-Windows NFSv4 clients [that don't instead edit the ACL], and from local applications [that don't instead edit the ZFS ACL]. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] moving newly created pool to alternate host
Hi, If have below kind of configuration (as an example): c1t1d1 and c2t2d2 are two LUNs visible (un masked) to both host1 and host2. Created a pool mypool as below zpool create mypool mirror c1t1d1 c2t2d2 Now I did zpool split zpool split mypool mypool_snap Once i run zpool split, is there a way I can move/visible newly created mypool_snap to other host i.e host2 ?? so that I can able to access all file systems and files in read only mode for backup? Thanks & Regards, sridhar. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrating to an aclmode-less world
Hi Cindy, That sounds very reassuring. Thanks a lot. Simon -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resilver endlessly restarting at completion
This seems to have been a false alarm, sorry for that. As soon as I started paying attention (logging zpool status, peeking around with zdb & mdb) the resilver didn't restart unless provoked. A cleartext log would have been nice ("restarted due to c11t7 becoming online"). A slight problem i can see is that resilver restarts always if a device is added to the array. In my case devices were absent for a short period (some SATA failure that corrected itself by running cfgadm -c disconnect & connect) and it would have been beneficial to let resilver run to completion and restart only after that to resilver missing data on the added device. ZFS does have some intelligence in those cases that all data is not resilvered, but only blocks that have been born after the outage. Also, as i had a spare in the array, that kicked in, which probably was not what I would have wanted, as that triggered a full resilver, and not a partial one. After the fact I could not kick the spare out, and could not make the resilvering process forget about doing a full resilver. Plus now I have to replace it back out and make it a cold spare. But end is well, all is well.. mostly. Devices seem to be still dropping from the SATA bus randomly. Maybe I'll cough together a report and post to storage-discuss. On Wed, Sep 29, 2010 at 8:13 PM, Tuomas Leikola wrote: > The endless resilver problem still persists on OI b147. Restarts when it > should complete. > > I see no other solution than to copy the data to safety and recreate the > array. Any hints would be appreciated as that takes days unless i can stop > or pause the resilvering. > > > On Mon, Sep 27, 2010 at 1:13 PM, Tuomas Leikola > wrote: > >> Hi! >> >> My home server had some disk outages due to flaky cabling and whatnot, and >> started resilvering to a spare disk. During this another disk or two >> dropped, and were reinserted into the array. So no devices were actually >> lost, they just were intermittently away for a while each. >> >> The situation is currently as follows: >> pool: tank >> state: ONLINE >> status: One or more devices has experienced an unrecoverable error. An >> attempt was made to correct the error. Applications are >> unaffected. >> action: Determine if the device needs to be replaced, and clear the errors >> using 'zpool clear' or replace the device with 'zpool replace'. >>see: http://www.sun.com/msg/ZFS-8000-9P >> scrub: resilver in progress for 5h33m, 22.47% done, 19h10m to go >> config: >> >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 0 >> raidz1-0 ONLINE 0 0 0 >> c11t1d0p0 ONLINE 0 0 0 >> c11t2d0ONLINE 0 0 5 >> c11t6d0p0 ONLINE 0 0 0 >> spare-3ONLINE 0 0 0 >> c11t3d0p0ONLINE 0 0 0 106M >> resilvered >> c9d1 ONLINE 0 0 0 104G >> resilvered >> c11t4d0p0 ONLINE 0 0 0 >> c11t0d0p0 ONLINE 0 0 0 >> c11t5d0p0 ONLINE 0 0 0 >> c11t7d0p0 ONLINE 0 0 0 93.6G >> resilvered >> raidz1-2 ONLINE 0 0 0 >> c6t2d0 ONLINE 0 0 0 >> c6t3d0 ONLINE 0 0 0 >> c6t4d0 ONLINE 0 0 0 2.50K >> resilvered >> c6t5d0 ONLINE 0 0 0 >> c6t6d0 ONLINE 0 0 0 >> c6t7d0 ONLINE 0 0 0 >> c6t1d0 ONLINE 0 0 1 >> logs >> /dev/zvol/dsk/rpool/log ONLINE 0 0 0 >> cache >> c6t0d0p0 ONLINE 0 0 0 >> spares >> c9d1 INUSE currently in use >> >> errors: No known data errors >> >> And this has been going on for a week now, always restarting when it >> should complete. >> >> The questions in my mind atm: >> >> 1. How can i determine the cause for each resilver? Is there a log? >> >> 2. Why does it resilver the same data over and over, and not just the >> changed bits? >> >> 3. Can i force remove c9d1 as it is no longer needed but c11t3 can be >> resilvered instead? >> >> I'm running opensolaris 134, but the event originally happened on 111b. I >> upgraded and tried quiescing snapshots and IO, none of which helped. >> >> I've already ordered some new hardware to recreate this entire array as >> raidz2 among other things, but there's about a week of time when I can run >> debuggers and traces if instructed to. >> >> - Tuo