Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Casper . Dik


>This would require a low-level re-format and would significantly
>reduce the available space if it was possible at all.

I don't think it is possible.

>>  WD has a jumper,
>>but is there explicitly to work with WindowsXP, and is not a real way
>>to dumb down the drive to 512.
>
>All it does is offset the sector numbers by 1 so that sector 63
>becomes physical sector 64 (a multiple of 4KB).

Is that all?  And this forces 4K alignment?

>>  I would presume that any vendor that
>>is shipping 4K sector size drives now, with a jumper to make it
>>'real' 512, would be supporting that over the long run?
>
>I would be very surprised if any vendor shipped a drive that could
>be jumpered to "real" 512 bytes.  The best you are going to get is
>jumpered to logical 512 bytes and maybe a 1-sector offset (needed
>for WindozeXP only).  These jumpers will probably last as long as
>the 8GB jumpers that were needed by old BIOS code.  (Eg BIOS boots
>using simulated 512-byte sectors and then the OS tells the drive to
>switch to native mode).

I would assume that such a jumper would change the drive from
"4K native" to "pretend to be have 512 byte sectors"/

>It's unfortunate that Sun didn't bite the bullet several decades
>ago and provide support for block sizes other than 512-bytes
>instead of getting custom firmware for their CD drives to make
>them provide 512-byte logical blocks for 2KB CD-ROMs.

Since Solaris x86 works fine with standard CD/DVD drives, that is no 
longer an issue.  Solaris does support larger sectors.

>It's even more idiotic of WD to sell a drive with 4KB sectors but
>not provide any way for an OS to identify those drives and perform
>4KB aligned I/O.

I'm not sure that that is correct; the drive works on naive clients but I 
believe it can reveal its true colors.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Casper . Dik


>Changing the sector size (if it's possible at all) would require a
>reformat of the drive.

The WD drives only support a 4K sector but they pretend to have 512byte
sectors.  I don't think they need to format the drive when changing to 4K 
sectors.  A non-aligned write requires a read-modify-write operation and 
that makes the file slower.

Casper



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Richard Elling
ZFS already aligns the beginning of data areas to 4KB offsets from the label.
For modern OpenSolaris and Solaris implementations, the default starting 
block for partitions is also aligned to 4KB.

On Oct 5, 2010, at 6:36 PM, Michael DeMan wrote:

> Hi upfront, and thanks for the valuable information.
> 
> 
> On Oct 5, 2010, at 4:12 PM, Peter Jeremy wrote:
> 
>>> Another annoying thing with the whole 4K sector size, is what happens
>>> when you need to replace drives next year, or the year after?
>> 
>> About the only mitigation needed is to ensure that any partitioning is
>> based on multiples of 4KB.
> 
> I agree, but to be quite honest, I have no clue how to do this with ZFS.  It 
> seems that it should be something under the regular tuning documenation.  

Disagree.  Starting alignment is not a problem OOB. You have to go out of your
way to make the starting alignments not be 4KB aligned.

> 
> http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
> 
> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide
> 
> 
> Is it going to be the case that basic information like about how to deal with 
> common scenarios like this is no longer going to be publicly available, and 
> Oracle will simply keep it 'close to the vest', with the relevant information 
> simply available for those who choose to research it themselves, or only 
> available to those with certain levels of support contracts from Oracle?
> 
> To put it another way - does the community that uses ZFS need to fork 'ZFS 
> Best Practices' and 'ZFZ Evil Tuning' to ensure that it is reasonably up to 
> date?

ZFS Best Practices and Evil Tuning Guide are not hosted by Oracle.  They are
hosted at the SolarisInternals.com site.
 -- richard

-- 
OpenStorage Summit, October 25-27, Palo Alto, CA
http://nexenta-summit2010.eventbrite.com
ZFS and performance consulting
http://www.RichardElling.com












___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] invalid vdev configuration after power failure

2010-10-05 Thread diyanc

Kyle Kakligian  gmail.com> writes:

> I'm not sure why `zfs import` choked on this [typical?] error case,
> but its easy to fix with a very careful dd. I took a different and
> very roundabout approach to recover my data, however, since I'm not
> confident in my 'careful' skills. (after all, where's my backup?)
> Instead, on a linux workstation where I am more cozy, I compiled
> zfs-fuse from the source with a slight modification to ignore labels 2
> and 3. fusermount worked great and I recovered my data without issue.

Hi,

waking up the old thread,

would you mind sharing the information how to
edit the zfs-fuse to ignoring labels?

thanks,

regards,

diyanc

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Michael DeMan
Hi upfront, and thanks for the valuable information.


On Oct 5, 2010, at 4:12 PM, Peter Jeremy wrote:

>> Another annoying thing with the whole 4K sector size, is what happens
>> when you need to replace drives next year, or the year after?
> 
> About the only mitigation needed is to ensure that any partitioning is
> based on multiples of 4KB.

I agree, but to be quite honest, I have no clue how to do this with ZFS.  It 
seems that it should be something under the regular tuning documenation.  

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide

http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide


Is it going to be the case that basic information like about how to deal with 
common scenarios like this is no longer going to be publicly available, and 
Oracle will simply keep it 'close to the vest', with the relevant information 
simply available for those who choose to research it themselves, or only 
available to those with certain levels of support contracts from Oracle?

To put it another way - does the community that uses ZFS need to fork 'ZFS Best 
Practices' and 'ZFZ Evil Tuning' to ensure that it is reasonably up to date?

Sorry for the somewhat hostile in the above, but the changes w/ the merger have 
demoralized a lot of folks I think.

- Mike




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Peter Jeremy
On 2010-Oct-06 05:59:06 +0800, Michael DeMan  wrote:
>Another annoying thing with the whole 4K sector size, is what happens
>when you need to replace drives next year, or the year after?

About the only mitigation needed is to ensure that any partitioning is
based on multiples of 4KB.

>  Does
>anybody know if there any vendors that are shipping 4K sector drives
>that have a jumper option to make them 512 size?

This would require a low-level re-format and would significantly
reduce the available space if it was possible at all.

>  WD has a jumper,
>but is there explicitly to work with WindowsXP, and is not a real way
>to dumb down the drive to 512.

All it does is offset the sector numbers by 1 so that sector 63
becomes physical sector 64 (a multiple of 4KB).

>  I would presume that any vendor that
>is shipping 4K sector size drives now, with a jumper to make it
>'real' 512, would be supporting that over the long run?

I would be very surprised if any vendor shipped a drive that could
be jumpered to "real" 512 bytes.  The best you are going to get is
jumpered to logical 512 bytes and maybe a 1-sector offset (needed
for WindozeXP only).  These jumpers will probably last as long as
the 8GB jumpers that were needed by old BIOS code.  (Eg BIOS boots
using simulated 512-byte sectors and then the OS tells the drive to
switch to native mode).

It's unfortunate that Sun didn't bite the bullet several decades
ago and provide support for block sizes other than 512-bytes
instead of getting custom firmware for their CD drives to make
them provide 512-byte logical blocks for 2KB CD-ROMs.

It's even more idiotic of WD to sell a drive with 4KB sectors but
not provide any way for an OS to identify those drives and perform
4KB aligned I/O.

-- 
Peter Jeremy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Andrew Gabriel

Michael DeMan wrote:
The WD 1TB 'enterprise' drives are still 512 sector size and safe to 
use, who knows though, maybe they just started shipping with 4K sector 
size as I write this e-mail?


Another annoying thing with the whole 4K sector size, is what happens 
when you need to replace drives next year, or the year after?  That 
part has me worried on this whole 4K sector migration thing more than 
what to buy today.  Given the choice, I would prefer to buy 4K sector 
size now, but operating system support is still limited.  Does anybody 
know if there any vendors that are shipping 4K sector drives that have 
a jumper option to make them 512 size?  WD has a jumper, but is there 
explicitly to work with WindowsXP, and is not a real way to dumb down 
the drive to 512.  I would presume that any vendor that is shipping 4K 
sector size drives now, with a jumper to make it 'real' 512, would be 
supporting that over the long run?


Changing the sector size (if it's possible at all) would require a
reformat of the drive.

On SCSI disks which support it, you do it by changing the sector size on
the relevant mode select page, and then sending a format-unit command to
make the drive relayout all the sectors.

I've no idea if these 4K sata drives have any such mechanism, but I
would expect they would.

BTW, I've been using a pair of 1TB Hitachi Ultrastar for something like
18 months without any problems at all. Of course, a 1 year old disk
model is no longer available now. I'm going to have to swap out for
bigger disks in the not too distant future.

--
Andrew Gabriel

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs volume snapshot

2010-10-05 Thread Richard Elling
On Oct 4, 2010, at 8:53 AM, Wei Li wrote:
> Hi All,
> 
> If a ZFS volume is presented to LDOM guest domain as whole disk (used as root 
> disk), does anyone know how to snapshot it?  It is something like how to 
> snapshot zfs raw volume (NOTE, no ufs file system directly created on ZFS 
> volume in above case)).  

zfs snapshot poolname/volumen...@snapshotname
 -- richard

-- 
OpenStorage Summit, October 25-27, Palo Alto, CA
http://nexenta-summit2010.eventbrite.com
ZFS and performance consulting
http://www.RichardElling.com












___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Richard Elling
On Oct 5, 2010, at 2:06 PM, Michael DeMan wrote:
> 
> On Oct 5, 2010, at 1:47 PM, Roy Sigurd Karlsbakk wrote:
> 
>>> Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal
>>> Hard Drive -Bare Drive
>>> 
>>> are only $129.
>>> 
>>> vs. $89 for the 'regular' black drives.
>>> 
>>> 45% higher price, but it is my understanding that the 'RAID Edition'
>>> ones also are physically constructed for longer life, lower vibration
>>> levels, etc.
>> 
>> Well, here it's about 60% up and for 150 drives, that makes a wee 
>> difference...
>> 
>> Vennlige hilsener / Best regards
>> 
>> roy
> 
> Understood on 1.6  times cost, especially for quantity 150 drives.

One service outage will consume far more in person-hours and downtime than this
little bit of money.  Penny-wise == Pound-foolish?
 -- richard

-- 
OpenStorage Summit, October 25-27, Palo Alto, CA
http://nexenta-summit2010.eventbrite.com
ZFS and performance consulting
http://www.RichardElling.com












___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Michael DeMan

On Oct 5, 2010, at 2:47 PM, casper@sun.com wrote:
> 
> 
> I've seen several important features when selecting a drive for
> a mirror:
> 
>   TLER (the ability of the drive to timeout a command)
>   sector size (native vs virtual)
>   power use (specifically at home)
>   performance (mostly for work)
>   price
> 
> I've heard scary stories about a mismatch of the native sector size and
> unaligned Solaris partitions (4K sectors, unaligned cylinder).
> 

Yes, avoiding the 4K sector sizes is a huge issue right now too - another item 
I forgot on the reasons to absolutely avoid those WD 'green' drives.

Three good reasons to avoid WD 'green' drives for ZFS...

- TLER issues
- IntelliPower head park issues
- 4K sector size issues

...they are an absolutely nightmare.  

The WD 1TB 'enterprise' drives are still 512 sector size and safe to use, who 
knows though, maybe they just started shipping with 4K sector size as I write 
this e-mail?

Another annoying thing with the whole 4K sector size, is what happens when you 
need to replace drives next year, or the year after?  That part has me worried 
on this whole 4K sector migration thing more than what to buy today.  Given the 
choice, I would prefer to buy 4K sector size now, but operating system support 
is still limited.  Does anybody know if there any vendors that are shipping 4K 
sector drives that have a jumper option to make them 512 size?  WD has a 
jumper, but is there explicitly to work with WindowsXP, and is not a real way 
to dumb down the drive to 512.  I would presume that any vendor that is 
shipping 4K sector size drives now, with a jumper to make it 'real' 512, would 
be supporting that over the long run?

I would be interested, and probably others would too, on what the original 
poster finally decides on this?

- Mike


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side

2010-10-05 Thread Nicolas Williams
On Mon, Oct 04, 2010 at 02:28:18PM -0400, Miles Nordin wrote:
> > "nw" == Nicolas Williams  writes:
> 
> nw> I would think that 777 would invite chmods.  I think you are
> nw> handwaving.
> 
> it is how AFS worked.  Since no file on a normal unix box besides /tmp

But would the AFS experience translate into double plus happiness for us?

> ever had 777 it would send a SIGWTF to any AFS-unaware graybeards that
> stumbled onto the directory, alerting them that they needed to go
> learn something and come back.

A signal?!  How would that work when the entity doing a chmod is on a
remote NFS client?

> I understand that everything:everyone on windows doesn't send SIGWTF,
> but 777 on unix for AFS sites it did.  You realize it's not
> hypothetical, right?  AFS was actually implemented, widely, and
> there's experience with it.

Yes... but I'm skeptical about the universality of that experience's
applicability.  Specifically: I don't think it could work for us.

AFS developers had fewer constraints than Solaris developers.  It is no
surprise that they were able to find happy solutions to these sorts of
problems long ago.

OpenAFS has a Windows native client and an Explorer shell extension
(which surely handles chmod?).  However, we don't have the luxury of
telling customers to install third-party (possibly ours, whatever)
Windows native clients for protocols other than SMB, nor can we tell
them to install Explorer shell extensions.  Solaris' SMB server needs to
work out of the box and without the limitations implied by having a
separate ACL and mode (well, we have that now, but we always compute a
new mode from the new ACL when ACLs are changed).

> If they failed to act on the SIGWTF, the overall system enforced the
> tighter of the unix permissions and the AFS ACL, so it fails closed.
> The current system fails open.

The current system fails closed (by discarding the ACL and replacing it
with a new one based entirely on the new mode).

> Also AFS did no translation between unix permissions and AFS ACL's so
> it was easy to undo such a mistake when it happened: double-check the
> AFS ACL is not too wide on the directories where you see unix people
> mucking around in case the muckers were responding to a real problem,
> then set the unix modes back to 777.

Right, but with SMB in the picture we don't have this luxury.  You seem
unwilling to accept that one constraint.

> nw> When chmod()ing an object... ZFS would search for the most
> nw> specific matching file in .zfs/ACLs/ and, if found, would
> nw> replace the chmod()ed object's ACL with that of the
> nw> .zfs/ACLs/... file found.  The .inherit suffix would indicate
> nw> that if the chmod() target's parent directory has inherittable
> nw> ACEs then they will be groupmasked and added to the ACEs from
> nw> the .zfs/ACLs/... file to produce a final ACL.
> 
> This proposal, like the current situation, seems to make chmod
> configurable to act like ``not chmod'' which IMHO is exactly what's
> unpopular about the current regime.  You've tried to leave chmod

To some degree, yes.  It's different though, and might conceivably be
acceptable, though I don't think it will be (I was illustrating
potential alternatives).

But I really like one thing about it: most apps shouldn't care about ACL
contents, they should care about context-specific permissions changes.
In a directory containing shared documents the intention should
typically be "share with all these people", while in home directories
the intention should typically be "don't share with anyone" (but this
will vary; e.g., ~/.ssh/authorized_keys needs to be reachable and
readable by everyone).  Add in executable versus not- executable, and
you have a pretty complete picture -- just a few "named" ACLs at most,
per-dataset.

If we could replace chmod(2) with a version that takes actual names for
pre-configured ACLs, _that_ would be great.  But we can't for the same
reason that we can't remove chmod(2): it's a widely used interface.

> active on windows trees and guess at the intent of whoever invokes
> chmod, providing no warning that you're secretly doing
> ``approximately'' what he asked for rather than exactly.  Maybe that
> flies on Windows, but on Unix people expect more precision: thorough
> abstractions that survive corner cases and have good exception
> handling.

Look, mode is a pretty lame hammer -- ACLs are far, far more granular--
but it's a hammer that many apps use.  Given the lack of granularity of
modes, I think an approximation of intent is the best we can do.

Consider: both, aclmode=discard and aclmode=groupmask behaviors can be
considered to be what the user intended.  How do you know if the user
intended for other users and groups to retain access limited to the
group bits of a new mode?  You can't, not without asking the user.  So
aclmode=discard is certainly an approximation of user intent, and so
aclmode=groupmask must be considered an approximation

Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Casper . Dik


>My immediate reaction to this is "time to avoid WD drives for a while";
>until things shake out and we know what's what reliably.
>
>But, um, what do we know about say the Seagate Barracuda 7200.12 ($70),
>the SAMSUNG Spinpoint F3 1TB ($75), or the HITACHI Deskstar 1TB 3.5"
>($70)?


I've seen several important features when selecting a drive for
a mirror:

TLER (the ability of the drive to timeout a command)
sector size (native vs virtual)
power use (specifically at home)
performance (mostly for work)
price

I've heard scary stories about a mismatch of the native sector size and
unaligned Solaris partitions (4K sectors, unaligned cylinder).

I was pretty happen with the WD drives (except for the one with a seriously
broken cache) but I see the reasons to not to pick WD drives over the 1TB
range.

Are people now using 4K native sectors and formating them with 4K sectors 
in (Open)Solaris?

Performance sucks when you use unaligned accesses but is performance good 
when the performance is aligned?

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Tim Cook
On Tue, Oct 5, 2010 at 3:47 PM, Roy Sigurd Karlsbakk wrote:

> > Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal
> > Hard Drive -Bare Drive
> >
> > are only $129.
> >
> > vs. $89 for the 'regular' black drives.
> >
> > 45% higher price, but it is my understanding that the 'RAID Edition'
> > ones also are physically constructed for longer life, lower vibration
> > levels, etc.
>
> Well, here it's about 60% up and for 150 drives, that makes a wee
> difference...
>
> Vennlige hilsener / Best regards
>
> roy
> --
> Roy Sigurd Karlsbakk
> (+47) 97542685
> r...@karlsbakk.net
> http://blogg.karlsbakk.net/
>
>

If you're spending upwards of $30,000 on a storage system, you probably
shouldn't skimp on the most important component.  You might as well be
complaining that ECC ram costs more.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread David Dyer-Bennet

On Tue, October 5, 2010 15:30, Roy Sigurd Karlsbakk wrote:


> I just discovered WD Black drives are rumored not to be set to allow TLER.
> Does anyone know how much performance impact the lack of TLER might have
> on a large pool? Choosing Enterprise drives will cost about 60% more, and
> on a large install, that means a lot of money...

My immediate reaction to this is "time to avoid WD drives for a while";
until things shake out and we know what's what reliably.

But, um, what do we know about say the Seagate Barracuda 7200.12 ($70),
the SAMSUNG Spinpoint F3 1TB ($75), or the HITACHI Deskstar 1TB 3.5"
($70)?

This is not a completely theoretical question to me; it's getting on
towards time to at least consider replacing my oldest mirrored pair; those
are 400GB Seagate, I think, dating from 2006.  I'd want something at least
twice as big (to make the space upgrade worthwhile), and I'm expecting to
buy three of them rather than just two because I think it's time to add a
hot spare to the system (currently 3 pair of data disks, and I've got two
more bays; I think a hot spare is a better use for them than a fourth
pair; safety of the data is very important, performance is adequate, and I
need a modest capacity upgrade, but the whole pool is currently 1.2TB
usable, not large).

On the third hand, there's the Barracuda 7200.11 1.5TB for only $75, which
is a really small price increment for a big space increment.

The WD RE3 1TB is $130 (all these prices are from Newegg just now). 
That's very close to TWICE the price of the competing 1TB drives.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Michael DeMan

On Oct 5, 2010, at 1:47 PM, Roy Sigurd Karlsbakk wrote:

>> Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal
>> Hard Drive -Bare Drive
>> 
>> are only $129.
>> 
>> vs. $89 for the 'regular' black drives.
>> 
>> 45% higher price, but it is my understanding that the 'RAID Edition'
>> ones also are physically constructed for longer life, lower vibration
>> levels, etc.
> 
> Well, here it's about 60% up and for 150 drives, that makes a wee 
> difference...
> 
> Vennlige hilsener / Best regards
> 
> roy

Understood on 1.6  times cost, especially for quantity 150 drives.

I think (and if I am wrong, somebody else correct me) - that if you are using 
commodity controllers, which seems to generally fine for ZFS, then if a drive 
times out trying to constantly re-read a bad sector, it could stall out the 
read on the entire pool overall.  On the other hand, if the drives are exported 
as JBOD from a RAID controller, I would think the RAID controller itself would 
just mark the drive as bad and offline it quickly based on its own internal 
algorithms. 

The above would also be relevant to the anticipated usage.  For instance, if it 
is some sort of backup machine and delays due to some reads stalling on out 
TLER then perhaps it is not a big deal.  If it is for more of an up-front 
production use, that could be intolerable.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Roy Sigurd Karlsbakk
> Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal
> Hard Drive -Bare Drive
> 
> are only $129.
> 
> vs. $89 for the 'regular' black drives.
> 
> 45% higher price, but it is my understanding that the 'RAID Edition'
> ones also are physically constructed for longer life, lower vibration
> levels, etc.

Well, here it's about 60% up and for 150 drives, that makes a wee difference...

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] TLER and ZFS

2010-10-05 Thread Michael DeMan
I'm not sure on the TLER issues by themselves, but after the nightmares I have 
gone through dealing with the 'green drives', which have both the TLER issue 
and the IntelliPower head parking issues, I would just stay away from it all 
entirely and pay extra for the 'RAID Editiion' drives.

Just out of curiosity, I took a peek a newegg.

Western Digital RE3 WD1002FBYS 1TB 7200 RPM SATA 3.0Gb/s 3.5" Internal Hard 
Drive -Bare Drive  

are only $129.

vs. $89 for the 'regular' black drives.

45% higher price, but it is my understanding that the 'RAID Edition' ones also 
are physically constructed for longer life, lower vibration levels, etc.


On Oct 5, 2010, at 1:30 PM, Roy Sigurd Karlsbakk wrote:

> Hi all
> 
> I just discovered WD Black drives are rumored not to be set to allow TLER. 
> Does anyone know how much performance impact the lack of TLER might have on a 
> large pool? Choosing Enterprise drives will cost about 60% more, and on a 
> large install, that means a lot of money...
> 
> Vennlige hilsener / Best regards
> 
> roy
> --
> Roy Sigurd Karlsbakk
> (+47) 97542685
> r...@karlsbakk.net
> http://blogg.karlsbakk.net/
> --
> I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det 
> er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
> idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
> relevante synonymer på norsk.
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] TLER and ZFS

2010-10-05 Thread Roy Sigurd Karlsbakk
Hi all

I just discovered WD Black drives are rumored not to be set to allow TLER. Does 
anyone know how much performance impact the lack of TLER might have on a large 
pool? Choosing Enterprise drives will cost about 60% more, and on a large 
install, that means a lot of money...

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS crypto bug status change

2010-10-05 Thread Miles Nordin
> "dm" == David Magda  writes:

dm> Thank you Mr. Moffat et al. Hopefully the rest of us will be
dm> able to bang on this at some point. :)

Thanks for the heads-up on the gossip.  

This etiquette seems weird, though: I don't thank Microsoft for
releasing a new version of Word.  I'll postpone my thanks for 2 years
until the source is released, though by then who knows if I'll still
be using ZFS at all.

Maybe more appropriate would be: congrats on finally finishing your
seven-year project, Darren!  must be a huge relief.

I'm glad it wasn't my project, though.  If I were in Darren's place
I'd have signed on to work for an open-source company, spent seven
years of my life working on something, delaying it and pushing hard to
make it a generation beyond other filesystem crypto, and then when I'm
finally done, .  

That's me, though.  I shouldn't speculate on someone else's situation.
Maybe he signed on under different circumstances, or delayed for
different reasons than feature-ambition, or cares about different
things than I do.  I only mean to make an example of how politics,
featuresets, and IT planning interact to make an ecosystem that's got
more complicated implications than just a bulleted list of features
and a license with an OSI logo.


-- 
READ CAREFULLY. By reading this fortune, you agree, on behalf of your employer,
to release me from all obligations and waivers arising from any and all
NON-NEGOTIATED  agreements, licenses, terms-of-service, shrinkwrap, clickwrap,
browsewrap, confidentiality, non-disclosure, non-compete and acceptable use
policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its
partners, licensors, agents and assigns, in perpetuity, without prejudice to my
ongoing rights and privileges. You further represent that you have the
authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


pgpxfnP4VSj9Z.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Long import due to spares.

2010-10-05 Thread Jason J. W. Williams
Just for history as to why Fishworks was running on this box...we were
in the beta program and have upgraded along the way. This box is an
X4240 with 16x 146GB disks running the Feb 2010 release of FW with
de-dupe.

We were getting ready to re-purpose the box and getting our data off.
We then deleted a filesystem that was using de-duplication and the box
suddenly went into a freeze and the pool had activity like crazy.

After several failed attempts to recover the box to usable state (days
of importing failed), we reloaded the boot drives with Nexenta 3.0
(b134) (which was our goal anyway). When we tried to import this pool
again, after 24 hours the pool finally imported but with the error
that the two spares were FAULTED with too many errors.

Controller is an LSI 1068E-IR

Normally, I'd believe the drive was dead except, both spares? Could
this be related to the de-dupe FS being deleted?

-J
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] moving newly created pool to alternate host

2010-10-05 Thread Cindy Swearingen

Hi Sridhar,

After a zpool split operation, you can access the newly created
pool by using the zpool import command.

If the LUNs from mypool are available on host1 and host2, you
should be able to import mypool_snap from host2. After mypool_snap
is imported, it will be available for backups, but not in read-only
mode.

It is important that data from these pools is not accessed from two
the different hosts at the same.

An upcoming feature is a read-only import that might be helpful
in your environment.

Thanks,

Cindy

On 10/05/10 07:41, sridhar surampudi wrote:

Hi,

If have below kind of configuration (as an example):

c1t1d1 and c2t2d2 are two LUNs visible (un masked) to both host1 and host2. 


Created a pool mypool as below

   zpool create mypool mirror c1t1d1  c2t2d2

Now I did zpool split
  zpool split mypool mypool_snap

Once i run zpool split, is there a way I can move/visible newly created 
mypool_snap to other host i.e host2 ??
so that I can able to access all file systems and files in read only mode for 
backup?


Thanks & Regards,
sridhar.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Migrating to an aclmode-less world

2010-10-05 Thread Nicolas Williams
On Mon, Oct 04, 2010 at 04:30:05PM -0600, Cindy Swearingen wrote:
> Hi Simon,
> 
> I don't think you will see much difference for these reasons:
> 
> 1. The CIFS server ignores the aclinherit/aclmode properties.

Because CIFS/SMB has no chmod operation :)

> 2. Your aclinherit=passthrough setting overrides the aclmode
> property anyway.

aclinherit=passthrough-x is a better choice.

Also, aclinherit doesn't override aclmode.  aclinherit applies on create
and aclmode used to apply on chmod.

> 3. The only difference is that if you use chmod on these files
> to manually change the permissions, you will lose the ACL values.

Right.  That only happens from NFSv3 clients [that don't instead edit
the POSIX Draft ACL translated from the ZFS ACL], from non-Windows NFSv4
clients [that don't instead edit the ACL], and from local applications
[that don't instead edit the ZFS ACL].

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] moving newly created pool to alternate host

2010-10-05 Thread sridhar surampudi
Hi,

If have below kind of configuration (as an example):

c1t1d1 and c2t2d2 are two LUNs visible (un masked) to both host1 and host2. 

Created a pool mypool as below

   zpool create mypool mirror c1t1d1  c2t2d2

Now I did zpool split
  zpool split mypool mypool_snap

Once i run zpool split, is there a way I can move/visible newly created 
mypool_snap to other host i.e host2 ??
so that I can able to access all file systems and files in read only mode for 
backup?


Thanks & Regards,
sridhar.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Migrating to an aclmode-less world

2010-10-05 Thread Simon Breden
Hi Cindy,

That sounds very reassuring.

Thanks a lot.

Simon
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Resilver endlessly restarting at completion

2010-10-05 Thread Tuomas Leikola
This seems to have been a false alarm, sorry for that. As soon as I started
paying attention (logging zpool status, peeking around with zdb & mdb) the
resilver didn't restart unless provoked. A cleartext log would have been
nice ("restarted due to c11t7 becoming online").

A slight problem i can see is that resilver restarts always if a device is
added to the array. In my case devices were absent for a short period (some
SATA failure that corrected itself by running cfgadm -c disconnect &
connect) and it would have been beneficial to let resilver run to completion
and restart only after that to resilver missing data on the added device.
ZFS does have some intelligence in those cases that all data is not
resilvered, but only blocks that have been born after the outage.

Also, as i had a spare in the array, that kicked in, which probably was not
what I would have wanted, as that triggered a full resilver, and not a
partial one. After the fact I could not kick the spare out, and could not
make the resilvering process forget about doing a full resilver. Plus now I
have to replace it back out and make it a cold spare.

But end is well, all is well.. mostly. Devices seem to be still dropping
from the SATA bus randomly. Maybe I'll cough together a report and post to
storage-discuss.

On Wed, Sep 29, 2010 at 8:13 PM, Tuomas Leikola wrote:

> The endless resilver problem still persists on OI b147. Restarts when it
> should complete.
>
> I see no other solution than to copy the data to safety and recreate the
> array. Any hints would be appreciated as that takes days unless i can stop
> or pause the resilvering.
>
>
> On Mon, Sep 27, 2010 at 1:13 PM, Tuomas Leikola 
> wrote:
>
>> Hi!
>>
>> My home server had some disk outages due to flaky cabling and whatnot, and
>> started resilvering to a spare disk. During this another disk or two
>> dropped, and were reinserted into the array. So no devices were actually
>> lost, they just were intermittently away for a while each.
>>
>> The situation is currently as follows:
>>   pool: tank
>>  state: ONLINE
>> status: One or more devices has experienced an unrecoverable error.  An
>> attempt was made to correct the error.  Applications are
>> unaffected.
>> action: Determine if the device needs to be replaced, and clear the errors
>> using 'zpool clear' or replace the device with 'zpool replace'.
>>see: http://www.sun.com/msg/ZFS-8000-9P
>>  scrub: resilver in progress for 5h33m, 22.47% done, 19h10m to go
>> config:
>>
>> NAME   STATE READ WRITE CKSUM
>> tank   ONLINE   0 0 0
>>   raidz1-0 ONLINE   0 0 0
>> c11t1d0p0  ONLINE   0 0 0
>> c11t2d0ONLINE   0 0 5
>> c11t6d0p0  ONLINE   0 0 0
>> spare-3ONLINE   0 0 0
>>   c11t3d0p0ONLINE   0 0 0  106M
>> resilvered
>>   c9d1 ONLINE   0 0 0  104G
>> resilvered
>> c11t4d0p0  ONLINE   0 0 0
>> c11t0d0p0  ONLINE   0 0 0
>> c11t5d0p0  ONLINE   0 0 0
>> c11t7d0p0  ONLINE   0 0 0  93.6G
>> resilvered
>>   raidz1-2 ONLINE   0 0 0
>> c6t2d0 ONLINE   0 0 0
>> c6t3d0 ONLINE   0 0 0
>> c6t4d0 ONLINE   0 0 0  2.50K
>> resilvered
>> c6t5d0 ONLINE   0 0 0
>> c6t6d0 ONLINE   0 0 0
>> c6t7d0 ONLINE   0 0 0
>> c6t1d0 ONLINE   0 0 1
>> logs
>>   /dev/zvol/dsk/rpool/log  ONLINE   0 0 0
>> cache
>>   c6t0d0p0 ONLINE   0 0 0
>> spares
>>   c9d1 INUSE currently in use
>>
>> errors: No known data errors
>>
>> And this has been going on for a week now, always restarting when it
>> should complete.
>>
>> The questions in my mind atm:
>>
>> 1. How can i determine the cause for each resilver? Is there a log?
>>
>> 2. Why does it resilver the same data over and over, and not just the
>> changed bits?
>>
>> 3. Can i force remove c9d1 as it is no longer needed but c11t3 can be
>> resilvered instead?
>>
>> I'm running opensolaris 134, but the event originally happened on 111b. I
>> upgraded and tried quiescing snapshots and IO, none of which helped.
>>
>> I've already ordered some new hardware to recreate this entire array as
>> raidz2 among other things, but there's about a week of time when I can run
>> debuggers and traces if instructed to.
>>
>> - Tuo