Re: [zfs-discuss] Re: Re: ZFS or UFS - what to do?

2007-01-28 Thread Casper . Dik


On 27-Jan-07, at 10:15 PM, Anantha N. Srirama wrote:

 ... ZFS will not stop alpha particle induced memory corruption  
 after data has been received by server and verified to be correct.  
 Sadly I've been hit with that as well.


My brother points out that you can use a rad hardened CPU. ECC should  
take care of the RAM. :-)

I wonder when the former will become data centre best practice?

Alpha particles which hit CPUs must have their origin inside said CPU.

(Alpha particles do not penentrate skin, paper, let alone system cases
or CPU packagaging)

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: ZFS or UFS - what to do?

2007-01-28 Thread Erik Trimble

[EMAIL PROTECTED] wrote:

On 27-Jan-07, at 10:15 PM, Anantha N. Srirama wrote:


... ZFS will not stop alpha particle induced memory corruption  
after data has been received by server and verified to be correct.  
Sadly I've been hit with that as well.
  
My brother points out that you can use a rad hardened CPU. ECC should  
take care of the RAM. :-)


I wonder when the former will become data centre best practice?



Alpha particles which hit CPUs must have their origin inside said CPU.

(Alpha particles do not penentrate skin, paper, let alone system cases
or CPU packagaging)

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  


But, but, but, they'll get my brain without this nice shiny aluminum cap 
I made!


Cosmic (aka Gamma) Radiation, folks.


And, I think we've jumped the shark.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: ZFS or UFS - what to do?

2007-01-28 Thread Joerg Schilling
[EMAIL PROTECTED] wrote:

 Alpha particles which hit CPUs must have their origin inside said CPU.

 (Alpha particles do not penentrate skin, paper, let alone system cases
 or CPU packagaging)

Gamma rays cannot be shielded in a senseful way.

Jörg

-- 
 EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin
   [EMAIL PROTECTED](uni)  
   [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: can I use zfs on just a partition?

2007-01-28 Thread roland
 Take note though, that giving zfs the entire disk gives a possible
 performance win, as zfs will only enable the write cache for the disk
 if it is given the entire disk.

really? 
why this?
is this tuneable somehow/somewhere? can i enabyle writecache if only using a 
dedicated partition ?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: can I use zfs on just a partition?

2007-01-28 Thread Casper . Dik

 Take note though, that giving zfs the entire disk gives a possible
 performance win, as zfs will only enable the write cache for the disk
 if it is given the entire disk.

really? 
why this?

In the old days, Sun never enabled the write cache on devices because
of reliability issues.  (Sun SCSI disks were shipped with caches disabled;
but the OS never bothered to change the caching behaviour; the behaviour
on SCSI drives is persistent.

On ata drives, the drive cache was specifically disabled (the behaviour
is not persistent and drives default to write cache on)

This behaviour was changed under competetive pressure for sata disks;
they now default to write cache on; set using the sata:sata_write_cache
variable

The change came about with ZFS and the addition of a mechanism to
flush the write cache (ZFS needs this to guarantee transactional
safety)(

is this tuneable somehow/somewhere? can i enabyle writecache if only using a 
dedicated partition ?

If does put the additional data at some what of a risk; not really
for swap but perhaps not nice for UFS.

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: ZFS or UFS - what to do?

2007-01-28 Thread Toby Thain


On 28-Jan-07, at 7:59 AM, [EMAIL PROTECTED] wrote:





On 27-Jan-07, at 10:15 PM, Anantha N. Srirama wrote:


... ZFS will not stop alpha particle induced memory corruption
after data has been received by server and verified to be correct.
Sadly I've been hit with that as well.



My brother points out that you can use a rad hardened CPU. ECC should
take care of the RAM. :-)

I wonder when the former will become data centre best practice?


Alpha particles which hit CPUs must have their origin inside said  
CPU.


(Alpha particles do not penentrate skin, paper, let alone system cases
or CPU packagaging)


Thanks. But what about cosmic rays?
--T



Casper


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: ZFS or UFS - what to do?

2007-01-28 Thread Casper . Dik


On 28-Jan-07, at 7:59 AM, [EMAIL PROTECTED] wrote:



 On 27-Jan-07, at 10:15 PM, Anantha N. Srirama wrote:

 ... ZFS will not stop alpha particle induced memory corruption
 after data has been received by server and verified to be correct.
 Sadly I've been hit with that as well.


 My brother points out that you can use a rad hardened CPU. ECC should
 take care of the RAM. :-)

 I wonder when the former will become data centre best practice?

 Alpha particles which hit CPUs must have their origin inside said  
 CPU.

 (Alpha particles do not penentrate skin, paper, let alone system cases
 or CPU packagaging)

Thanks. But what about cosmic rays?


I was just in pedantic mode; cosmic rays is the term covering
all different particles, including alpha, beta and gamma rays.

Alpha rays don't reach us from the cosmos; they are caught
long before they can do any harm.  Ditto beta rays.  Both have
an electrical charge that makes passing magnetic fields or passing
through materials difficult.  Both do exist in the free but are
commonly caused by slow radioactive decay of our natural environment.

Gamma rays are photons with high energy; they are not capture by
magnetic fields (such as those existing in atoms: electons, protons).
They need to take a direct hit before they're stopped; they can only
be stopped by dense materials, such as lead.  Unfortunately, natural
occuring lead is polluted by pollonium and uranium and is an alpha/beta
source in its own right.  That's why 100 year old lead from roofs is
worth more money than new lead: it's radioisotopes have been depleted.

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Re: ZFS or UFS - what to do?

2007-01-28 Thread Anantha N. Srirama
You're right that storage level snapshots are filesystem agnostic. I'm not sure 
why you believe you won't be able to restore individual files by using a NetApp 
snapshot? In the case of ZFS you'd take a periodic snapshot and use it to 
restore files, in the case of NetApp you can do the same (of course you've to 
have the additional step to mount the new snapshot volume.) Is this convenience 
tipping the scales for you to pursue ZFS?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] bug id 6381203

2007-01-28 Thread Leon Koll
Hello,
what is the status of the bug 6381203 fix in S10 u3 ?
(deadlock due to i/o while assigning (tc_lock held))

Was it integrated? Is there a patch?

Thanks,
[i]-- leon[/i]
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs rewrite?

2007-01-28 Thread Pawel Jakub Dawidek
On Fri, Jan 26, 2007 at 06:08:50PM -0800, Darren Dunham wrote:
  What do you guys think about implementing 'zfs/zpool rewrite' command?
  It'll read every block older than the date when the command was executed
  and write it again (using standard ZFS COW mechanism, simlar to how
  resilvering works, but the data is read from the same disk it is written to=
  ).
 
 #1 How do you control I/O overhead?

The same way it is handled for scrub and resilver.

 #2 Snapshot blocks are never rewritten at the moment.  Most of your
suggestions seem to imply working on the live data, but doing that
for snapshots as well might be tricky. 

Good point, see below.

  3. I created file system with huge amount of data, where most of the
  data is read-only. I change my server from intel to sparc64 machine.
  Adaptive endianess only change byte order to native on write and because
  file system is mostly read-only, it'll need to byteswap all the time.
  And here comes 'zfs rewrite'!
 
 It's only the metadata that is modified anyway, not the file data.  I
 would hope that this could be done more easily than a full tree rewrite
 (and again the issue with snapshots).  Also, the overhead there probably
 isn't going to be very high (since the metadata will be cached in most
 cases).  

Agreed. Probably in this case there should be rewrite-only-metadata
mode. I agree the overhead is probably not high, but on the other hand,
I'm quite sure there are workload, which will see the difference, eg.
'find / -name something'.

 Other than that, I'm guessing something like this will be necessary to
 implement disk evacuation/removal.  If you have to rewrite data from one
 disk to elsewhere in the pool, then rewriting the entire tree shouldn't
 be much harder.

How did I forget about this one?:) That's right. I belive ZFS will gain
such ability at some point and rewrite functionality fits very nice
here: mark the disk/mirror/raid-z as no-more-writes and start rewrite
process (probably only limited to this entity). To implement such
functionality there also has to be a way to migrate snapshot data, so
sooner or later there will be a need for moving snapshot blocks.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpsIUZEgB2Q6.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] bug id 6381203

2007-01-28 Thread Neil Perrin

Hi Leon,

This was fixed in March 2006, and is in S10_U2.

Neil.

Leon Koll wrote On 01/28/07 08:58,:

Hello,
what is the status of the bug 6381203 fix in S10 u3 ?
(deadlock due to i/o while assigning (tc_lock held))

Was it integrated? Is there a patch?

Thanks,
[i]-- leon[/i]
 
 
This message posted from opensolaris.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: bug id 6381203

2007-01-28 Thread Leon Koll
Too bad...I was in the situation where every zpool ... command was stuck (as 
well as df command) and my hope was - it's a known/fixed bug. I could not save 
the core files, not sure I can reproduce the bug.

Thank you for quick reply,
[i]-- leon[/i]
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: ZFS or UFS - what to do?

2007-01-28 Thread Gary Mills
On Sat, Jan 27, 2007 at 04:15:30PM -0800, Anantha N. Srirama wrote:
 
 I'm not sure what benefit you forsee by running a COW filesystem
 (ZFS) on a COW array (NetApp).

The application requires a filesystem with POSIX semantics.  My first
choice would be NFS from the Netapp, but this won't work in this case.
My next choice is an iSCSI LUN with a local filesystem on it.  I'm
assuming that since ZFS is more modern than UFS, that ZFS would be the
best of the two, even though the JBOD-oriented features of ZFS will
not be used.

ZFS does seem to be more manageable than UFS.  Filesystems that draw
their space from a common pool is ideal for our application.  The
ability to expand a pool by adding another device, or by extending a
existing device, is also ideal.  Another feature is snapshots, which
I've mentioned earlier.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Re: ZFS or UFS - what to do?

2007-01-28 Thread Gary Mills
On Sun, Jan 28, 2007 at 06:19:25AM -0800, Anantha N. Srirama wrote:
 
 You're right that storage level snapshots are filesystem agnostic. I'm
 not sure why you believe you won't be able to restore individual files
 by using a NetApp snapshot? In the case of ZFS you'd take a periodic
 snapshot and use it to restore files, in the case of NetApp you can do
 the same (of course you've to have the additional step to mount the
 new snapshot volume.) Is this convenience tipping the scales for you
 to pursue ZFS?

Yes, we'd run out of LUNs.  We're talking about two weeks of daily
snapshots on six filesystems.  Each snapshot on the Netapp would
become a separate iSCSI LUN.  They need to be mounted on the server so
that our admins can locate and restore missing files when necessary.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs rewrite?

2007-01-28 Thread Frank Cusack
On January 28, 2007 4:59:48 PM +0100 Pawel Jakub Dawidek [EMAIL PROTECTED] 
wrote:

On Fri, Jan 26, 2007 at 06:08:50PM -0800, Darren Dunham wrote:

 3. I created file system with huge amount of data, where most of the
 data is read-only. I change my server from intel to sparc64 machine.
 Adaptive endianess only change byte order to native on write and
 because file system is mostly read-only, it'll need to byteswap all
 the time. And here comes 'zfs rewrite'!

It's only the metadata that is modified anyway, not the file data.  I
would hope that this could be done more easily than a full tree rewrite
(and again the issue with snapshots).  Also, the overhead there probably
isn't going to be very high (since the metadata will be cached in most
cases).


Agreed. Probably in this case there should be rewrite-only-metadata
mode. I agree the overhead is probably not high, but on the other hand,
I'm quite sure there are workload, which will see the difference, eg.
'find / -name something'.


I'd imagine even for that it wouldn't matter.  The I/O time will dwarf
any time spent byte-swapping.  Easily tested though.  Make sure you
set atime=off so that your find isn't causing write I/O.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: high density SAS

2007-01-28 Thread Richard Elling

Anton B. Rang wrote:

How badly can you mess up a JBOD?

Two words: vibration, cooling.


Three more: power, signal quality.

I've seen even individual drive cases with bad enough signal quality to cause 
bit errors.


Yep, if I crank up the amp to over 1kW, then on some frequencies, I see lots
of noise on USB links as an example.  You may have noticed that many vendors
are now making USB cables with torroids builtin.  There is still some black art
involved in eliminating noise problems.  However, one easy way to do it is well
proven in the PCB design space.  We leverage that with Thumper which has no
internal disk cables.  In fact, you should notice that many Sun designs have
few, if any, internal cables.  Cables are a source of reliability issues, so 
they
are best when they don't exist.

[waxing nostalgic]
When the designers were planning the Shinkansen (Japanese high speed train 
system)
they had 150 years of train accident data to study.  Not surprisingly, most 
train
accidents occured at crossings.  To help avoid accidents, they eliminated 
crossings.
Good design is a good thing.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] data wanted: disk kstats

2007-01-28 Thread Robert Milkowski
Hello Richard,

Friday, January 26, 2007, 11:36:07 PM, you wrote:

RE We've been talking a lot recently about failure rates and types of
RE failures.  As you may know, I do look at field data and generally don't
RE ask the group for more data.  But this time, for various reasons (I
RE might have found a bug or deficiency) I'm soliciting for more data at
RE large.

RE What I'd like to gather is the error rates per bytes transferred. This
RE data is collected in kstats, but is reset when you reboot.  One of the
RE features of my vast collection of field data is that it is often collected
RE rather soon after a reboot. Thus, there aren't very many bytes transferred
RE yet, and the corresponding error rates tend to be small (often 0).  A 
perfect
RE collection would be from a machine connected to lots of busy disks which
RE has been up for a very long time.

RE Can you help?  It is real simple.  Just email me the output of:

I've sent you off list.

Will those results (total statistics, not site specific) be publicly
provided by you (here?)?

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] zfs rewrite?

2007-01-28 Thread Robert Milkowski
Hello Jeff,

Saturday, January 27, 2007, 8:27:09 AM, you wrote:


JB You're all correct.  File data is never byte-swapped.  Most metadata
JB needs to be byte-swapped, but it's generally only 1-2% of your space.
JB So the overhead shouldn't be significant, even if you never rewrite.

I remember some time ago Sun touted ZFS has some interesting new
technology to deal with endianess and that patent is pending for it.
Can you share what was it about?

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS or UFS - what to do?

2007-01-28 Thread Robert Milkowski
Hello Anantha,

Friday, January 26, 2007, 5:06:46 PM, you wrote:

ANS All my feedback is based on Solaris 10 Update 2 (aka 06/06) and
ANS I've no comments on NFS. I strongly recommend that you use ZFS
ANS data redundancy (z1, z2, or mirror) and simply delegate the
ANS Engenio to stripe the data for performance.

Striping on an array and then doing redundancy with ZFS has at least
one drawback - what if one of disks fails? You've got to replace bad
disk, re-create stripe on an array and resilver on ZFS (or stay with
hotspare). Lot of hassle.


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] ZFS or UFS - what to do?

2007-01-28 Thread Robert Milkowski
Hello Francois,

Friday, January 26, 2007, 4:09:43 PM, you wrote:

FD On Fri, 2007-01-26 at 06:16 -0800, Jeffery Malloch wrote:
 Hi Folks,
 
 I am currently in the midst of setting up a completely new file server using 
 a pretty well loaded Sun T2000 (8x1GHz, 16GB RAM) connected to an Engenio 
 6994 product (I work for LSI Logic so Engenio is a no brainer).  I have 
 configured a couple of zpools from Volume groups on the Engenio box - 
 1x2.5TB and 1x3.75TB.  I then created sub zfs systems below that and set 
 quotas and sharenfs'd them so that it appears that these file systems are 
 dynamically shrinkable and growable.  It looks very good...  I can see the 
 correct file system sizes on all types of machines (Linux 32/64bit and of 
 course Solaris boxes) and if I resize the quota it's picked up in NFS right 
 away.  But I would be the first in our organization to use this in an 
 enterprise system so I definitely have some concerns that I'm hoping someone 
 here can address.
 
 1.  How stable is ZFS?  The Engenio box is completely configured for RAID5 
 with hot spares

FD That partly defeats the purpose of ZFS. ZFS offers raid-z and raid-z2
FD (double parity) with all the advantages of raid-5 or raid-6 but without
FD several of the raid-5 issues. It also has features that a raid-5
FD controller could never do: ensure data integrity from the kernel to the
FD disk, and self correction.

Not always true. Actually you can get much more performance for some
workloads doing raid-5 in HW than raid-z.

Also with some other entry level arrays there're limits on how much
LUNs can be presented and you actually can't expose all disks each as
a LUN due to the limit (yes, Sun's 3510).

  and write cache (8GB) has battery backup so I'm not too concerned from a 
 hardware side.

FD Whereas the cache/battery backup is a requirement if you run raid-5, it
FD is not for zfs.

Still it doesn't mean it won't help for some workloads.


 2.  Recommended config.

FD The most reliable setup is a JBOD + zfs. But if you have cache, on your

I would argue this. No matter what you still get less reliable setup
when using ZFS on top of simple JBOD than Symmetrix box. It's just
that in many cases that simple JBOD can be good enough.


FD box, there might be some magic setup you have to do for that box, and
FD I'm sure somebody on the list will help you with that. I dont have an
FD Engenio.

There's a workaround for Enginie devices.


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: ZFS or UFS - what to do?

2007-01-28 Thread Anantha N. Srirama
Agreed, I guess I didn't articulate my point/thought very well. The best config 
is to present JBoDs and let ZFS provide the data protection. This has been a 
very stimulating conversation thread; it is shedding new light into how to best 
use ZFS.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss