from:"Chad"

Re: [zfs-discuss] S11 vs illumos zfs compatiblity

2012-12-17 Thread Truhn, Chad




On 12/14/12 10:07 AM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
wrote:
 Is that right?  You can't use zfs send | zfs receive to send from a newer 
 version and receive on an older version?



That is my experience.  If you do a zfs upgrade on the sending machine, the 
receiving machine requires a version = the sending machine.


No.  You can, with recv, override any property in the sending stream that can 
be
set from the command line (ie, a writable).  Version is not one of those
properties.  It only gets changed, in an upward direction, when you do a zfs
upgrade.

ie:

#  zfs get version repo/support
NAME  PROPERTY  VALUESOURCE
repo/support  version   5-


# zfs send repo/support@cpu-0412 | zfs recv -o version=4 repo/test
cannot receive: cannot override received version



You can send a version 6 file system into a version 28 pool, but it will still
be a version 6 file system.


Bob

I am not disagreeing with this, but isn't this the opposite test from what Ned 
asked?   You can send from an old version (6) to a new version (28), but I 
don't believe you can send the other way from the new version (28) to receive 
on the old version (6).

Or am I reading this wrong?

Chad






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS snapshot used space question

2012-08-31 Thread Truhn, Chad



Is there a way to get the total amount of data referenced by a snapshot that 
isn't referenced by a specified snapshot/filesystem?  I think this is what is 
really desired in order to locate snapshots with offending space usage.  The 
written and written@ attributes seem to only do the reverse.  I think you can 
back calculate it from the snapshot and filesystem referenced sizes, and 
the written@snap property of the filesystem, but that isn't particularly 
convenient to do (looks like zfs get -Hp ... makes it possible to hack a 
script together for, though).


This is what I was hoping to get as well, but I am not sure it's really 
possible.  Even if you try to calculate the referenced space + displayed used 
space and compare against the active filesystem that doesn't really tell you 
much because the data on the active filesystem might not be as static as you 
want.  

For example:

If it references 10G and the active filesystem shows 10G used, you might expect 
that the snapshot isn't using any space.  However, the 10G it referenced might 
have been deleted and the 10G in the active filesystem might be new data and 
that means your snap could be 10G.  But if 9G of that was on another snapshot, 
you would have something like this:

rootpool/export/home@snap.0-   1G-   -  -   
   -
rootpool/export/home@snap.1-   27K  -   -  
-  -
rootpool/export/home@snap.2-   0   -   -  - 
 -

And the referenced would look something like:

rootpool/export/home@snap.00  -  10G  -
rootpool/export/home@snap.10  -  9G-
rootpool/export/home@snap.10  -  10G-

And the current filesystem would be:

rootpool/export/home  40G  20G 10G   10G
  0  0


Then imagine that across more than three snapshots.  I can't wrap my head 
around logic that would work there.

I would love if someone could figure out a good way though...

- Chad




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS snapshot used space question

2012-08-30 Thread Truhn, Chad



On Wed, Aug 29, 2012 at 8:58 PM, Timothy Coalson tsc...@mst.edu wrote:
 As I understand it, the used space of a snapshot does not include anything
 that is in more than one snapshot.

True. It shows the amount that would be freed if you destroyed the
snapshot right away. Data held onto by more than one snapshot cannot
be removed when you destroy just one of them, obviously. The act of
destroying a snapshot will likely change the USED value of the
neighbouring snapshots though.

Yup, this is the same thing I came up with as well.  Though I am a bit 
disappointed in the results at least things make sense again.

Thank you all for your help!

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS ok for single disk dev box?

2012-08-30 Thread Truhn, Chad


Now that is interesting. But how do you do a receive before you reinstall?
Live cd??


Just boot off of the CD (or jumpstart server) to single user mode.  Format your 
new disk, create a zpool, zfs recv, installboot (or installgrub), reboot and 
done.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS snapshot used space question

2012-08-29 Thread Truhn, Chad

All,

I apologize in advance for what appears to be a question asked quite often, but 
I am not sure I have ever seen an answer that explains it.  This may also be a 
bit long-winded so I apologize for that as well.

I would like to know how much unique space each individual snapshot is using.

I have a ZFS filesystem that shows:

$zfs list -o space rootpool/export/home
NAME  AVAIL   USED  USEDSNAP  USEDDS  
USEDREFRESERV  USEDCHILD
rootpool/export/home  5.81G   14.4G  8.81G5.54G  0  
  0

So reading this I see that I have a total of 14.4G of space used by this data 
set.  Currently 5.54 is active data that is available on the normal 
filesystem and 8.81G used in snapshots.  8.81G + 5.54G = 14.4G (roughly).   I 
100% agree with these numbers and the world makes sense.

This is also backed up by:

$zfs get usedbysnapshots rootpool/export/home
NAME PROPERTYVALUE 
SOURCE
rootpool/export/home  usedbysnapshots 8.81G  -


Now if I wanted to see how much space any individual snapshot is currently 
using I would like to think that this would show me:

$zfs list -ro space rootpool/export/home

NAME  AVAIL   USED  
USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
rootpool/export/home   5.81G  14.4G 8.81G   5.54G   
   0  0
rootpool/export/home@week3  -202M -   - 
 -  -
rootpool/export/home@week2  -104M -   - 
 -  -
rootpool/export/home@7daysago-1.37M -   -   
   -  -
rootpool/export/home@6daysago-1.20M -   -   
   -  -
rootpool/export/home@5daysago-1020K -   -   
   -  -
rootpool/export/home@4daysago-342K -   -
  -  -
rootpool/export/home@3daysago-1.28M -   -   
   -  -
rootpool/export/home@week1  -0-   - 
 -  -
rootpool/export/home@2daysago-0-   -
  -  -
rootpool/export/home@yesterday   -   360K -   - 
 -  -
rootpool/export/home@today-1.26M -   -  
-  -


So normal logic would tell me if USEDSNAP is 8.81G and is composed of 11 
snapshots, I would add up the size of each of those snapshots and that would 
roughly equal 8.81G.  So time to break out the calculator:

202M + 104M + 1.37M + 1.20M + 1020K + 342K + 1.28M +0 +0 + 360K + 1.26M
equals...  ~312M!

That is nowhere near 8.81G.  I would accept it even if it was within 15%, but 
it's not even close.  That definitely not metadata or ZFS overhead or anything.

I understand that snapshots are just the delta between the time when the 
snapshot was taken and the current active filesystem and are truly just 
references to a block on disk rather than a copy.  I also understand how two 
(or more) snapshots can reference the same block on a disk but yet there is 
still only that one block used.  If I delete a recent snapshot I may not save 
as much space as advertised because some may be inherited by a parent 
snapshot.  But that inheritance is not creating duplicate used space on disk so 
it doesn't justify the huge difference in sizes. 

But even with this logic in place there is currently 8.81G of blocks referred 
to by snapshots which are not currently on the active filesystem and I don't 
believe anyone can argue with that.  Can something show me how much space a 
single snapshot has reserved?

I searched through some of the archives and found this thread 
(http://mail.opensolaris.org/pipermail/zfs-discuss/2012-August/052163.html) 
from early this month and I feel as if I have the same problem as the OP, but 
hopefully attacking it with a little more background.  I am not arguing with 
discrepancies between df/du and zfs output and I have read the Oracle 
documentation about it but haven't found what I feel like should be a simple 
answer.  I currently have a ticket open with Oracle, but I am getting answers 
to all kinds of questions except for the question I am asking so I am hoping 
someone out there might be able to help me.

I am a little concerned I am going to find out that there is no real way to 
show it and that makes for one sad SysAdmin.

Thanks,
Chad


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Ideas for ghetto file server data reliability?

2010-11-15 Thread Chad Leigh -- Shire.Net LLC


On Nov 15, 2010, at 8:32 AM, Bryan Horstmann-Allen wrote:

 +--
 | On 2010-11-15 10:21:06, Edward Ned Harvey wrote:
 | 
 | Backups.
 | 
 | Even if you upgrade your hardware to better stuff... with ECC and so on ...
 | There is no substitute for backups.  Period.  If you care about your data,
 | you will do backups.  Period.
 
 Backups are not going to save you from bad memory writing corrupted data to
 disk.
 
 If your RAM flips a bit and writes garbage to disk, and you back up that
 garbage, guess what: Your backups are full of garbage.
 
 Invest in ECC RAM and hardware that is, at the least, less likely to screw 
 you.
 
 Test your backups to ensure you can trust them.

The amount of resources invested trying to fix this by someone is probably more 
than the costs of some ECC RAM and a MB


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] couple of ZFS questions

2010-11-12 Thread Chad Leigh -- Shire.Net LLC


On Nov 12, 2010, at 5:54 AM, Edward Ned Harvey wrote:

 
 Why are you sharing iscsi from nexenta to freebsd?  Wouldn't it be better
 for nexenta to simply create zfs filesystems, and then share nfs?  Much more
 flexible in a lot of ways.  Unless your design requirements require limiting
 the flexibility intentionally...  I can't think of any reason you'd want to
 do the iscsi thing from nexenta to freebsd.
 

Because for running jails (in very simple terms a FreeBSD jail is a really 
fancy chroot or a really simple approximation of a zone)  it does not work very 
well (at least not in the past -- I have tried that but not recently).   Things 
like apache don't want to run off NFS mounted file system for example (actual 
httpd daemon -- not the webroots etc).___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] couple of ZFS questions

2010-11-11 Thread Chad Leigh -- Shire.Net LLC


I will be setting up a NexentaStor Community Edition based ZFS file server.  I 
will be serving some zvols over iSCSI to some FreeBSD machines to host jails 
in.  

1) The ZFS box offers a single iSCSI target that  exposes all the zvols as 
individual disks.  When the FreeBSD initiator finds it, it creates a separate 
disk for each zvol.  I assume if I have multiple FreeBSD machines connecting to 
this iSCSI target, as long as no individual zvol is mounted on more than 1 
FreeBSD machine, the fact that a disk exists for each zvol on each FreeBSD 
machine is irrelevant and won't cause problems


2) I am thinking about formatting the virtual disks served from the Nexenta 
iSCSI target as ZFS on the FreeBSD machine even though it has no redundancy.  I 
see this as safe since the backing store on the Nexenta machine is a redundant 
based ZFS zvol...  Is this correct thinking?

Thanks
Chad

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] couple of ZFS questions

2010-11-11 Thread Chad Leigh -- Shire.Net LLC


On Nov 11, 2010, at 7:18 PM, Xin LI wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256
 
 On 11/11/10 17:57, Chad Leigh -- Shire.Net LLC wrote:
 
 I will be setting up a NexentaStor Community Edition based ZFS file
 server.  I will be serving some zvols over iSCSI to some FreeBSD
 machines to host jails in.
 
 1) The ZFS box offers a single iSCSI target that  exposes all the
 zvols as individual disks.  When the FreeBSD initiator finds it, it
 creates a separate disk for each zvol.  I assume if I have multiple
 FreeBSD machines connecting to this iSCSI target, as long as no
 individual zvol is mounted on more than 1 FreeBSD machine, the fact
 that a disk exists for each zvol on each FreeBSD machine is
 irrelevant and won't cause problems
 
 This is correct.

A follow-on question.

If the zvol (virtual disks) are mounted READ ONLY, is it possible to mount it 
on multiple FreeBSD systems at the same time and access it for reading only 
from all the systems?  (With only one system having it R/W and that only being 
used occasionally when the new software needs to be installed for the jails)?  
What I want to do does not rely on this but could make things easier for me...


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-22 Thread Chad Cantwell

Hi Garrett,

Since my problem did turn out to be a debug kernel on my compilations,
I booted back into the Nexanta 3 RC2 CD and let a scrub run for about
half an hour to see if I just hadn't waited long enough the first time
around.  It never made it past 159 MB/s.  I finally rebooted into my
145 non-debug kernel and within a few seconds of reimporting the pool
the scrub was up to ~400 MB/s, so it does indeed seem like the Nexanta
CD kernel is either in debug mode, or something else is slowing it down.

Chad

On Wed, Jul 21, 2010 at 09:12:35AM -0700, Garrett D'Amore wrote:
 On Wed, 2010-07-21 at 02:21 -0400, Richard Lowe wrote:
  I built in the normal fashion, with the CBE compilers
  (cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30), and 12u1 lint.
  
  I'm not subscribed to zfs-discuss, but have you established whether the
  problematic build is DEBUG? (the bits I uploaded were non-DEBUG).
 
 That would make a *huge* difference.  DEBUG bits have zero optimization,
 and also have a great number of sanity tests included that are absent
 from the non-DEBUG bits.  If these are expensive checks on a hot code
 path, it can have a very nasty impact on performance.
 
 Now that said, I *hope* the bits that Nexenta delivered were *not*
 DEBUG.  But I've seen at least one bug that makes me think we might be
 delivering DEBUG binaries.  I'll check into it.
 
   -- Garrett
 
  
  -- Rich
  
  Haudy Kazemi wrote:
   Could it somehow not be compiling 64-bit support?
  
  
   -- 
   Brent Jones
   
  
   I thought about that but it says when it boots up that it is 64-bit, and 
   I'm able to run
   64-bit binaries.  I wonder if it's compiling for the wrong processor 
   optomization though?
   Maybe if it is missing some of the newer SSEx instructions the zpool 
   checksum checking is
   slowed down significantly?  I don't know how to check for this though 
   and it seems strange
   it would slow it down this significantly.  I'd expect even a non-SSE 
   enabled
   binary to be able to calculate a few hundred MB of checksums per second 
   for
   a 2.5+ghz processor.
  
   Chad
  
   Would it be possible to do a closer comparison between Rich Lowe's fast 
   142
   build and your slow 142 build?  For example run a diff on the source, 
   build
   options, and build scripts.  If the build settings are close enough, a
   comparison of the generated binaries might be a faster way to narrow 
   things
   down (if the optimizations are different then a resultant binary 
   comparison
   probably won't be useful).
  
   You said previously that:
   The procedure I followed was basically what is outlined here:
   http://insanum.com/blog/2010/06/08/how-to-build-opensolaris
  
   using the SunStudio 12 compilers for ON and 12u1 for lint.
 
   Are these the same compiler versions Rich Lowe used?  Maybe there is a
   compiler optimization bug.  Rich Lowe's build readme doesn't tell us which
   compiler he used.
   http://genunix.org/dist/richlowe/README.txt
  
   I suppose the easiest way for me to confirm if there is a regression or 
   if my
   compiling is flawed is to just try compiling snv_142 using the same 
   procedure
   and see if it works as well as Rich Lowe's copy or if it's slow like my 
   other
   compilations.
  
   Chad
  
   Another older compilation guide:
   http://hub.opensolaris.org/bin/view/Community+Group+tools/building_opensolaris
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-21 Thread Chad Cantwell

Hi,

My bits were originally debug because I didn't know any better.  I thought I 
had then
recompiled without debug to test again, but I didn't realize until just now the 
packages
end up in a different directory (nightly vs nightly-nd) so I believe after 
compiling
non-debug I just reinstalled the debug bits.  I'm about to test again with an 
actual
non-debug 142, and after that a non-debug 145 which just came out.

Thanks,
Chad

On Wed, Jul 21, 2010 at 02:21:51AM -0400, Richard Lowe wrote:
 
 I built in the normal fashion, with the CBE compilers
 (cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30), and 12u1 lint.
 
 I'm not subscribed to zfs-discuss, but have you established whether the
 problematic build is DEBUG? (the bits I uploaded were non-DEBUG).
 
 -- Rich
 
 Haudy Kazemi wrote:
  Could it somehow not be compiling 64-bit support?
 
 
  -- 
  Brent Jones
  
 
  I thought about that but it says when it boots up that it is 64-bit, and 
  I'm able to run
  64-bit binaries.  I wonder if it's compiling for the wrong processor 
  optomization though?
  Maybe if it is missing some of the newer SSEx instructions the zpool 
  checksum checking is
  slowed down significantly?  I don't know how to check for this though and 
  it seems strange
  it would slow it down this significantly.  I'd expect even a non-SSE 
  enabled
  binary to be able to calculate a few hundred MB of checksums per second for
  a 2.5+ghz processor.
 
  Chad
 
  Would it be possible to do a closer comparison between Rich Lowe's fast 142
  build and your slow 142 build?  For example run a diff on the source, build
  options, and build scripts.  If the build settings are close enough, a
  comparison of the generated binaries might be a faster way to narrow things
  down (if the optimizations are different then a resultant binary comparison
  probably won't be useful).
 
  You said previously that:
  The procedure I followed was basically what is outlined here:
  http://insanum.com/blog/2010/06/08/how-to-build-opensolaris
 
  using the SunStudio 12 compilers for ON and 12u1 for lint.

  Are these the same compiler versions Rich Lowe used?  Maybe there is a
  compiler optimization bug.  Rich Lowe's build readme doesn't tell us which
  compiler he used.
  http://genunix.org/dist/richlowe/README.txt
 
  I suppose the easiest way for me to confirm if there is a regression or if 
  my
  compiling is flawed is to just try compiling snv_142 using the same 
  procedure
  and see if it works as well as Rich Lowe's copy or if it's slow like my 
  other
  compilations.
 
  Chad
 
  Another older compilation guide:
  http://hub.opensolaris.org/bin/view/Community+Group+tools/building_opensolaris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-21 Thread Chad Cantwell

It does seem to be faster now that I really installed the non-debug bits.  I 
let it resume
a scrub after reboot, and while it's not as fast as it usually is (280 - 300 
MB/s vs 500+)
I assume it's just presently checking a part of the filesystem currently with 
smaller
files thus reducing the speed, since it's well past the prior limitation.  I 
tested 142
non-debug briefly until the scrub reached at least 250 MB/s and then booted 
into 145
non-debug where I'm letting the scrub finish now.  I'll test the Nexanta disc 
again to be
sure it was slow since I don't recall exactly how much time I gave it in my 
prior tests
for the scrub to reach it's normal speed, although I can't do that until this 
evening
when I'm home again.

Chad

On Wed, Jul 21, 2010 at 09:44:42AM -0700, Chad Cantwell wrote:
 Hi,
 
 My bits were originally debug because I didn't know any better.  I thought I 
 had then
 recompiled without debug to test again, but I didn't realize until just now 
 the packages
 end up in a different directory (nightly vs nightly-nd) so I believe after 
 compiling
 non-debug I just reinstalled the debug bits.  I'm about to test again with an 
 actual
 non-debug 142, and after that a non-debug 145 which just came out.
 
 Thanks,
 Chad
 
 On Wed, Jul 21, 2010 at 02:21:51AM -0400, Richard Lowe wrote:
  
  I built in the normal fashion, with the CBE compilers
  (cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30), and 12u1 lint.
  
  I'm not subscribed to zfs-discuss, but have you established whether the
  problematic build is DEBUG? (the bits I uploaded were non-DEBUG).
  
  -- Rich
  
  Haudy Kazemi wrote:
   Could it somehow not be compiling 64-bit support?
  
  
   -- 
   Brent Jones
   
  
   I thought about that but it says when it boots up that it is 64-bit, and 
   I'm able to run
   64-bit binaries.  I wonder if it's compiling for the wrong processor 
   optomization though?
   Maybe if it is missing some of the newer SSEx instructions the zpool 
   checksum checking is
   slowed down significantly?  I don't know how to check for this though 
   and it seems strange
   it would slow it down this significantly.  I'd expect even a non-SSE 
   enabled
   binary to be able to calculate a few hundred MB of checksums per second 
   for
   a 2.5+ghz processor.
  
   Chad
  
   Would it be possible to do a closer comparison between Rich Lowe's fast 
   142
   build and your slow 142 build?  For example run a diff on the source, 
   build
   options, and build scripts.  If the build settings are close enough, a
   comparison of the generated binaries might be a faster way to narrow 
   things
   down (if the optimizations are different then a resultant binary 
   comparison
   probably won't be useful).
  
   You said previously that:
   The procedure I followed was basically what is outlined here:
   http://insanum.com/blog/2010/06/08/how-to-build-opensolaris
  
   using the SunStudio 12 compilers for ON and 12u1 for lint.
 
   Are these the same compiler versions Rich Lowe used?  Maybe there is a
   compiler optimization bug.  Rich Lowe's build readme doesn't tell us which
   compiler he used.
   http://genunix.org/dist/richlowe/README.txt
  
   I suppose the easiest way for me to confirm if there is a regression or 
   if my
   compiling is flawed is to just try compiling snv_142 using the same 
   procedure
   and see if it works as well as Rich Lowe's copy or if it's slow like my 
   other
   compilations.
  
   Chad
  
   Another older compilation guide:
   http://hub.opensolaris.org/bin/view/Community+Group+tools/building_opensolaris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-20 Thread Chad Cantwell

On Mon, Jul 19, 2010 at 07:01:54PM -0700, Chad Cantwell wrote:
 On Tue, Jul 20, 2010 at 10:54:44AM +1000, James C. McPherson wrote:
  On 20/07/10 10:40 AM, Chad Cantwell wrote:
  fyi, everyone, I have some more info here.  in short, rich lowe's 142 works
  correctly (fast) on my hardware, while both my compilations (snv 143, snv 
  144)
  and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow.
  
  I finally got around to trying rich lowe's snv 142 compilation in place of
  my own compilation of 143 (and later 144, not mentioned below), and unlike
  my own two compilations, his works very fast again on my same zpool (
  scrubbing avg increased from low 100s to over 400 MB/s within a few
  minutes after booting into this copy of 142.  I should note that since
  my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO
  after realizing it had zpool 26 support backported into 134 and was in
  fact able to read my zpool despite upgrading the version.  Running a
  scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just
  like the 143 and 144 that I compiled.  So, there seem to be two 
  possibilities.
  Either (and this seems unlikely) there is a problem introduced post-142 
  which
  slows things down, and it occured in 143, 144, and was brought back to 134
  with Nexanta's backports, or else (more likely) there is something 
  different
  or wrong with how I'm compiling the kernel that makes the hardware not
  perform up to its specifications with a zpool, and possibly the Nexanta 3
  RC2 ISO has the same problem as my own compilations.
  
  So - what's your env file contents, which closedbins are you using,
  why crypto bits are you using, and what changeset is your own workspace
  synced with?
  
  
  James C. McPherson
  --
  Oracle
  http://www.jmcp.homeunix.com/blog
 
 
 The procedure I followed was basically what is outlined here:
 http://insanum.com/blog/2010/06/08/how-to-build-opensolaris
 
 using the SunStudio 12 compilers for ON and 12u1 for lint.
 
 For each build (143, 144) I cloned the exact tag for that build, i.e.:
 
 # hg clone ssh://a...@hg.opensolaris.org/hg/onnv/onnv-gate onnv-b144
 # cd onnv-b144
 # hg update onnv_144
 
 Then I downloaded the corresponding closed and crypto bins from
 http://dlc.sun.com/osol/on/downloads/b143 or
 http://dlc.sun.com/osol/on/downloads/b144
 
 The only environemnt variables I modified from the default opensolaris.sh
 file were the basic ones: GATE, CODEMGR_WS, STAFFER, and ON_CRYPTO_BINS
 to point to my work directory for the build, my username, and the relevant
 crypto bin:
 
 $ egrep -e ^GATE|^CODEMGR_WS|^STAFFER|^ON_CRYPTO_BINS opensolaris.sh
 GATE=onnv-b144; export GATE
 CODEMGR_WS=/work/compiling/$GATE; export 
 CODEMGR_WS
 STAFFER=chad;   export STAFFER
 ON_CRYPTO_BINS=$CODEMGR_WS/on-crypto-latest.$MACH.tar.bz2
 
 I suppose the easiest way for me to confirm if there is a regression or if my
 compiling is flawed is to just try compiling snv_142 using the same procedure
 and see if it works as well as Rich Lowe's copy or if it's slow like my other
 compilations.
 
 Chad
 

I've just compiled and booted into snv_142, and I experienced the same slow dd 
and
scrubbing as I did with my 142 and 143 compilations and with the Nexanta 3 RC2 
CD.
So, this would seem to indicate a build environment/process flaw rather than a
regression.

Chad
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-20 Thread Chad Cantwell

Yes, I think this might have been it.  I missed the NIGHTLY_OPTIONS variable in
opensolaris and I think it was compiling a debug build.  I'm not sure what the
ramifications are of this or how much slower a debug build should be, but I'm
recompiling a release build now so hopefully all will be well.

Thanks,
Chad

On Tue, Jul 20, 2010 at 08:39:42AM +0100, Robert Milkowski wrote:
 On 20/07/2010 07:59, Chad Cantwell wrote:
 
 I've just compiled and booted into snv_142, and I experienced the same slow 
 dd and
 scrubbing as I did with my 142 and 143 compilations and with the Nexanta 3 
 RC2 CD.
 So, this would seem to indicate a build environment/process flaw rather than 
 a
 regression.
 
 
 Are you sure it is not a debug vs. non-debug issue?
 
 
 -- 
 Robert Milkowski
 http://milek.blogspot.com
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-20 Thread Chad Cantwell

No, this wasn't it.  A non debug build with the same NIGHTLY_OPTIONS
at Rich Lowe's 142 build is still very slow...

On Tue, Jul 20, 2010 at 09:52:10AM -0700, Chad Cantwell wrote:
 Yes, I think this might have been it.  I missed the NIGHTLY_OPTIONS variable 
 in
 opensolaris and I think it was compiling a debug build.  I'm not sure what the
 ramifications are of this or how much slower a debug build should be, but I'm
 recompiling a release build now so hopefully all will be well.
 
 Thanks,
 Chad
 
 On Tue, Jul 20, 2010 at 08:39:42AM +0100, Robert Milkowski wrote:
  On 20/07/2010 07:59, Chad Cantwell wrote:
  
  I've just compiled and booted into snv_142, and I experienced the same 
  slow dd and
  scrubbing as I did with my 142 and 143 compilations and with the Nexanta 3 
  RC2 CD.
  So, this would seem to indicate a build environment/process flaw rather 
  than a
  regression.
  
  
  Are you sure it is not a debug vs. non-debug issue?
  
  
  -- 
  Robert Milkowski
  http://milek.blogspot.com
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-20 Thread Chad Cantwell

On Tue, Jul 20, 2010 at 10:45:58AM -0700, Brent Jones wrote:
 On Tue, Jul 20, 2010 at 10:29 AM, Chad Cantwell c...@iomail.org wrote:
  No, this wasn't it.  A non debug build with the same NIGHTLY_OPTIONS
  at Rich Lowe's 142 build is still very slow...
 
  On Tue, Jul 20, 2010 at 09:52:10AM -0700, Chad Cantwell wrote:
  Yes, I think this might have been it.  I missed the NIGHTLY_OPTIONS 
  variable in
  opensolaris and I think it was compiling a debug build.  I'm not sure what 
  the
  ramifications are of this or how much slower a debug build should be, but 
  I'm
  recompiling a release build now so hopefully all will be well.
 
  Thanks,
  Chad
 
  On Tue, Jul 20, 2010 at 08:39:42AM +0100, Robert Milkowski wrote:
   On 20/07/2010 07:59, Chad Cantwell wrote:
   
   I've just compiled and booted into snv_142, and I experienced the same 
   slow dd and
   scrubbing as I did with my 142 and 143 compilations and with the 
   Nexanta 3 RC2 CD.
   So, this would seem to indicate a build environment/process flaw rather 
   than a
   regression.
   
  
   Are you sure it is not a debug vs. non-debug issue?
  
  
   --
   Robert Milkowski
   http://milek.blogspot.com
  
 
 Could it somehow not be compiling 64-bit support?
 
 
 -- 
 Brent Jones
 br...@servuhome.net

I thought about that but it says when it boots up that it is 64-bit, and I'm 
able to run
64-bit binaries.  I wonder if it's compiling for the wrong processor 
optomization though?
Maybe if it is missing some of the newer SSEx instructions the zpool checksum 
checking is
slowed down significantly?  I don't know how to check for this though and it 
seems strange
it would slow it down this significantly.  I'd expect even a non-SSE enabled 
binary to 
be able to calculate a few hundred MB of checksums per second for a 2.5+ghz 
processor.

Chad
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-19 Thread Chad Cantwell

fyi, everyone, I have some more info here.  in short, rich lowe's 142 works
correctly (fast) on my hardware, while both my compilations (snv 143, snv 144)
and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow.

I finally got around to trying rich lowe's snv 142 compilation in place of
my own compilation of 143 (and later 144, not mentioned below), and unlike
my own two compilations, his works very fast again on my same zpool (
scrubbing avg increased from low 100s to over 400 MB/s within a few
minutes after booting into this copy of 142.  I should note that since
my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO
after realizing it had zpool 26 support backported into 134 and was in
fact able to read my zpool despite upgrading the version.  Running a
scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just
like the 143 and 144 that I compiled.  So, there seem to be two possibilities.
Either (and this seems unlikely) there is a problem introduced post-142 which
slows things down, and it occured in 143, 144, and was brought back to 134
with Nexanta's backports, or else (more likely) there is something different
or wrong with how I'm compiling the kernel that makes the hardware not
perform up to its specifications with a zpool, and possibly the Nexanta 3
RC2 ISO has the same problem as my own compilations.

Chad

On Tue, Jul 06, 2010 at 03:08:50PM -0700, Chad Cantwell wrote:
 Hi all,
 
 I've noticed something strange in the throughput in my zpool between
 different snv builds, and I'm not sure if it's an inherent difference
 in the build or a kernel parameter that is different in the builds.
 I've setup two similiar machines and this happens with both of them.
 Each system has 16 2TB Samsung HD203WI drives (total) directly connected
 to two LSI 3081E-R 1068e cards with IT firmware in one raidz3 vdev.
 
 In both computers, after a fresh installation of snv 134, the throughput
 is a maximum of about 300 MB/s during scrub or something like
 dd if=/dev/zero bs=1024k of=bigfile.
 
 If I bfu to snv 138, I then get throughput of about 700 MB/s with both
 scrub or a single thread dd.
 
 I assumed at first this was some sort of bug or regression in 134 that
 made it slow.  However, I've now tested also from the fresh 134
 installation, compiling the OS/Net build 143 from the mercurial
 repository and booting into it, after which the dd throughput is still
 only about 300 MB/s just like snv 134.  The scrub throughput in 143
 is even slower, rarely surpassing 150 MB/s.  I wonder if the scrubbing
 being extra slow here is related to the additional statistics displayed
 during the scrub that didn't used to be shown.
 
 Is there some kind of debug option that might be enabled in the 134 build
 and persist if I compile snv 143 which would be off if I installed a 138
 through bfu?  If not, it makes me think that the bfu to 138 is changing
 the configuration somewhere to make it faster rather than fixing a bug or
 being a debug flag on or off.  Does anyone have any idea what might be
 happening?  One thing I haven't tried is bfu'ing to 138, and from this
 faster working snv 138 installing the snv 143 build, which may possibly
 create a 143 that performs faster if it's simply a configuration parameter.
 I'm not sure offhand if installing source-compiled ON builds from a bfu'd
 rpool is supported, although I suppose it's simple enough to try.
 
 Thanks,
 Chad Cantwell
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-19 Thread Chad Cantwell

On Tue, Jul 20, 2010 at 10:54:44AM +1000, James C. McPherson wrote:
 On 20/07/10 10:40 AM, Chad Cantwell wrote:
 fyi, everyone, I have some more info here.  in short, rich lowe's 142 works
 correctly (fast) on my hardware, while both my compilations (snv 143, snv 
 144)
 and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow.
 
 I finally got around to trying rich lowe's snv 142 compilation in place of
 my own compilation of 143 (and later 144, not mentioned below), and unlike
 my own two compilations, his works very fast again on my same zpool (
 scrubbing avg increased from low 100s to over 400 MB/s within a few
 minutes after booting into this copy of 142.  I should note that since
 my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO
 after realizing it had zpool 26 support backported into 134 and was in
 fact able to read my zpool despite upgrading the version.  Running a
 scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just
 like the 143 and 144 that I compiled.  So, there seem to be two 
 possibilities.
 Either (and this seems unlikely) there is a problem introduced post-142 which
 slows things down, and it occured in 143, 144, and was brought back to 134
 with Nexanta's backports, or else (more likely) there is something different
 or wrong with how I'm compiling the kernel that makes the hardware not
 perform up to its specifications with a zpool, and possibly the Nexanta 3
 RC2 ISO has the same problem as my own compilations.
 
 So - what's your env file contents, which closedbins are you using,
 why crypto bits are you using, and what changeset is your own workspace
 synced with?
 
 
 James C. McPherson
 --
 Oracle
 http://www.jmcp.homeunix.com/blog


The procedure I followed was basically what is outlined here:
http://insanum.com/blog/2010/06/08/how-to-build-opensolaris

using the SunStudio 12 compilers for ON and 12u1 for lint.

For each build (143, 144) I cloned the exact tag for that build, i.e.:

# hg clone ssh://a...@hg.opensolaris.org/hg/onnv/onnv-gate onnv-b144
# cd onnv-b144
# hg update onnv_144

Then I downloaded the corresponding closed and crypto bins from
http://dlc.sun.com/osol/on/downloads/b143 or
http://dlc.sun.com/osol/on/downloads/b144

The only environemnt variables I modified from the default opensolaris.sh
file were the basic ones: GATE, CODEMGR_WS, STAFFER, and ON_CRYPTO_BINS
to point to my work directory for the build, my username, and the relevant
crypto bin:

$ egrep -e ^GATE|^CODEMGR_WS|^STAFFER|^ON_CRYPTO_BINS opensolaris.sh
GATE=onnv-b144; export GATE
CODEMGR_WS=/work/compiling/$GATE; export 
CODEMGR_WS
STAFFER=chad;   export STAFFER
ON_CRYPTO_BINS=$CODEMGR_WS/on-crypto-latest.$MACH.tar.bz2

I suppose the easiest way for me to confirm if there is a regression or if my
compiling is flawed is to just try compiling snv_142 using the same procedure
and see if it works as well as Rich Lowe's copy or if it's slow like my other
compilations.

Chad

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-19 Thread Chad Cantwell

On Mon, Jul 19, 2010 at 06:00:04PM -0700, Brent Jones wrote:
 On Mon, Jul 19, 2010 at 5:40 PM, Chad Cantwell c...@iomail.org wrote:
  fyi, everyone, I have some more info here.  in short, rich lowe's 142 works
  correctly (fast) on my hardware, while both my compilations (snv 143, snv 
  144)
  and also the nexanta 3 rc2 kernel (134 with backports) are horribly slow.
 
  I finally got around to trying rich lowe's snv 142 compilation in place of
  my own compilation of 143 (and later 144, not mentioned below), and unlike
  my own two compilations, his works very fast again on my same zpool (
  scrubbing avg increased from low 100s to over 400 MB/s within a few
  minutes after booting into this copy of 142.  I should note that since
  my original message, I also tried booting from a Nexanta Core 3.0 RC2 ISO
  after realizing it had zpool 26 support backported into 134 and was in
  fact able to read my zpool despite upgrading the version.  Running a
  scrub from the F2 shell on the Nexanta CD was also slow scrubbing, just
  like the 143 and 144 that I compiled.  So, there seem to be two 
  possibilities.
  Either (and this seems unlikely) there is a problem introduced post-142 
  which
  slows things down, and it occured in 143, 144, and was brought back to 134
  with Nexanta's backports, or else (more likely) there is something different
  or wrong with how I'm compiling the kernel that makes the hardware not
  perform up to its specifications with a zpool, and possibly the Nexanta 3
  RC2 ISO has the same problem as my own compilations.
 
  Chad
 
  On Tue, Jul 06, 2010 at 03:08:50PM -0700, Chad Cantwell wrote:
  Hi all,
 
  I've noticed something strange in the throughput in my zpool between
  different snv builds, and I'm not sure if it's an inherent difference
  in the build or a kernel parameter that is different in the builds.
  I've setup two similiar machines and this happens with both of them.
  Each system has 16 2TB Samsung HD203WI drives (total) directly connected
  to two LSI 3081E-R 1068e cards with IT firmware in one raidz3 vdev.
 
  In both computers, after a fresh installation of snv 134, the throughput
  is a maximum of about 300 MB/s during scrub or something like
  dd if=/dev/zero bs=1024k of=bigfile.
 
  If I bfu to snv 138, I then get throughput of about 700 MB/s with both
  scrub or a single thread dd.
 
  I assumed at first this was some sort of bug or regression in 134 that
  made it slow.  However, I've now tested also from the fresh 134
  installation, compiling the OS/Net build 143 from the mercurial
  repository and booting into it, after which the dd throughput is still
  only about 300 MB/s just like snv 134.  The scrub throughput in 143
  is even slower, rarely surpassing 150 MB/s.  I wonder if the scrubbing
  being extra slow here is related to the additional statistics displayed
  during the scrub that didn't used to be shown.
 
  Is there some kind of debug option that might be enabled in the 134 build
  and persist if I compile snv 143 which would be off if I installed a 138
  through bfu?  If not, it makes me think that the bfu to 138 is changing
  the configuration somewhere to make it faster rather than fixing a bug or
  being a debug flag on or off.  Does anyone have any idea what might be
  happening?  One thing I haven't tried is bfu'ing to 138, and from this
  faster working snv 138 installing the snv 143 build, which may possibly
  create a 143 that performs faster if it's simply a configuration parameter.
  I'm not sure offhand if installing source-compiled ON builds from a bfu'd
  rpool is supported, although I suppose it's simple enough to try.
 
  Thanks,
  Chad Cantwell
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 
 I'm surprised you're even getting 400MB/s on the fast
 configurations, with only 16 drives in a Raidz3 configuration.
 To me, 16 drives in Raidz3 (single Vdev) would do about 150MB/sec, as
 your slow speeds suggest.
 
 -- 
 Brent Jones
 br...@servuhome.net

With which drives and controllers?  For a single dd thread writing a large file 
to fill
up a new zpool from /dev/zero, in this configuration I can sustain over 700 
MB/s for
the duration of the process and can fill up the ~26t usable space overnight.  
This is
with two 8 port LSI 1068e controllers and no expanders.  RAIDZ operates 
similiar to
regular raid and you should get striped speeds for sequential access minus any
inefficiencies and processing time for the parity.  16 disks in raidz3 is 13 
disks
worth of striping so with ~700 MB/s I'm getting about 50% efficiency after the 
parity
calculations etc which is fine with me.  I understand that some people need to 
have
higher performance random I/O to many

[zfs-discuss] zpool throughput: snv 134 vs 138 vs 143

2010-07-06 Thread Chad Cantwell

Hi all,

I've noticed something strange in the throughput in my zpool between
different snv builds, and I'm not sure if it's an inherent difference
in the build or a kernel parameter that is different in the builds.
I've setup two similiar machines and this happens with both of them.
Each system has 16 2TB Samsung HD203WI drives (total) directly connected
to two LSI 3081E-R 1068e cards with IT firmware in one raidz3 vdev.

In both computers, after a fresh installation of snv 134, the throughput
is a maximum of about 300 MB/s during scrub or something like
dd if=/dev/zero bs=1024k of=bigfile.

If I bfu to snv 138, I then get throughput of about 700 MB/s with both
scrub or a single thread dd.

I assumed at first this was some sort of bug or regression in 134 that
made it slow.  However, I've now tested also from the fresh 134
installation, compiling the OS/Net build 143 from the mercurial
repository and booting into it, after which the dd throughput is still
only about 300 MB/s just like snv 134.  The scrub throughput in 143
is even slower, rarely surpassing 150 MB/s.  I wonder if the scrubbing
being extra slow here is related to the additional statistics displayed
during the scrub that didn't used to be shown.

Is there some kind of debug option that might be enabled in the 134 build
and persist if I compile snv 143 which would be off if I installed a 138
through bfu?  If not, it makes me think that the bfu to 138 is changing
the configuration somewhere to make it faster rather than fixing a bug or
being a debug flag on or off.  Does anyone have any idea what might be
happening?  One thing I haven't tried is bfu'ing to 138, and from this
faster working snv 138 installing the snv 143 build, which may possibly
create a 143 that performs faster if it's simply a configuration parameter.
I'm not sure offhand if installing source-compiled ON builds from a bfu'd
rpool is supported, although I suppose it's simple enough to try.

Thanks,
Chad Cantwell
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] compressed root pool at installation time with flash archive predeployment script

2010-03-02 Thread chad . campbell

I was trying to think of a way to set compression=on at the beginning of a 
jumpstart.  The only idea I've come up with is to do so with a flash 
archive predeployment script.  Has anyone else tried this approach?

Thanks,

Chad___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS replace - many to one

2010-02-25 Thread Chad

I'm looking to migrate a pool from using multiple smaller LUNs to one larger 
LUN. I don't see a way to do a zpool replace for multiple to one. Anybody know 
how to do this? It needs to be non disruptive.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-12-08 Thread Chad Cantwell

fyi to everyone, the Asus P5W64 motherboard previously in my opensolaris machine
was the culprit, and not the general mpt issues.  At the time the motherboard 
was
originally put in that machine, there was not enough zfs i/o load to trigger the
problem which led to the false impression the hardware was fine.  I'm using a
5400 chipset xeon board now (asus dseb-gh) and my LSI cards are working 
perfectly
again; over 2 hours of heavy I/O and no errors or warnings with snv 127 (with 
the
P5W64/LSI combo with build 127 it would never run more than 15 minutes without
warnings).  I chose this board partly since it has PCI-X slots and I thought 
those
might be useful for AOC-SAT2-MV8 cards if I couldn't shake the mpt issues, but 
now
that the mpt issues are gone I can continue with that controller if I want.

Thanks everyone for your help,
Chad


On Sun, Dec 06, 2009 at 11:12:50PM -0800, Chad Cantwell wrote:
 Thanks for the info on the yukon driver.  I realize too many variables makes
 things impossible to determine, but I had made these hardware changes awhile
 back, and they seemed to work fine at the time.  Since they aren't now, even
 in the older OpenSolaris (i've tried 2009.06 and 2008.11 now), the problem
 seems to be a hardware quirk, and the only way to narrow that down is to
 change hardware back until it works like it used to in at least the older
 snv builds.  I've ruled out the ethernet controller.  I'm leaning toward
 the current motherboard (Asus P5W64) not playing nicely with the LSI cards,
 but it will probably be several days until I get to the bottom of this since
 it takes awhile to test after making a change...
 
 Thanks,
 Chad
 
 On Mon, Dec 07, 2009 at 11:09:39AM +1000, James C. McPherson wrote:
  
  
  Gday Chad,
  the more swaptronics you partake in, the more difficult it
  is going to be for us (collectively) to figure out what is
  going wrong on your system. Btw, since you're running a build
  past 124, you can use the yge driver instead of the yukonx
  (from Marvell) or myk (from Murayama-san) drivers.
  
  As another comment in this thread has mentioned, a full scrub
  can be a serious test of your hardware depending on how much
  data you've got to walk over. If you can keep the hardware
  variables to a minimum then clarity will be more achievable.
  
  
  thankyou,
  James C. McPherson
  --
  Senior Kernel Software Engineer, Solaris
  Sun Microsystems
  http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-12-06 Thread Chad Cantwell

Thanks for the info on the yukon driver.  I realize too many variables makes
things impossible to determine, but I had made these hardware changes awhile
back, and they seemed to work fine at the time.  Since they aren't now, even
in the older OpenSolaris (i've tried 2009.06 and 2008.11 now), the problem
seems to be a hardware quirk, and the only way to narrow that down is to
change hardware back until it works like it used to in at least the older
snv builds.  I've ruled out the ethernet controller.  I'm leaning toward
the current motherboard (Asus P5W64) not playing nicely with the LSI cards,
but it will probably be several days until I get to the bottom of this since
it takes awhile to test after making a change...

Thanks,
Chad

On Mon, Dec 07, 2009 at 11:09:39AM +1000, James C. McPherson wrote:
 
 
 Gday Chad,
 the more swaptronics you partake in, the more difficult it
 is going to be for us (collectively) to figure out what is
 going wrong on your system. Btw, since you're running a build
 past 124, you can use the yge driver instead of the yukonx
 (from Marvell) or myk (from Murayama-san) drivers.
 
 As another comment in this thread has mentioned, a full scrub
 can be a serious test of your hardware depending on how much
 data you've got to walk over. If you can keep the hardware
 variables to a minimum then clarity will be more achievable.
 
 
 thankyou,
 James C. McPherson
 --
 Senior Kernel Software Engineer, Solaris
 Sun Microsystems
 http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-12-05 Thread Chad Cantwell

Hi all,

Unfortunately for me, there does seem to be a hardware component to my problem. 
 Although my rsync copied almost 4TB of
data with no iostat errors after going back to OpenSolaris 2009.06, I/O on one 
of my mpt cards did eventually hang, with
6 disk lights on and 2 off, until rebooting.  There are a few hardware changes 
made since the last time I did a full
backup, so it's possible that whatever problem was introduced didn't happen 
frequently enough in low i/o usage for
me to detect until now when I was reinstalling and copying massive amounts of 
data back.

The changes I had made since originally installing osol2009.06 several months 
ago are:

- stop using marvel yukon2 ethernet onboard driver (which used a 3rd party 
driver) in favor of intel 1000 pt dual port,
which necessesitated an extra pci-e slot, prompting the following item:
- swapped motherboards between 2 machines (they were similiar though, with 
similiar onboard hardware and shouldn't
have been a major change).  Originally was an Asus P5Q Deluxe w/3 pci-e slots, 
now is a slightly older Asus P5W64 w/4
pci-e slots.
- the intel 1000 pt dual port card has been aggregated as aggr0 since it was 
installed (the older yukon2 was a basic
interface)

the above changes were what was done awhile ago before upgrading opensolaris to 
127, and things seemed to be working fine
for at least 2-3 months with rsync updating (never hung, or had a fatal zfs 
error or lost access to data requiring a reboot)

new changes since troubleshooting snv 127 mpt issues:
- upgrade LSI 3081 firmware from 1.28.2 (or was it .02) to 1.29, the latest.  
If this turns out to be an issue, I do have
the previous IT firmware that I was using before which I can flash back.

another, albeit unlikely factor: when I originally copied all my data to my 
first opensolaris raidz2 pool, I didn't use
rsync at all, I used netcat  tar, and only setup rsync later for updates.  
perhaps the huge initial single rsync of
the large tree does something strange that the original intiial netcat  tar 
copy did not (i know, unlikely, but I'm
grasping at straws here to determine what has happened).

I'll work on ruling out the potential sources of hardware problems before I 
report any more on the mpt issues, since
my test case would probably confound things at this point.  I am affected by 
the mpt bugs since I would get the
timeouts almost constantly in snv 127+, but since I'm also apparently affected 
by some other unknown hardware issue,
my data on the mpt problems might lead people in the wrong direction at this 
point.

I will first try to go back to the non-aggregated yukon ethernet, remove the 
intel dual port pci-e network adapter,
then if the problem persists try half of my drives on each LSI controller 
individually to confirm if one controller
has a problem the other does not, or one drive in one set is causing a new 
problem to a particular controller.  I hope
to have some kind of answer at that point and not have to resort to motherboard 
swapping again.

Chad

On Thu, Dec 03, 2009 at 10:44:53PM -0800, Chad Cantwell wrote:
 I eventually performed a few more tests, adjusting some zfs tuning options 
 which had no effect, and trying the
 itmpt driver which someone had said would work, and regardless my system 
 would always freeze quite rapidly in
 snv 127 and 128a.  Just to double check my hardware, I went back to the 
 opensolaris 2009.06 release version, and
 everything is working fine.  The system has been running a few hours and 
 copied a lot of data and not had any
 trouble, mpt syslog events, or iostat errors.
 
 One thing I found interesting, and I don't know if it's significant or not, 
 is that under the recent builds and
 under 2009.06, I had run echo '::interrupts' | mdb -k to check the 
 interrupts used.  (I don't have the printout
 handy for snv 127+, though).
 
 I have a dual port gigabit Intel 1000 P PCI-e card, which shows up as e1000g0 
 and e1000g1.  In snv 127+, each of
 my e1000g devices shares an IRQ with my mpt devices (mpt0, mpt1) on the IRQ 
 listing, whereas in opensolaris
 2009.06, all 4 devices are on different IRQs.  I don't know if this is 
 significant, but most of my testing when
 I encountered errors was data transfer via the network, so it could have 
 potentially been interfering with the
 mpt drivers when it was on the same IRQ.  The errors did seem to be less 
 frequent when the server I was copying
 from was linked at 100 instead of 1000 (one of my tests), but that is as 
 likely to be a result of the slower zpool
 throughput as it is to be related to the network traffic.
 
 I'll probably stay with 2009.06 for now since it works fine for me, but I can 
 try a newer build again once some
 more progress is made in this area and people want to see if its fixed (this 
 machine is mainly to backup another
 array so it's not too big a deal to test later when the mpt drivers are 
 looking better and wipe again in the event
 of problems)
 
 Chad

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-12-05 Thread Chad Cantwell

I was under the impression that the problem affecting most of us was introduced 
much later than b104,
sometime between ~114 and ~118.  When I first started using my LSI 3081 cards, 
they had the IR firmware
on them, and it caused me all kinds of problems.  The disks showed up but I 
couldn't write to them, I
believe.  Eventually I found that I needed the IT firmware for it to work 
properly, which is what I
have used ever since, but maybe some builds do work with IR firmware?  I 
remember, then, when I was
originally trying to set them up with the IR firmware, Opensolaris saw my two 
cards as one device,
whereas with the IT firmware they were always mpt0 and mpt1.  Could also be the 
IR works with one card
but not well when two cards are combine...

Chad

On Sat, Dec 05, 2009 at 02:47:55PM -0800, Calvin Morrow wrote:
 I found this thread after fighting the same problem in Nexenta which uses the 
 OpenSolaris kernel from b104.  Thankfully, I think I have (for the moment) 
 solved my problem.
 
 Background:
 
 I have an LSI 3081e-R (1068E based) adapter which experiences the same 
 disconnected command timeout error under relatively light load.  This card 
 connects to a Supermicro chassis using 2 MiniSAS cables to redundant 
 expanders that are attached to 18 SAS drives.  The card ran the latest IT 
 firmware (1.29?).
 
 This server is a new install, and even installing from the CD to two disks in 
 a mirrored ZFS root would randomly cause the disconnect error.  The system 
 remained unresponsive until after a reboot.
 
 I tried the workarounds mentioned in this thread, namely using set 
 mpt:mpt_enable_msi = 0 and set xpv_psm:xen_support_msi = -1 in 
 /etc/system.  Once I added those lines, the system never really became 
 unresponsive, however there were partial read and partial write messages that 
 littered dmesg.  At one point there appeared to be a disconnect error ( can 
 not confirm ) that the system recovered from.
 
 Eventually, I became desperate and flashed the IR (Integrated Raid) firmware 
 over the top of the IT firmware.  Since then, I have had no errors in dmesg 
 of any kind.
 
 I even removed the workarounds from /etc/system and still have had no issues. 
  The mpt driver is exceptionally quiet now.
 
 I'm interested to know if anyone who has a 1068E based card is having these 
 problems using the IR firmware, or if they all seem to be IT (initiator 
 target) related.
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-12-03 Thread Chad Cantwell

I eventually performed a few more tests, adjusting some zfs tuning options 
which had no effect, and trying the
itmpt driver which someone had said would work, and regardless my system would 
always freeze quite rapidly in
snv 127 and 128a.  Just to double check my hardware, I went back to the 
opensolaris 2009.06 release version, and
everything is working fine.  The system has been running a few hours and copied 
a lot of data and not had any
trouble, mpt syslog events, or iostat errors.

One thing I found interesting, and I don't know if it's significant or not, is 
that under the recent builds and
under 2009.06, I had run echo '::interrupts' | mdb -k to check the interrupts 
used.  (I don't have the printout
handy for snv 127+, though).

I have a dual port gigabit Intel 1000 P PCI-e card, which shows up as e1000g0 
and e1000g1.  In snv 127+, each of
my e1000g devices shares an IRQ with my mpt devices (mpt0, mpt1) on the IRQ 
listing, whereas in opensolaris
2009.06, all 4 devices are on different IRQs.  I don't know if this is 
significant, but most of my testing when
I encountered errors was data transfer via the network, so it could have 
potentially been interfering with the
mpt drivers when it was on the same IRQ.  The errors did seem to be less 
frequent when the server I was copying
from was linked at 100 instead of 1000 (one of my tests), but that is as likely 
to be a result of the slower zpool
throughput as it is to be related to the network traffic.

I'll probably stay with 2009.06 for now since it works fine for me, but I can 
try a newer build again once some
more progress is made in this area and people want to see if its fixed (this 
machine is mainly to backup another
array so it's not too big a deal to test later when the mpt drivers are looking 
better and wipe again in the event
of problems)

Chad

On Tue, Dec 01, 2009 at 03:06:31PM -0800, Chad Cantwell wrote:
 To update everyone, I did a complete zfs scrub, and it it generated no errors 
 in iostat, and I have 4.8T of
 data on the filesystem so it was a fairly lengthy test.  The machine also has 
 exhibited no evidence of
 instability.  If I were to start copying a lot of data to the filesystem 
 again though, I'm sure it would
 generate errors and crash again.
 
 Chad
 
 
 On Tue, Dec 01, 2009 at 12:29:16AM -0800, Chad Cantwell wrote:
  Well, ok, the msi=0 thing didn't help after all.  A few minutes after my 
  last message a few errors showed
  up in iostat, and then in a few minutes more the machine was locked up 
  hard...  Maybe I will try just
  doing a scrub instead of my rsync process and see how that does.
  
  Chad
  
  
  On Tue, Dec 01, 2009 at 12:13:36AM -0800, Chad Cantwell wrote:
   I don't think the hardware has any problems, it only started having 
   errors when I upgraded OpenSolaris.
   It's still working fine again now after a reboot.  Actually, I reread one 
   of your earlier messages,
   and I didn't realize at first when you said non-Sun JBOD that this 
   didn't apply to me (in regards to
   the msi=0 fix) because I didn't realize JBOD was shorthand for an 
   external expander device.  Since
   I'm just using baremetal, and passive backplanes, I think the msi=0 fix 
   should apply to me based on
   what you wrote earlier, anyway I've put 
 set mpt:mpt_enable_msi = 0
   now in /etc/system and rebooted as it was suggested earlier.  I've 
   resumed my rsync, and so far there
   have been no errors, but it's only been 20 minutes or so.  I should have 
   a good idea by tomorrow if this
   definitely fixed the problem (since even when the machine was not 
   crashing it was tallying up iostat errors
   fairly rapidly)
   
   Thanks again for your help.  Sorry for wasting your time if the 
   previously posted workaround fixes things.
   I'll let you know tomorrow either way.
   
   Chad
   
   On Tue, Dec 01, 2009 at 05:57:28PM +1000, James C. McPherson wrote:
Chad Cantwell wrote:
After another crash I checked the syslog and there were some different 
errors than the ones
I saw previously during operation:
...

Nov 30 20:59:13 the-vault   LSI PCI device (1000,) not 
supported.
...
Nov 30 20:59:13 the-vault   mpt_config_space_init failed
...
Nov 30 20:59:15 the-vault   mpt_restart_ioc failed


Nov 30 21:33:02 the-vault fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
PCIEX-8000-8R, TYPE: Fault, VER: 1, SEVERITY: Major
Nov 30 21:33:02 the-vault EVENT-TIME: Mon Nov 30 21:33:02 PST 2009
Nov 30 21:33:02 the-vault PLATFORM: System-Product-Name, CSN: 
System-Serial-Number, HOSTNAME: the-vault
Nov 30 21:33:02 the-vault SOURCE: eft, REV: 1.16
Nov 30 21:33:02 the-vault EVENT-ID: 
7886cc0d-4760-60b2-e06a-8158c3334f63
Nov 30 21:33:02 the-vault DESC: The transmitting device sent an 
invalid request.
Nov 30 21:33:02 the-vault   Refer to http://sun.com/msg/PCIEX-8000-8R 
for more information.
Nov 30

Re: [zfs-discuss] mpt errors on snv 127

2009-12-01 Thread Chad Cantwell

I don't think the hardware has any problems, it only started having errors when 
I upgraded OpenSolaris.
It's still working fine again now after a reboot.  Actually, I reread one of 
your earlier messages,
and I didn't realize at first when you said non-Sun JBOD that this didn't 
apply to me (in regards to
the msi=0 fix) because I didn't realize JBOD was shorthand for an external 
expander device.  Since
I'm just using baremetal, and passive backplanes, I think the msi=0 fix should 
apply to me based on
what you wrote earlier, anyway I've put 
set mpt:mpt_enable_msi = 0
now in /etc/system and rebooted as it was suggested earlier.  I've resumed my 
rsync, and so far there
have been no errors, but it's only been 20 minutes or so.  I should have a good 
idea by tomorrow if this
definitely fixed the problem (since even when the machine was not crashing it 
was tallying up iostat errors
fairly rapidly)

Thanks again for your help.  Sorry for wasting your time if the previously 
posted workaround fixes things.
I'll let you know tomorrow either way.

Chad

On Tue, Dec 01, 2009 at 05:57:28PM +1000, James C. McPherson wrote:
 Chad Cantwell wrote:
 After another crash I checked the syslog and there were some different 
 errors than the ones
 I saw previously during operation:
 ...
 
 Nov 30 20:59:13 the-vault   LSI PCI device (1000,) not supported.
 ...
 Nov 30 20:59:13 the-vault   mpt_config_space_init failed
 ...
 Nov 30 20:59:15 the-vault   mpt_restart_ioc failed
 
 
 Nov 30 21:33:02 the-vault fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
 PCIEX-8000-8R, TYPE: Fault, VER: 1, SEVERITY: Major
 Nov 30 21:33:02 the-vault EVENT-TIME: Mon Nov 30 21:33:02 PST 2009
 Nov 30 21:33:02 the-vault PLATFORM: System-Product-Name, CSN: 
 System-Serial-Number, HOSTNAME: the-vault
 Nov 30 21:33:02 the-vault SOURCE: eft, REV: 1.16
 Nov 30 21:33:02 the-vault EVENT-ID: 7886cc0d-4760-60b2-e06a-8158c3334f63
 Nov 30 21:33:02 the-vault DESC: The transmitting device sent an invalid 
 request.
 Nov 30 21:33:02 the-vault   Refer to http://sun.com/msg/PCIEX-8000-8R for 
 more information.
 Nov 30 21:33:02 the-vault AUTO-RESPONSE: One or more device instances may be 
 disabled
 Nov 30 21:33:02 the-vault IMPACT: Loss of services provided by the device 
 instances associated with this fault
 Nov 30 21:33:02 the-vault REC-ACTION: Ensure that the latest drivers and 
 patches are installed. Otherwise schedule a repair procedure to replace the 
 affected device(s).  Us
 e fmadm faulty to identify the devices or contact Sun for support.
 
 
 Sorry to have to tell you, but that HBA is dead. Or at
 least dying horribly. If you can't init the config space
 (that's the pci bus config space), then you've got about
 1/2 the nails in the coffin hammered in. Then the failure
 to restart the IOC (io controller unit) == the rest of
 the lid hammered down.
 
 
 best regards,
 James C. McPherson
 --
 Senior Kernel Software Engineer, Solaris
 Sun Microsystems
 http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-12-01 Thread Chad Cantwell

Well, ok, the msi=0 thing didn't help after all.  A few minutes after my last 
message a few errors showed
up in iostat, and then in a few minutes more the machine was locked up hard...  
Maybe I will try just
doing a scrub instead of my rsync process and see how that does.

Chad


On Tue, Dec 01, 2009 at 12:13:36AM -0800, Chad Cantwell wrote:
 I don't think the hardware has any problems, it only started having errors 
 when I upgraded OpenSolaris.
 It's still working fine again now after a reboot.  Actually, I reread one of 
 your earlier messages,
 and I didn't realize at first when you said non-Sun JBOD that this didn't 
 apply to me (in regards to
 the msi=0 fix) because I didn't realize JBOD was shorthand for an external 
 expander device.  Since
 I'm just using baremetal, and passive backplanes, I think the msi=0 fix 
 should apply to me based on
 what you wrote earlier, anyway I've put 
   set mpt:mpt_enable_msi = 0
 now in /etc/system and rebooted as it was suggested earlier.  I've resumed my 
 rsync, and so far there
 have been no errors, but it's only been 20 minutes or so.  I should have a 
 good idea by tomorrow if this
 definitely fixed the problem (since even when the machine was not crashing it 
 was tallying up iostat errors
 fairly rapidly)
 
 Thanks again for your help.  Sorry for wasting your time if the previously 
 posted workaround fixes things.
 I'll let you know tomorrow either way.
 
 Chad
 
 On Tue, Dec 01, 2009 at 05:57:28PM +1000, James C. McPherson wrote:
  Chad Cantwell wrote:
  After another crash I checked the syslog and there were some different 
  errors than the ones
  I saw previously during operation:
  ...
  
  Nov 30 20:59:13 the-vault   LSI PCI device (1000,) not supported.
  ...
  Nov 30 20:59:13 the-vault   mpt_config_space_init failed
  ...
  Nov 30 20:59:15 the-vault   mpt_restart_ioc failed
  
  
  Nov 30 21:33:02 the-vault fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
  PCIEX-8000-8R, TYPE: Fault, VER: 1, SEVERITY: Major
  Nov 30 21:33:02 the-vault EVENT-TIME: Mon Nov 30 21:33:02 PST 2009
  Nov 30 21:33:02 the-vault PLATFORM: System-Product-Name, CSN: 
  System-Serial-Number, HOSTNAME: the-vault
  Nov 30 21:33:02 the-vault SOURCE: eft, REV: 1.16
  Nov 30 21:33:02 the-vault EVENT-ID: 7886cc0d-4760-60b2-e06a-8158c3334f63
  Nov 30 21:33:02 the-vault DESC: The transmitting device sent an invalid 
  request.
  Nov 30 21:33:02 the-vault   Refer to http://sun.com/msg/PCIEX-8000-8R for 
  more information.
  Nov 30 21:33:02 the-vault AUTO-RESPONSE: One or more device instances may 
  be disabled
  Nov 30 21:33:02 the-vault IMPACT: Loss of services provided by the device 
  instances associated with this fault
  Nov 30 21:33:02 the-vault REC-ACTION: Ensure that the latest drivers and 
  patches are installed. Otherwise schedule a repair procedure to replace 
  the affected device(s).  Us
  e fmadm faulty to identify the devices or contact Sun for support.
  
  
  Sorry to have to tell you, but that HBA is dead. Or at
  least dying horribly. If you can't init the config space
  (that's the pci bus config space), then you've got about
  1/2 the nails in the coffin hammered in. Then the failure
  to restart the IOC (io controller unit) == the rest of
  the lid hammered down.
  
  
  best regards,
  James C. McPherson
  --
  Senior Kernel Software Engineer, Solaris
  Sun Microsystems
  http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-12-01 Thread Chad Cantwell

First I tried just upgrading to b127, that had a few issues besides the mpt 
driver.  After that
I did a clean install of b127, but no I don't have my osol2009.06 root still 
there.  I wasn't
sure how to install another copy and leave it there (I suspect it is possible, 
since I saw
when doing upgrades it creates a second root environment, but my forte isn't 
solaris so I
just reformatted the root device)

On Tue, Dec 01, 2009 at 08:09:32AM -0500, Mark Johnson wrote:
 
 
 Chad Cantwell wrote:
 Hi,
 
  I was using for quite awhile OpenSolaris 2009.06
 with the opensolaris-provided mpt driver to operate a zfs raidz2 pool of
 about ~20T and this worked perfectly fine (no issues or device errors
 logged for several months, no hanging).  A few days ago I decided to
 reinstall with the latest OpenSolaris in order to take advantage of
 raidz3.
 
 Just to be clear... The same setup was working fine on osol2009.06,
 you upgraded to b127 and it started failing?
 
 Did you keep the osol2009.06 be around so you can reboot back to it?
 
 If so, have you tried the osol2009.06 mpt driver in the
 BE with the latest bits (make sure you make a backup copy
 of the mpt driver)?
 
 
 
 MRJ
 
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-12-01 Thread Chad Cantwell

To update everyone, I did a complete zfs scrub, and it it generated no errors 
in iostat, and I have 4.8T of
data on the filesystem so it was a fairly lengthy test.  The machine also has 
exhibited no evidence of
instability.  If I were to start copying a lot of data to the filesystem again 
though, I'm sure it would
generate errors and crash again.

Chad


On Tue, Dec 01, 2009 at 12:29:16AM -0800, Chad Cantwell wrote:
 Well, ok, the msi=0 thing didn't help after all.  A few minutes after my last 
 message a few errors showed
 up in iostat, and then in a few minutes more the machine was locked up 
 hard...  Maybe I will try just
 doing a scrub instead of my rsync process and see how that does.
 
 Chad
 
 
 On Tue, Dec 01, 2009 at 12:13:36AM -0800, Chad Cantwell wrote:
  I don't think the hardware has any problems, it only started having errors 
  when I upgraded OpenSolaris.
  It's still working fine again now after a reboot.  Actually, I reread one 
  of your earlier messages,
  and I didn't realize at first when you said non-Sun JBOD that this didn't 
  apply to me (in regards to
  the msi=0 fix) because I didn't realize JBOD was shorthand for an external 
  expander device.  Since
  I'm just using baremetal, and passive backplanes, I think the msi=0 fix 
  should apply to me based on
  what you wrote earlier, anyway I've put 
  set mpt:mpt_enable_msi = 0
  now in /etc/system and rebooted as it was suggested earlier.  I've resumed 
  my rsync, and so far there
  have been no errors, but it's only been 20 minutes or so.  I should have a 
  good idea by tomorrow if this
  definitely fixed the problem (since even when the machine was not crashing 
  it was tallying up iostat errors
  fairly rapidly)
  
  Thanks again for your help.  Sorry for wasting your time if the previously 
  posted workaround fixes things.
  I'll let you know tomorrow either way.
  
  Chad
  
  On Tue, Dec 01, 2009 at 05:57:28PM +1000, James C. McPherson wrote:
   Chad Cantwell wrote:
   After another crash I checked the syslog and there were some different 
   errors than the ones
   I saw previously during operation:
   ...
   
   Nov 30 20:59:13 the-vault   LSI PCI device (1000,) not supported.
   ...
   Nov 30 20:59:13 the-vault   mpt_config_space_init failed
   ...
   Nov 30 20:59:15 the-vault   mpt_restart_ioc failed
   
   
   Nov 30 21:33:02 the-vault fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
   PCIEX-8000-8R, TYPE: Fault, VER: 1, SEVERITY: Major
   Nov 30 21:33:02 the-vault EVENT-TIME: Mon Nov 30 21:33:02 PST 2009
   Nov 30 21:33:02 the-vault PLATFORM: System-Product-Name, CSN: 
   System-Serial-Number, HOSTNAME: the-vault
   Nov 30 21:33:02 the-vault SOURCE: eft, REV: 1.16
   Nov 30 21:33:02 the-vault EVENT-ID: 7886cc0d-4760-60b2-e06a-8158c3334f63
   Nov 30 21:33:02 the-vault DESC: The transmitting device sent an invalid 
   request.
   Nov 30 21:33:02 the-vault   Refer to http://sun.com/msg/PCIEX-8000-8R 
   for more information.
   Nov 30 21:33:02 the-vault AUTO-RESPONSE: One or more device instances 
   may be disabled
   Nov 30 21:33:02 the-vault IMPACT: Loss of services provided by the 
   device instances associated with this fault
   Nov 30 21:33:02 the-vault REC-ACTION: Ensure that the latest drivers and 
   patches are installed. Otherwise schedule a repair procedure to replace 
   the affected device(s).  Us
   e fmadm faulty to identify the devices or contact Sun for support.
   
   
   Sorry to have to tell you, but that HBA is dead. Or at
   least dying horribly. If you can't init the config space
   (that's the pci bus config space), then you've got about
   1/2 the nails in the coffin hammered in. Then the failure
   to restart the IOC (io controller unit) == the rest of
   the lid hammered down.
   
   
   best regards,
   James C. McPherson
   --
   Senior Kernel Software Engineer, Solaris
   Sun Microsystems
   http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] mpt errors on snv 127

2009-11-30 Thread Chad Cantwell

Hi,

Sorry for not replying to one of the already open threads on this topic;
I've just joined the list for the purposes of this discussion and have
nothing in my client to reply to yet.

I have an x86_64 opensolaris machine running on a Core 2 Quad Q9650
platform with two LSI SAS3081E-R PCI-E 8 port SAS controllers, with
8 drives each.  The LSI cards are flashed with IT firmware from Feb 2009
(I think, I can double check if it's important).  The drives are Samsung
HD154UI 1.5TB disks.  I was using for quite awhile OpenSolaris 2009.06
with the opensolaris-provided mpt driver to operate a zfs raidz2 pool of
about ~20T and this worked perfectly fine (no issues or device errors
logged for several months, no hanging).  A few days ago I decided to
reinstall with the latest OpenSolaris in order to take advantage of
raidz3.  I hadn't known at the time about the current mpt issues, or I
may have held off on upgrading.  I installed Solaris Nevada build 127
from the DVD image.  I then proceed to setup a raidz3 pool with the
same disks as before, of a slightly smaller size (obviously) than the
former raidz2 pool.  I started a moderately long-running and heavy
load rsync to copy my data back to the pool from another host.  Several
times during the day (sometimes a couple times an hour, or it could go up
to a few hours with no errors), I get several syslog errors and warnings
about mpt, similiar but not identical to what I've seen reported here by
others.  Also, iostat -en shows several hw and trn errors of varying
amounts for all the drives (in OpenSolaris 2009.06 I never had any iostat
errors).  After awhile the machine will hang in a variety of ways.  The
first time it was pingable, and I could authenticate through ssh but it
would never spawn a shell.  The second time it crashed it was unpingable
from the network, and the display was black, although the numlock key was
still toggling properly the numlock light on the console.  Here's a
sample of my errors.  I've included the complete series of errors from
one timestampe, and a few lines from a subsequent series of errors a
couple minutes later: 

(if there's any other info I can provide or more things to test just let me
know.  Thanks,  --Chad )

Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@1/pci1000,3...@0 (mpt1):
Nov 29 04:42:55 the-vault   mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@1/pci1000,3...@0 (mpt1):
Nov 29 04:42:55 the-vault   mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 29 04:42:55 the-vault   mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 29 04:42:55 the-vault   mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 29 04:42:55 the-vault   mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 29 04:42:55 the-vault   mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@1/pci1000,3...@0 (mpt1):
Nov 29 04:42:55 the-vault   mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@1/pci1000,3...@0 (mpt1):
Nov 29 04:42:55 the-vault   mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@1/pci1000,3...@0 (mpt1):
Nov 29 04:42:55 the-vault   mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@1/pci1000,3...@0 (mpt1):
Nov 29 04:42:55 the-vault   mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@1/pci1000,3...@0 (mpt1):
Nov 29 04:42:55 the-vault   mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x31120200
Nov 29 04:42:55 the-vault scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@1/pci1000,3...@0 (mpt1

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-30 Thread Chad Cantwell

Hi,

I just posted a summary of a similiar issue I'm having with non-Sun hardware.
For the record, it's in a Chenbro RM41416 chassis with 4 chenbro SAS backplanes
but no expanders (each backplane is 4 disks connected by SFF-8087 cable).  Each
of my LSI brand SAS3081E PCI-E cards is connected to two backplanes with 1m
SFF-8087 (both ends) cables.  For more details if they are important see my
other post.  I haven't tried the MSI workaround yet (although I'm not sure what
MSI is) but from what I've read the workaround won't fix the issues in my case
with non-sun hardware.

Thanks,
Chad

On Tue, Dec 01, 2009 at 12:36:33PM +1000, James C. McPherson wrote:
 Hi all,
 I believe it's an accurate summary of the emails on this thread
 over the last 18 hours to say that
 
 (1) disabling MSI support in xVM makes the problem go away
 
 (2) disabling MSI support on bare metal when you only have
 disks internal to your host (no jbods), makes the problem
 go away
 (several reports of this)
 
 (3) disabling MSI support on bare metal when you have a non-Sun
 jbod (and cables) does _not_ make the problem go away.
 (several reports of this)
 
 (4) the problem is not seen with a Sun-branded jbod and cables
 (only one report of this)
 
 (5) problem is seen with both mpt(7d) and itmpt(7d).
 
 (6) mpt(7d) without MSI support is sloow.
 
 
 For those who've been suffering this problem and who have non-Sun
 jbods, could you please let me know what model of jbod and cables
 (including length thereof) you have in your configuration.
 
 For those of you who have been running xVM without MSI support,
 could you please confirm whether the devices exhibiting the problem
 are internal to your host, or connected via jbod. And if via jbod,
 please confirm the model number and cables.
 
 Please note that Jianfei and I are not making assumptions about the
 root cause here, we're just trying to nail down specifics of what
 seems to be a likely cause.
 
 
 thankyou in advance,
 James C. McPherson
 --
 Senior Kernel Software Engineer, Solaris
 Sun Microsystems
 http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-11-30 Thread Chad Cantwell

Hi,

Replied to your previous general query already, but in summary, they are in the
server chassis.  It's a Chenbro 16 hotswap bay case.  It has 4 mini backplanes
that each connect via an SFF-8087 cable (1m) to my LSI cards (2 cables / 8 
drives
per card).

Chad

On Tue, Dec 01, 2009 at 01:02:34PM +1000, James C. McPherson wrote:
 Chad Cantwell wrote:
 Hi,
 
 Sorry for not replying to one of the already open threads on this topic;
 I've just joined the list for the purposes of this discussion and have
 nothing in my client to reply to yet.
 
 I have an x86_64 opensolaris machine running on a Core 2 Quad Q9650
 platform with two LSI SAS3081E-R PCI-E 8 port SAS controllers, with
 8 drives each.
 
 Are these disks internal to your server's chassis, or external in
 a jbod? If in a jbod, which one? Also, which cables are you using?
 
 
 thankyou,
 James C. McPherson
 --
 Senior Kernel Software Engineer, Solaris
 Sun Microsystems
 http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-11-30 Thread Chad Cantwell

Hi,

The Chenbro chassis contains everything - the motherboard/CPU, and the disks.  
As far as
I know the chenbro backplanes are basically electrical jumpers that the LSI 
cards shouldn't
be aware of.  They pass through the SATA signals directly from SFF-8087 cables 
to the
disks.

Thanks,
Chad

On Tue, Dec 01, 2009 at 01:43:06PM +1000, James C. McPherson wrote:
 Chad Cantwell wrote:
 Hi,
 
 Replied to your previous general query already, but in summary, they are in 
 the
 server chassis.  It's a Chenbro 16 hotswap bay case.  It has 4 mini 
 backplanes
 that each connect via an SFF-8087 cable (1m) to my LSI cards (2 cables / 8 
 drives
 per card).
 
 Hi Chad,
 thanks for the followup. Just to confirm - you've got this
 Chenbro chassis connected to the actual server chassis (where
 the cpu is), or do you have the cpu inside the Chenbro chassis?
 
 
 thankyou,
 James
 --
 Senior Kernel Software Engineer, Solaris
 Sun Microsystems
 http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-11-30 Thread Chad Cantwell

/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 30 22:38:21 the-vault   mpt_config_space_init failed
Nov 30 22:38:22 the-vault scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 30 22:38:22 the-vault   LSI PCI device (1000,) not supported.
Nov 30 22:38:22 the-vault scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 30 22:38:22 the-vault   mpt_config_space_init failed
Nov 30 22:38:46 the-vault sshd[636]: [ID 800047 auth.crit] monitor fatal: 
protocol error during kex, no DH_GEX_REQUEST: 254
Nov 30 22:38:46 the-vault sshd[637]: [ID 800047 auth.crit] fatal: Protocol 
error in privilege separation; expected packet type 254, got 20
Nov 30 23:11:23 the-vault scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 30 23:11:23 the-vault   mpt_send_handshake_msg task 3 failed
Nov 30 23:11:23 the-vault scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 30 23:11:23 the-vault   LSI PCI device (1000,) not supported.
Nov 30 23:11:23 the-vault scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 30 23:11:23 the-vault   mpt_config_space_init failed
Nov 30 23:11:25 the-vault scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 30 23:11:25 the-vault   LSI PCI device (1000,) not supported.
Nov 30 23:11:25 the-vault scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 30 23:11:25 the-vault   mpt_config_space_init failed
Nov 30 23:11:25 the-vault scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@3/pci111d,8...@0/pci111d,8...@0/pci1000,3...@0 (mpt0):
Nov 30 23:11:25 the-vault   mpt_restart_ioc failed

(and that's the last message before I hit the reset button.  Host was 
unpingable, and just moving the mouse around on the screen was
extremely delayed)

Nov 30 23:32:05 the-vault genunix: [ID 540533 kern.notice] ^MSunOS Release 5.11 
Version snv_127 64-bit
Nov 30 23:32:05 the-vault genunix: [ID 943908 kern.notice] Copyright 1983-2009 
Sun Microsystems, Inc.  All rights reserved.


Also, it says it resilvered some data; this is the first time I've seen any
notes next to a devices.  Still no zpool errors though.

# zpool status vault
  pool: vault
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Mon Nov 30 23:33:16 2009
config:

NAME STATE READ WRITE CKSUM
vaultONLINE   0 0 0
  raidz3-0   ONLINE   0 0 0
c1t6d0   ONLINE   0 0 0
c1t7d0   ONLINE   0 0 0
c1t8d0   ONLINE   0 0 0
c1t9d0   ONLINE   0 0 0
c1t11d0  ONLINE   0 0 0
c1t12d0  ONLINE   0 0 0
c1t13d0  ONLINE   0 0 0
c1t14d0  ONLINE   0 0 0
c2t3d0   ONLINE   0 0 0
c2t4d0   ONLINE   0 0 0
c2t5d0   ONLINE   0 0 0  11.5K resilvered
c2t6d0   ONLINE   0 0 0
c2t7d0   ONLINE   0 0 0
c2t8d0   ONLINE   0 0 0
c2t9d0   ONLINE   0 0 0
c2t10d0  ONLINE   0 0 0

errors: No known data errors
#



On Mon, Nov 30, 2009 at 06:46:13PM -0800, Chad Cantwell wrote:
 Hi,
 
 Sorry for not replying to one of the already open threads on this topic;
 I've just joined the list for the purposes of this discussion and have
 nothing in my client to reply to yet.
 
 I have an x86_64 opensolaris machine running on a Core 2 Quad Q9650
 platform with two LSI SAS3081E-R PCI-E 8 port SAS controllers, with
 8 drives each.  The LSI cards are flashed with IT firmware from Feb 2009
 (I think, I can double check if it's important).  The drives are Samsung
 HD154UI 1.5TB disks.  I was using for quite awhile OpenSolaris 2009.06
 with the opensolaris-provided mpt driver to operate a zfs raidz2 pool of
 about ~20T and this worked perfectly fine (no issues or device errors
 logged for several months, no hanging).  A few days ago I decided to
 reinstall with the latest OpenSolaris in order to take advantage of
 raidz3.  I hadn't known at the time about the current mpt issues, or I
 may have held off on upgrading.  I installed Solaris Nevada build 127
 from the DVD image.  I then proceed to setup a raidz3 pool with the
 same disks as before, of a slightly smaller size (obviously) than the
 former raidz2 pool.  I started a moderately long-running and heavy
 load rsync to copy my data back to the pool from another

[zfs-discuss] doing HDS shadow copy of a zpool

2008-09-19 Thread chad . campbell

I appologize if this has been answered already, but I've tried to RTFM and 
haven't found much.  I'm trying to get HDS shadow copy to work for zpool 
replication.  We do this with VXVM by modifying each target disk ID after 
it's been shadowed from the source LUN.  This allows us to import each 
target disk into the target diskgroup and then have its volumes mounted 
for backup over the network.  From what I can tell, each LUN in a zpool 
will have 2 256K vdev labels in the front and 2 at the end.  Is there a 
way to modify the vdev labels so that the target LUNs don't end up with 
the same zpool ID as the source LUNs?  Better yet, is there a way to 
import and rename a zpool that has the same exact id and name of an 
existing one?  As it stands now, after shadow copy, format can tell that 
each target LUN is labeled to be part of the source zpool, but that is 
invisibile to zpool import.

Thanks,

Chad___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Can I trust ZFS?

2008-07-31 Thread Chad Lewis


On Jul 31, 2008, at 2:56 PM, Bob Netherton wrote:

 On Thu, 2008-07-31 at 13:25 -0700, Ross wrote:
 Hey folks,

 I guess this is an odd question to be asking here, but I could do  
 with some
 feedback from anybody who's actually using ZFS in anger.

 ZFS in anger ?   That's an interesting way of putting it :-)


If you watch Phil Liggett and/or Paul Sherwen commentating on a  
cycling event,
you're pretty much guaranteed to hear turning the pedals in anger  
at some
point when a rider goes on the attack.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs, raidz, spare and jbod

2008-07-25 Thread Chad Leigh -- Shire.Net LLC


On Jul 25, 2008, at 7:27 AM, Claus Guttesen wrote:

 I'm running the version that was supplied on the CD, this is
 1.20.00.15 from 2007-04-04. The firmware is V1.45 from 2008-3-27.



Check the version at the Areca website.  They may have a more recent  
driver there.  The dates are later for the 1.20.00.15 and there is a  
-71010 extension.

Otherwise, file a bug with Areca.  They are pretty good about  
responding.

Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] RFE: -t flag for 'zfs destroy'

2008-07-17 Thread Chad Lewis

http://www.opensolaris.org/bug/report.jspa

You'll need an OpenSolaris.org account to file the RFE of course.


On Jul 17, 2008, at 10:52 AM, Will Murnane wrote:

 I would like to request an additional flag for the command line zfs
 tools.  Specifically, I'd like to have a -t flag for zfs destroy, as
 shown below.  Suppose I have a pool home with child filesystem
 will, and a snapshot home/[EMAIL PROTECTED].  Then I run the
 following commands:
 # zfs destroy -t volume home/[EMAIL PROTECTED]
 zfs: not destroying home/[EMAIL PROTECTED], as it is not a volume.
 # zfs destroy -t snapshot home/[EMAIL PROTECTED]
 (succeeds)
 # zfs destroy -t snapshot home/will
 zfs: not destroying home/will, as it is not a snapshot.
 # zfs destroy -t volume home/will
 zfs: not destroying home/will, as it is not a volume.
 # zfs destroy -t filesystem home/will
 (succeeds)

 Now, to test the behavior of '-r', I recreate the same structure as
 before, and run some more commands:
 zfs destroy -r -t snapshot home
 (succeeds)
 zfs list -Hro name
 home
 home/will

 One more time, to demonstrate -R:
 zfs clone home/[EMAIL PROTECTED] home/oldwill
 zfs destroy -R -t snapshot home
 (???)
 The two ways I can think of at this point are to destroy the clone as
 well, or to promote it and then destroy the snapshots.  Or, I suppose,
 make -R incompatible with -t for zfs destroy.

 I imagine this would be easy to implement, and for scripting use it
 would be a good sanity check; if you're trying to clean up snapshots
 you don't accidentally kill the filesystems by messing up some string
 operation and naming a valid filesystem by mistake.  Especially with
 -r, this could prevent silly mistakes.

 Also, it might be a helpful thing to add to 'zfs get'; if one wants to
 see some property for all user home directories and not the snapshots
 of them, syntax like
 zfs get used -r -t filesystem home
 could list the used property of all the children of the home
 filesystem.  This is a slightly different semantic from the proposed
 zfs destroy enhancement: it's a filter rather than a predicate.  I
 think this is the Right Thing to do with this flag, and it will be
 intuitive for users.

 Any suggestions on better specifying the behavior?  How can I formally
 propose this?  I'd be glad to implement it if would help this get
 finished.

 Thanks!
 Will
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] previously mentioned J4000 released

2008-07-09 Thread Chad Lewis

Here's the announcement for those new Sun JBOD devices mentioned the  
other day.

http://www.sun.com/aboutsun/pr/2008-07/sunflash.20080709.1.xml

ckl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [perf-discuss] [storage-discuss] zpool io to 6140 is really slow

2007-11-20 Thread Chad Mynhier

On 11/20/07, Asif Iqbal [EMAIL PROTECTED] wrote:
 On Nov 20, 2007 7:01 AM, Chad Mynhier [EMAIL PROTECTED] wrote:
  On 11/20/07, Asif Iqbal [EMAIL PROTECTED] wrote:
   On Nov 19, 2007 1:43 AM, Louwtjie Burger [EMAIL PROTECTED] wrote:
On Nov 17, 2007 9:40 PM, Asif Iqbal [EMAIL PROTECTED] wrote:
 (Including storage-discuss)

 I have 6 6140s with 96 disks. Out of which 64 of them are Seagate
 ST337FC (300GB - 1 RPM FC-AL)
   
Those disks are 2Gb disks, so the tray will operate at 2Gb.
   
  
   That is still 256MB/s . I am getting about 194MB/s
 
  2Gb fibre channel is going to max out at a data transmission rate
  around 200MB/s rather than the 256MB/s that you'd expect.  Fibre
  channel uses an 8-bit/10-bit encoding, so it transmits 8-bits of data
  in 10 bits on the wire.  So while 256MB/s is being transmitted on the
  connection itself, only 200MB/s of that is the data that you're
  transmitting.

 But I am running 4GB fiber channels with 4GB NVRAM on a 6 tray of
 300GB FC 10K rpm (2Gb/s) disks

 So I should get a lot more than ~ 200MB/s. Shouldn't I?

Here, I'm relying on what Louwtjie said above, that the tray itself is
going to be limited to 2Gb/s because of the 2Gb/s FC disks.

Chad Mynhier
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] nv-69 install panics dell precision 670

2007-08-14 Thread Chad Lewis

Apparently known bug, fixed in snv_70.

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6577473


On Aug 14, 2007, at 8:28 AM, Bill Moloney wrote:

 using hyperterm, I captured the panic message as:

 SunOS Release 5.11 Version snv_69 32-bit
 Copyright 1983-2007 Sun Microsystems, Inc.  All rights reserved.
 Use is subject to license terms.

 panic[cpu0]/thread=fec1ede0: Can't handle mwait size 0

 fec37e70 unix:mach_alloc_mwait+72 (fec2006c)
 fec37e8c unix:mach_init+b0 (c0ce80, fe800010, f)
 fec37eb8 unix:psm_install+95 (fe84166e, 3, fec37e)
 fec37ec8 unix:startup_end+93 (fec37ee4, fe91731e,)
 fec37ed0 unix:startup+3a (fe800010, fec33c98,)
 fec37ee4 genunix:main+1e ()

 skipping system dump - no dump device configured
 rebooting...

 this behavior loops endlessly


 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: ZFS Apple WWDC Keynote Absence

2007-06-12 Thread Chad Leigh -- Shire.Net LLC



On Jun 12, 2007, at 9:37 AM, Andy Lubel wrote:

Yeah this is pretty sad, we had such plans for actually using our  
apple

(PPC) hardware in our datacenter for something other than AFP and web
serving.

It also shows how limited apples vision seems to be.


I think you are jumping to conclusions


 For 2 CEO's not to be
on the same page demonstrates that there is something else going on  
rather
than just we chose not to put a future ready file system into our  
next OS.

And how its being dismissed by apple is quite upsetting.


I think you are jumping to conclusions.

Jonathon jumped the gun on something.

Chad




I wonder when we will see Johnny-cat and Steve-o in the same room  
talking

about it.


On 6/12/07 8:23 AM, Sunstar Dude [EMAIL PROTECTED] wrote:

Yea, What is the deal with this? I am so bummed :( What the heck  
was Sun's CEO
talking about the other day? And why the heck did Apple not include  
at least
non-default ZFS support in Leopard? If no ZFS in Leapard, then what  
is all the
Apple-induced-hype about? A trapezoidal Dock table? A transparent  
menu bar?


Can anyone explain the absence of ZFS in Leopard??? I signed up for  
this forum

just to post this.


This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Andy Lubel



--


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mac OS X Leopard to use ZFS

2007-06-07 Thread Chad Leigh -- Shire.Net LLC



On Jun 7, 2007, at 12:50 PM, Rick Mann wrote:


From Macintouch (http://macintouch.com/#other.2007.06.07):


---
On stage Wednesday in Washington D.C., Sun Microsystems Inc. CEO  
Jonathan Schwartz revealed that his company's open-source ZFS file  
system will replace Apple's long-used HFS+ in Mac OS X 10.5, a.k.a.  
Leopard, when the new operating system ships this fall. This  
week, you'll see that Apple is announcing at their Worldwide  
Developers Conference that ZFS has become the file system in Mac OS  
X, said Schwartz.
  ZFS (Zettabyte File System), designed by Sun for its Solaris OS  
but licensed as open-source, is a 128-bit file storage system that  
features, among other things, pooled storage, which means that  
users simply plug in additional drives to add space, without  
worrying about such traditional storage parameters as volumes or  
partitions.
  [ZFS] eliminates volume management, it has extremely high  
performance It permits the failure of disk drives, crowed  
Schwartz during a presentation focused on Sun's new blade servers.

---



We'll see next week what Steve announces at the WWDC keynote (which  
is not under NDA like the rest of the conference).  I'll be there and  
try to remember to post what is said (though it will probably be in a  
billion other places as well)


Chad

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ARC and patents

2007-06-06 Thread Chad Lewis

With US patent laws the way they are, no one but a patent lawyer  
could safely give you an answer.
If by some chance a patent lawyer is lurking and decided to comment,  
none of the rest of us
could safely read such comments. No one working on ZFS could even  
safely look at the patent

you've referenced.


On Jun 5, 2007, at 11:40 PM, Kasper Nielsen wrote:


Hi there,

I was looking at using something very similar to arc.c http:// 
src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/ 
fs/zfs/arc.c for an open source project.
However, I'm a bit worried about the patent IBM is holding on the  
ARC data structure.
http://appft1.uspto.gov/netacgi/nph-Parser? 
Sect1=PTO1Sect2=HITOFFd=PG01p=1u=%2Fnetahtml%2FPTO% 
2Fsrchnum.htmlr=1f=Gl=50s1=%2220040098541%22.PGNR.OS=DN/ 
20040098541RS=DN/20040098541
I remember PostgreSQL dropping their ARC implementation for 2Q some  
time ago.
But I was hoping, that someone on this list might have some  
constructive input on this issue?


cheers
 Kasper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-18 Thread Chad Mynhier


On 5/17/07, Robert Milkowski [EMAIL PROTECTED] wrote:

Hello Phillip,

Thursday, May 17, 2007, 6:30:38 PM, you wrote:

PF [b]Given[/b]:  A Solaris 10 u3 server with an externally attached
PF disk array with RAID controller(s)

PF [b]Question[/b]:  Is it better to create a zpool from a
PF [u]single[/u] external LUN on an external disk array, or is it
PF better to use no RAID on the disk array and just present
PF individual disks to the server and let ZFS take care of the RAID?


Then other thing - do you use SATA disks? How much data loss or
corruption is an issue for you? Doing software RAID in ZFS can detect
AND correct such problems. HW RAID also can but to much less extent.



I think this point needs to be emphasized.  If reliability is a prime
concern, you absolutely want to let ZFS handle redundancy in one way
or another, either as mirrogin or as raidz.

You can think of redundancy in ZFS as much the same thing as packet
retransmission in TCP.  If the data comes through bad the first time,
checksum verification will catch it, and you get a second chance to
get the correct data.  A single-LUN zpool is the moral equivalent of
disabling retransmission in TCP.

Chad Mynhier
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on the desktop

2007-04-17 Thread Chad Leigh -- Shire.Net LLC



On Apr 17, 2007, at 7:47 AM, Toby Thain wrote:



On 17-Apr-07, at 8:33 AM, Robert Milkowski wrote:


Hello Rayson,

Tuesday, April 17, 2007, 10:50:41 AM, you wrote:

RH On 4/17/07, David R. Litwin [EMAIL PROTECTED] wrote:

How about asking Microsoft to change Shared Source first??


Let's leave ms out of this, eh? :-)


RH While ZFS is nice, I don't think it is a must for most desktop  
users.


RH For servers and power users, yes. But most (over 90% of world
RH population) people who just use the computers to browse the  
web, check
RH emails, do word processing, etc... don't care. Even if they do  
care, I
RH don't think those who do not backup their drive can really  
understand

RH how to use ZFS.

I belive that ZFS definitely belongs on a desktop,


Apple (and I) assuredly agree with you.


I would agree as well. With the proper UI (which I hope Apple has or  
will eventually have -- waiting to get Leopard! as I have not yet  
renewed my paid developer program at Apple) ZFS is a killer on the  
desktop, especially on OS X where everything of importance has to be  
or likes to live on the boot device (I understand that OS X does not  
yet support booting on ZFS but someday it will), but on any consumer  
class desktop it is killer because it removes the need to worry about  
disks from the end user.  You need more space, buy a new disk or two  
and then just add them into the pool of storage.


What's interesting about its integration in OS X - and OS X in  
general - is it diffuses hitherto server grade technology (UNIX,  
inter alia) all the way down to everybody's grandmother's non- 
technical desktop/MacBook. Steve definitely proved his point  
(starting with NeXT, of course); Linux and Solaris will inevitably  
arrive there too. To M's detriment :-)


Yep

Chad



--Toby


mostly for its
built-in reliability, free snapshots, built-in compression and
cryptography (soon) and easy to use.

ps. few days ago I encountered my first checksum error on my
desktop system on a submirror (two sata drives in a zfs  
mirror). Thanks to zfs it

won't be a problem and it's already repaired.


--
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on the desktop

2007-04-17 Thread Chad Leigh -- Shire.Net LLC



On Apr 17, 2007, at 10:03 AM, Toby Thain wrote:



On 17-Apr-07, at 12:15 PM, Chad Leigh -- Shire.Net LLC wrote:



On Apr 17, 2007, at 7:47 AM, Toby Thain wrote:



On 17-Apr-07, at 8:33 AM, Robert Milkowski wrote:


...

I belive that ZFS definitely belongs on a desktop,


Apple (and I) assuredly agree with you.


I would agree as well. With the proper UI (which I hope Apple has  
or will eventually have -- waiting to get Leopard!


Full disclosure: I don't think anyone outside Apple yet knows for  
SURE if it's going to be in Leopard (or even a future release).  
Found this sceptical article today - or is it out of date?

http://arstechnica.com/staff/fatbits.ars/2006/8/15/4995



I don't have any insider or NDA knowledge (as I said, I have not yet  
re-upped my paid developer status and have not had any of the leopard  
seeds), but there have been screenshots from Leopard seeds posted  
that show ZFS volume creation options etc in dialog boxes.  Again,  
who knows if it will actually ship with that feature.  But it has  
been shipped in seeds as far as I know.  Siracusa's column is old.


Chad

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] FYI: ZFS on USB sticks (from Germany)

2007-02-01 Thread Chad Leigh -- Shire.Net LLC



On Feb 1, 2007, at 10:51 AM, Richard Elling wrote:


FYI,
here is an interesting blog on using ZFS with a dozen USB drives  
from Constantin.

http://blogs.sun.com/solarium/entry/solaris_zfs_auf_12_usb

My German is somewhat rusty, but I see that Google Translate does a  
respectable

job.  Thanks Constantin!
 -- richard


This is the best line:

 Hier ist die offizielle Dokumentation, echte Systemhelden jedoch  
kommen mit nur zwei man-Pages aus: zpool und zfs.


Roughly,  Here [link] is the official documentation; real system  
heroes need only the two manpages: zpool and zfs


Chad


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zfs / nfs issue (not performance :-) with courier-imap

2007-01-25 Thread Chad Leigh -- Shire.Net LLC

I am not sure if this is a zfs issue, and nfs issue, or a combination  
of the two, or not an issue with them per se (caching or whatever),  
or a courier-imap issue, or even a mail client issue.


However, the issue happens in at least two different unrelated mail  
clients, so I don't think it is client related, and I have spoken to  
someone who uses courier-imap on nfs mounted directories for maildir  
mailstore using FreeBSD 6.x  to NetApp nfs servers without issue (my  
nfs client if FreeBSD 6.x while the server is Solaris 10 x86 serving  
ZFS backed filesystems over nfs), so maybe it is something to do with  
ZFS and NFS interaction.


Basically, I have a few maildir mailstores that are mounted on my  
FreeBSD imap server from a Solaris 10 sever that serves them using  
NFSv3 from ZFS filesystems (each maildir has its own ZFS  
filesystem).  Most of my maildirs are on a local disk and do not have  
a problem and a few on the nfz/zfs do not have the problem and a few  
have the problem that appeared right after they were migrated from  
the local disk to the zfs/nfs filesystem for testing (we would  
eventually like to move over all mail to this nfz/zfs setup).


Basically, in the affected accounts (under Apple Mail.app and Windows  
Thunderbird), you can delete 1 or more messages, (mark for delete),  
expunge, and then mail starting some place in the list after the  
deleted messages starts to show the wrong mail content for the given  
message as shown in the list view.


say I have messages A B C D E F G etc

A
B
C
D
E
F
G

I delete C and expunge

Now it looks like this

A
B
D
E
F
G

but if I click, say E, it has F's contents, F has Gs contents, and no
mail has D's contents that I can see.  But the list in the mail
client list view is correct.

--

Some feedback from the courier mail list, from a guy who runs the  
FreeBSD nfs clients to NetApp nfs servers with courier without issue,  
thought it might be an nfs caching issue or something on the client  
or server.


Since this is ZFS backed nfs, I thought to ask here to see if there  
were any gotchas or anything that might be causing this.


ATIME is off (but was on earlier and the problem still happened  
before I switched it)
CHECKSOM COMPRESS DEVICES EXEC SETUID are ON and RDONLY and ZONED are  
OFF.  ACLMODE is groupmask and ACLINHEREIT is secure.


I have not messed around with the ZIL business to improve performance.

Thanks for any insight on how I might have set this up wrong.

Thanks
Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Solaris-Supported cards with battery backup

2007-01-24 Thread Chad Leigh -- Shire.Net LLC



On Jan 24, 2007, at 1:57 PM, Robert Milkowski wrote:


Hello James,

Wednesday, January 24, 2007, 3:20:14 PM, you wrote:

JFH Since we're talking about various hardware configs, does  
anyone know
JFH which controllers with battery backup are supported on  
Solaris? If
JFH we build a big ZFS box I'd like to be able to turn on write  
caching
JFH on the drives but have them battery-backed in the event of a  
power

JFH loss. Are 3ware cards going to be supported any time soon?

JFH I checked and there doesn't seem to be a battery backup option
JFH for Thumper. Is that right? Does anyone know if there plans for
JFH that?

ZFS makes sure itself that transaction is on disk by issuing write
cache flush command to disks. So you don't have to worry about it.



Areca SATA cards are supported on Solaris x86 by Areca (drivers etc  
from them, not from Sun) and they support battery backup.


It is what I am using

Chad


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS entry in /etc/vfstab

2007-01-11 Thread Chad Mynhier


On 1/10/07, Vahid Moghaddasi [EMAIL PROTECTED] wrote:

Hi,
Why would I ever need to specify ZFS mount(s) in /etc/vfstab at all? I see it 
in some documents that zfs can be defined in /etc/vfstab with fstype zfs.
Thanks.



I don't think it's a question of needing to be able to do so as much
as it is a useful transitional mechanism.  Some people might not be
comfortable with how ZFS keeps track of filesystems and where they
should be mounted, and vfstab is something they're used to dealing
with.

For example, at a previous job, we had a sanity-check script running
out of cron to verify that every file system that should have been
mounted actually was mounted and that every file system that actually
was mounted should have been mounted (in other words, that the mapping
of vfstab entries to (non-auto-)mounted filesystems was both
one-to-one and onto.)[1]

In the pre-ZFS world, knowing what should be mounted was simply a
question of looking at vfstab.  With ZFS, the filesystems that
should be mounted are those filesystems that _are_ mounted.  In this
model, a sanity-check script like this is meaningless, because there's
no longer an independent source of information to say what should be
mounted.

This is an example where this feature is convenient.  There might be
other examples where this feature is necessary.

Chad Mynhier

[1]  Note that the purpose of the script was mostly to guard against
operator error rather than system problems.  With vfstab, it would
take two independent actions to change what is mounted on a server and
the concept of what should be mounted there.  With ZFS, a single
action can change both of those.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Chad Leigh -- Shire.Net LLC



On Dec 2, 2006, at 10:56 AM, Al Hopper wrote:


On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:



On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:




While other file systems, when they become corrupt, allow you to
salvage data :-)



They allow you to salvage what you *think* is your data.

But in reality, you have no clue what the disks are giving you.


I stand by what I said.  If you have a massive disk failure, yes.
You are right.

When you have subtle corruption, some of the data and meta data is
bad but not all.  In that case you can recover (and verify the data
if you have the means to do so) t he parts that did not get
corrupted.  My ZFS experience so far is that it basically said the
whole 20GB pool was dead and I seriously doubt all 20GB was  
corrupted.


That was because you built a pool with no redundancy.  In the case  
where
ZFS does not have a redundant config from which to try to  
reconstruct the

data (today) it simply says: sorry charlie - you pool is corrupt.


Where a RAID system would still be salvageable.

Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-02 Thread Chad Leigh -- Shire.Net LLC



On Dec 2, 2006, at 12:29 PM, Jeff Victor wrote:


Chad Leigh -- Shire.Net LLC wrote:

On Dec 2, 2006, at 10:56 AM, Al Hopper wrote:

On Sat, 2 Dec 2006, Chad Leigh -- Shire.Net LLC wrote:



On Dec 2, 2006, at 6:01 AM, [EMAIL PROTECTED] wrote:




While other file systems, when they become corrupt, allow you to
salvage data :-)


They allow you to salvage what you *think* is your data.

But in reality, you have no clue what the disks are giving you.



I stand by what I said.  If you have a massive disk failure, yes.
You are right.

When you have subtle corruption, some of the data and meta data is
bad but not all.  In that case you can recover (and verify the data
if you have the means to do so) t he parts that did not get
corrupted.  My ZFS experience so far is that it basically said the
whole 20GB pool was dead and I seriously doubt all 20GB was   
corrupted.


That was because you built a pool with no redundancy.  In the  
case  where
ZFS does not have a redundant config from which to try to   
reconstruct the

data (today) it simply says: sorry charlie - you pool is corrupt.

Where a RAID system would still be salvageable.


That is a comparison of apples to oranges.  The RAID system has  
Redundancy.  If the ZFS pool had been configured with redundancy,  
it would have fared at least as well as the RAID system.


Without redundancy, neither of them can magically reconstruct  
data.  The RAID system would simply be an AID system.


That is not the question.  Assuming the error came OUT of the RAID  
system (which it did in this case as there was a bug in the driver  
and the cache did not get flushed in a certain shutdown situation),  
another FS would have been salvageable as the whole 20GB of the pool  
was not corrupt.


Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC



On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:


Followup: When you say you fixed the HW, I'm curious as to what you
found and if this experience with ZFS convinced you that your  
trusted RAID

H/W did, in fact, have issues?

Do you think that it's likely that there are others running production
systems on RAID systems that they trust, but don't realize may have  
bugs

(causing data corruption) that have yet to be discovered?


And this is different from any other storage system, how?  (ie, JBOD  
controllers and disks can also have subtle bugs that corrupt data)


Chad


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC



On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:


Chad Leigh -- Shire.Net LLC wrote:


On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:

Followup: When you say you fixed the HW, I'm curious as to what  
you
found and if this experience with ZFS convinced you that your  
trusted

RAID
H/W did, in fact, have issues?

Do you think that it's likely that there are others running  
production
systems on RAID systems that they trust, but don't realize may  
have bugs

(causing data corruption) that have yet to be discovered?


And this is different from any other storage system, how?  (ie, JBOD
controllers and disks can also have subtle bugs that corrupt data)


Of course, but there isn't the expectation of data reliability with a
JBOD that there is with some RAID configurations.



There is not?  People buy disk drives and expect them to corrupt  
their data?  I expect the drives I buy to work fine (knowing that  
there could be bugs etc in them, the same as with my RAID systems).


Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC



On Dec 1, 2006, at 10:17 PM, Ian Collins wrote:


Chad Leigh -- Shire.Net LLC wrote:



On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:


Chad Leigh -- Shire.Net LLC wrote:



And this is different from any other storage system, how?  (ie,  
JBOD

controllers and disks can also have subtle bugs that corrupt data)



Of course, but there isn't the expectation of data reliability  
with a

JBOD that there is with some RAID configurations.



There is not?  People buy disk drives and expect them to corrupt
their data?  I expect the drives I buy to work fine (knowing that
there could be bugs etc in them, the same as with my RAID systems).

So you trust your important data to a single drive?  I doubt it.   
But I

bet you do trust your data to a hardware RAID array.


Yes, but not because I expect a single drive to be more error prone  
(versus total failure).  Total drive failure on a single disk loses  
all your data.  But we are not talking total failure, we are talking  
errors that corrupt data.  I buy individual drives with the  
expectation that they are designed to be error free and are error  
free for the most part and I do not expect a RAID array to be more  
robust in this regard (after all, the RAID is made up of a bunch of  
single drives).


Some people on this list think that the RAID arrays are more likely  
to corrupt your data than JBOD (both with ZFS on top, for example, a  
ZFS mirror of 2 raid arrays or a JBOD mirror or raidz).  There is no  
proof of this or even reasonable hypothetical explanation for this  
that I have seen presented.


Chad



Ian


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: Production ZFS Server Death (06/06)

2006-12-01 Thread Chad Leigh -- Shire.Net LLC



On Dec 1, 2006, at 10:42 PM, Toby Thain wrote:



On 1-Dec-06, at 6:36 PM, Chad Leigh -- Shire.Net LLC wrote:



On Dec 1, 2006, at 4:34 PM, Dana H. Myers wrote:


Chad Leigh -- Shire.Net LLC wrote:


On Dec 1, 2006, at 9:50 AM, Al Hopper wrote:

Followup: When you say you fixed the HW, I'm curious as to  
what you
found and if this experience with ZFS convinced you that your  
trusted

RAID
H/W did, in fact, have issues?

Do you think that it's likely that there are others running  
production
systems on RAID systems that they trust, but don't realize may  
have bugs

(causing data corruption) that have yet to be discovered?


And this is different from any other storage system, how?  (ie,  
JBOD

controllers and disks can also have subtle bugs that corrupt data)


Of course, but there isn't the expectation of data reliability  
with a

JBOD that there is with some RAID configurations.



There is not?  People buy disk drives and expect them to corrupt  
their data?  I expect the drives I buy to work fine (knowing that  
there could be bugs etc in them, the same as with my RAID systems).


Yes, but in either case, ZFS will tell you.


And then kill your whole pool :-)


Other filesystems in general cannot.



While other file systems, when they become corrupt, allow you to  
salvage data :-)


Chad


--Toby



Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] poor NFS/ZFS performance

2006-11-22 Thread Chad Leigh -- Shire.Net LLC



On Nov 22, 2006, at 4:11 PM, Al Hopper wrote:


No problem there!  ZFS rocks.  NFS/ZFS is a bad combination.


Has anyone tried sharing a ZFS fs using samba or afs or something  
else besides nfs?  Do we have the same issues?


Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] poor NFS/ZFS performance

2006-11-21 Thread Chad Leigh -- Shire.Net LLC



On Nov 21, 2006, at 1:36 PM, Joe Little wrote:


On 11/21/06, Matthew B Sweeney - Sun Microsystems Inc.
[EMAIL PROTECTED] wrote:


 Roch,

 Am I barking up the wrong tree?  Or is ZFS over NFS not the right  
solution?




I strongly believe it is.. We just are at odds as to some philosophy.
Either we need NVRAM backed storage between NFS and ZFS, battery
backed-memory that can survive other subsystem failure, or a change in
the code path to allow some discretion here. Currently, the third
option, 6280630, ZIL syncronicity, or as I reference it, sync_deferred
functionality.

A combination is best, but the sooner this arrives, the better for
anyone who needs a general purpose file server / NAS that compares
anywhere near to the competion.


I had heard that some stuff in the latest OS and coming in Sol10 U3  
should greatly help in NFS/ZFS performance.  Something to do with ZFS  
not synching the entire pool on every sync but just the stuff needed  
or something like that.  I heard it kind of 2nd or 3rd hand so cannot  
be to detailed in my description.  Can someone here in the know  
confirm that this is so (or not)?


Thanks
Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: ZFS Performance Question

2006-10-31 Thread Chad Leigh -- Shire.Net LLC



On Oct 31, 2006, at 11:09 AM, Jay Grogan wrote:

Thanks Robert, I was hoping something like that hard turned up  
allot of what I will need to use ZFS for will be sequential writes  
at this time.


I don't know what it is worth, but I was using iozone http:// 
www.iozone.org/ on my ZFS on top of Areca RAID volumes as well on  
ufs on a similar volume and it showed, for many sorts of things,  
better performance under ZFS.  I am not an expert on file systems and  
disk performance so I cannot say that there are not faults in its  
methodology, but it is interesting to run and look at.


Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS Performance Question

2006-10-30 Thread Chad Leigh -- Shire.Net LLC



On Oct 30, 2006, at 10:45 PM, David Dyer-Bennet wrote:


Also, stacking it on top of an existing RAID setup is kinda missing
the entire point!


Everyone keeps saying this, but I don't think it is missing the point  
at all.  Checksumming and all the other goodies still work fine and  
you can run a ZFS mirror across 2 or more raid devices for ultimate  
in reliability.  My Dual RAID-6 with large ECC battery backed cache  
device mirrors will be much more reliable than your RAID-Z and  
probably perform better, and I still get the ZFS goodness.


I can lose one whole RAID device (all the disks) and up to 2 of the  
disks on the second RAID device, all att he same time, and still be  
OK and fully recoverable and still operating.


(ok, my second raid is not yet installed, so right now my ZFS'ed  
single RAID-6 is not as reliable as I would like, but the second  
half, ie, second RAID-6 will be installed before XMas)


Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] What is touching my filesystems?

2006-10-17 Thread Chad Mynhier


On 10/17/06, Niclas Sodergard [EMAIL PROTECTED] wrote:

Hi everyone,

I have a very strange problem. I've written a simple script that uses
zfs send/recv to send a filesystem between two hosts using ssh. Works
like a charm - most of the time. As you know we need a two snapshots
when we do a incremental send. But the problem is something is
touching my filesystems on the receiving side so they are no longer
identical.


Do you have atime updates on the recv side turned off?  If you want to
do incrementals, and you also want to be able to look at the data on
the receive side, you'll need to do so.

Chad Mynhier
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] A versioning FS

2006-10-06 Thread Chad Leigh -- Shire.Net LLC

 as it  
never once came up as a an issue with VMS usability.


Also, a big difference between Snapshots and FV tends to be who  
controls EOL-ing a version/Snapshot.  Snapshots tend to be done by  
the Admin, and their aging strictly controlled and defines (e.g.  
we keep hourly snapshots for 1 week). File versioning is  
typically under the control of the End-User, as their utility is  
much more nebulously defined.   Certainly, there is no ability to  
truncate based on number of versions (e.g. we only allow 100  
versions to be kept), since the frequency of versioning a file  
varies widely.  Aging on a version is possibly a better answer, but  
this runs into a problem of user education, where we have to  
retrain our users to stop making frequent copies of important  
documents (like they do now, in absence of FV), but _do_ remember  
to dig through the FV archive periodically to save a desirable old  
copy.   Also, if  managing FV is to be a User task, how are they to  
do it over NFS/SAMBA?  And, log into the NFS server to do a  
cleanup isn't an acceptable answer.


Also, FV is only useful for apps which do a close() on a file (or  
at least, I'm assuming we wait for a file to signal that it is  
closed before taking a version - otherwise, we do what? take a  
version every X minutes while the file still open? I shudder to  
think about the implementation of this, and its implications...).   
How many apps keep a file open for a long period of time?  FV isn't  
useful to them, only an unlimited undo functionality INSIDE the app.


Yes, any time you do a close() or equivalent. The idea is not to  
implement a universal undo stack.


You can always find a scenario where FV doesn't help.  So what.   
There are lots of scenarios where it does help.  More positive  
scenarios than you can dream up negatives for.




Lastly, consider the additional storage requirement of FV, and  
exactly how much utility you gain for sacrificing disk space.


We have GB and TB of cheap space.  A few extra versions lying around  
until people hit their quotas is the users' issue, not the sysadmin.


Look at this scenario:  I'm editing a file, making 1MB of change  
per 5 minutes (a likely scenario when actively editing any Office- 
style document), of which only 50% to I actually make permanent  
(the rest being temp edits for ideas I decide to change or throw  
out).  If I'm auto-saving every 5 minutes, that means I use 12MB of  
version space per hour. If I took a hourly snapshot, then I need  
only 6MB of storage.


So.  Your snapshot is much less useful and 12MB is nothing in todays  
GBs of cheap space.  Probably compressed too so even less usage than  
you envision.


The situation gets worse, for the primary usefulness of FV is for  
files which are frequently edited - mean that they have rapid  
content change, and not in append-mode. Such a usage pattern means  
that FV will take up a much greater amount of space than periodic  
snapshots, as the longer interval in snapshots will allow the  
changes to settle.


Not an issue.  Cheap disk space.




To me, FV is/was very useful in TOPS-20 and VMS, where you were  
looking at a system DESIGNED with the idea in mind, already have a  
user base trained to use and expect it, and virtually all usage was  
local (i.e. no network filesharing). None of this is true in the  
UNIX/POSIX world.


And does not affects its usefulness.

Chad




-Erik
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] A versioning FS

2006-10-06 Thread Chad Leigh -- Shire.Net LLC



On Oct 6, 2006, at 3:53 PM, Nicolas Williams wrote:

On Fri, Oct 06, 2006 at 03:30:20PM -0600, Chad Leigh -- Shire.Net  
LLC wrote:

On Oct 6, 2006, at 3:08 PM, Erik Trimble wrote:

OK. So, now we're on to FV.  As Nico pointed out, FV is going to
need a new API.  Using the VMS convention of simply creating file
names with a version string afterwards is unacceptible, as it
creates enormous directory pollution,


Assumption, not supported.  Eye of the  beholder.


No, you really need an API, otherwise you have to guess when to  
snapshot

versions of files.


What does snapshot versions of files mean?

My line Assumption, not supported.  Eye of the beholder was in  
reference to enormous directory polution





not to mention user confusion.


Assumption, not supported.


Maybe Erik would find it confusing.  I know I would find it  
_annoying_.


Then leave it set to 1 version




So, FV has to be invisible to non-aware programs.


yes


Interesting that you agree with this when you disagree with Erik's  
other

points!  To me this statement implies FV APIs.


It has to do with the implementation details.  I don't know what sort  
of APIs you are saying are  needed.  Maybe they are needed and maybe  
they would be handy. I am not disputing that.


The above should be simple to do however -- a program does an open of  
a file name foo.bar.  ZFS / the file system routine would use the  
most recent version by default if no version info is given.





Now we have a problem:  how do we access FV for non-local (e.g.
SAMBA/NFS) clients?  Since the VAST majority of usefulness of FV is
in the network file server arena,


Assumption, and definitely not supported.   It is very useful outside
of the file sharing arena.


I agree with you, and I agree with Erik.  We, Sun engineers that is,
need to look at the big picture, and network access is part of the big
picture.


Sure




unless we can use FV over the network, it is useless.


Wrong


Yes, but we have to provide for it.


I never said that file sharing is not useful (in this or any  
context).  I just said that FV is not useless except in the over the  
network use.  And if it did not support filesharing scenarios, at  
least in the beginning, it still has great use.  The same way that  
apache does not support lockfiles on nfs file systems, does not  make  
apache or nfs useless, FV that is not 100% in every nook and cranny  
does not make it useless.


I would find it of tremendous use just in managing system and  
configuration files.





You can't modify the SMB or NFS protocol (easily or quickly) to add
FV functionality (look how hard it was to add ACLs to these
protocols).

About the only way I can think around this problem is to store
versions in a special subdir of each directory (e.g. .zfs_version),
which would then be browsable over the network, using tools not
normally FV-aware.  But this puts us back into the problem of a
directory which potentially has hundreds or thousands of files.


This directory way of doing it is not a good way.  It fails the ease
of use to the end user test.


No, it doesn't: it doesn't preclude having FV-aware UIs that make it
easier to access versions.  All Erik's .zfs_version proposal is  
about is

remote access, not a user interface.


one UI is the command line shell




The VMS way is far superior.  The problem is that you have to make
sure that apps that are not FV aware have no problems, which means
you cannot just append something to the actual file name. It has to
be some sort of meta data.


I.e., APIs.


Well, file system level meta data that the file system uses may or  
may not need APIs to expose it -- depends on how the final  
implementation works.  However, I never came out against APIs




The big question though is: how to snapshot file versions when they  
are

touched/created by applications that are not aware of FV?


Don't use the word snapshot as it may draw in unintended comparisons  
to snapshot features.




Certainly not with every write(2).


no


At fsync(2), close(2), open(2) for
write/append?


probably


What if an application deals in multiple files?


so?


Etc...

Automatically capturing file versions isn't possible in the general  
case

with applications that aren't aware of FV.


In most cases it is possible.  At worst you make a copy on open and  
work on the copy, making it the most recent version.





While this may indeed mean that you have all of your changes
around, figuring out which version has them can be massively time-
consuming.


Your assumption.  (And much less hard than using snapshots).


I agree that with ZFS snapshots it could be hard to find the file
versions you want.  I don't agree that the same isn't true with FV
*except* where you have FV-aware applications.


How so?  The shell / desktop is enough of a UI to deal with it.




Yes, any time you do a close() or equivalent. The idea is not to
implement a universal undo stack.


Or open(2) for write, fsync(2)s

Re: [zfs-discuss] A versioning FS

2006-10-06 Thread Chad Leigh -- Shire.Net LLC



On Oct 6, 2006, at 7:33 PM, Erik Trimble wrote:



This is what Nico and I are talking about:  if you turn on file  
versioning automatically (even for just a directory, and not a  
whole filesystem), the number of files being created explodes  
geometrically.


But it doesn't.  Unless you are editing geometrically more files.

Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] A versioning FS

2006-10-06 Thread Chad Leigh -- Shire.Net LLC

 FV, your editor must issue  
periodic close() and open() commands on the same file, as you edit,  
all without your intervention.


No, you get the benefits of FV, just across editing sessions and not  
internal to an editing session.


Exactly how many editors do this?  I have no idea.  So, the only  
way to enable FV is to require the user to periodically push the  
Save button. Which is how much more different than the current  
situation?


I edit a file.  I realize I screwed up.  I can go back to the  
previous version (or 2 ago or whatever).  I cannot do that in the  
current situation.




Chad



-Erik


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] A versioning FS

2006-10-06 Thread Chad Leigh -- Shire.Net LLC



On Oct 6, 2006, at 10:18 PM, Richard Elling - PAE wrote:


Erik Trimble wrote:
The problem is we are comparing apples to oranges in user bases  
here. TOPS-20 systems had a couple of dozen users (or, at most, a  
few hundred).  VMS only slightly more.  UNIX/POSIX systems have  
10s of thousands.


IIRC, I had about a dozen files under VMS, not counting versions.


You mean in your system?  There was a lot more than that...



Plus, the number of files being created under typical  
modern systems is at least two (and probably three or four) orders  
of magnitude greater.  I've got 100,000 files under /usr in  
Solaris, and almost 1,000 under my home directory.


wimp :-)  I count 88,148 in my main home directory.  I'll bet just
running gnome and firefox will get you in the ballpark of 1,000 :-/


None (well, maybe 1 or 2)  of which you edit and hence would not  
generate versions.


Chad

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] A versioning FS

2006-10-05 Thread Chad Leigh -- Shire.Net LLC



On Oct 5, 2006, at 7:47 PM, Chad Leigh -- Shire.Net LLC wrote:

I find the unix conventions of storying a file and file~ or any  
of the other myriad billion ways of doing it that each app has  
invented to be much more unwieldy.



sorry,  storing a file, not storying

---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] A versioning FS

2006-10-05 Thread Chad Lewis



On Oct 5, 2006, at 6:48 PM, Frank Cusack wrote:

On October 5, 2006 5:25:17 PM -0700 David Dyer-Bennet [EMAIL PROTECTED] 
b.net wrote:

Well, unless you have a better VCS than CVS or SVN.  I first met this
as an obscure, buggy, expensive, short-lived SUN product, actually; I
believe it was called NSE, the Network Software Engineering
environment.  And I used one commercial product (written by an NSE
user after NSE was discontinued) that supported the feature needed.
Both of these had what I might call a two-level VCS.  Each developer
had one or more private repositories (the way people have working
directories now with SVN), but you had full VCS checkin/checkout (and
compare and rollback and so forth) within that.  Then, when your code
was ready for the repository, you did a commit step that pushed it
up from your private repository to the public repository.


I wouldn't call that 2-level, it's simply branching, and all VCS/SCM
systems have this, even rcs.  Some expose all changes in the private
branch to everyone (modulo protection mechanisms), some only expose  
changes

that are put back (to use Sun teamware terminology).

Both CVS and SVN have this.

-frank



David is describing a different behavior. Even a branch is still  
ultimately on the single,
master server with CVS, SVN, and more other versioning systems.  
Teamware, and a few
other versioning systems, let you have more arbitrary parent and  
child relationships.


In Teamware, you can create a project gate, have a variety of people  
check code into this
project gate, and do all of this without ever touching the parent  
gate. When the
project is done, you then checkin the changes to the project gate's  
parent.


The gate parent may itself be a child of some other gate, making the  
above
project gate a grand-child of some higher gate. You can also change a  
child's parent,
so you could in fact skip the parent and go straight to the grand  
parent if you wish.


For that matter, you can re-parent the parent to sync with the  
former child if you

had some reason to do so.

A Teamware putback really isn't a matter of exposure. Until you do a  
putback to the
parent, the code is not physically (or even logically) present in the  
parent.


Teamware's biggest drawbacks are a lack of change sets (like how  
Subversion tracks
simultaneous, individual changes as a group) and that it only runs  
via file access

(no network protocol, filesystem or NFS only.)

Mercurial seems to be similar to Teamware in terms of parenting, but  
with network protocol

support builtin. Which is presumably OpenSolaris will be using it.

ckl


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] problem ZFS / NFS from FreeBSD nfsv3 client -- periodic NFS server not resp

2006-09-28 Thread Chad Leigh -- Shire.Net LLC



On Sep 26, 2006, at 12:26 PM, Chad Leigh -- Shire.Net LLC wrote:



On Sep 26, 2006, at 12:24 PM, Mike Kupfer wrote:

Chad == Chad Leigh -- Shire.Net LLC [EMAIL PROTECTED]  
writes:


Chad snoop does not show me the reply packets going back.  What do I
Chad need to do to go both ways?

It's possible that performance issues are causing snoop to miss the
replies.

If your server has multiple network interfaces, it's more likely that
the server is routing the replies back on a different interface.   
We've

run into that problem many times with the NFS server that has my home
directory on it.  If that is what's going on, you need to fire up
multiple instances of snoop, one per interface.


OK, I will try that.  I did run tcpdump on the BSD client as well  
so the responses should show up there as well as it only has the 1  
interface on that net while the Solaris box has 3.


That got me thinking.   Since I had 3 dedicated ports to use for  
nfs, I changed it so each is on its own network (192.168.2   .3. 
4) so there is no port switcheroo on incoming and outgoing port.  I  
also upgraded the FreeBSD to catch any bge updates and patches (there  
were some I think but I am not sure they had anything to do with my  
issue).  Anyway, after doing both of these my issue seems to have  
gone away...  I am still testing / watching but I have not seen or  
experienced the issue in a day.  I am not sure which one fixed my  
problem but it seems to have gone away.


Thanks
Chad





Thanks
Chad



mike


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] problem ZFS / NFS from FreeBSD nfsv3 client -- periodic NFS server not resp

2006-09-26 Thread Chad Leigh -- Shire.Net LLC



On Sep 26, 2006, at 12:24 PM, Mike Kupfer wrote:


Chad == Chad Leigh -- Shire.Net LLC [EMAIL PROTECTED] writes:


Chad snoop does not show me the reply packets going back.  What do I
Chad need to do to go both ways?

It's possible that performance issues are causing snoop to miss the
replies.

If your server has multiple network interfaces, it's more likely that
the server is routing the replies back on a different interface.   
We've

run into that problem many times with the NFS server that has my home
directory on it.  If that is what's going on, you need to fire up
multiple instances of snoop, one per interface.


OK, I will try that.  I did run tcpdump on the BSD client as well so  
the responses should show up there as well as it only has the 1  
interface on that net while the Solaris box has 3.


Thanks
Chad



mike


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] problem ZFS / NFS from FreeBSD nfsv3 client -- periodic NFS server not resp

2006-09-25 Thread Chad Leigh

I have set up a Solaris 10 U2 06/06 system that has basic patches to the latest 
-19 kernel patch and latest zfs genesis etc as recommended.  I have set up a 
basic pool (local) and a bunch of sub-pools (local/mail, local/mail/shire.net, 
local/mail/shire.net/o, local/jailextras/shire.net/irsfl, etc). I am exporting 
these with [EMAIL PROTECTED],[EMAIL PROTECTED] and then mounting a few of these 
pools on a FreeBSD system using nfsv3.

The FreeBSD has about 4 of my 10 or so subpools mounted.  2 are email imap 
account tests, 1 is generic storage, and one is a FreeBSD jail root.  FreeBSD 
mounts them with using TCP

/sbin/mount_nfs -s -i -3 -T foo-i1:/local/mail/shire.net/o/obar 
/local/2/hobbiton/local/mail/shire.net/o/obar

The systems are both directly connected to a gigabit switch using 1000btx-fdx 
and both have an MTU set at 9000.  The Solaris side is an e1000g port (the 
system has 2 bge and 2 e1000g ports all configured) and the FreeBSD is a bge 
port.

etc.

I have heard that there are some ZFS/NFS sync performance problems etc that 
will be fixed in U3 or are fixed in OpenSolaris.  I do not think my issue is 
related to that.  I have also seen some of that with sometimes having pisspoor 
performance on writing.

I have experienced the following issue several times since I started 
experimenting with this a few days ago.  I periodically will get NFS server not 
responding errors on the FreeBSD machine for one of the mounted pools, and it 
will last 4-8 minutes or so and then come alive again and be fine for many 
hours.  When this happens, access to the other mounted pools still works fine 
and logged directly in to the Solaris machine I am able to access the file 
systems (pools) just fine.

Example error message:

Sep 24 03:09:44 freebsdclient kernel: nfs server 
solzfs-i1:/local/jailextras/shire.net/irsfl: not responding
Sep 24 03:10:15 freebsdclient kernel: nfs server 
solzfs-i1:/local/jailextras/shire.net/irsfl: not responding
Sep 24 03:12:19 freebsdclient last message repeated 4 times
Sep 24 03:14:54 freebsdclient last message repeated 5 times

I would be interested in getting feedback on what might be the problem and also 
ways to track this down etc.  Is this a know issue?  Have others seen the nfs 
server sharing ZFS time  out  (but not for all pools)?  Etc.

Is there any functional difference with setting up the ZFS pools as legacy 
mounts and using a traditional share command to share them over nfs?

I am mostly a Solaris noob and am happy to learn and can try anything people 
want me to test.

Thanks in advance for any comments or help.
thanks
Chad
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] problem ZFS / NFS from FreeBSD nfsv3 client -- periodic NFS server not resp

2006-09-25 Thread Chad Leigh -- Shire.Net LLC



On Sep 25, 2006, at 12:18 PM, eric kustarz wrote:


Chad Leigh wrote:

I have set up a Solaris 10 U2 06/06 system that has basic patches  
to the latest -19 kernel patch and latest zfs genesis etc as  
recommended.  I have set up a basic pool (local) and a bunch of  
sub-pools (local/mail, local/mail/shire.net, local/mail/shire.net/ 
o, local/jailextras/shire.net/irsfl, etc). I am exporting these  
with [EMAIL PROTECTED],[EMAIL PROTECTED] and then mounting a few  
of these pools on a FreeBSD system using nfsv3.


The FreeBSD has about 4 of my 10 or so subpools mounted.  2 are  
email imap account tests, 1 is generic storage, and one is a  
FreeBSD jail root.  FreeBSD mounts them with using TCP


/sbin/mount_nfs -s -i -3 -T foo-i1:/local/mail/shire.net/o/obar / 
local/2/hobbiton/local/mail/shire.net/o/obar


The systems are both directly connected to a gigabit switch using  
1000btx-fdx and both have an MTU set at 9000.  The Solaris side is  
an e1000g port (the system has 2 bge and 2 e1000g ports all  
configured) and the FreeBSD is a bge port.


etc.

I have heard that there are some ZFS/NFS sync performance problems  
etc that will be fixed in U3 or are fixed in OpenSolaris.  I do  
not think my issue is related to that.  I have also seen some of  
that with sometimes having pisspoor performance on writing.


I have experienced the following issue several times since I  
started experimenting with this a few days ago.  I periodically  
will get NFS server not responding errors on the FreeBSD machine  
for one of the mounted pools, and it will last 4-8 minutes or so  
and then come alive again and be fine for many hours.  When this  
happens, access to the other mounted pools still works fine and  
logged directly in to the Solaris machine I am able to access the  
file systems (pools) just fine.


Example error message:

Sep 24 03:09:44 freebsdclient kernel: nfs server solzfs-i1:/local/ 
jailextras/shire.net/irsfl: not responding
Sep 24 03:10:15 freebsdclient kernel: nfs server solzfs-i1:/local/ 
jailextras/shire.net/irsfl: not responding

Sep 24 03:12:19 freebsdclient last message repeated 4 times
Sep 24 03:14:54 freebsdclient last message repeated 5 times

I would be interested in getting feedback on what might be the  
problem and also ways to track this down etc.  Is this a know  
issue?  Have others seen the nfs server sharing ZFS time  out   
(but not for all pools)?  Etc.




Could be lots of things - network partition, bad hardware,  
overloaded server, bad routers, etc.


What's the server's load like (vmstat, prstat)?  If you're banging  
on the server too hard and using up the server's resources then  
nfsd may not be able to respond to your client's requests.


The server is not doing anything except this ZFS / NFS serving and  
only 1 client is attached to it (the one with the problems).  prstat  
shows a load of 0.00 continually and vmstat is typically like


# vmstat
kthr  memorypagedisk  faults   
cpu
r b w   swap  free  re  mf pi po fr de sr s1 s2 -- --   in   sy   cs  
us sy id
0 0 0 10640580 691412 0  1  0  0  0  0  2  0 11  0  0  421   85  120   
0  0 100

#




You can also grab a snoop trace to see what packets are not being  
responded too?


If I can catch it happening.  Most of the time I am not around and I  
just see it in the logs.  Sometimes it happens when I do a df -h on  
the client for example.




What are clients and local apps doing to the machine?


Almost nothing.  No local apps are running on the server.  It is  
basically just doing ZFS and NFS.


The client has 4 mounts from ZFS,  all of them very low usage.  2  
email accounts storage (imap maildir) are mounted for testing.  Each  
receives 10-100 messages a day.  1 extra storage space is mounted and  
once a day rsync copies 2 files to it in the middle of the night --  
one around 70mb and one 7mb.  The other is being used as the root for  
a FreeBSD jail which is not being used for anything.  Just proof of  
concept.  No processes are running in the jail that are doing much of  
anything to the NFS mounted fiel system -- occasional log writes.




What is your server hardware (# processors, memory) - is it  
underprovisioned for what you're doing to it?


Tyan 2892 MB with a single dual core Opteron at 2.0 GHZ.  2GB memory.

Single Areca 1130 raid card with 1gb RAM cache.  Works very well with  
ZFS without the NFS component.  (Has a 9 disk RAID 6 array on it).  I  
have done lots of testing with this card and Solaris with and without  
ZFS and it has held up very well without any sort of IO issues.   
(Except the fact that it does not get a flush when the system powers  
down with init 5).  The ZFS pools are currently on this single  
disk (to be augmented later this year when more funding comes  
through to buy more stuff)


A dual port e1000g intel server card over PCIe is the Solaris side of  
the network.




How is the freeBSD NFS client code  - robust?


I have

Re: [zfs-discuss] problem ZFS / NFS from FreeBSD nfsv3 client -- periodic NFS server not resp

2006-09-25 Thread Chad Leigh -- Shire.Net LLC



On Sep 25, 2006, at 1:15 PM, Mike Kupfer wrote:


Chad == Chad Leigh -- Shire.Net LLC [EMAIL PROTECTED] writes:


Chad On Sep 25, 2006, at 12:18 PM, eric kustarz wrote:


You can also grab a snoop trace to see what packets are not being
responded too?


Chad If I can catch it happening.  Most of the time I am not  
around and

Chad I just see it in the logs.

I've attached a hack script that runs snoop in the background and
rotates the capture files.  If you start it as (for example)

bgsnoop client server

it will save the last 6 hours of capture files between the two hosts.
If you notice a problem in the logs, you can find the corresponding
capture file and extract from it what you need.


Hi Mike

Thanks.  I set this up like so

./bgsnoop.sh -d e1000g0 freebsd-internal

since my nfs is not going out the default interface.  Soon  
thereafter I caught the problem.  In looking at the snoop.trace file  
I am not sure what to look for.  There seems to be no packet headers  
or time stamps or anything -- just a lot of binary data.  What am I  
looking for?


Thanks
Chad



mike
bgsnoop


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.n




smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] problem ZFS / NFS from FreeBSD nfsv3 client -- periodic NFS server not resp

2006-09-25 Thread Chad Leigh -- Shire.Net LLC



On Sep 25, 2006, at 2:49 PM, Mike Kupfer wrote:


Chad == Chad Leigh -- Shire.Net LLC [EMAIL PROTECTED] writes:


Chad There seems to be no packet headers or time stamps or  
anything --

Chad just a lot of binary data.  What am I looking for?

Use snoop -i capture_file to decode the capture file.


OK, a little snoop help is required.

I ran bgsnoop as follows:

# ./bgsnoop.sh -t a -r -d e1000g0

According to the snoop man page

 -t [ r | a | d ]Time-stamp presentation. Time-stamps
 areaccuratetowithin4
 microseconds.  The  default  is  for
 times  to  be presented in d (delta)
 format (the time since receiving the
 previous  packet).  Option  a (abso-
 lute) gives wall-clock time.  Option
 r  (relative) gives time relative to
 the first packet displayed. This can
 be   used  with  the  -p  option  to
 display   time   relative   to   any
 selected packet.

so -t a should show wall clock time

But my feed looks like the following and I don't see any wall clock  
time stamps.  I need to be able to get some sort of wall-time stamp  
on this so that I can know where to look in my snoop dump for  
offending issues...


 1   0.0 freebsd-internal.shire.net - bagend-i1NFS C  
ACCESS3 FH=50E5 (read,lookup,modify,extend,delete,execute)
  2   0.00045 freebsd-internal.shire.net - bagend-i1NFS C  
ACCESS3 FH=339B (read,lookup,modify,extend,delete,execute)
  3   0.00019 freebsd-internal.shire.net - bagend-i1NFS C  
LOOKUP3 FH=339B 1159219290.M400972P15189_courierlock.freebsd.shire.net
  4   0.00019 freebsd-internal.shire.net - bagend-i1NFS C  
LOOKUP3 FH=339B 1159219290.M400972P15189_courierlock.freebsd.shire.net
  5   0.00026 freebsd-internal.shire.net - bagend-i1NFS C  
CREATE3 FH=339B (UNCHECKED)  
1159219290.M400972P15189_courierlock.freebsd.shire.net
  6   0.00045 freebsd-internal.shire.net - bagend-i1NFS C  
ACCESS3 FH=878C (read,lookup,modify,extend,delete,execute)
  7   0.00013 freebsd-internal.shire.net - bagend-i1NFS C  
LOOKUP3 FH=50E5 tmp
  8   0.00013 freebsd-internal.shire.net - bagend-i1NFS C  
LOOKUP3 FH=339B 1159219290.M400972P15189_courierlock.freebsd.shire.net
  9   0.00019 freebsd-internal.shire.net - bagend-i1NFS C  
ACCESS3 FH=878C (read,lookup,modify,extend,delete,execute)
10   0.00026 freebsd-internal.shire.net - bagend-i1NFS C ACCESS3  
FH=878C (read,lookup,modify,extend,delete,execute)
11   0.00019 freebsd-internal.shire.net - bagend-i1NFS C WRITE3  
FH=878C at 0 for 24 (ASYNC)
12   0.00026 freebsd-internal.shire.net - bagend-i1NFS C ACCESS3  
FH=878C (read,lookup,modify,extend,delete,execute)
13   0.00013 freebsd-internal.shire.net - bagend-i1NFS C LOOKUP3  
FH=339B courier.lock
14   0.00013 freebsd-internal.shire.net - bagend-i1NFS C COMMIT3  
FH=878C at 0 for 24
15   0.00032 freebsd-internal.shire.net - bagend-i1NFS C LINK3  
FH=878C to FH=339B courier.lock
16   0.00026 freebsd-internal.shire.net - bagend-i1NFS C LOOKUP3  
FH=339B 1159219290.M400972P15189_courierlock.freebsd.shire.net
17   0.00019 freebsd-internal.shire.net - bagend-i1NFS C REMOVE3  
FH=339B 1159219290.M400972P15189_courierlock.freebsd.shire.net
18   0.00032 freebsd-internal.shire.net - bagend-i1NFS C ACCESS3  
FH=339B (read,lookup,modify,extend,delete,execute)
19   0.00019 freebsd-internal.shire.net - bagend-i1NFS C FSSTAT3  
FH=50E5
20   0.00019 freebsd-internal.shire.net - bagend-i1NFS C  
READDIR3 FH=339B Cookie=0 for 8192
21   0.00026 freebsd-internal.shire.net - bagend-i1NFS C LOOKUP3  
FH=339B courier.lock
22   0.00019 freebsd-internal.shire.net - bagend-i1NFS C LOOKUP3  
FH=339B 1159219290.M405999P15189_imapuid_164.freebsd.shire.net
23   0.00026 freebsd-internal.shire.net - bagend-i1NFS C LOOKUP3  
FH=339B 1159219290.M405999P15189_imapuid_164.freebsd.shire.net
24   0.00013 freebsd-internal.shire.net - bagend-i1NFS C CREATE3  
FH=339B (UNCHECKED)  
1159219290.M405999P15189_imapuid_164.freebsd.shire.net
25   0.00032 freebsd-internal.shire.net - bagend-i1NFS C ACCESS3  
FH=868C (read,lookup,modify,extend,delete,execute)
26   0.00013 freebsd-internal.shire.net - bagend-i1NFS C LOOKUP3  
FH=339B 1159219290.M405999P15189_imapuid_164.freebsd.shire.net
27   0.00013 freebsd-internal.shire.net - bagend-i1NFS C ACCESS3  
FH=EE81 (read,lookup,modify,extend,delete,execute)
28   0.00013 freebsd-internal.shire.net - bagend-i1NFS C ACCESS3  
FH=EE81 (read,lookup,modify,extend,delete,execute)
29   0.05840 freebsd-internal.shire.net - bagend-i1NFS C ACCESS3  
FH=868C (read,lookup,modify,extend,delete,execute)
30   0.00019 freebsd-internal.shire.net - bagend-i1NFS C ACCESS3  
FH

Re: [zfs-discuss] problem ZFS / NFS from FreeBSD nfsv3 client -- periodic NFS server not resp

2006-09-25 Thread Chad Leigh -- Shire.Net LLC



On Sep 25, 2006, at 3:54 PM, Mike Kupfer wrote:


Chad == Chad Leigh -- Shire.Net LLC [EMAIL PROTECTED] writes:


Chad so -t a should show wall clock time

The capture file always records absolute time.  So you (just) need to
use -t a when you decode the capture file.

Sorry for not making the clear earlier.


OK, thanks.  Sorry for being such a noob with snoop.  I guess it is  
kind of obvious now that you would put that on the snoop that reads  
the file and outputs the human readable one and not the one that  
saves things away...


This appears to be the only stuff having to do with the hanging  
server (lots of other stuff that is with other zfs pools that are  
served over nfs)


 68 15:29:27.53298 freebsd-internal.shire.net - solaris-zfs-i1 
NFS C FSSTAT3 FH=84EC
72 15:29:28.54294 freebsd-internal.shire.net - solaris-zfs-i1NFS  
C FSSTAT3 FH=84EC (retransmit)
73 15:29:29.54312 freebsd-internal.shire.net - solaris-zfs-i1NFS  
C FSSTAT3 FH=84EC (retransmit)
74 15:29:31.54356 freebsd-internal.shire.net - solaris-zfs-i1NFS  
C FSSTAT3 FH=84EC (retransmit)
75 15:29:35.54443 freebsd-internal.shire.net - solaris-zfs-i1NFS  
C FSSTAT3 FH=84EC (retransmit)
76 15:29:43.54610 freebsd-internal.shire.net - solaris-zfs-i1NFS  
C FSSTAT3 FH=84EC (retransmit)
5890 15:29:59.55835 freebsd-internal.shire.net - solaris-zfs-i1 
NFS C FSSTAT3 FH=84EC
5993 15:30:31.56506 freebsd-internal.shire.net - solaris-zfs-i1 
NFS C FSSTAT3 FH=84EC (retransmit)
6124 15:31:35.58971 freebsd-internal.shire.net - solaris-zfs-i1 
NFS C FSSTAT3 FH=84EC (retransmit)
6346 15:32:44.23048 freebsd-internal.shire.net - solaris-zfs-i1 
NFS C FSSTAT3 FH=84EC
6347 15:32:44.23585 freebsd-internal.shire.net - solaris-zfs-i1 
NFS C FSSTAT3 FH=84EC (retransmit)
6755 15:34:40.56138 freebsd-internal.shire.net - solaris-zfs-i1 
NFS C FSSTAT3 FH=84EC


comes alive again again right about 6347  15:32:22.23585  based on  
matching log entries and this snoop


snoop does not show me the reply packets going back.  What do I need  
to do to go both ways?


Thanks
Chad




mike


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Access to ZFS checksums would be nice and very useful feature

2006-09-14 Thread Chad Lewis



On Sep 14, 2006, at 1:32 PM, Henk Langeveld wrote:


Bady, Brant RBCM:EX wrote:
Part of the archiving process is to generate checksums (I happen  
to use

MD5), and store them with other metadata about the digital object in
order to verify data integrity and demonstrate the authenticity of  
the

digital object over time.



Wouldn't it be helpful if there was a utility to access/read  the
checksum data created by ZFS, and use it for those same purposes.


Doesn't ZFS use block-level checksums?
Hoping to see something like that in a future release, or a  
command line

utility that could do the same.


It might be possible to add a user set property to a file with the  
md5sum and

a timestamp when it was computed.

But what would this protect against?  If you need to avoid  
tampering, you

need the checksums offline anyway - cf. tripwire.

Cheers,
Henk



Better still would be the forthcoming cryptographic extensions in some
kind of digital-signature mode.

ckl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: Re: Re: Re: Proposal: multiple copies of user data

2006-09-12 Thread Chad Lewis



On Sep 12, 2006, at 4:39 PM, Celso wrote:


On 12/09/06, Celso [EMAIL PROTECTED] wrote:


I think it has already been said that in many

peoples experience, when a disk fails, it completely
fails. Especially on laptops. Of course ditto blocks
wouldn't help you in this situation either!

Exactly.


I still think that silent data corruption is a

valid concern, one that ditto blocks would solve. 
Also, I am not thrilled about losing that much space
for duplication of unneccessary data (caused by
partitioning a disk in two).

Well, you'd only be duplicating the data on the
mirror. If you don't want to
mirror the base OS, no one's saying you have to.



Yikes! that sounds like even more partitioning!



The redundancy you're talking about is what you'd get
from 'cp /foo/bar.jpg /foo/bar.jpg.ok', except it's
hidden from the
user and causing
headaches for anyone trying to comprehend, port or
extend the codebase in
the future.


the proposed solution differs in one important aspect: it  
automatically detects data corruption.





Detecting data corruption is a function of the ZFS checksumming  
feature. The proposed solution
has _nothing_ to do with detecting corruption. The difference is in  
what happens when/if such
bad data is detected. Without a duplicate copy, via some RAID level  
or the proposed ditto block

copies, the file is corrupted.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Enabling compression/encryption on a populated filesystem

2006-07-18 Thread Chad Mynhier


On 7/18/06, Brian Hechinger [EMAIL PROTECTED] wrote:

On Tue, Jul 18, 2006 at 09:46:44AM -0400, Chad Mynhier wrote:
 On 7/18/06, Brian Hechinger [EMAIL PROTECTED] wrote:
 
 Being able to remove devices from a pool would be a good thing.  I can't
 personally think of any reason that I would ever do it, but a friend of
 mine keeps asking me why it can't do it and that it should be able to.
 
 -brian

 This situation is implicitly included in what Jeff said, but live data
 migration is a good example of where this would come in handy.

Size upgrades you can do in place, and even migrating to a new shelf you
can do in place as well (replace individual disk in old shelf with
individual disk in new chelf).

The only place the removing disks from the pool would be useful in this
scenario would be if the new array had a fewer number of larger disks.


There are conceivable situations in which you're not able to do a
simple one-to-one device replacement.  One case is the one you give,
where you have an array with fewer, larger disks.  But it's also
feasible that the zpool structure you want to use on the new storage
doesn't match what you're doing on your current storage.

(Although I guess the fewer, larger disk scenario is just a special
case of the situation in which the resultant zpool structure doesn't
match the original.)

Chad Mynhier
http://cmynhier.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Enabling compression/encryption on a populated filesystem

2006-07-13 Thread Chad Mynhier


On 7/13/06, Darren Reed [EMAIL PROTECTED] wrote:

When ZFS compression is enabled, although the man page doesn't
explicitly say this, my guess is that only new data that gets
written out is compressed - in keeping with the COW policy.


[ ... ]


Hmmm, well, I suppose the same problem might apply to
encrypting data too...so maybe what I need is a zfs command
that will walk the filesystem's data tree, read in data and
write it back out according to the current data policy.



It seems this could be made a function of 'zfs scrub' -- instead of
simply verifying the data, it could rewrite the data as it goes.

This comes in handy in other situations.  For example, with the
current state of things, if you add disks to a pool that contains
mostly static data, you don't get the benefit of the additional
spindles when reading old data.  Rewriting the data would gain you
that benefit, plus it would avoid the new disks becoming the hot spot
for all new writes (assuming the old disks were very full.)

Theoretically this could also be useful in a live data migration
situation, where you have both new and old storage connected to a
server.  But this assumes there would be some way to tell ZFS to treat
a subset of disks as read-only.

Chad Mynhier
http://cmynhier.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] COW question

2006-07-12 Thread Chad Lewis

It uses extra space in the middle of the write, in order to hold the  
new data, but once
the write is complete, the space occupied by the old version is now  
free for use.


ckl

On Jul 12, 2006, at 8:05 PM, Robert Chen wrote:

I still could not understand why Copy on Write does not waste file  
system capacity.


Robert


Raymond Xiong 写道:

Robert Chen wrote:
question is ZFS uses COW(copy on write), does this mean it will  
double usage of capacity or waste the capacity? What COW really  
do? No mirror also has COW? Please help me, thanks. Robert  
___ zfs-discuss  
mailing list zfs-discuss@opensolaris.org http:// 
mail.opensolaris.org/mailman/listinfo/zfs-discuss
It doesn't. Page 11 of the following slides illustrates how COW  
works in ZFS: http://www.opensolaris.org/os/community/zfs/docs/ 
zfs_last.pdf Blocks containing active data are never overwritten  
in place; instead, a new block is allocated, modified data is  
written to it, and then any metadata blocks referencing it are  
similarly read, reallocated, and written. To reduce the overhead  
of this process, multiple updates are grouped into transaction  
groups, and an intent log is used when synchronous write semantics  
are required.(from http://en.wikipedia.org/wiki/ZFS) IN snapshot  
scenario, COW consumes much less disk space and is much faster.  
Raymond ___ zfs- 
discuss mailing list zfs-discuss@opensolaris.org http:// 
mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

83 matches

Mail list logo