[zfs-discuss] Mounting a ZFS clone

2007-01-15 Thread Albert Chin
I have no hands-on experience with ZFS but have a question. If the
file server running ZFS exports the ZFS file system via NFS to
clients, based on previous messages on this list, it is not possible
for an NFS client to mount this NFS-exported ZFS file system on
multiple directories on the NFS client.

So, let's say I create a ZFS clone of some ZFS file system. Is it
possible for an NFS client to mount the ZFS file system _and_ the
clone without problems?

If the clone is underneath the ZFS file system hierarchy, will
mounting the ZFS file system I created the clone from allow the NFS
client access to the remote ZFS file system and the clone?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mounting a ZFS clone

2007-01-16 Thread Albert Chin
On Tue, Jan 16, 2007 at 01:28:04PM -0800, Eric Kustarz wrote:
 Albert Chin wrote:
 On Mon, Jan 15, 2007 at 10:55:23AM -0600, Albert Chin wrote:
 
 I have no hands-on experience with ZFS but have a question. If the
 file server running ZFS exports the ZFS file system via NFS to
 clients, based on previous messages on this list, it is not possible
 for an NFS client to mount this NFS-exported ZFS file system on
 multiple directories on the NFS client.
 
 
 At least, I thought I read this somewhere. Is the above possible? I
 don't see why it should not be.
 
 Yes, you can mount multiple *filesystems* via NFS.

And the fact that the file systems on the remote server are ZFS is
irrelevant?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Thumper Origins Q

2007-01-25 Thread Albert Chin
On Wed, Jan 24, 2007 at 10:19:29AM -0800, Frank Cusack wrote:
 On January 24, 2007 10:04:04 AM -0800 Bryan Cantrill [EMAIL PROTECTED] 
 wrote:
 
 On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote:
 Well, he did say fairly cheap. the ST 3511 is about $18.5k. That's
 about the same price for the low-end NetApp FAS250 unit.
 
 Note that the 3511 is being replaced with the 6140:
 
 Which is MUCH nicer but also much pricier.  Also, no non-RAID option.

So there's no way to treat a 6140 as JBOD? If you wanted to use a 6140
with ZFS, and really wanted JBOD, your only choice would be a RAID 0
config on the 6140?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Thumper Origins Q

2007-01-25 Thread Albert Chin
On Thu, Jan 25, 2007 at 10:16:47AM -0500, Torrey McMahon wrote:
 Albert Chin wrote:
 On Wed, Jan 24, 2007 at 10:19:29AM -0800, Frank Cusack wrote:
   
 On January 24, 2007 10:04:04 AM -0800 Bryan Cantrill [EMAIL PROTECTED] 
 wrote:
 
 On Wed, Jan 24, 2007 at 09:46:11AM -0800, Moazam Raja wrote:
   
 Well, he did say fairly cheap. the ST 3511 is about $18.5k. That's
 about the same price for the low-end NetApp FAS250 unit.
 
 Note that the 3511 is being replaced with the 6140:
   
 Which is MUCH nicer but also much pricier.  Also, no non-RAID option.
 
 
 So there's no way to treat a 6140 as JBOD? If you wanted to use a 6140
 with ZFS, and really wanted JBOD, your only choice would be a RAID 0
 config on the 6140?
 
 Why would you want to treat a 6140 like a JBOD? (See the previous 
 threads about JBOD vs HW RAID...)

Well, a 6140 with RAID 10 is not an option because we don't want to
lose 50% disk capacity. So, we're left with RAID 5. Yes, we could
layer ZFS on top of this. But what do you do if you want RAID 6?
Easiest way to get it is ZFS RAIDZ2 on top of JBOD. The only reason
I'd consider RAID is if the HW RAID performance was enough of a win
over ZFS SW RAID.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Thumper Origins Q

2007-01-25 Thread Albert Chin
On Thu, Jan 25, 2007 at 02:24:47PM -0600, Al Hopper wrote:
 On Thu, 25 Jan 2007, Bill Sommerfeld wrote:
 
  On Thu, 2007-01-25 at 10:16 -0500, Torrey McMahon wrote:
 
So there's no way to treat a 6140 as JBOD? If you wanted to use a 6140
with ZFS, and really wanted JBOD, your only choice would be a RAID 0
config on the 6140?
  
   Why would you want to treat a 6140 like a JBOD? (See the previous
   threads about JBOD vs HW RAID...)
 
  Let's turn this around.  Assume I want a FC JBOD.  What should I get?
 
 Many companies make FC expansion boxes to go along with their FC based
 hardware RAID arrays.  Often, the expansion chassis is identical to the
 RAID equipped chassis - same power supplies, same physical chassis and
 disk drive carriers - the only difference is that the slots used to house
 the (dual) RAID H/W controllers have been blanked off.  These expansion
 chassis are designed to be daisy chained back to the box with the H/W
 RAID.  So you simply use one of the expansion chassis and attach it
 directly to a system equipped with an FC HBA and ... you've got an FC
 JBOD.  Nearly all of them will support two FC connections to allow dual
 redundant connections to the FC RAID H/W.  So if you equip your ZFS host
 with either a dual-port FC HBA or two single-port FC HBAs - you have a
 pretty good redundant FC JBOD solution.
 
 An example of such an expansion box is the DS4000 EXP100 from IBM.  It's
 also possible to purchase a 3510FC box from Sun with no RAID controllers -
 but their nearest equivalent of an empty box comes with 6 (overpriced)
 disk drives pre-installed. :(
 
 Perhaps you could use your vast influence at Sun to persuade them to sell
 an empty 3510FC box?  Or an empty box bundled with a single or dual-port
 FC card (Qlogic based please).  Well - there's no harm in making the
 suggestion ... right?

Well, when you buy disk for the Sun 5320 NAS Appliance, you get a
Controller Unit shelf and, if you expand storage, an Expansion Unit
shelf that connects to the Controller Unit. Maybe the Expansion Unit
shelf is a JBOD 6140?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS or UFS - what to do?

2007-01-29 Thread Albert Chin
On Mon, Jan 29, 2007 at 11:17:05AM -0800, Jeffery Malloch wrote:
 From what I can tell from this thread ZFS if VERY fussy about
 managing writes,reads and failures.  It wants to be bit perfect.  So
 if you use the hardware that comes with a given solution (in my case
 an Engenio 6994) to manage failures you risk a) bad writes that
 don't get picked up due to corruption from write cache to disk b)
 failures due to data changes that ZFS is unaware of that the
 hardware imposes when it tries to fix itself.
 
 So now I have a $70K+ lump that's useless for what it was designed
 for.  I should have spent $20K on a JBOD.  But since I didn't do
 that, it sounds like a traditional model works best (ie. UFS et al)
 for the type of hardware I have.  No sense paying for something and
 not using it.  And by using ZFS just as a method for ease of file
 system growth and management I risk much more corruption.

Well, ZFS with HW RAID makes sense in some cases. However, it seems
that if you are unwilling to lose 50% disk space to RAID 10 or two
mirrored HW RAID arrays, you either use RAID 0 on the array with ZFS
RAIDZ/RAIDZ2 on top of that or a JBOD with ZFS RAIDZ/RAIDZ2 on top of
that.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hot spares - in standby?

2007-01-30 Thread Albert Chin
On Mon, Jan 29, 2007 at 09:37:57PM -0500, David Magda wrote:
 On Jan 29, 2007, at 20:27, Toby Thain wrote:
 
 On 29-Jan-07, at 11:02 PM, Jason J. W. Williams wrote:
 
 I seem to remember the Massive Array of Independent Disk guys ran  
 into
 a problem I think they called static friction, where idle drives  
 would
 fail on spin up after being idle for a long time:
 
 You'd think that probably wouldn't happen to a spare drive that was  
 spun up from time to time. In fact this problem would be (mitigated  
 and/or) caught by the periodic health check I suggested.
 
 What about a rotating spare?
 
 When setting up a pool a lot of people would (say) balance things  
 around buses and controllers to minimize single  points of failure,  
 and a rotating spare could disrupt this organization, but would it be  
 useful at all?

Agami Systems has the concept of Enterprise Sparing, where the hot
spare is distributed amongst data drives in the array. When a failure
occurs, the rebuild occurs in parallel across _all_ drives in the
array:
  http://www.issidata.com/specs/agami/enterprise-classreliability.pdf

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Hypothetical question about multiple host access to 6140 array

2007-02-23 Thread Albert Chin
Hypothetical question.

Say you have one 6140 controller tray and one 6140 expansion tray. In
the beginning, these are connected to a 5220 or 5320 NAS appliance.
Then, say you get a Sun server, a X4100, and connect it to the 6140
controller tray (the 6140 supports multiple data hosts). Is it
possible to allocate some disks from the 6140 array to ZFS on the
X4100 for the purpose of migrating data from the appliance to ZFS?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] File level snapshots in ZFS?

2007-03-29 Thread Albert Chin
On Thu, Mar 29, 2007 at 11:52:56PM +0530, Atul Vidwansa wrote:
 Is it possible to take file level snapshots in ZFS? Suppose I want to
 keep a version of the file before writing new data to it, how do I do
 that? My goal would be to rollback the file to earlier version (i.e.
 discard the new changes) depending upon a policy.  I would like to
 keep only 1 version of a file at a time and while writing new data,
 earlier version will be discarded and current state of file (before
 writing) would be saved in the version.

Doubt it. Snapshots are essentiall free and take no time so might as
well just snapshot the file system.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-21 Thread Albert Chin
We're testing an X4100M2, 4GB RAM, with a 2-port 4GB Fibre Channel
QLogic connected to a 2GB Fibre Channel 6140 array. The X4100M2 is
running OpenSolaris b63.

We have 8 drives in the Sun 6140 configured as individual RAID-0
arrays and have a ZFS RAID-Z2 array comprising 7 of the drives (for
testing, we're treating the 6140 as JBOD for now). The RAID-0 stripe
size is 128k. We're testing updates to the X4100M2 using rsync across
the network with ssh and using NFS:
  1. [copy 400MB of gcc-3.4.3 via rsync/NFS]
 # mount file-server:/opt/test /mnt
 # rsync -vaHR --delete --stats gcc343 /mnt
 ...
 sent 409516941 bytes  received 80590 bytes  5025736.58 bytes/sec
  2. [copy 400MB of gcc-3.4.3 via rsync/ssh]
 # rsync -vaHR -e 'ssh' --delete --stats gcc343 file-server:/opt/test
 ...
 sent 409516945 bytes  received 80590 bytes  9637589.06 bytes/sec

The network is 100MB. /etc/system on the file server is:
  set maxphys = 0x80
  set ssd:ssd_max_throttle = 64
  set zfs:zfs_nocacheflush = 1

Why can't the NFS performance match that of SSH?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-21 Thread Albert Chin
On Mon, May 21, 2007 at 02:55:18PM -0600, Robert Thurlow wrote:
 Albert Chin wrote:
 
 Why can't the NFS performance match that of SSH?
 
 One big reason is that the sending CPU has to do all the comparisons to
 compute the list of files to be sent - it has to fetch the attributes
 from both local and remote and compare timestamps.  With ssh, local
 processes at each end do lstat() calls in parallel and chatter about
 the timestamps, and the lstat() calls are much cheaper.  I would wonder
 how long the attr-chatter takes in your two cases before bulk data
 starts to be sent - deducting that should reduce the imbalance you're
 seeing.  If rsync were more multi-threaded and could manage multiple
 lstat() calls in parallel NFS would be closer.

Well, there is no data on the file server as this is an initial copy,
so there is very little for rsync to do. To compare the rsync
overhead, I conducted some more tests, using tar:
  1. [copy 400MB of gcc-3.4.3 via tar/NFS to ZFS file system]
 # mount file-server:/opt/test /mnt
 # time tar cf - gcc343 | (cd /mnt; tar xpf - )
 ...
 419721216 bytes in 1:08.65 = 6113928.86 bytes/sec
  2. [copy 400MB of gcc-3.4.3 via tar/ssh to ZFS file system]
 # time tar cf - gcc343 | ssh -oForwardX11=no file-server \
 'cd /opt/test; tar xpf -'
 ...
 419721216 bytes in 35:82 = 11717510.21 bytes/sec

  3. [copy 400MB of gcc-3.4.3 via tar/NFS to Fibre-attached file system]
 # mount file-server:/opt/fibre-disk /mnt
 # time tar cf - gcc343 | (cd /mnt; tar xpf - )
 ...
 419721216 bytes in 56:87 = 7380362.51 bytes/sec
  4. [copy 400MB of gcc-3.4.3 via tar/ssh to Fibre-attached file system]
 # time tar cf - gcc343 | ssh -oForwardX11=no file-server \
 'cd /opt/fibre-disk; tar xpf -'
 ...
 419721216 bytes in 35:89 = 11694656.34 bytes/sec

So, it would seem using #1 and #2, NFS performance can stand some
improvement. And, I'd have thought that since #2/#4 were similar,
#1/#3 should be as well. Maybe some NFS/ZFS issues would answer the
discrepancy.

I think the bigger problem is the NFS performance penalty so we'll go
lurk somewhere else to find out what the problem is.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-21 Thread Albert Chin
On Mon, May 21, 2007 at 06:11:36PM -0500, Nicolas Williams wrote:
 On Mon, May 21, 2007 at 06:09:46PM -0500, Albert Chin wrote:
  But still, how is tar/SSH any more multi-threaded than tar/NFS?
 
 It's not that it is, but that NFS sync semantics and ZFS sync
 semantics conspire against single-threaded performance.

What's why we have set zfs:zfs_nocacheflush = 1 in /etc/system. But,
that's only helps ZFS. Is there something similar for NFS?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-21 Thread Albert Chin
On Mon, May 21, 2007 at 04:55:35PM -0600, Robert Thurlow wrote:
 Albert Chin wrote:
 
 I think the bigger problem is the NFS performance penalty so we'll go
 lurk somewhere else to find out what the problem is.
 
 Is this with Solaris 10 or OpenSolaris on the client as well?

Client is RHEL 4/x86_64.

But, we just ran a concurrent tar/SSH across Solaris 10, HP-UX
11.23/PA, 11.23/IA, AIX 5.2, 5.3, RHEL 4/x86, 4/x86_64 and the average
was ~4562187 bytes/sec. But, the gcc343 copy on each of these machines
isn't the same size. It's certainly less than 400MBx7 though.

While performance on one system is fine, things degrade when you add
clients.

 I guess this goes back to some of the why is tar slow over NFS
 discussions we've had, some here and some on nfs-discuss.  A more
 multi-threaded workload would help; so will planned work to focus
 on performance of NFS and ZFS together, which can sometimes be
 slower than expected.

But still, how is tar/SSH any more multi-threaded than tar/NFS?

I've posted to nfs-discuss so maybe someone knows something.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Rsync update to ZFS server over SSH faster than over

2007-05-21 Thread Albert Chin
On Mon, May 21, 2007 at 08:26:37PM -0700, Paul Armstrong wrote:
 GIven you're not using compression for rsync, the only thing I can
 think if would be that the stream compression of SSH is helping
 here.

SSH compresses by default? I thought you had to specify -oCompression
and/or -oCompressionLevel?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Rsync update to ZFS server over SSH faster than over NFS?

2007-05-22 Thread Albert Chin
On Mon, May 21, 2007 at 13:23:48 -0800, Marion Hakanson wrote:
Albert Chin wrote:
 Why can't the NFS performance match that of SSH? 

 My first guess is the NFS vs array cache-flush issue.  Have you
 configured the 6140 to ignore SYNCHRONIZE_CACHE requests?  That'll
 make a huge difference for NFS clients of ZFS file servers.

Doesn't setting zfs:zfs_nocacheflush=1 achieve the same result:
  http://blogs.digitar.com/jjww/?itemid=44

The 6140 has a non-volatile cache. Dunno if it's order-preserving
though.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?

2007-05-24 Thread Albert Chin
On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote:
 I'm running SunOS Release 5.10 Version Generic_118855-36 64-bit
 and in [b]/etc/system[/b] I put:
 
 [b]set zfs:zfs_nocacheflush = 1[/b]
 
 And after rebooting, I get the message:
 
 [b]sorry, variable 'zfs_nocacheflush' is not defined in the 'zfs' module[/b]
 
 So is this variable not available in the Solaris kernel? 

I think zfs:zfs_nocacheflush is only available in Nevada.

 I'm getting really poor write performance with ZFS on a RAID5 volume
 (5 disks) from a storagetek 6140 array. I've searched the web and
 these forums and it seems that this zfs_nocacheflush option is the
 solution, but I'm open to others as well.

What type of poor performance? Is it because of ZFS? You can test this
by creating a RAID-5 volume on the 6140, creating a UFS file system on
it, and then comparing performance with what you get against ZFS.

It would also be worthwhile doing something like the following to
determine the max throughput the H/W RAID is giving you:
  # time dd of=raw disk if=/dev/zero bs=1048576 count=1000
For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a
single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k
stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Albert Chin
On Fri, May 25, 2007 at 12:14:45AM -0400, Torrey McMahon wrote:
 Albert Chin wrote:
 On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote:
   
 
 I'm getting really poor write performance with ZFS on a RAID5 volume
 (5 disks) from a storagetek 6140 array. I've searched the web and
 these forums and it seems that this zfs_nocacheflush option is the
 solution, but I'm open to others as well.
 
 
 What type of poor performance? Is it because of ZFS? You can test this
 by creating a RAID-5 volume on the 6140, creating a UFS file system on
 it, and then comparing performance with what you get against ZFS.
 
 If it's ZFS then you might want to check into modifying the 6540 NVRAM 
 as mentioned in this thread
 
 http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
 
 there is a fix that doesn't involve modifying the NVRAM in the works. (I 
 don't have an estimate.)

The above URL helps only if you have Santricity.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Albert Chin
On Fri, May 25, 2007 at 12:01:45PM -0400, Andy Lubel wrote:
 Im using: 
  
   zfs set:zil_disable 1
 
 On my se6130 with zfs, accessed by NFS and writing performance almost
 doubled.  Since you have BBC, why not just set that?

I don't think it's enough to have BBC to justify zil_disable=1.
Besides, I don't know anyone from Sun recommending zil_disable=1. If
your storage array has BBC, it doesn't matter. What matters is what
happens when ZIL isn't flushed and your file server crashes (ZFS file
system is still consistent but you'll lose some info that hasn't been
flushed by ZIL). Even having your file server on a UPS won't help
here.

http://blogs.sun.com/erickustarz/entry/zil_disable discusses some of
the issues affecting zil_disable=1.

We know we get better performance with zil_disable=1 but we're not
taking any chances.

 -Andy
 
 
 
 On 5/24/07 4:16 PM, Albert Chin
 [EMAIL PROTECTED] wrote:
 
  On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote:
  I'm running SunOS Release 5.10 Version Generic_118855-36 64-bit
  and in [b]/etc/system[/b] I put:
  
  [b]set zfs:zfs_nocacheflush = 1[/b]
  
  And after rebooting, I get the message:
  
  [b]sorry, variable 'zfs_nocacheflush' is not defined in the 'zfs' 
  module[/b]
  
  So is this variable not available in the Solaris kernel?
  
  I think zfs:zfs_nocacheflush is only available in Nevada.
  
  I'm getting really poor write performance with ZFS on a RAID5 volume
  (5 disks) from a storagetek 6140 array. I've searched the web and
  these forums and it seems that this zfs_nocacheflush option is the
  solution, but I'm open to others as well.
  
  What type of poor performance? Is it because of ZFS? You can test this
  by creating a RAID-5 volume on the 6140, creating a UFS file system on
  it, and then comparing performance with what you get against ZFS.
  
  It would also be worthwhile doing something like the following to
  determine the max throughput the H/W RAID is giving you:
# time dd of=raw disk if=/dev/zero bs=1048576 count=1000
  For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a
  single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k
  stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip.
 -- 
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Albert Chin
On Fri, May 25, 2007 at 09:54:04AM -0700, Grant Kelly wrote:
  It would also be worthwhile doing something like the following to
  determine the max throughput the H/W RAID is giving you:
  # time dd of=raw disk if=/dev/zero bs=1048576 count=1000
  or a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a
  single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k
  stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip.
 
 Well the Solaris kernel is telling me that it doesn't understand
 zfs_nocacheflush, but the array sure is acting like it!
 I ran the dd example, but increased the count for a longer running time.

I don't think a longer running time is going to give you a more
accurate measurement.

 5-disk RAID5 with UFS: ~79 MB/s

What about against a raw RAID-5 device?

 5-disk RAID5 with ZFS: ~470 MB/s

I don't think you want to if=/dev/zero on ZFS. There's probably some
optimization going on. Better to use /dev/urandom or concat n-many
files comprised of random bits.

 I'm assuming there's some caching going on with ZFS that's really
 helping out?

Yes.

 Also, no Santricity, just Sun's Common Array Manager. Is it possible
 to use both without completely confusing the array?

I think both are ok. CAM is free. Dunno about Santricity.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS version 5 to version 6 fails to import or upgrade

2007-06-19 Thread Albert Chin
On Tue, Jun 19, 2007 at 07:16:06PM -0700, John Brewer wrote:
 bash-3.00# zpool import
   pool: zones
 id: 4567711835620380868
  state: ONLINE
 status: The pool is formatted using an older on-disk version.
 action: The pool can be imported using its name or numeric identifier, though
 some features will not be available without an explicit 'zpool 
 upgrade'.
 config:
 
 zones   ONLINE
   c0d1s5ONLINE

zpool import lists the pools available for import. Maybe you need to
actually _import_ the pool first before you can upgrade.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to take advantage of PSARC 2007/171: ZFS Separate Intent Log

2007-07-03 Thread Albert Chin
PSARC 2007/171 will be available in b68. Any documentation anywhere on
how to take advantage of it?

Some of the Sun storage arrays contain NVRAM. It would be really nice
if the array NVRAM would be available for ZIL storage. It would also
be nice for extra hardware (PCI-X, PCIe card) that added NVRAM storage
to various sun low/mid-range servers that are currently acting as
ZFS/NFS servers. Or maybe someone knows of cheap SSD storage that
could be used for the ZIL? I think several HD's are available with
SCSI/ATA interfaces.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to take advantage of PSARC 2007/171: ZFS Separate Intent Log

2007-07-03 Thread Albert Chin
On Tue, Jul 03, 2007 at 05:31:00PM +0200, [EMAIL PROTECTED] wrote:
 
 PSARC 2007/171 will be available in b68. Any documentation anywhere on
 how to take advantage of it?
 
 Some of the Sun storage arrays contain NVRAM. It would be really nice
 if the array NVRAM would be available for ZIL storage. It would also
 be nice for extra hardware (PCI-X, PCIe card) that added NVRAM storage
 to various sun low/mid-range servers that are currently acting as
 ZFS/NFS servers. Or maybe someone knows of cheap SSD storage that
 could be used for the ZIL? I think several HD's are available with
 SCSI/ATA interfaces.
 
 Would flash memory be fast enough (current flash memory has reasonable
 sequential write throughput but horrible I/O ops)

Good point. The speeds for the following don't seem very impressive:
  http://www.adtron.com/products/A25fb-SerialATAFlashDisk.html
  http://www.sandisk.com/OEM/ProductCatalog(1321)-SanDisk_SSD_SATA_5000_25.aspx

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to take advantage of PSARC 2007/171: ZFS Separate Intent Log

2007-07-03 Thread Albert Chin
On Tue, Jul 03, 2007 at 09:01:50AM -0700, Richard Elling wrote:
 Albert Chin wrote:
  Some of the Sun storage arrays contain NVRAM. It would be really nice
  if the array NVRAM would be available for ZIL storage. It would also
  be nice for extra hardware (PCI-X, PCIe card) that added NVRAM storage
  to various sun low/mid-range servers that are currently acting as
  ZFS/NFS servers. Or maybe someone knows of cheap SSD storage that
  could be used for the ZIL? I think several HD's are available with
  SCSI/ATA interfaces.
 
 First, you need a workload where the ZIL has an impact.

ZFS/NFS + zil_disable is faster than ZFS/NFS without zil_disable. So,
I presume, ZFS/NFS + an NVRAM-backed ZIL would be noticeably faster
than ZFS/NFS + ZIL.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to take advantage of PSARC 2007/171: ZFS Separate Intent Log

2007-07-03 Thread Albert Chin
On Tue, Jul 03, 2007 at 10:31:28AM -0700, Richard Elling wrote:
 Albert Chin wrote:
  On Tue, Jul 03, 2007 at 09:01:50AM -0700, Richard Elling wrote:
  Albert Chin wrote:
  Some of the Sun storage arrays contain NVRAM. It would be really nice
  if the array NVRAM would be available for ZIL storage. It would also
  be nice for extra hardware (PCI-X, PCIe card) that added NVRAM storage
  to various sun low/mid-range servers that are currently acting as
  ZFS/NFS servers. Or maybe someone knows of cheap SSD storage that
  could be used for the ZIL? I think several HD's are available with
  SCSI/ATA interfaces.
  First, you need a workload where the ZIL has an impact.
  
  ZFS/NFS + zil_disable is faster than ZFS/NFS without zil_disable. So,
  I presume, ZFS/NFS + an NVRAM-backed ZIL would be noticeably faster
  than ZFS/NFS + ZIL.
 
 ... for NFS workloads which are sync-sensitive.

Well, yes. We've made the decision not to set zil_disable in lieu of
the possibility of the ZFS/NFS server crashing and having the clients
out of sync with what's on the ZFS/NFS server. I think this is the
common case though for a ZFS/NFS server.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to take advantage of PSARC 2007/171: ZFS Separate Intent Log

2007-07-03 Thread Albert Chin
On Tue, Jul 03, 2007 at 11:02:24AM -0700, Bryan Cantrill wrote:
 On Tue, Jul 03, 2007 at 10:26:20AM -0500, Albert Chin wrote:
  PSARC 2007/171 will be available in b68. Any documentation anywhere on
  how to take advantage of it?
  
  Some of the Sun storage arrays contain NVRAM. It would be really nice
  if the array NVRAM would be available for ZIL storage. 
 
 It depends on your array, of course, but in most arrays you can control
 the amount of write cache (i.e., NVRAM) dedicated to particular LUNs.
 So to use the new separate logging most effectively, you should take
 your array, and dedicate all of your NVRAM to a single LUN that you then
 use as your separate log device.  Your pool should then use a LUN or LUNs
 that do not have any NVRAM dedicated to it.  

Hmm, interesting. We'll try to find out if the 6140's can do this.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to take advantage of PSARC 2007/171: ZFS Separate Intent Log

2007-07-09 Thread Albert Chin
On Tue, Jul 03, 2007 at 11:02:24AM -0700, Bryan Cantrill wrote:
 
 On Tue, Jul 03, 2007 at 10:26:20AM -0500, Albert Chin wrote:
  It would also be nice for extra hardware (PCI-X, PCIe card) that
  added NVRAM storage to various sun low/mid-range servers that are
  currently acting as ZFS/NFS servers. 
 
 You can do it yourself very easily -- check out the umem cards from
 Micro Memory, available at http://www.umem.com.  Reasonable prices
 ($1000/GB), they have a Solaris driver, and the performance
 absolutely rips.

The PCIe card is in beta, they don't sell to individual customers, and
the person I spoke with didn't even know a vendor (Tier 1/2 OEMs) that
had a Solaris driver. They do have a number of PCI-X cards though.

So, I guess we'll be testing the dedicate all NVRAM to LUN solution
once b68 is released.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to take advantage of PSARC 2007/171: ZFS Separate Intent Log

2007-07-10 Thread Albert Chin
On Tue, Jul 10, 2007 at 07:12:35AM -0500, Al Hopper wrote:
 On Mon, 9 Jul 2007, Albert Chin wrote:
 
  On Tue, Jul 03, 2007 at 11:02:24AM -0700, Bryan Cantrill wrote:
 
  On Tue, Jul 03, 2007 at 10:26:20AM -0500, Albert Chin wrote:
  It would also be nice for extra hardware (PCI-X, PCIe card) that
  added NVRAM storage to various sun low/mid-range servers that are
  currently acting as ZFS/NFS servers.
 
  You can do it yourself very easily -- check out the umem cards from
  Micro Memory, available at http://www.umem.com.  Reasonable prices
  ($1000/GB), they have a Solaris driver, and the performance
  absolutely rips.
 
  The PCIe card is in beta, they don't sell to individual customers, and
  the person I spoke with didn't even know a vendor (Tier 1/2 OEMs) that
  had a Solaris driver. They do have a number of PCI-X cards though.
 
  So, I guess we'll be testing the dedicate all NVRAM to LUN solution
  once b68 is released.
 
 or ramdiskadm(1M) might be interesting...

Well, that's not really an option as a panic of the server would not
be good. While the on-disk data would be consistent, data the clients
wrote to the server might not have been committed.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] separate intent log blog

2007-07-18 Thread Albert Chin
On Wed, Jul 18, 2007 at 01:29:51PM -0600, Neil Perrin wrote:
 I wrote up a blog on the separate intent log called slog blog
 which describes the interface; some performance results; and
 general status:
 
 http://blogs.sun.com/perrin/entry/slog_blog_or_blogging_on

So, how did you get a pci Micro Memory pci1332,5425 card :) I
presume this is the PCI-X version.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] separate intent log blog

2007-07-18 Thread Albert Chin
On Wed, Jul 18, 2007 at 01:00:22PM -0700, Eric Schrock wrote:
 You can find these at:
 
 http://www.umem.com/Umem_NVRAM_Cards.html
 
 And the one Neil was using in particular:
 
 http://www.umem.com/MM-5425CN.html

They only sell to OEMs. Our Sun VAR looked for one as well but they
cannot find anyone selling them.

 - Eric
 
 On Wed, Jul 18, 2007 at 01:54:23PM -0600, Neil Perrin wrote:
  
  
  Albert Chin wrote:
   On Wed, Jul 18, 2007 at 01:29:51PM -0600, Neil Perrin wrote:
   I wrote up a blog on the separate intent log called slog blog
   which describes the interface; some performance results; and
   general status:
  
   http://blogs.sun.com/perrin/entry/slog_blog_or_blogging_on
   
   So, how did you get a pci Micro Memory pci1332,5425 card :) I
   presume this is the PCI-X version.
  
  I wasn't involved in the aquisition but was just sent one internally
  for testing. Yes it's PCI-X. I assume your asking because they can
  not (or no longer) be obtained?
  
  Neil.
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 --
 Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] separate intent log blog

2007-07-18 Thread Albert Chin
On Wed, Jul 18, 2007 at 01:54:23PM -0600, Neil Perrin wrote:
 Albert Chin wrote:
  On Wed, Jul 18, 2007 at 01:29:51PM -0600, Neil Perrin wrote:
  I wrote up a blog on the separate intent log called slog blog
  which describes the interface; some performance results; and
  general status:
 
  http://blogs.sun.com/perrin/entry/slog_blog_or_blogging_on
  
  So, how did you get a pci Micro Memory pci1332,5425 card :) I
  presume this is the PCI-X version.
 
 I wasn't involved in the aquisition but was just sent one internally
 for testing. Yes it's PCI-X. I assume your asking because they can
 not (or no longer) be obtained?

Sadly, not from any reseller I know of.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] separate intent log blog

2007-07-27 Thread Albert Chin
On Fri, Jul 27, 2007 at 08:32:48AM -0700, Adolf Hohl wrote:
 what is necessary to get it working from the solaris side. Is a
 driver on board or is there no special one needed?

I'd imagine so.

 I just got a packed MM-5425CN with 256M. However i am lacking a
 pci-x 64bit connector and not sure if it is worth the whole effort
 for my personal purposes.

Huh? So your MM-5425CN doesn't fit into a PCI slot?

 Any comment are very appreciated

How did you obtain your card?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zfs log device (zil) ever coming to Sol10?

2007-09-18 Thread Albert Chin
On Tue, Sep 18, 2007 at 12:59:02PM -0400, Andy Lubel wrote:
 I think we are very close to using zfs in our production environment..  Now
 that I have snv_72 installed and my pools set up with NVRAM log devices
 things are hauling butt.

How did you get NVRAM log devices?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS array NVRAM cache?

2007-09-25 Thread Albert Chin
On Tue, Sep 25, 2007 at 06:01:00PM -0700, Vincent Fox wrote:
 I don't understand.  How do you
 
 setup one LUN that has all of the NVRAM on the array dedicated to it
 
 I'm pretty familiar with 3510 and 3310. Forgive me for being a bit
 thick here, but can you be more specific for the n00b?

If you're using CAM, disable NVRAM on all of your LUNs. Then, create
another LUN equivalent to the size of your NVRAM. Assign the ZIL to
this LUN. You'll then have an NVRAM-backed ZIL.

I posted a question along these lines to storage-discuss:
  http://mail.opensolaris.org/pipermail/storage-discuss/2007-July/003080.html

You'll need to determine the performance impact of removing NVRAM from
your data LUNs. Don't blindly do it.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Oops (accidentally deleted replaced drive)

2007-11-19 Thread Albert Chin
Running ON b66 and had a drive fail. Ran 'zfs replace' and resilvering
began. But, accidentally deleted the replacement drive on the array
via CAM.

# zpool status -v
...
  raidz2   DEGRADED 0 0 0
c0t600A0B800029996605964668CB39d0  ONLINE   0 0 0
spare  DEGRADED 0 0 0
  replacingUNAVAIL  0 79.14 
0  insufficient replicas
c0t600A0B8000299966059E4668CBD3d0  UNAVAIL 27   370 
0  cannot open
c0t600A0B800029996606584741C7C3d0  UNAVAIL  0 82.32 
0  cannot open
  c0t600A0B8000299CCC05D84668F448d0ONLINE   0 0 0
c0t600A0B8000299CCC05B44668CC6Ad0  ONLINE   0 0 0
c0t600A0B800029996605A44668CC3Fd0  ONLINE   0 0 0
c0t600A0B8000299CCC05BA4668CD2Ed0  ONLINE   0 0 0


Is there a way to recover from this?
  # zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
  c0t600A0B8000299CCC06734741CD4Ed0
  cannot replace c0t600A0B8000299966059E4668CBD3d0 with
  c0t600A0B8000299CCC06734741CD4Ed0: cannot replace a replacing device

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Oops (accidentally deleted replaced drive)

2007-11-19 Thread Albert Chin
On Mon, Nov 19, 2007 at 06:23:01PM -0800, Eric Schrock wrote:
 You should be able to do a 'zpool detach' of the replacement and then
 try again.

Thanks. That worked.

 - Eric
 
 On Mon, Nov 19, 2007 at 08:20:04PM -0600, Albert Chin wrote:
  Running ON b66 and had a drive fail. Ran 'zfs replace' and resilvering
  began. But, accidentally deleted the replacement drive on the array
  via CAM.
  
  # zpool status -v
  ...
raidz2   DEGRADED 0 0 
  0
  c0t600A0B800029996605964668CB39d0  ONLINE   0 0 
  0
  spare  DEGRADED 0 0 
  0
replacingUNAVAIL  0 79.14 
  0  insufficient replicas
  c0t600A0B8000299966059E4668CBD3d0  UNAVAIL 27   370 
  0  cannot open
  c0t600A0B800029996606584741C7C3d0  UNAVAIL  0 82.32 
  0  cannot open
c0t600A0B8000299CCC05D84668F448d0ONLINE   0 0 
  0
  c0t600A0B8000299CCC05B44668CC6Ad0  ONLINE   0 0 
  0
  c0t600A0B800029996605A44668CC3Fd0  ONLINE   0 0 
  0
  c0t600A0B8000299CCC05BA4668CD2Ed0  ONLINE   0 0 
  0
  
  
  Is there a way to recover from this?
# zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
c0t600A0B8000299CCC06734741CD4Ed0
cannot replace c0t600A0B8000299966059E4668CBD3d0 with
c0t600A0B8000299CCC06734741CD4Ed0: cannot replace a replacing device
  
  -- 
  albert chin ([EMAIL PROTECTED])
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 --
 Eric Schrock, FishWorkshttp://blogs.sun.com/eschrock
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why did resilvering restart?

2007-11-20 Thread Albert Chin
On Tue, Nov 20, 2007 at 10:01:49AM -0600, [EMAIL PROTECTED] wrote:
 Resilver and scrub are broken and restart when a snapshot is created
 -- the current workaround is to disable snaps while resilvering,
 the ZFS team is working on the issue for a long term fix.

But, no snapshot was taken. If so, zpool history would have shown
this. So, in short, _no_ ZFS operations are going on during the
resilvering. Yet, it is restarting.

 -Wade
 
 [EMAIL PROTECTED] wrote on 11/20/2007 09:58:19 AM:
 
  On b66:
# zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
c0t600A0B8000299CCC06734741CD4Ed0
 some hours later
# zpool status tww
  pool: tww
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool
 will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 62.90% done, 4h26m to go
 some hours later
# zpool status tww
  pool: tww
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool
 will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 3.85% done, 18h49m to go
 
# zpool history tww | tail -1
2007-11-20.02:37:13 zpool replace tww
 c0t600A0B8000299966059E4668CBD3d0
c0t600A0B8000299CCC06734741CD4Ed0
 
  So, why did resilvering restart when no zfs operations occurred? I
  just ran zpool status again and now I get:
# zpool status tww
  pool: tww
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool
 will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 0.00% done, 134h45m to go
 
  What's going on?
 
  --
  albert chin ([EMAIL PROTECTED])
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why did resilvering restart?

2007-11-20 Thread Albert Chin
On Tue, Nov 20, 2007 at 11:10:20AM -0600, [EMAIL PROTECTED] wrote:
 
 [EMAIL PROTECTED] wrote on 11/20/2007 10:11:50 AM:
 
  On Tue, Nov 20, 2007 at 10:01:49AM -0600, [EMAIL PROTECTED] wrote:
   Resilver and scrub are broken and restart when a snapshot is created
   -- the current workaround is to disable snaps while resilvering,
   the ZFS team is working on the issue for a long term fix.
 
  But, no snapshot was taken. If so, zpool history would have shown
  this. So, in short, _no_ ZFS operations are going on during the
  resilvering. Yet, it is restarting.
 
 
 Does 2007-11-20.02:37:13 actually match the expected timestamp of
 the original zpool replace command before the first zpool status
 output listed below?

No. We ran some 'zpool status' commands after the last 'zpool
replace'. The 'zpool status' output in the initial email is from this
morning. The only ZFS command we've been running is 'zfs list', 'zpool
list tww', 'zpool status', or 'zpool status -v' after the last 'zpool
replace'.

Server is on GMT time.

 Is it possible that another zpool replace is further up on your
 pool history (ie it was rerun by an admin or automatically from some
 service)?

Yes, but a zpool replace for the same bad disk:
  2007-11-20.00:57:40 zpool replace tww c0t600A0B8000299966059E4668CBD3d0
  c0t600A0B800029996606584741C7C3d0
  2007-11-20.02:35:22 zpool detach tww c0t600A0B800029996606584741C7C3d0
  2007-11-20.02:37:13 zpool replace tww c0t600A0B8000299966059E4668CBD3d0
  c0t600A0B8000299CCC06734741CD4Ed0

We accidentally removed c0t600A0B800029996606584741C7C3d0 from the
array, hence the 'zpool detach'.

The last 'zpool replace' has been running for 15h now.

 -Wade
 
 
  
   [EMAIL PROTECTED] wrote on 11/20/2007 09:58:19 AM:
  
On b66:
  # zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
  c0t600A0B8000299CCC06734741CD4Ed0
   some hours later
  # zpool status tww
pool: tww
   state: DEGRADED
  status: One or more devices is currently being resilvered.  The
 pool
   will
  continue to function, possibly in a degraded state.
  action: Wait for the resilver to complete.
   scrub: resilver in progress, 62.90% done, 4h26m to go
   some hours later
  # zpool status tww
pool: tww
   state: DEGRADED
  status: One or more devices is currently being resilvered.  The
 pool
   will
  continue to function, possibly in a degraded state.
  action: Wait for the resilver to complete.
   scrub: resilver in progress, 3.85% done, 18h49m to go
   
  # zpool history tww | tail -1
  2007-11-20.02:37:13 zpool replace tww
   c0t600A0B8000299966059E4668CBD3d0
  c0t600A0B8000299CCC06734741CD4Ed0
   
So, why did resilvering restart when no zfs operations occurred? I
just ran zpool status again and now I get:
  # zpool status tww
pool: tww
   state: DEGRADED
  status: One or more devices is currently being resilvered.  The
 pool
   will
  continue to function, possibly in a degraded state.
  action: Wait for the resilver to complete.
   scrub: resilver in progress, 0.00% done, 134h45m to go
   
What's going on?
   
--
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
   ___
   zfs-discuss mailing list
   zfs-discuss@opensolaris.org
   http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
  
 
  --
  albert chin ([EMAIL PROTECTED])
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why did resilvering restart?

2007-11-21 Thread Albert Chin
On Tue, Nov 20, 2007 at 11:39:30AM -0600, Albert Chin wrote:
 On Tue, Nov 20, 2007 at 11:10:20AM -0600, [EMAIL PROTECTED] wrote:
  
  [EMAIL PROTECTED] wrote on 11/20/2007 10:11:50 AM:
  
   On Tue, Nov 20, 2007 at 10:01:49AM -0600, [EMAIL PROTECTED] wrote:
Resilver and scrub are broken and restart when a snapshot is created
-- the current workaround is to disable snaps while resilvering,
the ZFS team is working on the issue for a long term fix.
  
   But, no snapshot was taken. If so, zpool history would have shown
   this. So, in short, _no_ ZFS operations are going on during the
   resilvering. Yet, it is restarting.
  
  
  Does 2007-11-20.02:37:13 actually match the expected timestamp of
  the original zpool replace command before the first zpool status
  output listed below?
 
 No. We ran some 'zpool status' commands after the last 'zpool
 replace'. The 'zpool status' output in the initial email is from this
 morning. The only ZFS command we've been running is 'zfs list', 'zpool
 list tww', 'zpool status', or 'zpool status -v' after the last 'zpool
 replace'.

I think the 'zpool status' command was resetting the resilvering. We
upgraded to b77 this morning which did not exhibit this problem.
Resilvering is now done.

 Server is on GMT time.
 
  Is it possible that another zpool replace is further up on your
  pool history (ie it was rerun by an admin or automatically from some
  service)?
 
 Yes, but a zpool replace for the same bad disk:
   2007-11-20.00:57:40 zpool replace tww c0t600A0B8000299966059E4668CBD3d0
   c0t600A0B800029996606584741C7C3d0
   2007-11-20.02:35:22 zpool detach tww c0t600A0B800029996606584741C7C3d0
   2007-11-20.02:37:13 zpool replace tww c0t600A0B8000299966059E4668CBD3d0
   c0t600A0B8000299CCC06734741CD4Ed0
 
 We accidentally removed c0t600A0B800029996606584741C7C3d0 from the
 array, hence the 'zpool detach'.
 
 The last 'zpool replace' has been running for 15h now.
 
  -Wade
  
  
   
[EMAIL PROTECTED] wrote on 11/20/2007 09:58:19 AM:
   
 On b66:
   # zpool replace tww c0t600A0B8000299966059E4668CBD3d0 \
   c0t600A0B8000299CCC06734741CD4Ed0
some hours later
   # zpool status tww
 pool: tww
state: DEGRADED
   status: One or more devices is currently being resilvered.  The
  pool
will
   continue to function, possibly in a degraded state.
   action: Wait for the resilver to complete.
scrub: resilver in progress, 62.90% done, 4h26m to go
some hours later
   # zpool status tww
 pool: tww
state: DEGRADED
   status: One or more devices is currently being resilvered.  The
  pool
will
   continue to function, possibly in a degraded state.
   action: Wait for the resilver to complete.
scrub: resilver in progress, 3.85% done, 18h49m to go

   # zpool history tww | tail -1
   2007-11-20.02:37:13 zpool replace tww
c0t600A0B8000299966059E4668CBD3d0
   c0t600A0B8000299CCC06734741CD4Ed0

 So, why did resilvering restart when no zfs operations occurred? I
 just ran zpool status again and now I get:
   # zpool status tww
 pool: tww
state: DEGRADED
   status: One or more devices is currently being resilvered.  The
  pool
will
   continue to function, possibly in a degraded state.
   action: Wait for the resilver to complete.
scrub: resilver in progress, 0.00% done, 134h45m to go

 What's going on?

 --
 albert chin ([EMAIL PROTECTED])
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   
   
  
   --
   albert chin ([EMAIL PROTECTED])
   ___
   zfs-discuss mailing list
   zfs-discuss@opensolaris.org
   http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
  
 
 -- 
 albert chin ([EMAIL PROTECTED])
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Trial x4500, zfs with NFS and quotas.

2007-11-28 Thread Albert Chin
On Wed, Nov 28, 2007 at 05:40:57PM +0900, Jorgen Lundman wrote:
 
 Ah it's a somewhat mis-leading error message:
 
 bash-3.00# mount -F lofs /zpool1/test /export/test
 bash-3.00# share -F nfs -o rw,anon=0 /export/test
 Could not share: /export/test: invalid path
 bash-3.00# umount /export/test
 bash-3.00# zfs set sharenfs=off zpool1/test
 bash-3.00# mount -F lofs /zpool1/test /export/test
 bash-3.00# share -F nfs -o rw,anon=0 /export/test
 
 So if any zfs file-system has sharenfs enabled, you will get invalid 
 path. If you disable sharenfs, then you can export the lofs.

I reported bug #6578437. We recently ugraded to b77 and this bug
appears to be fixed now.

 Lund
 
 
 J.P. King wrote:
 
  I can not export lofs on NFS. Just gives invalid path,
  
  Tell that to our mirror server.
  
  -bash-3.00$ /sbin/mount -p | grep linux
  /data/linux - /linux lofs - no ro
  /data/linux - /export/ftp/pub/linux lofs - no ro
  -bash-3.00$ grep linux /etc/dfs/sharetab
  /linux  -   nfs ro  Linux directories
  -bash-3.00$ df -k /linux
  Filesystem   1K-blocks  Used Available Use% Mounted on
  data 3369027462 3300686151  68341312  98% /data
  
  and:
 
  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6578437
  
  I'm using straight Solaris, not Solaris Express or equivalents:
  
  -bash-3.00$ uname -a
  SunOS leprechaun.csi.cam.ac.uk 5.10 Generic_127111-01 sun4u sparc 
  SUNW,Sun-Fire-V240 Solaris
  
  I can't comment on the bug, although I notice it is categorised under 
  nfsv4, but the description doesn't seem to match that.
  
  Jorgen Lundman   | [EMAIL PROTECTED]
  
  Julian
  -- 
  Julian King
  Computer Officer, University of Cambridge, Unix Support
  
 
 -- 
 Jorgen Lundman   | [EMAIL PROTECTED]
 Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
 Shibuya-ku, Tokyo| +81 (0)90-5578-8500  (cell)
 Japan| +81 (0)3 -3375-1767  (home)
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Not Offlining Disk on SCSI Sense Error (X4500)

2008-01-03 Thread Albert Chin
On Thu, Jan 03, 2008 at 02:57:08PM -0700, Jason J. W. Williams wrote:
 There seems to be a persistent issue we have with ZFS where one of the
 SATA disk in a zpool on a Thumper starts throwing sense errors, ZFS
 does not offline the disk and instead hangs all zpools across the
 system. If it is not caught soon enough, application data ends up in
 an inconsistent state. We've had this issue with b54 through b77 (as
 of last night).
 
 We don't seem to be the only folks with this issue reading through the
 archives. Are there any plans to fix this behavior? It really makes
 ZFS less than desirable/reliable.

http://blogs.sun.com/eschrock/entry/zfs_and_fma

FMA For ZFS Phase 2 (PSARC/2007/283) was integrated in b68:
  http://www.opensolaris.org/os/community/arc/caselog/2007/283/
  http://www.opensolaris.org/os/community/on/flag-days/all/

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Albert Chin
On Tue, Jan 22, 2008 at 12:47:37PM -0500, Kyle McDonald wrote:
 
 My primary use case, is NFS base storage to a farm of software build 
 servers, and developer desktops.

For the above environment, you'll probably see a noticable improvement
with a battery-backed NVRAM-based ZIL. Unfortunately, no inexpensive
cards exist for the common consumer (with ECC memory anyways). If you
convince http://www.micromemory.com/ to sell you one, let us know :)

Set set zfs:zil_disable = 1 in /etc/system to gauge the type of
improvement you can expect. Don't use this in production though.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] sharenfs with over 10000 file systems

2008-01-23 Thread Albert Chin
On Wed, Jan 23, 2008 at 08:02:22AM -0800, Akhilesh Mritunjai wrote:
 I remember reading a discussion where these kind of problems were
 discussed.
 
 Basically it boils down to everything not being aware of the
 radical changes in filesystems concept.
 
 All these things are being worked on, but it might take sometime
 before everything is made aware that yes it's no longer unusual that
 there can be 1+ filesystems on one machine.

But shouldn't sharemgr(1M) be aware? It's relatively new.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-25 Thread Albert Chin
On Fri, Jan 25, 2008 at 12:59:18AM -0500, Kyle McDonald wrote:
 ... With the 256MB doing write caching, is there any further benefit
 to moving thte ZIL to a flash or other fast NV storage?

Do some tests with/without ZIL enabled. You should see a big
difference. You should see something equivalent to the performance of
ZIL disabled with ZIL/RAM. I'd do ZIL with a battery-backed RAM in a
heartbeat if I could find a card. I think others would as well.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance with Sun StorageTek 2540

2008-02-15 Thread Albert Chin
On Fri, Feb 15, 2008 at 09:00:05PM +, Peter Tribble wrote:
 On Fri, Feb 15, 2008 at 8:50 PM, Bob Friesenhahn
 [EMAIL PROTECTED] wrote:
  On Fri, 15 Feb 2008, Peter Tribble wrote:
   
May not be relevant, but still worth checking - I have a 2530 (which 
  ought
to be that same only SAS instead of FC), and got fairly poor performance
at first. Things improved significantly when I got the LUNs properly
balanced across the controllers.
 
   What do you mean by properly balanced across the controllers?  Are
   you using the multipath support in Solaris 10 or are you relying on
   ZFS to balance the I/O load?  Do some disks have more affinity for a
   controller than the other?
 
 Each LUN is accessed through only one of the controllers (I presume the
 2540 works the same way as the 2530 and 61X0 arrays). The paths are
 active/passive (if the active fails it will relocate to the other path).
 When I set mine up the first time it allocated all the LUNs to controller B
 and performance was terrible. I then manually transferred half the LUNs
 to controller A and it started to fly.

http://groups.google.com/group/comp.unix.solaris/browse_frm/thread/59b43034602a7b7f/0b500afc4d62d434?lnk=stq=#0b500afc4d62d434

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/recv question

2008-03-11 Thread Albert Chin
On Thu, Mar 06, 2008 at 10:34:07PM -0800, Bill Shannon wrote:
 Darren J Moffat wrote:
  I know this isn't answering the question but rather than using today 
  and yesterday why not not just use dates ?
 
 Because then I have to compute yesterday's date to do the incremental
 dump.

Not if you set a ZFS property with the date of the last backup.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS configuration for VMware

2008-06-27 Thread Albert Chin
On Fri, Jun 27, 2008 at 08:13:14AM -0700, Ross wrote:
 Bleh, just found out the i-RAM is 5v PCI only.  Won't work on PCI-X
 slots which puts that out of the question for the motherboad I'm
 using.  Vmetro have a 2GB PCI-E card out, but it's for OEM's only:
 http://www.vmetro.com/category4304.html, and I don't have any space in
 this server to mount a SSD.

Maybe you can call Vmetro and get the names of some resellers whom you
could call to get pricing info?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] J4200/J4400 Array

2008-07-02 Thread Albert Chin
On Wed, Jul 02, 2008 at 04:49:26AM -0700, Ben B. wrote:
 According to the Sun Handbook, there is a new array :
 SAS interface
 12 disks SAS or SATA
 
 ZFS could be used nicely with this box.

Doesn't seem to have any NVRAM storage on board, so seems like JBOD.

 There is an another version called
 J4400 with 24 disks.
 
 Doc is here :
 http://docs.sun.com/app/docs/coll/j4200

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] J4200/J4400 Array

2008-07-03 Thread Albert Chin
On Thu, Jul 03, 2008 at 01:43:36PM +0300, Mertol Ozyoney wrote:
 You are right that J series do not have nvram onboard. However most Jbods
 like HPS's MSA series have some nvram. 
 The idea behind not using nvram on the Jbod's is 
 
 -) There is no use to add limited ram to a JBOD as disks already have a lot
 of cache.
 -) It's easy to design a redundant Jbod without nvram. If you have nvram and
 need redundancy you need to design more complex HW and more complex firmware
 -) Bateries are the first thing to fail 
 -) Servers already have too much ram

Well, if the server attached to the J series is doing ZFS/NFS,
performance will increase with zfs:zfs_nocacheflush=1. But, without
battery-backed NVRAM, this really isn't safe. So, for this usage case,
unless the server has battery-backed NVRAM, I don't see how the J series
is good for ZFS/NFS usage.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do you grow a ZVOL?

2008-07-17 Thread Albert Chin
On Thu, Jul 17, 2008 at 04:28:34PM -0400, Charles Menser wrote:
 I've looked for anything I can find on the topic, but there does not
 appear to be anything documented.
 
 Can a ZVOL be expanded?

I think setting the volsize property expands it. Dunno what happens on
the clients though.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL NVRAM partitioning?

2008-09-06 Thread Albert Chin
On Sat, Sep 06, 2008 at 11:16:15AM -0700, Kaya Bekiroğlu wrote:
  The big problem appears to be getting your hands on these cards.   
  Although I have the drivers now my first supplier let me down, and  
  while the second insists they have shipped the cards it's been three  
  weeks now and there's no sign of them.
 
 Thanks to Google Shopping I was able to order two of these cards from:
 http://www.printsavings.com/01390371OP-discount-MICRO+MEMORY-MM5425--512MB-NVRAM-battery.aspx
 
 They appear to be in good working order, but unfortunately I am unable
 to verify the driver. pkgadd -d umem_Sol_Drv_Cust_i386_v01_11.pkg
 hangs on ## Installing  part 1 of 3. on snv_95.  I do not have other
 Solaris versions to  experiment with; this is really just a hobby for
 me.

Does the card come with any programming specs to help debug the driver?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS space efficiency when copying files from another source

2008-11-24 Thread Albert Chin
On Mon, Nov 24, 2008 at 08:43:18AM -0800, Erik Trimble wrote:
 I _really_ wish rsync had an option to copy in place or something like 
 that, where the updates are made directly to the file, rather than a 
 temp copy.

Isn't this what --inplace does?

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Why so many data errors with raidz2 config and one failing drive?

2009-08-24 Thread Albert Chin
Added a third raidz2 vdev to my pool:
  pool: tww
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver in progress for 0h57m, 13.36% done, 6h9m to go
config:

NAME STATE READ
WRITE CKSUM
tww  DEGRADED 0
0 16.9K
  raidz2 ONLINE   0
0 0
c6t600A0B800029996605964668CB39d0ONLINE   0
0 0
c6t600A0B8000299CCC06C84744C892d0ONLINE   0
0 0
c6t600A0B8000299CCC05B44668CC6Ad0ONLINE   0
0 0
c6t600A0B800029996605A44668CC3Fd0ONLINE   0
0 0
c6t600A0B8000299CCC05BA4668CD2Ed0ONLINE   0
0 0
c6t600A0B800029996605AA4668CDB1d0ONLINE   0
0 0
c6t600A0B8000299966073547C5CED9d0ONLINE   0
0 0
  raidz2 ONLINE   0
0 0
c6t600A0B800029996605B04668F17Dd0ONLINE   0
0 0
c6t600A0B8000299CCC099E4A400B94d0ONLINE   0
0 0
c6t600A0B800029996605B64668F26Fd0ONLINE   0
0 0
c6t600A0B8000299CCC05CC4668F30Ed0ONLINE   0
0 0
c6t600A0B800029996605BC4668F305d0ONLINE   0
0 0
c6t600A0B8000299CCC099B4A400A9Cd0ONLINE   0
0 0
c6t600A0B800029996605C24668F39Bd0ONLINE   0
0 0
  raidz2 DEGRADED 0
0 34.0K
c6t600A0B8000299CCC0A154A89E426d0ONLINE   0
0 0
c6t600A0B800029996609F74A89E1A5d0ONLINE   0
0 7  4K resilvered
c6t600A0B8000299CCC0A174A89E520d0ONLINE   0
0 2  4K resilvered
c6t600A0B800029996609F94A89E24Bd0ONLINE   0
048  24.5K resilvered
replacingDEGRADED 0
0 78.7K
  c6t600A0B8000299CCC0A194A89E634d0  UNAVAIL 20
277K 0  experienced I/O failures
  c6t600A0B800029996609EE4A89DA51d0  ONLINE   0
0 0  38.1M resilvered
c6t600A0B8000299CCC0A0C4A89DDE8d0ONLINE   0
0 6  6K resilvered
c6t600A0B800029996609F04A89DB1Bd0ONLINE   0
086  92K resilvered
spares
  c6t600A0B8000299CCC05D84668F448d0  AVAIL   
  c6t600A0B800029996605C84668F461d0  AVAIL   

errors: 17097 data errors, use '-v' for a list


Seems some of the new drives are having problems, resulting in CKSUM
errors. I don't understand why I have so many data errors though. Why
does the third raidz2 vdev report 34.0K CKSUM errors?

The number of data errors appears to be increasing as well as the
resilver process continues.

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why so many data errors with raidz2 config and one failing drive?

2009-08-24 Thread Albert Chin
On Mon, Aug 24, 2009 at 02:01:39PM -0500, Bob Friesenhahn wrote:
 On Mon, 24 Aug 2009, Albert Chin wrote:

 Seems some of the new drives are having problems, resulting in CKSUM
 errors. I don't understand why I have so many data errors though. Why
 does the third raidz2 vdev report 34.0K CKSUM errors?

 Is it possible that this third raidz2 is inflicted with a shared
 problem such as a cable, controller, backplane, or power supply? Only
 one drive is reported as being unscathed.

Well, we're just using unused drives on the existing array. No other
changes.

 Do you periodically scrub your array?

No. Guess we will now :) But, I think all of the data loss is a result
of the new drives, not ones that were already part of the two previous
vdevs.

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Resilver complete, but device not replaced, odd zpool status output

2009-08-25 Thread Albert Chin
 STATE READ
WRITE CKSUM
tww  DEGRADED 0
0 76.0K
  raidz2 ONLINE   0
0 0
c6t600A0B800029996605964668CB39d0ONLINE   0
0 0
c6t600A0B8000299CCC06C84744C892d0ONLINE   0
0 0
c6t600A0B8000299CCC05B44668CC6Ad0ONLINE   0
0 0
c6t600A0B800029996605A44668CC3Fd0ONLINE   0
0 0
c6t600A0B8000299CCC05BA4668CD2Ed0ONLINE   0
0 0
c6t600A0B800029996605AA4668CDB1d0ONLINE   0
0 0
c6t600A0B8000299966073547C5CED9d0ONLINE   0
0 0
  raidz2 ONLINE   0
0 0
c6t600A0B800029996605B04668F17Dd0ONLINE   0
0 0
c6t600A0B8000299CCC099E4A400B94d0ONLINE   0
0 0
c6t600A0B800029996605B64668F26Fd0ONLINE   0
0 0
c6t600A0B8000299CCC05CC4668F30Ed0ONLINE   0
0 0
c6t600A0B800029996605BC4668F305d0ONLINE   0
0 0
c6t600A0B8000299CCC099B4A400A9Cd0ONLINE   0
0 0
c6t600A0B800029996605C24668F39Bd0ONLINE   0
0 0
  raidz2 DEGRADED 0
0  153K
c6t600A0B8000299CCC0A154A89E426d0ONLINE   0
0 1  1K resilvered
c6t600A0B800029996609F74A89E1A5d0ONLINE   0
0 2.14K  5.67M resilvered
c6t600A0B8000299CCC0A174A89E520d0ONLINE   0
0   299  34K resilvered
c6t600A0B800029996609F94A89E24Bd0ONLINE   0
0 29.7K  23.5M resilvered
replacingDEGRADED 0
0  118K
  c6t600A0B8000299CCC0A194A89E634d0  OFFLINE 20
1.28M 0
  c6t600A0B800029996609EE4A89DA51d0  ONLINE   0
0 0  1.93G resilvered
c6t600A0B8000299CCC0A0C4A89DDE8d0ONLINE   0
0   247  54K resilvered
c6t600A0B800029996609F04A89DB1Bd0ONLINE   0
0 24.2K  51.3M resilvered
spares
  c6t600A0B8000299CCC05D84668F448d0  AVAIL   
  c6t600A0B800029996605C84668F461d0  AVAIL   

errors: 27886 data errors, use '-v' for a list

  # zpool replace c6t600A0B8000299CCC0A194A89E634d0 \
  c6t600A0B800029996609EE4A89DA51d0
  invalid vdev specification
  use '-f' to override the following errors:
  /dev/dsk/c6t600A0B800029996609EE4A89DA51d0s0 is part of active ZFS
  pool tww. Please see zpool(1M).

So, what is going on?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Resilver complete, but device not replaced, odd zpool status output

2009-08-25 Thread Albert Chin
On Tue, Aug 25, 2009 at 06:05:16AM -0500, Albert Chin wrote:
 [[ snip snip ]]
 
 After the resilver completed:
   # zpool status tww
   pool: tww
  state: DEGRADED
 status: One or more devices has experienced an error resulting in data
 corruption.  Applications may be affected.
 action: Restore the file in question if possible.  Otherwise restore the
 entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
  scrub: resilver completed after 6h9m with 27886 errors on Tue Aug 25
 08:32:41 2009
 config:
 
 NAME STATE READ
 WRITE CKSUM
 tww  DEGRADED 0
 0 76.0K
   raidz2 ONLINE   0
 0 0
 c6t600A0B800029996605964668CB39d0ONLINE   0
 0 0
 c6t600A0B8000299CCC06C84744C892d0ONLINE   0
 0 0
 c6t600A0B8000299CCC05B44668CC6Ad0ONLINE   0
 0 0
 c6t600A0B800029996605A44668CC3Fd0ONLINE   0
 0 0
 c6t600A0B8000299CCC05BA4668CD2Ed0ONLINE   0
 0 0
 c6t600A0B800029996605AA4668CDB1d0ONLINE   0
 0 0
 c6t600A0B8000299966073547C5CED9d0ONLINE   0
 0 0
   raidz2 ONLINE   0
 0 0
 c6t600A0B800029996605B04668F17Dd0ONLINE   0
 0 0
 c6t600A0B8000299CCC099E4A400B94d0ONLINE   0
 0 0
 c6t600A0B800029996605B64668F26Fd0ONLINE   0
 0 0
 c6t600A0B8000299CCC05CC4668F30Ed0ONLINE   0
 0 0
 c6t600A0B800029996605BC4668F305d0ONLINE   0
 0 0
 c6t600A0B8000299CCC099B4A400A9Cd0ONLINE   0
 0 0
 c6t600A0B800029996605C24668F39Bd0ONLINE   0
 0 0
   raidz2 DEGRADED 0
 0  153K
 c6t600A0B8000299CCC0A154A89E426d0ONLINE   0
 0 1  1K resilvered
 c6t600A0B800029996609F74A89E1A5d0ONLINE   0
 0 2.14K  5.67M resilvered
 c6t600A0B8000299CCC0A174A89E520d0ONLINE   0
 0   299  34K resilvered
 c6t600A0B800029996609F94A89E24Bd0ONLINE   0
 0 29.7K  23.5M resilvered
 replacingDEGRADED 0
 0  118K
   c6t600A0B8000299CCC0A194A89E634d0  OFFLINE 20
 1.28M 0
   c6t600A0B800029996609EE4A89DA51d0  ONLINE   0
 0 0  1.93G resilvered
 c6t600A0B8000299CCC0A0C4A89DDE8d0ONLINE   0
 0   247  54K resilvered
 c6t600A0B800029996609F04A89DB1Bd0ONLINE   0
 0 24.2K  51.3M resilvered
 spares
   c6t600A0B8000299CCC05D84668F448d0  AVAIL   
   c6t600A0B800029996605C84668F461d0  AVAIL   
 
 errors: 27886 data errors, use '-v' for a list
 
   # zpool replace c6t600A0B8000299CCC0A194A89E634d0 \
   c6t600A0B800029996609EE4A89DA51d0
   invalid vdev specification
   use '-f' to override the following errors:
   /dev/dsk/c6t600A0B800029996609EE4A89DA51d0s0 is part of active ZFS
   pool tww. Please see zpool(1M).
 
 So, what is going on?

Rebooted the server and see the same problem. So, I ran:
  # zpool detach tww c6t600A0B8000299CCC0A194A89E634d0
and now the zpool status output looks normal:
  # zpool status tww
  pool: tww
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver in progress for 0h16m, 7.88% done, 3h9m to go
config:

NAME   STATE READ WRITE CKSUM
twwONLINE   0 0 5
  raidz2   ONLINE   0 0 0
c6t600A0B800029996605964668CB39d0  ONLINE   0 0 0
c6t600A0B8000299CCC06C84744C892d0  ONLINE   0 0 0
c6t600A0B8000299CCC05B44668CC6Ad0  ONLINE   0 0 0
c6t600A0B800029996605A44668CC3Fd0  ONLINE   0 0 0
c6t600A0B8000299CCC05BA4668CD2Ed0  ONLINE   0 0 0
c6t600A0B800029996605AA4668CDB1d0  ONLINE   0 0 0
c6t600A0B8000299966073547C5CED9d0  ONLINE   0 0 0
  raidz2   ONLINE   0 0 0
c6t600A0B800029996605B04668F17Dd0  ONLINE   0 0 0
c6t600A0B8000299CCC099E4A400B94d0  ONLINE   0 0 0
c6t600A0B800029996605B64668F26Fd0  ONLINE   0 0 0

[zfs-discuss] zpool scrub started resilver, not scrub

2009-08-26 Thread Albert Chin
# cat /etc/release
  Solaris Express Community Edition snv_105 X86
   Copyright 2008 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
   Assembled 15 December 2008
# zpool status tww
  pool: tww
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed after 6h15m with 27885 errors on Wed Aug 26 07:18:03 
2009
config:

NAME   STATE READ WRITE CKSUM
twwONLINE   0 0 54.5K
  raidz2   ONLINE   0 0 0
c6t600A0B800029996605964668CB39d0  ONLINE   0 0 0
c6t600A0B8000299CCC06C84744C892d0  ONLINE   0 0 0
c6t600A0B8000299CCC05B44668CC6Ad0  ONLINE   0 0 0
c6t600A0B800029996605A44668CC3Fd0  ONLINE   0 0 0
c6t600A0B8000299CCC05BA4668CD2Ed0  ONLINE   0 0 0
c6t600A0B800029996605AA4668CDB1d0  ONLINE   0 0 0
c6t600A0B8000299966073547C5CED9d0  ONLINE   0 0 0
  raidz2   ONLINE   0 0 0
c6t600A0B800029996605B04668F17Dd0  ONLINE   0 0 0
c6t600A0B8000299CCC099E4A400B94d0  ONLINE   0 0 0
c6t600A0B800029996605B64668F26Fd0  ONLINE   0 0 0
c6t600A0B8000299CCC05CC4668F30Ed0  ONLINE   0 0 0
c6t600A0B800029996605BC4668F305d0  ONLINE   0 0 0
c6t600A0B8000299CCC099B4A400A9Cd0  ONLINE   0 0 0
c6t600A0B800029996605C24668F39Bd0  ONLINE   0 0 0
  raidz2   ONLINE   0 0  109K
c6t600A0B8000299CCC0A154A89E426d0  ONLINE   0 0 0
c6t600A0B800029996609F74A89E1A5d0  ONLINE   0 018  
2.50K resilvered
c6t600A0B8000299CCC0A174A89E520d0  ONLINE   0 039  
4.50K resilvered
c6t600A0B800029996609F94A89E24Bd0  ONLINE   0 0   486  
75K resilvered
c6t600A0B80002999660A454A93CEDBd0  ONLINE   0 0 0  
2.55G resilvered
c6t600A0B8000299CCC0A0C4A89DDE8d0  ONLINE   0 034  
2K resilvered
c6t600A0B800029996609F04A89DB1Bd0  ONLINE   0 0   173  
18K resilvered
spares
  c6t600A0B8000299CCC05D84668F448d0AVAIL   
  c6t600A0B800029996605C84668F461d0AVAIL   

errors: 27885 data errors, use '-v' for a list

# zpool scrub tww
# zpool status tww
  pool: tww
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver in progress for 0h11m, 2.82% done, 6h21m to go
config:
...

So, why is a resilver in progress when I asked for a scrub?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-27 Thread Albert Chin
On Thu, Aug 27, 2009 at 06:29:52AM -0700, Gary Gendel wrote:
 It looks like It's definitely related to the snv_121 upgrade.  I
 decided to roll back to snv_110 and the checksum errors have
 disappeared.  I'd like to issue a bug report, but I don't have any
 information that might help track this down, just lots of checksum
 errors.

So, on snv_121, can you read the files with checksum errors? Is it
simply the reporting mechanism that is wrong or are the files really
damaged?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool scrub started resilver, not scrub

2009-08-31 Thread Albert Chin
On Wed, Aug 26, 2009 at 02:33:39AM -0500, Albert Chin wrote:
 # cat /etc/release
   Solaris Express Community Edition snv_105 X86
Copyright 2008 Sun Microsystems, Inc.  All Rights Reserved.
 Use is subject to license terms.
Assembled 15 December 2008

 So, why is a resilver in progress when I asked for a scrub?

Still seeing the same problem with snv_114.
  # cat /etc/release
Solaris Express Community Edition snv_114 X86
 Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
  Use is subject to license terms.
Assembled 04 May 2009

How do I scrub this pool?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool replace complete but old drives not detached

2009-09-06 Thread Albert Chin
$ cat /etc/release
  Solaris Express Community Edition snv_114 X86
   Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
  Assembled 04 May 2009

I recently replaced two drives in a raidz2 vdev. However, after the
resilver completed, the old drives were not automatically detached.
Why? How do I detach the drives that were replaced?

# zpool replace tww c6t600A0B800029996605B04668F17Dd0 \
c6t600A0B8000299CCC099B4A400A9Cd0
# zpool replace tww c6t600A0B800029996605C24668F39Bd0 \
c6t600A0B8000299CCC0A744A94F7E2d0
... resilver runs to completion ...

# zpool status tww
  pool: tww
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: resilver completed after 25h11m with 23375 errors on Sun Sep  6 
02:09:07 2009
config:

NAME STATE READ WRITE CKSUM
tww  DEGRADED 0 0  207K
  raidz2 ONLINE   0 0 0
c6t600A0B800029996605964668CB39d0ONLINE   0 0 0
c6t600A0B8000299CCC06C84744C892d0ONLINE   0 0 0
c6t600A0B8000299CCC05B44668CC6Ad0ONLINE   0 0 0
c6t600A0B800029996605A44668CC3Fd0ONLINE   0 0 0
c6t600A0B8000299CCC05BA4668CD2Ed0ONLINE   0 0 0
c6t600A0B800029996605AA4668CDB1d0ONLINE   0 0 0
c6t600A0B8000299966073547C5CED9d0ONLINE   0 0 0
  raidz2 DEGRADED 0 0  182K
replacingDEGRADED 0 0 0
  c6t600A0B800029996605B04668F17Dd0  DEGRADED 0 0 0 
 too many errors
  c6t600A0B8000299CCC099B4A400A9Cd0  ONLINE   0 0 0 
 255G resilvered
c6t600A0B8000299CCC099E4A400B94d0ONLINE   0 0  218K 
 10.2M resilvered
c6t600A0B8000299CCC0A6B4A93D3EEd0ONLINE   0 0   242 
 246G resilvered
spareDEGRADED 0 0 0
  c6t600A0B8000299CCC05CC4668F30Ed0  DEGRADED 0 0 3 
 too many errors
  c6t600A0B8000299CCC05D84668F448d0  ONLINE   0 0 0 
 255G resilvered
spareDEGRADED 0 0 0
  c6t600A0B800029996605BC4668F305d0  DEGRADED 0 0 0 
 too many errors
  c6t600A0B800029996605C84668F461d0  ONLINE   0 0 0 
 255G resilvered
c6t600A0B800029996609EE4A89DA51d0ONLINE   0 0 0 
 246G resilvered
replacingDEGRADED 0 0 0
  c6t600A0B800029996605C24668F39Bd0  DEGRADED 0 0 0 
 too many errors
  c6t600A0B8000299CCC0A744A94F7E2d0  ONLINE   0 0 0 
 255G resilvered
  raidz2 ONLINE   0 0  233K
c6t600A0B8000299CCC0A154A89E426d0ONLINE   0 0 0
c6t600A0B800029996609F74A89E1A5d0ONLINE   0 0   758 
 6.50K resilvered
c6t600A0B8000299CCC0A174A89E520d0ONLINE   0 0   311 
 3.50K resilvered
c6t600A0B800029996609F94A89E24Bd0ONLINE   0 0 21.8K 
 32K resilvered
c6t600A0B8000299CCC0A694A93D322d0ONLINE   0 0 0 
 1.85G resilvered
c6t600A0B8000299CCC0A0C4A89DDE8d0ONLINE   0 0 27.4K 
 41.5K resilvered
c6t600A0B800029996609F04A89DB1Bd0ONLINE   0 0 7.13K 
 24K resilvered
spares
  c6t600A0B8000299CCC05D84668F448d0  INUSE currently in use
  c6t600A0B800029996605C84668F461d0  INUSE currently in use
  c6t600A0B80002999660A454A93CEDBd0  AVAIL   
  c6t600A0B80002999660ADA4A9CF2EDd0  AVAIL   

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to recover from can't open objset, cannot iterate filesystems?

2009-09-21 Thread Albert Chin
Recently upgraded a system from b98 to b114. Also replaced two 400G
Seagate Barracudea 7200.8 SATA disks with two WD 750G RE3 SATA disks
from a 6-device raidz1 pool. Replacing the first 750G went ok. While
replacing the second 750G disk, I noticed CKSUM errors on the first
disk. Once the second disk was replaced, I halted the system, upgraded
to b114, and rebooted. Both b98 and b114 gave the errors:
  WARNING: can't open objset for tww/opt/dists/cd-8.1
  cannot iterate filesystems: I/O error

How do I recover from this?

# zpool status tww
  pool: tww
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tww ONLINE   0 0 3
  raidz1ONLINE   0 012
c4t0d0  ONLINE   0 0 0
c4t1d0  ONLINE   0 0 0
c4t4d0  ONLINE   0 0 0
c4t5d0  ONLINE   0 0 0
c4t6d0  ONLINE   0 0 0
c4t7d0  ONLINE   0 0 0

errors: 855 data errors, use '-v' for a list

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs snapshot -r panic on b114

2009-09-23 Thread Albert Chin
While a resilver was running, we attempted a recursive snapshot which
resulted in a kernel panic:
  panic[cpu1]/thread=ff00104c0c60: assertion failed: 0 == 
zap_remove_int(mos, next_clones_obj, dsphys-ds_next_snap_obj, tx) (0x0 == 
0x2), file: ../../common/ fs/zfs/dsl_dataset.c, line: 1869

  ff00104c0960 genunix:assfail3+c1 ()
  ff00104c0a00 zfs:dsl_dataset_snapshot_sync+4a2 ()
  ff00104c0a50 zfs:snapshot_sync+41 ()
  ff00104c0aa0 zfs:dsl_sync_task_group_sync+eb ()
  ff00104c0b10 zfs:dsl_pool_sync+196 ()
  ff00104c0ba0 zfs:spa_sync+32a ()
  ff00104c0c40 zfs:txg_sync_thread+265 ()
  ff00104c0c50 unix:thread_start+8 ()

System is a X4100M2 running snv_114.

Any ideas?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Help! System panic when pool imported

2009-09-24 Thread Albert Chin
Running snv_114 on an X4100M2 connected to a 6140. Made a clone of a
snapshot a few days ago:
  # zfs snapshot a...@b
  # zfs clone a...@b tank/a
  # zfs clone a...@b tank/b

The system started panicing after I tried:
  # zfs snapshot tank/b...@backup

So, I destroyed tank/b:
  # zfs destroy tank/b
then tried to destroy tank/a
  # zfs destroy tank/a

Now, the system is in an endless panic loop, unable to import the pool
at system startup or with zpool import. The panic dump is:
  panic[cpu1]/thread=ff0010246c60: assertion failed: 0 == 
zap_remove_int(mos, ds_prev-ds_phys-ds_next_clones_obj, obj, tx) (0x0 == 
0x2), file: ../../common/fs/zfs/dsl_dataset.c, line: 1512

  ff00102468d0 genunix:assfail3+c1 ()
  ff0010246a50 zfs:dsl_dataset_destroy_sync+85a ()
  ff0010246aa0 zfs:dsl_sync_task_group_sync+eb ()
  ff0010246b10 zfs:dsl_pool_sync+196 ()
  ff0010246ba0 zfs:spa_sync+32a ()
  ff0010246c40 zfs:txg_sync_thread+265 ()
  ff0010246c50 unix:thread_start+8 ()

We really need to import this pool. Is there a way around this? We do
have snv_114 source on the system if we need to make changes to
usr/src/uts/common/fs/zfs/dsl_dataset.c. It seems like the zfs
destroy transaction never completed and it is being replayed, causing
the panic. This cycle continues endlessly.

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help! System panic when pool imported

2009-09-25 Thread Albert Chin
On Fri, Sep 25, 2009 at 05:21:23AM +, Albert Chin wrote:
 [[ snip snip ]]
 
 We really need to import this pool. Is there a way around this? We do
 have snv_114 source on the system if we need to make changes to
 usr/src/uts/common/fs/zfs/dsl_dataset.c. It seems like the zfs
 destroy transaction never completed and it is being replayed, causing
 the panic. This cycle continues endlessly.

What are the implications of adding the following to /etc/system:
  set zfs:zfs_recover=1
  set aok=1

And importing the pool with:
  # zpool import -o ro

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help! System panic when pool imported

2009-09-27 Thread Albert Chin
On Sun, Sep 27, 2009 at 10:06:16AM -0700, Andrew wrote:
 This is what my /var/adm/messages looks like:
 
 Sep 27 12:46:29 solaria genunix: [ID 403854 kern.notice] assertion failed: ss 
 == NULL, file: ../../common/fs/zfs/space_map.c, line: 109
 Sep 27 12:46:29 solaria unix: [ID 10 kern.notice]
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a97a0 
 genunix:assfail+7e ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a9830 
 zfs:space_map_add+292 ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a98e0 
 zfs:space_map_load+3a7 ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a9920 
 zfs:metaslab_activate+64 ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a99e0 
 zfs:metaslab_group_alloc+2b7 ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a9ac0 
 zfs:metaslab_alloc_dva+295 ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a9b60 
 zfs:metaslab_alloc+9b ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a9b90 
 zfs:zio_dva_allocate+3e ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a9bc0 
 zfs:zio_execute+a0 ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a9c40 
 genunix:taskq_thread+193 ()
 Sep 27 12:46:29 solaria genunix: [ID 655072 kern.notice] ff00089a9c50 
 unix:thread_start+8 ()

I'm not sure that aok=1/zfs:zfs_recover=1 would help you because
zfs_panic_recover isn't in the backtrace (see
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6638754).
Sometimes a Sun zfs engineer shows up on the freenode #zfs channel. I'd
pop up there and ask. There are somewhat similar bug reports at
bugs.opensolaris.org. I'd post a bug report just in case.

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Quickest way to find files with cksum errors without doing scrub

2009-09-28 Thread Albert Chin
Without doing a zpool scrub, what's the quickest way to find files in a
filesystem with cksum errors? Iterating over all files with find takes
quite a bit of time. Maybe there's some zdb fu that will perform the
check for me?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Quickest way to find files with cksum errors without doing scrub

2009-09-28 Thread Albert Chin
On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
 On Mon, 28 Sep 2009, Richard Elling wrote:

 Scrub could be faster, but you can try
  tar cf - .  /dev/null

 If you think about it, validating checksums requires reading the data.
 So you simply need to read the data.

 This should work but it does not verify the redundant metadata.  For
 example, the duplicate metadata copy might be corrupt but the problem
 is not detected since it did not happen to be used.

Too bad we cannot scrub a dataset/object.

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Quickest way to find files with cksum errors without doing scrub

2009-09-28 Thread Albert Chin
On Mon, Sep 28, 2009 at 10:16:20AM -0700, Richard Elling wrote:
 On Sep 28, 2009, at 3:42 PM, Albert Chin wrote:

 On Mon, Sep 28, 2009 at 12:09:03PM -0500, Bob Friesenhahn wrote:
 On Mon, 28 Sep 2009, Richard Elling wrote:

 Scrub could be faster, but you can try
tar cf - .  /dev/null

 If you think about it, validating checksums requires reading the  
 data.
 So you simply need to read the data.

 This should work but it does not verify the redundant metadata.  For
 example, the duplicate metadata copy might be corrupt but the problem
 is not detected since it did not happen to be used.

 Too bad we cannot scrub a dataset/object.

 Can you provide a use case? I don't see why scrub couldn't start and
 stop at specific txgs for instance. That won't necessarily get you to a
 specific file, though.

If your pool is borked but mostly readable, yet some file systems have
cksum errors, you cannot zfs send that file system (err, snapshot of
filesystem). So, you need to manually fix the file system by traversing
it to read all files to determine which must be fixed. Once this is
done, you can snapshot and zfs send. If you have many file systems,
this is time consuming.

Of course, you could just rsync and be happy with what you were able to
recover, but if you have clones branched from the same parent, which a
few differences inbetween shapshots, having to rsync *everything* rather
than just the differences is painful. Hence the reason to try to get
zfs send to work.

But, this is an extreme example and I doubt pools are often in this
state so the engineering time isn't worth it. In such cases though, a
zfs scrub would be useful.

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] refreservation not transferred by zfs send when sending a volume?

2009-09-28 Thread Albert Chin
snv114# zfs get 
used,reservation,volsize,refreservation,usedbydataset,usedbyrefreservation 
tww/opt/vms/images/vios/mello-0.img
NAME PROPERTY  VALUE  SOURCE
tww/opt/vms/images/vios/mello-0.img  used  30.6G  -
tww/opt/vms/images/vios/mello-0.img  reservation   none   
default
tww/opt/vms/images/vios/mello-0.img  volsize   25G-
tww/opt/vms/images/vios/mello-0.img  refreservation25Glocal
tww/opt/vms/images/vios/mello-0.img  usedbydataset 5.62G  -
tww/opt/vms/images/vios/mello-0.img  usedbyrefreservation  25G-

Sent tww/opt/vms/images/vios/mello-0.img from snv_114 server
to snv_119 server.

On snv_119 server:
snv119# zfs get 
used,reservation,volsize,refreservation,usedbydataset,usedbyrefreservation 
t/opt/vms/images/vios/mello-0.img 
NAME   PROPERTY  VALUE  SOURCE
t/opt/vms/images/vios/mello-0.img  used  5.32G  -
t/opt/vms/images/vios/mello-0.img  reservation   none   default
t/opt/vms/images/vios/mello-0.img  volsize   25G-
t/opt/vms/images/vios/mello-0.img  refreservationnone   default
t/opt/vms/images/vios/mello-0.img  usedbydataset 5.32G  -
t/opt/vms/images/vios/mello-0.img  usedbyrefreservation  0  -

Any reason the refreservation and usedbyrefreservation properties are
not sent?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Should usedbydataset be the same after zfs send/recv for a volume?

2009-09-28 Thread Albert Chin
When transferring a volume between servers, is it expected that the
usedbydataset property should be the same on both? If not, is it cause
for concern?

snv114# zfs list tww/opt/vms/images/vios/near.img
NAME   USED  AVAIL  REFER  MOUNTPOINT
tww/opt/vms/images/vios/near.img  70.5G   939G  15.5G  -
snv114# zfs get usedbydataset tww/opt/vms/images/vios/near.img
NAME  PROPERTY   VALUE   SOURCE
tww/opt/vms/images/vios/near.img  usedbydataset  15.5G   -

snv119# zfs list t/opt/vms/images/vios/near.img 
NAME USED  AVAIL  REFER  MOUNTPOINT
t/opt/vms/images/vios/near.img  14.5G  2.42T  14.5G  -
snv119# zfs get usedbydataset t/opt/vms/images/vios/near.img 
NAMEPROPERTY   VALUE   SOURCE
t/opt/vms/images/vios/near.img  usedbydataset  14.5G   -

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Should usedbydataset be the same after zfs send/recv for a volume?

2009-09-28 Thread Albert Chin
On Mon, Sep 28, 2009 at 07:33:56PM -0500, Albert Chin wrote:
 When transferring a volume between servers, is it expected that the
 usedbydataset property should be the same on both? If not, is it cause
 for concern?
 
 snv114# zfs list tww/opt/vms/images/vios/near.img
 NAME   USED  AVAIL  REFER  MOUNTPOINT
 tww/opt/vms/images/vios/near.img  70.5G   939G  15.5G  -
 snv114# zfs get usedbydataset tww/opt/vms/images/vios/near.img
 NAME  PROPERTY   VALUE   SOURCE
 tww/opt/vms/images/vios/near.img  usedbydataset  15.5G   -
 
 snv119# zfs list t/opt/vms/images/vios/near.img 
 NAME USED  AVAIL  REFER  MOUNTPOINT
 t/opt/vms/images/vios/near.img  14.5G  2.42T  14.5G  -
 snv119# zfs get usedbydataset t/opt/vms/images/vios/near.img 
 NAMEPROPERTY   VALUE   SOURCE
 t/opt/vms/images/vios/near.img  usedbydataset  14.5G   -

Don't know if it matters but disks on both send/recv server are
different, 300GB FCAL on the send, 750GB SATA on the recv.

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs receive should allow to keep received system

2009-09-28 Thread Albert Chin
On Mon, Sep 28, 2009 at 03:16:17PM -0700, Igor Velkov wrote:
 Not so good as I hope.
 zfs send -R xxx/x...@daily_2009-09-26_23:51:00 |ssh -c blowfish r...@xxx.xx 
 zfs recv -vuFd xxx/xxx
 
 invalid option 'u'
 usage:
 receive [-vnF] filesystem|volume|snapshot
 receive [-vnF] -d filesystem
 
 For the property list, run: zfs set|get
 
 For the delegated permission list, run: zfs allow|unallow
 r...@xxx:~# uname -a
 SunOS xxx 5.10 Generic_13-03 sun4u sparc SUNW,Sun-Fire-V890
 
 What's wrong?

Looks like -u was a recent addition.

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] iscsi/comstar performance

2009-10-13 Thread Albert Chin
On Tue, Oct 13, 2009 at 01:00:35PM -0400, Frank Middleton wrote:
 After a recent upgrade to b124, decided to switch to COMSTAR for iscsi
 targets for VirtualBox hosted on AMD64 Fedora C10. Both target and
 initiator are running zfs under b124. This combination seems
 unbelievably slow compared to  the old iscsi subsystem.

 A scrub of a local 20GB disk on the target took 16 minutes. A scrub of
 a 20GB iscsi disk took 106 minutes! It seems to take much longer to
 boot from iscsi, so it seems to be reading more slowly too.

 There are a lot of variables - switching to Comstar, snv124, VBox
 3.08, etc., but such a dramatic loss of performance probably has a
 single cause. Is anyone willing to speculate?

Maybe this will help:
  
http://mail.opensolaris.org/pipermail/storage-discuss/2009-September/007118.html

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help! System panic when pool imported

2009-10-19 Thread Albert Chin
On Mon, Oct 19, 2009 at 03:31:46PM -0700, Matthew Ahrens wrote:
 Thanks for reporting this.  I have fixed this bug (6822816) in build  
 127.

Thanks. I just installed OpenSolaris Preview based on 125 and will
attempt to apply the patch you made to this release and import the pool.

 --matt

 Albert Chin wrote:
 Running snv_114 on an X4100M2 connected to a 6140. Made a clone of a
 snapshot a few days ago:
   # zfs snapshot a...@b
   # zfs clone a...@b tank/a
   # zfs clone a...@b tank/b

 The system started panicing after I tried:
   # zfs snapshot tank/b...@backup

 So, I destroyed tank/b:
   # zfs destroy tank/b
 then tried to destroy tank/a
   # zfs destroy tank/a

 Now, the system is in an endless panic loop, unable to import the pool
 at system startup or with zpool import. The panic dump is:
   panic[cpu1]/thread=ff0010246c60: assertion failed: 0 == 
 zap_remove_int(mos, ds_prev-ds_phys-ds_next_clones_obj, obj, tx) (0x0 == 
 0x2), file: ../../common/fs/zfs/dsl_dataset.c, line: 1512

   ff00102468d0 genunix:assfail3+c1 ()
   ff0010246a50 zfs:dsl_dataset_destroy_sync+85a ()
   ff0010246aa0 zfs:dsl_sync_task_group_sync+eb ()
   ff0010246b10 zfs:dsl_pool_sync+196 ()
   ff0010246ba0 zfs:spa_sync+32a ()
   ff0010246c40 zfs:txg_sync_thread+265 ()
   ff0010246c50 unix:thread_start+8 ()

 We really need to import this pool. Is there a way around this? We do
 have snv_114 source on the system if we need to make changes to
 usr/src/uts/common/fs/zfs/dsl_dataset.c. It seems like the zfs
 destroy transaction never completed and it is being replayed, causing
 the panic. This cycle continues endlessly.

   

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help! System panic when pool imported

2009-10-20 Thread Albert Chin
On Mon, Oct 19, 2009 at 09:02:20PM -0500, Albert Chin wrote:
 On Mon, Oct 19, 2009 at 03:31:46PM -0700, Matthew Ahrens wrote:
  Thanks for reporting this.  I have fixed this bug (6822816) in build  
  127.
 
 Thanks. I just installed OpenSolaris Preview based on 125 and will
 attempt to apply the patch you made to this release and import the pool.

Did the above and the zpool import worked. Thanks!

  --matt
 
  Albert Chin wrote:
  Running snv_114 on an X4100M2 connected to a 6140. Made a clone of a
  snapshot a few days ago:
# zfs snapshot a...@b
# zfs clone a...@b tank/a
# zfs clone a...@b tank/b
 
  The system started panicing after I tried:
# zfs snapshot tank/b...@backup
 
  So, I destroyed tank/b:
# zfs destroy tank/b
  then tried to destroy tank/a
# zfs destroy tank/a
 
  Now, the system is in an endless panic loop, unable to import the pool
  at system startup or with zpool import. The panic dump is:
panic[cpu1]/thread=ff0010246c60: assertion failed: 0 == 
  zap_remove_int(mos, ds_prev-ds_phys-ds_next_clones_obj, obj, tx) (0x0 == 
  0x2), file: ../../common/fs/zfs/dsl_dataset.c, line: 1512
 
ff00102468d0 genunix:assfail3+c1 ()
ff0010246a50 zfs:dsl_dataset_destroy_sync+85a ()
ff0010246aa0 zfs:dsl_sync_task_group_sync+eb ()
ff0010246b10 zfs:dsl_pool_sync+196 ()
ff0010246ba0 zfs:spa_sync+32a ()
ff0010246c40 zfs:txg_sync_thread+265 ()
ff0010246c50 unix:thread_start+8 ()
 
  We really need to import this pool. Is there a way around this? We do
  have snv_114 source on the system if we need to make changes to
  usr/src/uts/common/fs/zfs/dsl_dataset.c. It seems like the zfs
  destroy transaction never completed and it is being replayed, causing
  the panic. This cycle continues endlessly.
 

 
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 
 -- 
 albert chin (ch...@thewrittenword.com)
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance problems with Thumper and 7TB ZFS pool using RAIDZ2

2009-10-24 Thread Albert Chin
On Sat, Oct 24, 2009 at 03:31:25PM -0400, Jim Mauro wrote:
 Posting to zfs-discuss. There's no reason this needs to be
 kept confidential.

 5-disk RAIDZ2 - doesn't that equate to only 3 data disks?
 Seems pointless - they'd be much better off using mirrors,
 which is a better choice for random IO...

Is it really pointless? Maybe they want the insurance RAIDZ2 provides.
Given the choice between insurance and performance, I'll take insurance,
though it depends on your use case. We're using 5-disk RAIDZ2 vdevs.
While I want the performance a mirrored vdev would give, it scares me
that you're just one drive away from a failed pool. Of course, you could
have two mirrors in each vdev but I don't want to sacrifice that much
space. However, over the last two years, we haven't had any
demonstratable failures that would give us cause for concern. But, it's
still unsettling.

Would love to hear other opinions on this.

 Looking at this now...

 /jim


 Jeff Savit wrote:
 Hi all,

 I'm looking for suggestions for the following situation: I'm helping  
 another SE with a customer using Thumper with a large ZFS pool mostly  
 used as an NFS server, and disappointments in performance. The storage  
 is an intermediate holding place for data to be fed into a relational  
 database, and the statement is that the NFS side can't keep up with  
 data feeds written to it as flat files.

 The ZFS pool has 8 5-volume RAIDZ2 groups, for 7.3TB of storage, with  
 1.74TB available.  Plenty of idle CPU as shown by vmstat and mpstat.   
 iostat shows queued I/O and I'm not happy about the total latencies -  
 wsvc_t in excess of 75ms at times.  Average of ~60KB per read and only  
 ~2.5KB per write. Evil Tuning guide tells me that RAIDZ2 is happiest  
 for long reads and writes, and this is not the use case here.

 I was surprised to see commands like tar, rm, and chown running  
 locally on the NFS server, so it looks like they're locally doing file  
 maintenance and pruning at the same time it's being accessed remotely.  
 That makes sense to me for the short write lengths and for the high  
 ZFS ACL activity shown by DTrace. I wonder if there is a lot of sync  
 I/O that would benefit from separately defined ZILs (whether SSD or  
 not), so I've asked them to look for fsync activity.

 Data collected thus far is listed below. I've asked for verification  
 of the Solaris 10 level (I believe it's S10u6) and ZFS recordsize.   
 Any suggestions will be appreciated.

 regards, Jeff

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send... too slow?

2009-10-26 Thread Albert Chin
On Sun, Oct 25, 2009 at 01:45:05AM -0700, Orvar Korvar wrote:
 I am trying to backup a large zfs file system to two different
 identical hard drives. I have therefore started two commands to backup
 myfs and when they have finished, I will backup nextfs
 
 zfs send mypool/m...@now | zfs receive backupzpool1/now  zfs send
 mypool/m...@now | zfs receive backupzpool2/now ; zfs send
 mypool/nex...@now | zfs receive backupzpool3/now
 
 in parallell. The logic is that the same file data is cached and
 therefore easy to send to each backup drive.
 
 Should I instead have done one zfs send... and waited for it to
 complete, and then started the next?
 
 It seems that zfs send... takes quite some time? 300GB takes 10
 hours, this far. And I have in total 3TB to backup. This means it will
 take 100 hours. Is this normal? If I had 30TB to back up, it would
 take 1000 hours, which is more than a month. Can I speed this up?

It's not immediately obvious what the cause is. Maybe the server running
zfs send has slow MB/s performance reading from disk. Maybe the network.
Or maybe the remote system. This might help:
  http://tinyurl.com/yl653am

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS)

2009-10-26 Thread Albert Chin
On Mon, Oct 26, 2009 at 09:58:05PM +0200, Mertol Ozyoney wrote:
 In all 2500 and 6000 series you can assign raid set's to a controller and
 that controller becomes the owner of the set. 

When I configured all 32-drives on a 6140 array and the expansion
chassis, CAM automatically split the drives amongst controllers evenly.

 The advantage of 2540 against it's bigger brothers (6140 which is EOL'ed)
 and competitors 2540 do use dedicated data paths for cache mirroring just
 like higher end unit disks (6180,6580, 6780) improving write performance
 significantly. 
 
 Spliting load between controllers can most of the time increase performance,
 but you do not need to split in two equal partitions. 
 
 Also do not forget that first tray have dedicated data lines to the
 controller so generaly it's wise not to mix those drives with other drives
 on other trays. 

But, if you have an expansion chassis, and create a zpool with drives on
the first tray and subsequent trays, what's the difference? You cannot
tell zfs which vdev to assign writes to so it seems pointless to balance
your pool based on the chassis when reads/writes are potentially spread
across all vdevs.

 Best regards
 Mertol  
 
 
 
 
 Mertol Ozyoney 
 Storage Practice - Sales Manager
 
 Sun Microsystems, TR
 Istanbul TR
 Phone +902123352200
 Mobile +905339310752
 Fax +90212335
 Email mertol.ozyo...@sun.com
 
 
 
 -Original Message-
 From: zfs-discuss-boun...@opensolaris.org
 [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Bob Friesenhahn
 Sent: Tuesday, October 13, 2009 10:59 PM
 To: Nils Goroll
 Cc: zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS)
 
 On Tue, 13 Oct 2009, Nils Goroll wrote:
 
  Regarding my bonus question: I haven't found yet a definite answer if
 there 
  is a way to read the currently active controller setting. I still assume
 that 
  the nvsram settings which can be read with
 
  service -d arrayname -c read -q nvsram region=0xf2 host=0x00
 
  do not necessarily reflect the current configuration and that the only way
 to 
  make sure the controller is running with that configuration is to reset
 it.
 
 I believe that in the STK 2540, the controllers operate Active/Active 
 except that each controller is Active for half the drives and Standby 
 for the others.  Each controller has a copy of the configuration 
 information.  Whichever one you communicate with is likely required to 
 mirror the changes to the other.
 
 In my setup I load-share the fiber channel traffic by assigning six 
 drives as active on one controller and six drives as active on the 
 other controller, and the drives are individually exported with a LUN 
 per drive.  I used CAM to do that.  MPXIO sees the changes and does 
 map 1/2 the paths down each FC link for more performance than one FC 
 link offers.
 
 Bob
 --
 Bob Friesenhahn
 bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
 GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] heads up on SXCE build 125 (LU + mirrored root pools)

2009-11-05 Thread Albert Chin
On Thu, Nov 05, 2009 at 01:01:54PM -0800, Chris Du wrote:
 I think I finally see what you mean.
 
 # luactivate b126
 System has findroot enabled GRUB
 ERROR: Unable to determine the configuration of the current boot environment 
 b125.

A possible solution was posted in the thread:
  http://opensolaris.org/jive/thread.jspa?threadID=115503tstart=0

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] cannot receive new filesystem stream: invalid backup stream

2009-12-27 Thread Albert Chin
I have two snv_126 systems. I'm trying to zfs send a recursive snapshot
from one system to another:
  # zfs send -v -R tww/opt/chro...@backup-20091225 |\
  ssh backupserver zfs receive -F -d -u -v tww
  ...
  found clone origin tww/opt/chroots/a...@ab-1.0
  receiving incremental stream of tww/opt/chroots/ab-...@backup-20091225 into 
tww/opt/chroots/ab-...@backup-20091225
  cannot receive new filesystem stream: invalid backup stream

If I do the following on the origin server:
  # zfs destroy -r tww/opt/chroots/ab-1.0
  # zfs list -t snapshot -r tww/opt/chroots | grep ab-1.0 
  tww/opt/chroots/a...@ab-1.0
  tww/opt/chroots/hppa1.1-hp-hpux11...@ab-1.0
  tww/opt/chroots/hppa1.1-hp-hpux11...@ab-1.0
  ...
  # zfs list -t snapshot -r tww/opt/chroots | grep ab-1.0 |\
  while read a; do zfs destroy $a; done
then another zfs send like the above, the zfs send/receive succeeds.
However, If I then perform a few operations like the following:
  zfs snapshot tww/opt/chroots/a...@ab-1.0
  zfs clone tww/opt/chroots/a...@ab-1.0 tww/opt/chroots/ab-1.0
  zfs rename tww/opt/chroots/ab/hppa1.1-hp-hpux11.00 
tww/opt/chroots/ab-1.0/hppa1.1-hp-hpux11.00
  zfs rename tww/opt/chroots/hppa1.1-hp-hpux11...@ab 
tww/opt/chroots/hppa1.1-hp-hpux11...@ab-1.0
  zfs destroy tww/opt/chroots/ab/hppa1.1-hp-hpux11.00
  zfs destroy tww/opt/chroots/hppa1.1-hp-hpux11...@ab
  zfs snapshot tww/opt/chroots/hppa1.1-hp-hpux11...@ab
  zfs clone tww/opt/chroots/hppa1.1-hp-hpux11...@ab 
tww/opt/chroots/ab/hppa1.1-hp-hpux11.00
  zfs rename tww/opt/chroots/ab/hppa1.1-hp-hpux11.11 
tww/opt/chroots/ab-1.0/hppa1.1-hp-hpux11.11
  zfs rename tww/opt/chroots/hppa1.1-hp-hpux11...@ab 
tww/opt/chroots/hppa1.1-hp-hpux11...@ab-1.0
  zfs destroy tww/opt/chroots/ab/hppa1.1-hp-hpux11.11
  zfs destroy tww/opt/chroots/hppa1.1-hp-hpux11...@ab
  zfs snapshot tww/opt/chroots/hppa1.1-hp-hpux11...@ab
  zfs clone tww/opt/chroots/hppa1.1-hp-hpux11...@ab 
tww/opt/chroots/ab/hppa1.1-hp-hpux11.11
  ...
and then perform another zfs send/receive, the error above occurs. Why?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] CPU sizing for ZFS/iSCSI/NFS server

2011-12-12 Thread Albert Chin
We're preparing to purchase an X4170M2 as an upgrade for our existing
X4100M2 server for ZFS, NFS, and iSCSI. We have a choice for CPU, some
more expensive than others. Our current system has a dual-core 1.8Ghz
Opteron 2210 CPU with 8GB. Seems like either a 6-core Intel E5649
2.53Ghz CPU or 4-core Intel E5620 2.4Ghz CPU would be more than
enough. Based on what we're using the system for, it should be more
I/O bound than CPU bound. We are doing compression in ZFS but that
shouldn't be too CPU intensive. Seems we should be caring more about
more cores than high Ghz.

Recommendations?

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CPU sizing for ZFS/iSCSI/NFS server

2011-12-12 Thread Albert Chin
On Mon, Dec 12, 2011 at 02:40:52PM -0500, Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D. 
wrote:
 please check out the ZFS appliance 7120 spec 2.4Ghz /24GB memory and
 ZIL(SSD)
 may be try the ZFS simulator SW

Good point. Thanks.

 regards
 
 On 12/12/2011 2:28 PM, Albert Chin wrote:
 We're preparing to purchase an X4170M2 as an upgrade for our existing
 X4100M2 server for ZFS, NFS, and iSCSI. We have a choice for CPU, some
 more expensive than others. Our current system has a dual-core 1.8Ghz
 Opteron 2210 CPU with 8GB. Seems like either a 6-core Intel E5649
 2.53Ghz CPU or 4-core Intel E5620 2.4Ghz CPU would be more than
 enough. Based on what we're using the system for, it should be more
 I/O bound than CPU bound. We are doing compression in ZFS but that
 shouldn't be too CPU intensive. Seems we should be caring more about
 more cores than high Ghz.
 
 Recommendations?
 
 
 -- 
 Hung-Sheng Tsao Ph D.
 Founder  Principal
 HopBit GridComputing LLC
 cell: 9734950840
 http://laotsao.wordpress.com/
 http://laotsao.blogspot.com/
 

 begin:vcard
 fn:Hung-Sheng Tsao
 n:Tsao;Hung-Sheng
 email;internet:laot...@gmail.com
 tel;cell:9734950840
 x-mozilla-html:TRUE
 version:2.1
 end:vcard
 


-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CPU sizing for ZFS/iSCSI/NFS server

2011-12-12 Thread Albert Chin
On Mon, Dec 12, 2011 at 03:01:08PM -0500, Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D. 
wrote:
 4c@2.4ghz

Yep, that's the plan. Thanks.

 On 12/12/2011 2:44 PM, Albert Chin wrote:
 On Mon, Dec 12, 2011 at 02:40:52PM -0500, Hung-Sheng Tsao (Lao Tsao 老曹) 
 Ph.D. wrote:
 please check out the ZFS appliance 7120 spec 2.4Ghz /24GB memory and
 ZIL(SSD)
 may be try the ZFS simulator SW
 Good point. Thanks.
 
 regards
 
 On 12/12/2011 2:28 PM, Albert Chin wrote:
 We're preparing to purchase an X4170M2 as an upgrade for our existing
 X4100M2 server for ZFS, NFS, and iSCSI. We have a choice for CPU, some
 more expensive than others. Our current system has a dual-core 1.8Ghz
 Opteron 2210 CPU with 8GB. Seems like either a 6-core Intel E5649
 2.53Ghz CPU or 4-core Intel E5620 2.4Ghz CPU would be more than
 enough. Based on what we're using the system for, it should be more
 I/O bound than CPU bound. We are doing compression in ZFS but that
 shouldn't be too CPU intensive. Seems we should be caring more about
 more cores than high Ghz.
 
 Recommendations?
 
 -- 
 Hung-Sheng Tsao Ph D.
 Founder   Principal
 HopBit GridComputing LLC
 cell: 9734950840
 http://laotsao.wordpress.com/
 http://laotsao.blogspot.com/
 
 begin:vcard
 fn:Hung-Sheng Tsao
 n:Tsao;Hung-Sheng
 email;internet:laot...@gmail.com
 tel;cell:9734950840
 x-mozilla-html:TRUE
 version:2.1
 end:vcard
 
 
 
 -- 
 Hung-Sheng Tsao Ph D.
 Founder  Principal
 HopBit GridComputing LLC
 cell: 9734950840
 http://laotsao.wordpress.com/
 http://laotsao.blogspot.com/

-- 
albert chin (ch...@thewrittenword.com)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss