Re: [zfs-discuss] zfs data corruption

2008-04-24 Thread Victor Engle
Just to clarify this post. This isn't data I care about recovering.
I'm just interested in understanding how zfs determined there was data
corruption when I have checksums disabled and there were no
non-retryable read errors reported in the messages file.

On Wed, Apr 23, 2008 at 9:52 PM, Victor Engle [EMAIL PROTECTED] wrote:
 Thanks! That would explain things. I don't believe it was a real disk
  read error because of the absence of evidence in /var/adm/messages.

  I'll review the man page and documentation to confirm that metadata is
  checksummed.

  Regards,
  Vic




  On Wed, Apr 23, 2008 at 6:30 PM, Nathan Kroenert
  [EMAIL PROTECTED] wrote:
   I'm just taking a stab here, so could be completely wrong, but IIRC, even 
 if
   you disable checksum, it still checksums the metadata...
  
So, it could be metadata checksum errors.
  
Others on the list might have some funky zdb thingies you could to see 
 what
   it actually is...
  
Note: typed pre caffeine... :)
  
Nathan
  
  
  
Vic Engle wrote:
  
I'm hoping someone can help me understand a zfs data corruption symptom.
   We have a zpool with checksum turned off. Zpool status shows that data
   corruption occured. The application using the pool at the time reported a
   read error and zoppl status (see below) shows 2 read errors on a device.
   The thing that is confusing to me is how ZFS determines that data 
 corruption
   exists when reading data from a pool with checkdum turned off.
   
Also, I'm wondering about the persistent errors in the output below. 
 Since
   no specific file or directory is mentioned does this indicate pool metadata
   is corrupt?
   
Thanks for any help interpreting the output...
   
   
# zpool status -xv
 pool: zpool1
 state: ONLINE
status: One or more devices has experienced an error resulting in data
   corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
   entire pool from backup.
  see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:
   
   NAME STATE READ WRITE 
 CKSUM
   zpool1   ONLINE   2 0  0
 c4t60A9800043346859444A476B2D48446Fd0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D484352d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D484236d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D482D6Cd0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D483951d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D483836d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D48366Bd0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D483551d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D483435d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D48326Bd0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D483150d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D483035d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D47796Ad0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D477850d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D477734d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D47756Ad0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D47744Fd0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D477333d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D477169d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D47704Ed0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D476F33d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D476D68d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D476C4Ed0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D476B32d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D476968d0  ONLINE   0 0  0
 c4t60A98000433468656834476B2D453974d0  ONLINE   0 0  0
 c4t60A98000433468656834476B2D454142d0  ONLINE   0 0  0
 c4t60A98000433468656834476B2D454255d0  ONLINE   0 0  0
 c4t60A98000433468656834476B2D45436Dd0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D487346d0  ONLINE   2 0  0
 c4t60A9800043346859444A476B2D487175d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D48705Ad0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D486F45d0  ONLINE   0 0  0
 c4t60A9800043346859444A476B2D486D74d0  ONLINE   0 0

Re: [zfs-discuss] zfs data corruption

2008-04-23 Thread Victor Engle
Thanks! That would explain things. I don't believe it was a real disk
read error because of the absence of evidence in /var/adm/messages.

I'll review the man page and documentation to confirm that metadata is
checksummed.

Regards,
Vic


On Wed, Apr 23, 2008 at 6:30 PM, Nathan Kroenert
[EMAIL PROTECTED] wrote:
 I'm just taking a stab here, so could be completely wrong, but IIRC, even if
 you disable checksum, it still checksums the metadata...

  So, it could be metadata checksum errors.

  Others on the list might have some funky zdb thingies you could to see what
 it actually is...

  Note: typed pre caffeine... :)

  Nathan



  Vic Engle wrote:

  I'm hoping someone can help me understand a zfs data corruption symptom.
 We have a zpool with checksum turned off. Zpool status shows that data
 corruption occured. The application using the pool at the time reported a
 read error and zoppl status (see below) shows 2 read errors on a device.
 The thing that is confusing to me is how ZFS determines that data corruption
 exists when reading data from a pool with checkdum turned off.
 
  Also, I'm wondering about the persistent errors in the output below. Since
 no specific file or directory is mentioned does this indicate pool metadata
 is corrupt?
 
  Thanks for any help interpreting the output...
 
 
  # zpool status -xv
   pool: zpool1
   state: ONLINE
  status: One or more devices has experienced an error resulting in data
 corruption.  Applications may be affected.
  action: Restore the file in question if possible.  Otherwise restore the
 entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
   scrub: none requested
  config:
 
 NAME STATE READ WRITE CKSUM
 zpool1   ONLINE   2 0 0
   c4t60A9800043346859444A476B2D48446Fd0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D484352d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D484236d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D482D6Cd0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D483951d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D483836d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D48366Bd0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D483551d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D483435d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D48326Bd0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D483150d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D483035d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D47796Ad0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D477850d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D477734d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D47756Ad0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D47744Fd0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D477333d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D477169d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D47704Ed0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D476F33d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D476D68d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D476C4Ed0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D476B32d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D476968d0  ONLINE   0 0 0
   c4t60A98000433468656834476B2D453974d0  ONLINE   0 0 0
   c4t60A98000433468656834476B2D454142d0  ONLINE   0 0 0
   c4t60A98000433468656834476B2D454255d0  ONLINE   0 0 0
   c4t60A98000433468656834476B2D45436Dd0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D487346d0  ONLINE   2 0 0
   c4t60A9800043346859444A476B2D487175d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D48705Ad0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D486F45d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D486D74d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D486C5Ad0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D486B44d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D486974d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D486859d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D486744d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D486573d0  ONLINE   0 0 0
   c4t60A9800043346859444A476B2D486459d0  ONLINE   0 0 

Re: [zfs-discuss] ZFS and multipath with iSCSI

2008-04-04 Thread Victor Engle
In /kernel/drv/scsi_vhci.conf you could do this

load-balance=none;

That way mpxio would use only one device. I imagine you need a vid/pid
entry also in scsi_vhci.conf for your target.

Regards,
Vic


On Fri, Apr 4, 2008 at 3:36 PM, Chris Siebenmann [EMAIL PROTECTED] wrote:
  We're currently designing a ZFS fileserver environment with iSCSI based
  storage (for failover, cost, ease of expansion, and so on). As part of
  this we would like to use multipathing for extra reliability, and I am
  not sure how we want to configure it.

   Our iSCSI backend only supports multiple sessions per target, not
  multiple connections per session (and my understanding is that the
  Solaris initiator doesn't currently support multiple connections
  anyways). However, we have been cautioned that there is nothing in
  the backend that imposes a global ordering for commands between the
  sessions, and so disk IO might get reordered if Solaris's multipath load
  balancing submits part of it to one session and part to another.

   So: does anyone know if Solaris's multipath and iSCSI systems already
  take care of this, or if ZFS already is paranoid enough to deal
  with this, or if we should configure Solaris multipathing to not
  load-balance?

  (A load-balanced multipath configuration is simpler for us to
  administer, at least until I figure out how to tell Solaris multipathing
  which is the preferrred network for any given iSCSI target so we can
  balance the overall network load by hand.)

   Thanks in advance.

 - cks
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS hang and boot hang when iSCSI device removed

2008-02-05 Thread Victor Engle
I don't think this is so much a ZFS problem as an iSCSI initiator
problem. Are you using static configs or Send Target discovery? There
are many reports of sent target discovery misbehavior in the storage
discuss forum.

To recover:
1. Boot into single user from CD
2. mount the root slice on /a
3. rm /etc/iscsi/*
4. reboot
5. configure iscsi static discovery for the new target IP's.

A nice trick mentioned by David Weibel previously on storage discuss
is to use discovery addresses to provide all the info you need to
create the static configs. Just add the discovery addresses but don't
enable send targets. Then run iscsiadm list discovery-address -v.
The initiator will login to the discovery address and issue a send
targets all command and print the results on stdout. Use the results
to create the static configs and then enable static discovery.

Good Luck,
Vic



On Feb 5, 2008 11:44 AM, Ross [EMAIL PROTECTED] wrote:
 We're currently evaluating ZFS prior to (hopefully) rolling it out across our 
 server room, and have managed to lock up a server after connecting to an 
 iSCSI target, and then changing the IP address of the target.

 Basically we have two test Solaris servers running, and I followed the 
 instructions on the post below to share a zpool on Server1 using the iSCSI 
 Target, and then import that into a new zpool on Server2.
 http://blogs.sun.com/chrisg/date/20070418.

 Everything appeared to work fine until I moved the servers to a new network 
 (while powered on), which changed their IP addresses.  The server running the 
 iSCSI Target is still fine, it has it's IP address and from another machine I 
 can see that the iSCSI target is still visible.

 However, Server2 was not as happy with the move.  As far as I can tell, all 
 ZFS commands locked up on it.  I couldn't run zfs list, zpool list, 
 zpool status or zfs iostat.  Every single one locked up and I couldn't 
 even find a way to stop them.  Now I've seen a few posts about ZFS commands 
 locking up, but this is very concerning for something we're considering using 
 in a production system.

 Anyway, with Server 2 well and truly locked up, I restarted it hoping that 
 would clear the problem (figuring ZFS would simply mark the device as 
 offline), but found that the server can't even boot.  For the past hour it 
 has simply spammed the following message to the screen:

 NOTICE: iscsi connection(27) unable to connecct to target 
 iqn.1986-03.com.sun:02:3d882af1-91cc-6d9e-9f19-edfa095fca6d

 Now that I wasn't expecting.  This volume isn't a boot volume, the server 
 doesn't need either ZFS or iSCSI to boot, and I don't think I even saved any 
 data on that drive.  I have found a post reporting a similar message to the 
 above, which was reporting a ten minute boot delay with a working iSCSI 
 volume, however I can't find anything to say what happens if the iSCSI volume 
 is no longer there:
 http://forum.java.sun.com/thread.jspa?threadID=5243777messageID=10004717

 So, I have quite a few questions:

 1.  Does anybody know how I can recover from this, or am I going to have to 
 wipe my test server and start again?

 2.  How vulnerable are the ZFS admin tools to locking up like this?

 3.  How vulnerable is the iSCSI client to locking up like this during boot?

 4.  Is there any way we can disconnect the iSCSI share while ZFS is locked up 
 like this?  What could I have tried to regain control of my server before 
 rebooting?

 5.  If I can get the server booted, is there any way to redirect an iSCSI 
 volume so it's pointing at the new IP address?  (I was expecting to simply do 
 a zpool replace when ZFS reported the drive as missing).

 And finally, does anybody know why zpool status should lock up like this?  
 I'm really not happy that the ZFS admin tools seem so fragile.  At the very 
 least I would have expected zpool status to be able to list the devices 
 attached to the pools and report that they are timing out or erroring, and 
 for me to be able to use the other ZFS tools to forcibly remove failed drives 
 as needed.  Anything less means I'm risking my whole system should ZFS find 
 something it doesn't like.

 I admit I'm a solaris newbie, but surely something designed as a robust 
 filesystem also needs robust management tools?


 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Q : change disks to get bigger pool

2008-01-20 Thread Victor Engle
 Plan is to replace disks with new and larger disks.

 So will pool get bigger just by replasing all 4 disks one-by-one ?
 And if it will get larger how this should be done , fail disks one-by-one .. 
 or ???

 Or is data backup and pool recreation only way to get bigger pool


There is another possibility. If you can retain the original smaller
disks in the pool then you have the option of adding the additional 4
larger disks as another raidz set. In that case the command would be
something like...

zpool add yourpool raidz disk1 disk2 disk3 disk4

The pool would then stripe across the 2 raidz sets with more I/O to
the larger raidz. In this case your new space would be immediately
available.

Since the 4 new disks are larger you could alternatively add a 3 disk
raidz to the pool and add the 4th new disk as a spare. That way I
think you could survive 2 disk failures in either pool as long as the
2nd failure didn't occur during resilver operation from the first
failure.

Regards,
Vic
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] how to relocate a disk

2008-01-18 Thread Victor Engle
 I tried taking it offline and online again, but then zpool says the disk
 is unavailable. Trying a zpool replace didn't work because it complains
 that the new disk is part of a zfs pool...

So it would look like a new disk to ZFS and not like a disk belonging
to a zpool.

Vic
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] how to relocate a disk

2008-01-18 Thread Victor Engle
 I tried taking it offline and online again, but then zpool says the disk
 is unavailable. Trying a zpool replace didn't work because it complains
 that the new disk is part of a zfs pool...

So you offlined the disk and moved it to the new controller and then
tried to add it back to the pool? A brute force approach might work,
offline the disk, move it and format -e to restore the vtoc label and
then zpool replace it. Of course it would have to resync but you would
have avoided an export/import or reboot.

Regards,
Vic
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Clearing partition/label info

2007-12-17 Thread Victor Engle
Hi Al,

That depends on whether you want to go back to a VTOC/SMI label or
keep the EFI label created by ZFS. To keep the EFI label just
repartition and use the partitions as desired. If you want to go back
to a VTOC/SMI label you have to run format -e and then relabel the
disk and select SMI.

Be sure to run zpool destroy poolname before relabeling a lun used for zfs.

To automatically recreate the default VTOC label you could incorporate
the following into a script and iterate over a list of disks.

1. Create a label.dat file with the following line in it...

label  0  y

2. Then execute the following format command...

format -e  -m -f /tmp/label  cxtxdx

That should apply a default VTOC SMI label.

For x86 you may need run following before the format command...

usr/sbin/fdisk  -B cxtxdxp0

Regards,
Vic


On Dec 17, 2007 9:36 AM, Al Slater [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi,

 What is the quickest way of clearing the label information on a disk
 that has been previously used in a zpool?

 regards

 - --
 Al Slater

 Technical Director
 SCL

 Phone : +44 (0)1273 07
 Fax   : +44 (0)1273 01
 email : [EMAIL PROTECTED]

 Stanton Consultancy Ltd
 Pavilion House, 6-7 Old Steine, Brighton, East Sussex, BN1 1EJ
 Registered in England Company number: 1957652 VAT number: GB 760 2433 55
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.7 (MingW32)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

 iD8DBQFHZoluz4fTOFL/EDYRAnr5AJ4ie+xFNCi6gA5HLZ8IqI1wHItEEwCgj0ru
 EwSc9B16io3kBz2wS0LGoEQ=
 =eaZc
 -END PGP SIGNATURE-

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Training

2007-10-31 Thread Victor Engle
This class looks pretty good...

http://www.sun.com/training/catalog/courses/SA-229-S10.xml



On 10/31/07, Lisa Richards [EMAIL PROTECTED] wrote:




 Is there a class on ZFS installation and administration ?



 Lisa Richards

 Zykis Corporation

 [EMAIL PROTECTED]
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Victor Engle
Wouldn't this be the known feature where a write error to zfs forces a panic?

Vic



On 10/4/07, Ben Rockwood [EMAIL PROTECTED] wrote:
 Dick Davies wrote:
  On 04/10/2007, Nathan Kroenert [EMAIL PROTECTED] wrote:
 
 
  Client A
- import pool make couple-o-changes
 
  Client B
- import pool -f  (heh)
 
 
 
  Oct  4 15:03:12 fozzie ^Mpanic[cpu0]/thread=ff0002b51c80:
  Oct  4 15:03:12 fozzie genunix: [ID 603766 kern.notice] assertion
  failed: dmu_read(os, smo-smo_object, offset, size, entry_map) == 0 (0x5
  == 0x0)
  , file: ../../common/fs/zfs/space_map.c, line: 339
  Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51160
  genunix:assfail3+b9 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51200
  zfs:space_map_load+2ef ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51240
  zfs:metaslab_activate+66 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51300
  zfs:metaslab_group_alloc+24e ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b513d0
  zfs:metaslab_alloc_dva+192 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51470
  zfs:metaslab_alloc+82 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514c0
  zfs:zio_dva_allocate+68 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b514e0
  zfs:zio_next_stage+b3 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51510
  zfs:zio_checksum_generate+6e ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51530
  zfs:zio_next_stage+b3 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515a0
  zfs:zio_write_compress+239 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b515c0
  zfs:zio_next_stage+b3 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51610
  zfs:zio_wait_for_children+5d ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51630
  zfs:zio_wait_children_ready+20 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51650
  zfs:zio_next_stage_async+bb ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51670
  zfs:zio_nowait+11 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51960
  zfs:dbuf_sync_leaf+1ac ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b519a0
  zfs:dbuf_sync_list+51 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a10
  zfs:dnode_sync+23b ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51a50
  zfs:dmu_objset_sync_dnodes+55 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51ad0
  zfs:dmu_objset_sync+13d ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51b40
  zfs:dsl_pool_sync+199 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51bd0
  zfs:spa_sync+1c5 ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c60
  zfs:txg_sync_thread+19a ()
  Oct  4 15:03:12 fozzie genunix: [ID 655072 kern.notice] ff0002b51c70
  unix:thread_start+8 ()
  Oct  4 15:03:12 fozzie unix: [ID 10 kern.notice]
 
 
 
  Is this a known issue, already fixed in a later build, or should I bug it?
 
 
  It shouldn't panic the machine, no. I'd raise a bug.
 
 
  After spending a little time playing with iscsi, I have to say it's
  almost inevitable that someone is going to do this by accident and panic
  a big box for what I see as no good reason. (though I'm happy to be
  educated... ;)
 
 
  You use ACLs and TPGT groups to ensure 2 hosts can't simultaneously
  access the same LUN by accident. You'd have the same problem with
  Fibre Channel SANs.
 
 I ran into similar problems when replicating via AVS.

 benr.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] When I stab myself with this knife, it hurts... But - should it kill me?

2007-10-04 Thread Victor Engle
 Perhaps it's the same cause, I don't know...

 But I'm certainly not convinced that I'd be happy with a 25K, for
 example, panicing just because I tried to import a dud pool...

 I'm ok(ish) with the panic on a failed write to a non-redundant storage.
 I expect it by now...


I agree, forcing a panic seems to be pretty severe and may cause as
much grief as it prevents. Why not just stop allowing I/O to the pool
so the sys admin can gracefully shutdown the system? Applications
would be disrupted but no more so than they would be disrupted during
a panic.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Again ZFS with expanding LUNs!

2007-09-12 Thread Victor Engle
I like option #1 because it is simple and quick. It seems unlikely
that this will lead to an excessive number of luns in the pool in most
cases unless you start with a large number of very small luns. If you
begin with 5 100GB luns and over time add 5 more it still seems like a
reasonable and manageable pool with twice the original capacity.

And considering the array can likely support hundreds and perhaps
thousands of luns then it really isn't an issue on the array side
either.

Regards,
Vic

On 9/12/07, Bill Korb [EMAIL PROTECTED] wrote:
 I found this discussion just today as I recently set up my first S10 machine 
 with ZFS. We use a NetApp Filer via multipathed FC HBAs, and I wanted to know 
 what my options were in regards to growing a ZFS filesystem.

 After looking at this thread, it looks like there is currently no way to grow 
 an existing LUN on our NetApp and then tell ZFS to expand to fill the new 
 space. This may be coming down the road at some point, but I would like to be 
 able to do this now.

 At this point, I believe I have two options:

 1. Add a second LUN and simply do a zpool add to add the new space to the 
 existing pool.

 2. Create a new LUN that is the size I would like my pool to be, then use 
 zpool replace oldLUNdev newLUNdev to ask ZFS to resilver my data to the new 
 LUN then detach the old one.

 The advantage of the first option is that it happens very quickly, but it 
 could get kind of messy if you grow the ZFS pool on multiple occasions. I've 
 read that some SANs are also limited as to how many LUNs can be created (some 
 are limitations of the SAN itself whereas I believe that some others impose a 
 limit as part of the SAN license). That would also make the first approach 
 less attractive.

 The advantage of the second approach is that all of the space would be 
 contained in a single LUN. The disadvantages are that this would involve 
 copying all of the data from the old LUN to the new one and also this means 
 that you need to have enough free space on your SAN to create this new, 
 larger LUN.

 Is there a best practice regarding this? I'm leaning towards option #2 so as 
 to keep the number of LUNs I have to manage at a minimum, but #1 seems like a 
 reasonable alternative, too. Or perhaps there's an option #3 that I haven't 
 thought of?

 Thanks,
 Bill


 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single SAN Lun presented to 4 Hosts

2007-08-25 Thread Victor Engle
On 8/25/07, Matt B [EMAIL PROTECTED] wrote:
 the 4 database servers are part of an Oracle RAC configuration. 3 databases 
 are hosted on these servers, BIGDB1 on all 4, littledb1 on the first 2, and 
 littledb2 on the last two. The oracle backup system spawns db backup jobs 
 that could occur on any node based on traffic and load. All nodes are fiber 
 attached to a SAN. They all of FC access to the same set of SAN disks where 
 the nightly dumps must go to. The plan all along was to save the gigE network 
 for network traffic and have the nightly backups occur over the dedicated fc 
 network.


Matt,

Can you just alter the backup job that oracle spawns to import the
pool then do the backup and finally export the pool?

Regards,
Vic
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Again ZFS with expanding LUNs!

2007-08-07 Thread Victor Engle
I can understand lun expansion capability being an issue with more a
traditional volume manager or even a single lun but with pooled
storage and the ability to expand the pool, benefiting all filesystems
in the pool it seems a shame to consider lun expansion a show stopper.

Even so, having all the details automated and transparent to the
administrator would be very cool.

Regards,
Vic


On 8/7/07, George Wilson [EMAIL PROTECTED] wrote:
 I'm planning on putting back the changes to ZFS into Opensolaris in
 upcoming weeks. This will still require a manual step as the changes
 required in the sd driver are still under development.

 The ultimate plan is to have the entire process totally automated.

 If you have more questions, feel free to drop me a line.

 Thanks,
 George

 Yan wrote:
  Hey David
  might I need to track the evolution of that size-change utility to ZFS
  could I have a contact at Sun that would be able to give me more 
  information on that ?
  Being able to resize LUNS dynamically Is a reality here, I currently do it 
  with UFS after a EMC Clariion LUN Migration to a larger LUN
 
  That is our current show-stopper to using ZFS
  thanks
  Yannick
 
 
  This message posted from opensolaris.org
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Again ZFS with expanding LUNs!

2007-08-07 Thread Victor Engle
Hi Yannick,

Just to be sure I understand the restriction; with the clariion you
are limited in host side volume management so that basically you use
single luns with ufs filesystems on them and if you need additional
space in the ufs filesystem the only option is to resize the lun on
the clarion, rewrite the vtoc on the lun  and then growfs? That seems
like a significant limitation, especially in a very dynamic storage
centric environment. Just curious, any idea how much performance
suffers on the host if write coelesing is unusable?

Regards,
Vic



On 8/7/07, Yannick Mercier [EMAIL PROTECTED] wrote:
 From a storage administrator perspective, when doing capacity planning
 on a EMC Clariion Unit, it becomes a pain to have more than one LUN
 for the same volume. The Clariion with Navisphere agent gives the
 storage administrator more visibility in the storage management
 interface, showing mountpoints on the hosts for each Luns
 The EMC CLariion Storage best practices recommends to use one LUN per volume
 The write coelesing feature may be unusable if using more than one lun
 per volume if striped in ZFS

 Yannick

 On 8/7/07, Victor Engle [EMAIL PROTECTED] wrote:
  I can understand lun expansion capability being an issue with more a
  traditional volume manager or even a single lun but with pooled
  storage and the ability to expand the pool, benefiting all filesystems
  in the pool it seems a shame to consider lun expansion a show stopper.
 
  Even so, having all the details automated and transparent to the
  administrator would be very cool.
 
  Regards,
  Vic
 
 
  On 8/7/07, George Wilson [EMAIL PROTECTED] wrote:
   I'm planning on putting back the changes to ZFS into Opensolaris in
   upcoming weeks. This will still require a manual step as the changes
   required in the sd driver are still under development.
  
   The ultimate plan is to have the entire process totally automated.
  
   If you have more questions, feel free to drop me a line.
  
   Thanks,
   George
  
   Yan wrote:
Hey David
might I need to track the evolution of that size-change utility to ZFS
could I have a contact at Sun that would be able to give me more 
information on that ?
Being able to resize LUNS dynamically Is a reality here, I currently do 
it with UFS after a EMC Clariion LUN Migration to a larger LUN
   
That is our current show-stopper to using ZFS
thanks
Yannick
   
   
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   ___
   zfs-discuss mailing list
   zfs-discuss@opensolaris.org
   http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-20 Thread Victor Engle

On 6/20/07, Torrey McMahon [EMAIL PROTECTED] wrote:
Also, how does replication at the ZFS level use more storage - I'm
assuming raw block - then at the array level?
___



Just to add to the previous comments. In the case where you have a SAN
array providing storage to a host for use with ZFS the SAN storage
really needs to be redundant in the array AND the zpools need to be
redundant pools.

The reason the SAN storage should be redundant is that SAN arrays are
designed to serve logical units. The logical units are usually
allocated from a raid set, storage pool or aggregate of some kind. The
array side pool/aggregate may include 10 300GB disks and may have 100+
luns allocated from it for example. If redundancy is not used in the
array side pool/aggregate and then 1 disk failure will kill 100+ luns
at once.

On 6/20/07, Torrey McMahon [EMAIL PROTECTED] wrote:

James C. McPherson wrote:
 Roshan Perera wrote:

 But Roshan, if your pool is not replicated from ZFS' point of view,
 then all the multipathing and raid controller backup in the world will
 not make a difference.

 James, I Agree from ZFS point of view. However, from the EMC or the
 customer point of view they want to do the replication at the EMC level
 and not from ZFS. By replicating at the ZFS level they will loose some
 storage and its doubling the replication. Its just customer use to
 working with Veritas and UFS and they don't want to change their
 habbits.
 I just have to convince the customer to use ZFS replication.

 Hi Roshan,
 that's a great shame because if they actually want
 to make use of the features of ZFS such as replication,
 then they need to be serious about configuring their
 storage to play in the ZFS world and that means
 replication that ZFS knows about.


Also, how does replication at the ZFS level use more storage - I'm
assuming raw block - then at the array level?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS - SAN and Raid

2007-06-19 Thread Victor Engle

Roshan,

Could you provide more detail please. The host and zfs should be
unaware of any EMC array side replication so this sounds more like an
EMC misconfiguration than a ZFS problem. Did you look in the messages
file to see if anything happened to the devices that were in your
zpools? If so then that wouldn't be a zfs error. If your EMC devices
fall offline because of something happening on the array or fabric
then zfs is not to blame. The same thing would have happened with any
other filesystem built on those devices.

What kind of pools were in use, raidz, mirror or simple stripe?

Regards,
Vic




On 6/19/07, Roshan Perera [EMAIL PROTECTED] wrote:

Hi All,

We have come across a problem at a client where ZFS brought the system down 
with a write error on a EMC device due to mirroring done at the EMC level and 
not ZFS, Client is total EMC committed and not too happy to use the ZFS for 
oring/RAID-Z. I have seen the notes below about the ZFS and SAN attached 
devices and understand the ZFS behaviour.

Can someone help me with the following Questions:

Is this the way ZFS will work in the future ?
is there going to be any compromise to allow SAN Raid and ZFS to do the rest.
If so when and if possible details of it ?


Many Thanks

Rgds

Roshan

ZFS work with SAN-attached devices?

 Yes, ZFS works with either direct-attached devices or SAN-attached
 devices. However, if your storage pool contains no mirror or RAID-Z
 top-level devices, ZFS can only report checksum errors but cannot
 correct them. If your storage pool consists of mirror or RAID-Z
 devices built using storage from SAN-attached devices, ZFS can report
 and correct checksum errors.

 This says that if we are not using ZFS raid or mirror then the
 expected event would be for ZFS to report but not fix the error. In
 our case the system kernel panicked, which is something different. Is
 the FAQ wrong or is there a bug in ZFS?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS - SAN and Raid

2007-06-19 Thread Victor Engle

Roshan,

As far as I know, there is no problem at all with using SAN storage
with ZFS and it does look like you were having an underlying problem
with either powerpath or the array.

The best practices guide on opensolaris does recommend replicated
pools even if your backend storage is redundant. There are at least 2
good reasons for that. ZFS needs a replica for the self healing
feature to work. Also there is no fsck like tool for ZFS so it is a
good idea to make sure self healing can work.

I think first I would track down the cause of the messages just prior
to the zfs write error because even with replicated pools if several
devices error at once then the pool could be lost.

Regards,
Vic


On 6/19/07, Roshan Perera [EMAIL PROTECTED] wrote:

Victror,
Thanks for your comments but I believe it contradict what ZFS information given 
below and now Bruce's mail.
After some digging around I found that the messages file has thrown out some 
powerpath errors to one of the devices that may have caused the proble. 
attached below the errors. But the question still remains is ZFS only happy 
with JBOD disks and not SAN storage with hardware raid. Thanks
Roshan


Jun  4 16:30:09 su621dwdb ltid[23093]: [ID 815759 daemon.error] Cannot start 
rdevmi pr
ocess for remote shared drive operations on host su621dh01, cannot connect to 
vmd
Jun  4 16:30:12 su621dwdb emcp: [ID 801593 kern.notice] Info: Assigned volume 
Symm 000
290100491 vol 0ffe to
Jun  4 16:30:12 su621dwdb last message repeated 1 time
Jun  4 16:30:12 su621dwdb emcp: [ID 801593 kern.notice] Info: Assigned volume 
Symm 000
290100491 vol 0fee to
Jun  4 16:30:12 su621dwdb unix: [ID 836849 kern.notice]
Jun  4 16:30:12 su621dwdb ^Mpanic[cpu550]/thread=2a101dd9cc0:
Jun  4 16:30:12 su621dwdb unix: [ID 809409 kern.notice] ZFS: I/O failure (write on 
un
known off 0: zio 600574e7500 [L0 unallocated] 4000L/400P DVA[0]=5:55c00:400 
DVA[1]=
6:2b800:400 fletcher4 lzjb BE contiguous birth=107027 fill=0 
cksum=673200f97f:34804a
0e20dc:102879bdcf1d13:3ce1b8dac7357de): error 5
Jun  4 16:30:12 su621dwdb unix: [ID 10 kern.notice]
Jun  4 16:30:12 su621dwdb genunix: [ID 723222 kern.notice] 02a101dd9740 
zfs:zio_do
ne+284 (600574e7500, 0, a8, 708fdca0, 0, 6000f26cdc0)
Jun  4 16:30:12 su621dwdb genunix: [ID 179002 kern.notice]   %l0-3: 
060015beaf00 0
000708fdc00 0005 0005














 We have the same problem and I have just moved back to UFS because of
 this issue. According to the engineer at Sun that i spoke with, he
 implied that there is an RFE out internally that is to address
 this problem.

 The issue is this:

 When configuring a zpool with 1 vdev in it and zfs times out a write
 operation to the pool/filesystem for whatever reason, possibly
 just a
 hold back or retyrable error, the zfs module will cause a system panic
 because it thinks there are no other mirror's in the pool to write to
 and forces a kernel panic.

 The way around this is to configure the zpools with mirror's which
 negates the use of a hardware raid array, and sends twice the
 amount of
 data down to the RAID cache that is actually required (because of the
 mirroring at the ZFS layer). In our case it was a little old Sun
 StorEdge 3511 FC SATA Array, but the principle applies to any RAID
 arraythat is not configured as a JBOD.



 Victor Engle wrote:
  Roshan,
 
  Could you provide more detail please. The host and zfs should be
  unaware of any EMC array side replication so this sounds more
 like an
  EMC misconfiguration than a ZFS problem. Did you look in the
 messages file to see if anything happened to the devices that
 were in your
  zpools? If so then that wouldn't be a zfs error. If your EMC devices
  fall offline because of something happening on the array or fabric
  then zfs is not to blame. The same thing would have happened
 with any
  other filesystem built on those devices.
 
  What kind of pools were in use, raidz, mirror or simple stripe?
 
  Regards,
  Vic
 
 
 
 
  On 6/19/07, Roshan Perera [EMAIL PROTECTED] wrote:
  Hi All,
 
  We have come across a problem at a client where ZFS brought the
 system down with a write error on a EMC device due to mirroring
 done at the
  EMC level and not ZFS, Client is total EMC committed and not
 too happy
  to use the ZFS for oring/RAID-Z. I have seen the notes below
 about the
  ZFS and SAN attached devices and understand the ZFS behaviour.
 
  Can someone help me with the following Questions:
 
  Is this the way ZFS will work in the future ?
  is there going to be any compromise to allow SAN Raid and ZFS
 to do
  the rest.
  If so when and if possible details of it ?
 
 
  Many Thanks
 
  Rgds
 
  Roshan
 
  ZFS work with SAN-attached devices?
  
   Yes, ZFS works with either direct-attached devices or SAN-
 attached  devices. However, if your storage pool contains no
 mirror or RAID-Z
   top-level devices, ZFS can only report checksum errors but cannot
   correct them. If your storage pool

Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-19 Thread Victor Engle


 The best practices guide on opensolaris does recommend replicated
 pools even if your backend storage is redundant. There are at least 2
 good reasons for that. ZFS needs a replica for the self healing
 feature to work. Also there is no fsck like tool for ZFS so it is a
 good idea to make sure self healing can work.




NB. fsck is not needed for ZFS because the on-disk format is always
consistent.  This is orthogonal to hardware faults.



I understand that the on disk state is always consistent but the self
healing feature can correct blocks that have bad checksums if zfs is
able to retrieve the block from a good replica. So even though the
filesystem is consistent, the data can be corrupt in non-redundant
pools. I am unsure of what happens with a non-redundant pool when a
block has a bad checksum and perhaps you could clear that up. Does
this cause a problem for the pool or is it limited to the file or
files affected by the bad block and otherwise the pool is online and
healthy.

Thanks,
Vic
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual IP Integration

2007-06-15 Thread Victor Engle

Well I suppose complexity is relative. Still, to use Sun Cluster at
all I have to install the cluster framework on each node, correct? And
even before that I have to install an interconnect with 2 switches
unless I direct connect a simple 2 node cluster.

My thinking was that ZFS seems to try and bundle all storage related
tasks into 1 simple interface including making vfstab and dfstab
entries unnecessary and considered legacy wrt ZFS. If I am using ZFS
only to serve storage via IP then the only component I'm forced to
manage outside of ZFS is the IP and if that's really all I want then
it does seem like overkill to install, configure and administer sun
cluster framework on even 2 nodes.

I'm not really thinking about an application where I really need sun
cluster like availability. Just the convenience factor of being able
to export a pool to another system if I need to do maintenance or
patching or whatever without having to go configure the other system.
As it is now, the only thing I might need to do is go bring the
virtual IP on the system I import the pool to.

A good example would be maybe a system where I keep jumpstart images.
I really don't need HA for it but simple administration is always a
plus.

It's an easy enough task to script I suppose but it occurred to me
that it would be very convenient to have this task builtin to ZFS.

Regards,
Vic


On 6/15/07, Richard Elling [EMAIL PROTECTED] wrote:

Vic Engle wrote:
 Has there been any discussion here about the idea integrating a virtual IP 
into ZFS. It makes sense to me because of the integration of NFS and iSCSI with 
the sharenfs and shareiscsi properties. Since these are both dependent on an IP it 
would be pretty cool if there was also a virtual IP that would automatically move 
with the pool.

 Maybe something like zfs set ip.nge0=x.x.x.x mypool

 Or since we may have different interfaces on the nodes where we want to move 
the zpool...

 zfs set ip.server1.nge0=x.x.x.x mypool
 zfs set ip.server2.bge0=x.x.x.x mypool

 I know this could be handled with Sun Cluster but if I am only building a 
simple storage appliance to serve NFS and iSCSI along with CIFS via SAMBA then I 
don't want or need the overhead and complexity of Sun Cluster.

Overhead?

The complexity of a simple HA storage service is quite small.
The complexity arises when you have multiple dependencies where
various applications depend on local storage and other applications.
(think SMF, but spread across multiple OSes).  For a simple
relationship such as storage--ZFS--share, there isn't much complexity.

Reinventing the infrastructure needed to manage access in the
face of failures is a distinctly non-trivial task.  You can
even begin with a single node cluster, though a virtual IP on a
single node cluster isn't very interesting.
  -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss