[zfs-discuss] ZFS error handling - suggestion

2008-02-18 Thread Adrian Saul
Howdy,
 I have at several times had issues with consumer grade PC hardware and ZFS not 
getting along.  The problem is not the disks but the fact I dont have ECC and 
end to end checking on the datapath.  What is happening is that random memory 
errors and bit flips are written out to disk and when read back again ZFS 
reports it as a checksum failure:

  pool: myth
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
mythONLINE   0 048
  raidz1ONLINE   0 048
c7t1d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c6t1d0  ONLINE   0 0 0
c6t2d0  ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

/myth/tv/1504_20080216203700.mpg
/myth/tv/1509_20080217192700.mpg
 
Note there are no disk errors, just entire RAID errors.  I get the same thing 
on a mirror pool where both sides of the mirror have identical errors.  All I 
can assume is that it was corrupted after the checksum was calculated and 
flushed to disk like that.  In the past it was a motherboard capacitor that had 
popped - but it was enough to generate these errors under load.

At any rate ZFS is doing the right thing by telling me - what I dont like is 
that from that point on I cant convince ZFS to ignore it.  The data in question 
is video files - a bit flip here or there wont matter.  But if ZFS reads the 
affected block it returns and I/O error and until I restore the file I have no 
option but to try and make the application skip over it.  If it was UFS for 
example I would have never known, but ZFS makes a point of stopping anything 
using it - understandably, but annoyingly as well.

What I would like to see is an option to ZFS in the style of the 'onerror' for 
UFS i.e the ability to tell ZFS to join fight club - let what doesnt matter 
truely slide.  For example:

zfs set erroraction=[iofail|log|ignore]

This would default to the current action of iofail but in the event you 
wanted to try and recover or repair data, you could set log to say generate an 
FMA event that there is bad checksums, or ignore, to get on with your day.

As mentioned, I see this as mostly an option to help repair data after the 
issue is identified or repaired.  Of course its data specific, but if the 
application can allow it or handle it, why should ZFS get in the way?

Just a thought.

Cheers,
  Adrian

PS: And yes, I am now buying some ECC memory.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance with Sun StorageTek 2540

2008-02-18 Thread Ralf Ramge
Mertol Ozyoney wrote:

 2540 controler can achieve maximum 250 MB/sec on writes on the first 
 12 drives. So you are pretty close to maximum throughput already.

 Raid 5 can be a little bit slower.


I'm a bit irritated now. I have ZFS running for some Sybase ASE 12.5 
databases using X4600 servers (8x dual core, 64 GB RAM, Solaris 10 
11/06) and 4 GBit/s lowest cost Infortrend Fibrechannel JBODs with a 
total of 4x 16 FC drives imported in a single mirrored zpool. I 
benchmarked them with tiobench, using a filesize of 64 GB and 32 
parallel threads. With an untweaked ZFS the average throughput I got 
was: sequential  random read  1GB/s, sequential write 296 MB/s, random 
write 353 MB/s, leading to a total of approx. 650,000 IOPS with a 
maximum latency of  350 ms after the databases went into production and 
the bottleneck are basically the FC HBA's. These are averages, the peaks 
flatline  with reaching the 4 GBit/s FibreChannel maximum capacity 
pretty soon afterwards.

I'm a bit disturbed because I think about switching to 2530/2540 
shelves, but a maximum 250 MB/sec would disqualify them instantly, even 
with individual RAID controllers for each shelf. So my question is: Can 
I do the same thing I did with the IFT shelves, can I buy only 2501 
JBOBDs and attach them directly to the server, thus *not* using the 2540 
raid controller and still having access to the single drives?

I'm quite nervous about this, because I'm not just talking about a 
single databases - I'd need a total number of 42 shelves and I'm pretty 
sure SUN doesn't offer TryBuy deals at such a scale.

-- 

Ralf Ramge
Senior Solaris Administrator, SCNA, SCSA

Tel. +49-721-91374-3963 
[EMAIL PROTECTED] - http://web.de/

11 Internet AG
Brauerstraße 48
76135 Karlsruhe

Amtsgericht Montabaur HRB 6484

Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, 
Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Norbert Lang, 
Achim Weiss 
Aufsichtsratsvorsitzender: Michael Scheeren

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance with Sun StorageTek 2540

2008-02-18 Thread Bob Friesenhahn
On Mon, 18 Feb 2008, Ralf Ramge wrote:
 I'm a bit disturbed because I think about switching to 2530/2540
 shelves, but a maximum 250 MB/sec would disqualify them instantly, even

Note that this is single-file/single-thread I/O performance. I suggest 
that you read the formal benchmark report for this equipment since it 
covers multi-thread I/O performance as well.  The multi-user 
performance is considerably higher.

Given ZFS's smarts, the JBOD approach seems like a good one as long as 
the hardware provides a non-volatile cache.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance with Sun StorageTek 2540

2008-02-18 Thread Roch - PAE

Bob Friesenhahn writes:

  On Fri, 15 Feb 2008, Roch Bourbonnais wrote:
   What was the interlace on the LUN ?
  
   The question was about LUN  interlace not interface.
   128K to 1M works better.
  
  The segment size is set to 128K.  The max the 2540 allows is 512K. 
  Unfortunately, the StorageTek 2540 and CAM documentation does not 
  really define what segment size means.
  
   Any compression ?
  
  Compression is disabled.
  
   Does turn off checksum helps the number (that would point to a CPU limited 
   throughput).
  
  I have not tried that but this system is loafing during the benchmark. 
  It has four 3GHz Opteron cores.
  
  Does this output from 'iostat -xnz 20' help to understand issues?
  
   extended device statistics
   r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   3.00.7   26.43.5  0.0  0.00.04.2   0   2 c1t1d0
   0.0  154.20.0 19680.3  0.0 20.70.0  134.2   0  59 
  c4t600A0B80003A8A0B096147B451BEd0
   0.0  211.50.0 26940.5  1.1 33.95.0  160.5  99 100 
  c4t600A0B800039C9B50A9C47B4522Dd0
   0.0  211.50.0 26940.6  1.1 33.95.0  160.4  99 100 
  c4t600A0B800039C9B50AA047B4529Bd0
   0.0  154.00.0 19654.7  0.0 20.70.0  134.2   0  59 
  c4t600A0B80003A8A0B096647B453CEd0
   0.0  211.30.0 26915.0  1.1 33.95.0  160.5  99 100 
  c4t600A0B800039C9B50AA447B4544Fd0
   0.0  152.40.0 19447.0  0.0 20.50.0  134.5   0  59 
  c4t600A0B80003A8A0B096A47B4559Ed0
   0.0  213.20.0 27183.8  0.9 34.14.2  159.9  90 100 
  c4t600A0B800039C9B50AA847B45605d0
   0.0  152.50.0 19453.4  0.0 20.50.0  134.5   0  59 
  c4t600A0B80003A8A0B096E47B456DAd0
   0.0  213.20.0 27177.4  0.9 34.14.2  159.9  90 100 
  c4t600A0B800039C9B50AAC47B45739d0
   0.0  213.20.0 27195.3  0.9 34.14.2  159.9  90 100 
  c4t600A0B800039C9B50AB047B457ADd0
   0.0  154.40.0 19711.8  0.0 20.70.0  134.0   0  59 
  c4t600A0B80003A8A0B097347B457D4d0
   0.0  211.30.0 26958.6  1.1 33.95.0  160.6  99 100 
  c4t600A0B800039C9B50AB447B4595Fd0
  

Interesting that a subset of 5 disks are responding faster
(which also leads to smaller actv queues and so lower
service times) than the 7 others.



and the slow ones are subject to more writes...haha.

If the sizes of the luns are different (or have different
amount of free blocks) then maybe ZFS is now trying to rebalance
free space by targetting a subset of the disks with more 
new data.  Pool throughput will be impacted by this.


-r





  Bob
  ==
  Bob Friesenhahn
  [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
  GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS error handling - suggestion

2008-02-18 Thread Richard Elling
comment below...

Adrian Saul wrote:
 Howdy,
  I have at several times had issues with consumer grade PC hardware and ZFS 
 not getting along.  The problem is not the disks but the fact I dont have ECC 
 and end to end checking on the datapath.  What is happening is that random 
 memory errors and bit flips are written out to disk and when read back again 
 ZFS reports it as a checksum failure:

   pool: myth
  state: ONLINE
 status: One or more devices has experienced an error resulting in data
 corruption.  Applications may be affected.
 action: Restore the file in question if possible.  Otherwise restore the
 entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
  scrub: none requested
 config:

 NAMESTATE READ WRITE CKSUM
 mythONLINE   0 048
   raidz1ONLINE   0 048
 c7t1d0  ONLINE   0 0 0
 c7t3d0  ONLINE   0 0 0
 c6t1d0  ONLINE   0 0 0
 c6t2d0  ONLINE   0 0 0

 errors: Permanent errors have been detected in the following files:

 /myth/tv/1504_20080216203700.mpg
 /myth/tv/1509_20080217192700.mpg
  
 Note there are no disk errors, just entire RAID errors.  I get the same thing 
 on a mirror pool where both sides of the mirror have identical errors.  All I 
 can assume is that it was corrupted after the checksum was calculated and 
 flushed to disk like that.  In the past it was a motherboard capacitor that 
 had popped - but it was enough to generate these errors under load.

 At any rate ZFS is doing the right thing by telling me - what I dont like is 
 that from that point on I cant convince ZFS to ignore it.  The data in 
 question is video files - a bit flip here or there wont matter.  But if ZFS 
 reads the affected block it returns and I/O error and until I restore the 
 file I have no option but to try and make the application skip over it.  If 
 it was UFS for example I would have never known, but ZFS makes a point of 
 stopping anything using it - understandably, but annoyingly as well.

 What I would like to see is an option to ZFS in the style of the 'onerror' 
 for UFS i.e the ability to tell ZFS to join fight club - let what doesnt 
 matter truely slide.  For example:

 zfs set erroraction=[iofail|log|ignore]

 This would default to the current action of iofail but in the event you 
 wanted to try and recover or repair data, you could set log to say generate 
 an FMA event that there is bad checksums, or ignore, to get on with your day.

 As mentioned, I see this as mostly an option to help repair data after the 
 issue is identified or repaired.  Of course its data specific, but if the 
 application can allow it or handle it, why should ZFS get in the way?

 Just a thought.

 Cheers,
   Adrian

 PS: And yes, I am now buying some ECC memory.
   

I don't recall when this arrived in NV, but the failmode parameter
for storage pools has already been implemented.  From zpool(1m)
 failmode=wait | continue | panic

 Controls the system behavior  in  the  event  of  catas-
 trophic  pool  failure.  This  condition  is typically a
 result of a  loss  of  connectivity  to  the  underlying
 storage device(s) or a failure of all devices within the
 pool. The behavior of such an  event  is  determined  as
 follows:

 waitBlocks all I/O access until the device  con-
 nectivity  is  recovered  and the errors are
 cleared. This is the default behavior.

 continueReturns EIO to any new  write  I/O  requests
 but  allows  reads  to  any of the remaining
 healthy devices.  Any  write  requests  that
 have  yet  to  be committed to disk would be
 blocked.

 panic   Prints out a message to the console and gen-
 erates a system crash dump.

 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS error handling - suggestion

2008-02-18 Thread Joe Peterson
Richard Elling wrote:
 Adrian Saul wrote:
 Howdy, I have at several times had issues with consumer grade PC
 hardware and ZFS not getting along.  The problem is not the disks
 but the fact I dont have ECC and end to end checking on the
 datapath.  What is happening is that random memory errors and bit
 flips are written out to disk and when read back again ZFS reports
 it as a checksum failure:
 
 pool: myth state: ONLINE status: One or more devices has
 experienced an error resulting in data corruption.  Applications
 may be affected. action: Restore the file in question if possible.
 Otherwise restore the entire pool from backup. see:
 http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config:
 
 NAMESTATE READ WRITE CKSUM mythONLINE   0
 048 raidz1ONLINE   0 048 c7t1d0  ONLINE   0
 0 0 c7t3d0  ONLINE   0 0 0 c6t1d0  ONLINE   0
 0 0 c6t2d0  ONLINE   0 0 0
 
 errors: Permanent errors have been detected in the following files:
 
 
 /myth/tv/1504_20080216203700.mpg /myth/tv/1509_20080217192700.mpg
 
 Note there are no disk errors, just entire RAID errors.  I get the
 same thing on a mirror pool where both sides of the mirror have
 identical errors.  All I can assume is that it was corrupted after
 the checksum was calculated and flushed to disk like that.  In the
 past it was a motherboard capacitor that had popped - but it was
 enough to generate these errors under load.

I got a similar CKSUM error recently in which a block from a different
file ended up in one of my files.  So this was not a simple bit-flip,
but 64K of the file was bad.  However, I do not think any disk
filesystem should tolerate even bit flips.  Even in video files, I'd
want to know that

I hacked the ZFS source to temporarily ignore the error so I could see
what was wrong.  So your error(s) might be something of this kind
(except I do not understand, if so, how both of your mirrors were
affected in the same way - do you know this, or did ZFS simply say that
the file was not recoverable - i.e. it might have had different bad bits
in the two mirrors?).

For me, at least on subsequent reboots, no read or write errors were
reported on mine either, just CKSUM (I do seem to recall other errors
listed - read or write - but they were cleared on reboot, so I cannot
recall it exactly).  And I would think it's possible to get no errors if
it's simply a misdirected block write.  Still, I would then wonder why I
didn't see *2* files with errors if this is what happened to me.  I
guess I am saying that this may not be a memory glitch, but could also
be some IDE cable issue (as mine turned out to be).  See my post here:

http://lists.freebsd.org/pipermail/freebsd-stable/2008-February/040355.html

 At any rate ZFS is doing the right thing by telling me - what I
 dont like is that from that point on I cant convince ZFS to ignore
 it.  The data in question is video files - a bit flip here or there
 wont matter.  But if ZFS reads the affected block it returns and
 I/O error and until I restore the file I have no option but to try
 and make the application skip over it.  If it was UFS for example I
 would have never known, but ZFS makes a point of stopping anything
 using it - understandably, but annoyingly as well.

I understand your situation, and I agree that user-control might be nice
(in my case, I would not have had to tweak the ZFS code).  I do think
that zpool status should still reveal the error, however, even if the
file read does not report it (if you have set ZFS to ignore the error).
 I can also imagine this could be a bit dangerous if, e.g., the user
forgets this option is set.

 PS: And yes, I am now buying some ECC memory.

Good practice in general - I always use ECC.  There is nothing worse
than silent data corruption.

 I don't recall when this arrived in NV, but the failmode parameter
 for storage pools has already been implemented.  From zpool(1m)
  failmode=wait | continue | panic
 
  Controls the system behavior  in  the  event  of  catas-
  trophic  pool  failure.  This  condition  is typically a
  result of a  loss  of  connectivity  to  the  underlying
  storage device(s) or a failure of all devices within the
  pool. The behavior of such an  event  is  determined  as
  follows:
 
  waitBlocks all I/O access until the device  con-
  nectivity  is  recovered  and the errors are
  cleared. This is the default behavior.
 
  continueReturns EIO to any new  write  I/O  requests
  but  allows  reads  to  any of the remaining
  healthy devices.  Any  write  requests  that
  have  yet  to  be committed to disk would be
  blocked.
 
  panic   Prints out a message to the console and gen-
  erates a system crash dump.

Is 

Re: [zfs-discuss] ZFS error handling - suggestion

2008-02-18 Thread Eric Schrock
On Mon, Feb 18, 2008 at 11:52:48AM -0700, Joe Peterson wrote:
 
 Is wait the default behavior now?  When I had CKSUM errors, reading
 the file would return EIO and stop reading at that point (returning only
 the good data so far).  Do you mean it blocks access on the errored
 file, or on the whole device?  I've noticed the former, but not the latter.

The 'failmode' property only applies when writes fail, or
read-during-write dependies, such as the spacemaps.  It does not affect
normal reads.

- Eric

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS error handling - suggestion

2008-02-18 Thread Eric Schrock
On Mon, Feb 18, 2008 at 11:15:34AM -0800, Eric Schrock wrote:
 
 The 'failmode' property only applies when writes fail, or
 read-during-write dependies, such as the spacemaps.  It does not affect
^

That should read 'dependencies', obviously ;-)

- Eric

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] vxfs vs ufs vs zfs

2008-02-18 Thread Richard L. Hamilton
 Hello,
 
 I have just done comparison of all the above
 filesystems
 using the latest filebench.  If you are interested:
 http://przemol.blogspot.com/2008/02/zfs-vs-vxfs-vs-ufs
 -on-x4500-thumper.html
 
 Regards
 przemol

I would think there'd be a lot more variation based on workload,
such that the overall comparison may fall far short of telling the
whole story.  For example, IIRC, VxFS is more or less
extent-based (like mainframe storage), so serial I/O for large
files should be perhaps its strongest point, while other workloads
may do relatively better with the other filesystems.

The free basic edition sounds cool, though - downloading now.
I could use a bit of practice with VxVM/VxFS; it's always struck
me as very good when it was good (online reorgs of storage and
such), and an utter terror to untangle when it got messed up,
not to mention rather more complicated that DiskSuite/SVM
(and of course _waay_ more complicated than zfs :-)
Any idea if it works with reasonably recent OpenSolaris (build 81) ?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 'du' is not accurate on zfs

2008-02-18 Thread Richard L. Hamilton
 On Sat, 16 Feb 2008, Richard Elling wrote:
 
  ls -l shows the length.  ls -s shows the size,
 which may be
  different than the length.  You probably want size
 rather than du.
 
 That is true.  Unfortunately 'ls -s' displays in
 units of disk blocks 
 and does not also consider the 'h' option in order to
 provide a value 
 suitable for humans.
 
 Bob

ISTR someone already proposing to make ls -h -s   work in
a way one might hope for.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] vxfs vs ufs vs zfs

2008-02-18 Thread Todd Stansell
 The free basic edition sounds cool, though - downloading now.
 I could use a bit of practice with VxVM/VxFS; it's always struck
 me as very good when it was good (online reorgs of storage and
 such), and an utter terror to untangle when it got messed up,
 not to mention rather more complicated that DiskSuite/SVM
 (and of course _waay_ more complicated than zfs :-)

Also note that Veritas has a Simple Admin Utility (beta) available that
works on Storage Foundation 4.0 or higher.  You can find it here:

  
http://www.symantec.com/business/products/agents_options.jsp?pcid=2245pvid=203_1

I played with it brielfly when they first introduced it after folks
complained that vxvm/vxfs was so much more complicated than zfs.  I don't
really have a need for it myself, but it seemed to work fine.

Todd
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance with Sun StorageTek 2540

2008-02-18 Thread Robert Milkowski
Hello Joel,

Saturday, February 16, 2008, 4:09:11 PM, you wrote:

JM Bob,

JM Here is how you can tell the array to ignore cache sync commands
JM and the force unit access bits...(Sorry if it wraps..)

JM On a Solaris CAM install, the 'service' command is in /opt/SUNWsefms/bin

JM To read the current settings:
JM service -d arrayname -c read -q nvsram region=0xf2 host=0x00

JM save this output so you can reverse the changes below easily if needed...


JM To set new values:

JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x17 value=0x01 
host=0x00
JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x18 value=0x01 
host=0x00
JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x21 value=0x01 
host=0x00

JM Host region 00 is Solaris (w/Traffic Manager)

JM You will need to reboot both controllers after making the change before it 
becomes active.


Is it also necessary and does it work on 2530?


-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion

2008-02-18 Thread Stuart Anderson
Is this kernel panic a known ZFS bug, or should I open a new ticket?

Note, this happened on an X4500 running S10U4 (127112-06) with NCQ disabled.

Thanks.


Feb 18 17:55:18 thumper1 ^Mpanic[cpu1]/thread=fe8000809c80: 
Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion failed: 
arc_buf_remove_ref(db-db_buf, db) == 0, file: ../../common/fs/zfs/dbuf.c, 
line: 1692
Feb 18 17:55:18 thumper1 unix: [ID 10 kern.notice] 
Feb 18 17:55:18 thumper1 genunix: [ID 802836 kern.notice] fe80008099d0 
fb9c9853 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809a00 
zfs:zfsctl_ops_root+2fac59f2 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809a30 
zfs:dbuf_write_done+c8 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809a70 
zfs:arc_write_done+13b ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809ac0 
zfs:zio_done+1b8 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809ad0 
zfs:zio_next_stage+65 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b00 
zfs:zio_wait_for_children+49 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b10 
zfs:zio_wait_children_done+15 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b20 
zfs:zio_next_stage+65 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b60 
zfs:zio_vdev_io_assess+84 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809b70 
zfs:zio_next_stage+65 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809bd0 
zfs:vdev_mirror_io_done+c1 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809be0 
zfs:zio_vdev_io_done+14 ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809c60 
genunix:taskq_thread+bc ()
Feb 18 17:55:18 thumper1 genunix: [ID 655072 kern.notice] fe8000809c70 
unix:thread_start+8 ()
Feb 18 17:55:18 thumper1 unix: [ID 10 kern.notice] 

-- 
Stuart Anderson  [EMAIL PROTECTED]
http://www.ligo.caltech.edu/~anderson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion

2008-02-18 Thread Prabahar Jeyaram
The patches (127728-06 : sparc, 127729-07 : x86) which has the fix for 
this panic is in temporary state and will be released via SunSolve soon.

Please contact your support channel to get these patches.

--
Prabahar.

Stuart Anderson wrote:
 On Mon, Feb 18, 2008 at 06:28:31PM -0800, Stuart Anderson wrote:
 Is this kernel panic a known ZFS bug, or should I open a new ticket?

 Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion failed: 
 arc_buf_remove_ref(db-db_buf, db) == 0, file: ../../common/fs/zfs/dbuf.c, 
 line: 1692
 
 It looks like this might be bug 6523336,
 http://sunsolve.sun.com/search/document.do?assetkey=1-66-201229-1
 
 Does anyone know when the Binary relief for this and other Sol10 ZFS
 kernel panics will be released as normal kernel patches?
 
 Thanks.
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool shared between OSX and Solaris on a MacBook Pro

2008-02-18 Thread Peter Karlsson
Hi,

I got my MacBook pro set up to dual boot between Solaris and OSX and I  
have created a zpool to use as a shred storage for documents etc..  
However got this strange thing when trying to access the zpool from  
Solaris, only root can see it?? I created the zpool on OSX as they are  
using an old version of the on disk format, if I create a zpool on  
Solaris all users can see it, strange

Any ideas on what might be the issue here??

Cheers,
Peter

root# zpool get all zpace
NAME  PROPERTY VALUE   SOURCE
zpace  bootfs   -   default
zpace  autoreplace  off default
zpace  delegation   off default

root# zfs get all zpace/demo
NAMEPROPERTY   VALUE  SOURCE
zpace/demo  typefilesystem -
zpace/demo  creationSat Feb 16 13:25 2008  -
zpace/demo  used66.2M  -
zpace/demo  available   59.3G  -
zpace/demo  referenced  66.2M  -
zpace/demo  compressratio   1.00x  -
zpace/demo  mounted yes-
zpace/demo  quota   none   default
zpace/demo  reservation none   default
zpace/demo  recordsize  128K   default
zpace/demo  mountpoint  /Volumes/zpace/demodefault
zpace/demo  sharenfsoffdefault
zpace/demo  checksumon default
zpace/demo  compression offdefault
zpace/demo  atime   on default
zpace/demo  devices on default
zpace/demo  execon default
zpace/demo  setuid  on default
zpace/demo  readonlyoffdefault
zpace/demo  zoned   offdefault
zpace/demo  snapdir hidden default
zpace/demo  aclmode groupmask  default
zpace/demo  aclinherit  secure default
zpace/demo  canmounton default
zpace/demo  shareiscsi  offdefault
zpace/demo  xattr   on default
zpace/demo  copies  1  default
zpace/demo  version 2  -

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion

2008-02-18 Thread Stuart Anderson
On Mon, Feb 18, 2008 at 06:28:31PM -0800, Stuart Anderson wrote:
 Is this kernel panic a known ZFS bug, or should I open a new ticket?
 
 Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion failed: 
 arc_buf_remove_ref(db-db_buf, db) == 0, file: ../../common/fs/zfs/dbuf.c, 
 line: 1692

It looks like this might be bug 6523336,
http://sunsolve.sun.com/search/document.do?assetkey=1-66-201229-1

Does anyone know when the Binary relief for this and other Sol10 ZFS
kernel panics will be released as normal kernel patches?

Thanks.

-- 
Stuart Anderson  [EMAIL PROTECTED]
http://www.ligo.caltech.edu/~anderson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion

2008-02-18 Thread Stuart Anderson
Thanks for the information.

How does the temporary patch 127729-07 relate to the IDR127787 (x86) which
I believe also claims to fix this panic?

Thanks.


On Mon, Feb 18, 2008 at 08:32:03PM -0800, Prabahar Jeyaram wrote:
 The patches (127728-06 : sparc, 127729-07 : x86) which has the fix for 
 this panic is in temporary state and will be released via SunSolve soon.
 
 Please contact your support channel to get these patches.
 
 --
 Prabahar.
 
 Stuart Anderson wrote:
 On Mon, Feb 18, 2008 at 06:28:31PM -0800, Stuart Anderson wrote:
 Is this kernel panic a known ZFS bug, or should I open a new ticket?
 
 Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion 
 failed: arc_buf_remove_ref(db-db_buf, db) == 0, file: 
 ../../common/fs/zfs/dbuf.c, line: 1692
 
 It looks like this might be bug 6523336,
 http://sunsolve.sun.com/search/document.do?assetkey=1-66-201229-1
 
 Does anyone know when the Binary relief for this and other Sol10 ZFS
 kernel panics will be released as normal kernel patches?
 
 Thanks.
 

-- 
Stuart Anderson  [EMAIL PROTECTED]
http://www.ligo.caltech.edu/~anderson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic on arc_buf_remove_ref() assertion

2008-02-18 Thread Prabahar Jeyaram
Any IDRXX (Released immediately) is the interim relief (Will also 
contains the fix) provided to the customers till the official patch 
(Usually takes longer to be released) is available. Patch is supposed to 
be consider as the permanent solution.

--
Prabahar.

Stuart Anderson wrote:
 Thanks for the information.
 
 How does the temporary patch 127729-07 relate to the IDR127787 (x86) which
 I believe also claims to fix this panic?
 
 Thanks.
 
 
 On Mon, Feb 18, 2008 at 08:32:03PM -0800, Prabahar Jeyaram wrote:
 The patches (127728-06 : sparc, 127729-07 : x86) which has the fix for 
 this panic is in temporary state and will be released via SunSolve soon.

 Please contact your support channel to get these patches.

 --
 Prabahar.

 Stuart Anderson wrote:
 On Mon, Feb 18, 2008 at 06:28:31PM -0800, Stuart Anderson wrote:
 Is this kernel panic a known ZFS bug, or should I open a new ticket?

 Feb 18 17:55:18 thumper1 genunix: [ID 403854 kern.notice] assertion 
 failed: arc_buf_remove_ref(db-db_buf, db) == 0, file: 
 ../../common/fs/zfs/dbuf.c, line: 1692
 It looks like this might be bug 6523336,
 http://sunsolve.sun.com/search/document.do?assetkey=1-66-201229-1

 Does anyone know when the Binary relief for this and other Sol10 ZFS
 kernel panics will be released as normal kernel patches?

 Thanks.

 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance with Sun StorageTek 2540

2008-02-18 Thread Joel Miller
It is the same for the 2530, and I am fairly certain it is also valid  
for the 6130,6140,  6540.

-Joel

On Feb 18, 2008, at 3:51 PM, Robert Milkowski [EMAIL PROTECTED] wrote:

 Hello Joel,

 Saturday, February 16, 2008, 4:09:11 PM, you wrote:

 JM Bob,

 JM Here is how you can tell the array to ignore cache sync commands
 JM and the force unit access bits...(Sorry if it wraps..)

 JM On a Solaris CAM install, the 'service' command is in /opt/ 
 SUNWsefms/bin

 JM To read the current settings:
 JM service -d arrayname -c read -q nvsram region=0xf2 host=0x00

 JM save this output so you can reverse the changes below easily if  
 needed...


 JM To set new values:

 JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x17  
 value=0x01 host=0x00
 JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x18  
 value=0x01 host=0x00
 JM service -d arrayname -c set -q nvsram region=0xf2 offset=0x21  
 value=0x01 host=0x00

 JM Host region 00 is Solaris (w/Traffic Manager)

 JM You will need to reboot both controllers after making the change  
 before it becomes active.


 Is it also necessary and does it work on 2530?


 -- 
 Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss