[zfs-discuss] COW updates [C1]

2008-10-28 Thread Cyril ROUHASSIA
Dear all,
please find below test that I have run:

#zdb -v unxtmpzfs3--uberblock   for unxtmpzfs3 spool
Uberblock

magic = 00bab10c
version = 4
txg = 86983
guid_sum = 9860489793107228114
timestamp = 1225183041 UTC = Tue Oct 28 09:37:21 2008
rootbp = [L0 DMU objset] 400L/200P DVA[0]=1:8:200 
DVA[1]=0:7ac00:200 DVA[2]=1:18013000:200 fletcher4 lzjb BE contiguous 
birth=86983 fill=38 
cksum=d7e4c6e6f:508f5121f9f:f66339b469f2:2025284ff2f12d


# echo titi  /unxtmpzfs3/mnt1/mnt4/te1 -- update of te1 
file located in the zpool

#  zdb -v unxtmpzfs3  -- uberblock  for unxtmpzfs3 
spool after file update
Uberblock

magic = 00bab10c
version = 4
txg = 87012
guid_sum = 9860489793107228114
timestamp = 1225183186 UTC = Tue Oct 28 09:39:46 2008
rootbp = [L0 DMU objset] 400L/200P DVA[0]=1:82a00:200 
DVA[1]=0:7e400:200 DVA[2]=1:18015c00:200 fletcher4 lzjb BE contiguous 
birth=87012 fill=38 
cksum=c3ac8e047:46e375e1c21:d272d39402da:1aaadb02468e54



Conclusion is:

Because of one change to just one file, the MOS  is a brend new one. Then 
the question is : 
  Is the new MOS a  whole copy of the previous one  or  does it  share 
untouched data with the previous one and has its own copy of specific data 
(like an update onto a regular file)?
Indeed, I have checked the metadnode array entries  and it sounds like 
there are few entries which are different .

Is the uperblock a brend new one after the update (just 128k possible 
uperblocks!!!)??

Thank for your answer


C Rouhassia
*
This message and any attachments (the message) are confidential, intended 
solely for the addressee(s), and may contain legally privileged information.
Any unauthorised use or dissemination is prohibited. E-mails are susceptible to 
alteration.   
Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be 
liable for the message if altered, changed or
falsified.
  
Ce message et toutes les pieces jointes (ci-apres le message) sont 
confidentiels et susceptibles de contenir des informations couvertes 
par le secret professionnel. 
Ce message est etabli a l'intention exclusive de ses destinataires. Toute 
utilisation ou diffusion non autorisee est interdite.
Tout message electronique est susceptible d'alteration. 
La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de 
ce message s'il a ete altere, deforme ou falsifie.
*
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] COW updates [C1]

2008-10-28 Thread [EMAIL PROTECTED]
Hi Cyril,

Cyril ROUHASSIA wrote:

 Dear all,
 please find below test that I have run:

 #zdb -v unxtmpzfs3--uberblock   for unxtmpzfs3 spool
 Uberblock

 magic = 00bab10c
 version = 4
 txg = 86983
 guid_sum = 9860489793107228114
 timestamp = 1225183041 UTC = Tue Oct 28 09:37:21 2008
 rootbp = [L0 DMU objset] 400L/200P DVA[0]=1:8:200 
 DVA[1]=0:7ac00:200 DVA[2]=1:18013000:200 fletcher4 lzjb BE 
 contiguous birth=86983 fill=38 
 cksum=d7e4c6e6f:508f5121f9f:f66339b469f2:2025284ff2f12d


 # echo titi  /unxtmpzfs3/mnt1/mnt4/te1 -- update of 
 te1 file located in the zpool

 #  zdb -v unxtmpzfs3  -- uberblock  for 
 unxtmpzfs3 spool after file update
 Uberblock

 magic = 00bab10c
 version = 4
 txg = 87012
 guid_sum = 9860489793107228114
 timestamp = 1225183186 UTC = Tue Oct 28 09:39:46 2008
 rootbp = [L0 DMU objset] 400L/200P DVA[0]=1:82a00:200 
 DVA[1]=0:7e400:200 DVA[2]=1:18015c00:200 fletcher4 lzjb BE 
 contiguous birth=87012 fill=38 
 cksum=c3ac8e047:46e375e1c21:d272d39402da:1aaadb02468e54



 Conclusion is:

 * Because of one change to just one file, the MOS  is a brend new
   one. Then the question is :

   Is the new MOS a  whole copy of the previous one  or  does it  share 
 untouched data with the previous one and has its own copy of specific 
 data (like an update onto a regular file)?
 Indeed, I have checked the metadnode array entries  and it sounds like 
 there are few entries which are different .
A block containing changed MOS data will be new.  Other blocks of the 
MOS should be unchanged.  Of course,
any indirect (gang) blocks that need to be updated will also be new.

 * Is the uperblock a brend new one after the update (just 128k
   possible uperblocks!!!)??

Only one is active at any one time.  As I recall, the 128 possible 
uberblocks are treated as
a circular array.
max

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disabling auto-snapshot by default.

2008-10-28 Thread Tim Foster

Chris Gerhard wrote:

I'm not sure there's an easy way to please everyone to be honest :-/


I'm not sure you are right there. If there was an SMF property that you 
set to set the default behaviour and then you set it to true on 
something that looked like a laptop and false otherwise. Or you could 
just turn it on by default for root pools. Indeed the installation could 
set the property.


It's going to be turtles all the way down isn't it?

That is, we'd now have a property to set whether or not the system sets 
properties... Should we have another property set to tell the system 
whether to look at that property as well? :-)


I can see what you're asking for, but it just seems a bit redundant, 
when all you have to do is set a property on the pools you don't want 
auto-snapshotted, or just disable the service altogether.


I've got a suggested fix attached - but we'd need to coordinate with the 
desktop  install people to work out how they're actually going to set 
the service property at install time to true.  Can you file an rfe 
against solaris/zfs/utility with me as RE if you still think this is 
important, but read on...


If we have such a setting, auto-include, set to false by default, then 
on a brand new system, doing


# svcadm enable auto-snapshot:frequent

would have no effect whatsoever - other than running a cron job that 
does nothing every 15 minutes. This strikes me as weird.


While on  a desktop being on by default has some value on a server that 
has been upgraded and therefore would have a backup strategy already in 
place it does not.


In nv_100, SXCE has this off by default - the auto-snapshot SMF service 
already defaults to disabled, unfortunately time-slider had a 
postinstall script that enabled the services again. They've got that 
fixed in nv_102 I believe. So perhaps the auto-include property isn't 
needed after all.


Do we supply a script to delete all the auto-snapshot snapshots? Is 
there anyway to recognise them apart from by their name?


Nope. The time-slider service just looks for snapshots of given names.

We could mark those snapshot with another zfs user property, but that'd 
break backwards compatibility with earlier versions of Solaris that 
don't have snapshot property support, so I'd rather not do that if possible?


cheers,
tim





That said, how often do you import or create pools where this would be 
an issue?  If you're importing pools you've used on such a system 
before, then you'd already have the property set on the root dataset.


If you're constantly importing brand new pools, then yes, you've got a 
point.


You only have to have this mess with one pool which is being uses as the 
target for a backup for the consequences to be horrible. For that reason 
alone it needs to default to off.


Don't get me wrong. The service is really nice and the gui integration 
looks like it could be a thing of beauty if it can be made to scale. 
Indeed on my daughter's laptop it looks great. It's just it should 
default to being off with a really easy way to turn it on rather than 
the other way round.
diff -r ebf8b658f257 README.zfs-auto-snapshot.txt
--- a/README.zfs-auto-snapshot.txt  Sun Oct 26 17:41:39 2008 +
+++ b/README.zfs-auto-snapshot.txt  Tue Oct 28 11:13:21 2008 +
@@ -110,6 +110,13 @@
6343667 need itinerary so interrupted scrub/resilver
doesn't have to start over
 
+ zfs/auto-include  Set to false by default, setting to true makes service
+   instances using the '//' fs-name value attempt to
+   automatically set the com.sun:auto-snapshot
+   property to true on service start for any pools on the
+   system where it's not already set to either 
+   true or false.
+
 
 An example instance manifest is included in this archive.
 
diff -r ebf8b658f257 src/lib/svc/method/zfs-auto-snapshot
--- a/src/lib/svc/method/zfs-auto-snapshot  Sun Oct 26 17:41:39 2008 +
+++ b/src/lib/svc/method/zfs-auto-snapshot  Tue Oct 28 11:13:21 2008 +
@@ -895,7 +895,7 @@
 function auto_include {
FS_NAME=$fs_name
LABEL=$label
-   if [ $FS_NAME == // ] ; then
+   if [[ $FS_NAME == //  $auto_include == true ]] ; then
POOLS=$(zpool list -H -o name)
for pool in $POOLS ; do
if ! zpool status -x $pool | grep state: UNAVAIL  
/dev/null ; then
diff -r ebf8b658f257 src/samples/auto-snapshot-space-archive.xml
--- a/src/samples/auto-snapshot-space-archive.xml   Sun Oct 26 17:41:39 
2008 +
+++ b/src/samples/auto-snapshot-space-archive.xml   Tue Oct 28 11:13:21 
2008 +
@@ -45,6 +45,8 @@
override=true/
  propval name=avoidscrub type=boolean value=true
   override=true/
+ propval name=auto-include type=boolean value=false
+ 

Re: [zfs-discuss] Hotplug issues on USB removable media.

2008-10-28 Thread Niall Power
Bueller? Anyone?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-28 Thread Matt Harrison
On Mon, Oct 27, 2008 at 06:18:59PM -0700, Nigel Smith wrote:
 Hi Matt
 Unfortunately, I'm having problems un-compressing that zip file.
 I tried with 7-zip and WinZip reports this:
 
 skipping _1_20081027010354.cap: this file was compressed using an unknown 
 compression method.
Please visit www.winzip.com/wz54.htm for more information.
The compression method used for this file is 98.
 
 Please can you check it out, and if necessary use a more standard
 compression algorithm.
 Download File Size was 8,782,584 bytes.

Apologies, I had let winzip compress it with whatever it thought was best,
apparently this was the best method for size, not compatibility.

There's a new upload under the same URL compressed with 2.0 compatible
compression. Fingers crossed that works better for you.

Thanks

Matt


pgphON3KUfHQn.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hotplug issues on USB removable media.

2008-10-28 Thread Ross
Hi Niall,

I noticed that ZFS won't automatically import pools myself.  I didn't really 
consider it a problem since I wanted to script a bunch of stuff on USB 
insertion.  I was hoping to be able to write a script that would detect the 
insertion, attempt to automatically mount pools on devices that are recognised 
by the system, and issue a ZFS send to the device.

Regarding the hot plugging of USB devices, yes, that can cause problems.  I 
created a number of bug reports after finding out that ZFS can continue writing 
to a removed USB hard drive for some considerable period after the drive was 
removed.

The main thread where this is documented is here.  What you probably want is 
section 4 of the attached PDF:
http://www.opensolaris.org/jive/thread.jspa?threadID=68748

I reported a range of bugs from that, two that I think are probably relevant 
are:

Data loss when ZFS doesn't react to device removal:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6735932
(Hopefully not as severe with snv_100 onwards now the zpool status hang bug has 
been resolved.)

ZFS has inconsistent handling of device removal
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6735853

Ross
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hotplug issues on USB removable media.

2008-10-28 Thread Tim Foster
Niall Power wrote:
 Bueller? Anyone?

Yeah, I'd love to know the answer too. The furthest I got into
investigating this last time was:

http://mail.opensolaris.org/pipermail/zfs-discuss/2007-December/044787.html

- does that help at all Niall?

The context to Niall's question is to extend Time Slider to do proper
backups to usb devices whenever a device is inserted.  I nearly had this
working with:

http://blogs.sun.com/timf/entry/zfs_backups_to_usb_mass
http://blogs.sun.com/timf/entry/zfs_automatic_backup_0_1

but I used pcfs on the storage device to store flat zfs send-streams as
I didn't have a chance to work out what was going on. Getting ZFS plug
n' play on usb disks would be much much cooler though[1].

cheers,
tim

[1] and I reckon that by relying on the 'zfs/interval' 'none' setting
for the auto-snapshot service, doing this now will be a lot easier than
my previous auto-backup hack.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Performance with tens of thousands of zfs filesystems

2008-10-28 Thread Morten-Christian Bernson
I have been reading this forum for a little while, and am interested in more 
information about the performance of ZFS when creating large amounts of 
filesystems.   We are considering using ZFS for the user's home folders, and 
this could potentially be 30'000 filesystems, and if using snapshots that 
number would be multiplied by x snapshots as well.  

I am a bit nervous after reading this forum, that the performance when getting 
huge numbers of filesystems is not very good.  Using hours to boot the server, 
and possibly weeks to make the filesystems seems not right.

Any official input on how this will be in the upcoming release of Solaris 10?

Yours sincerly,
Morten-Christian Bernson
Solaris System Administrator
University of Bergen
Norway
[EMAIL PROTECTED]
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hotplug issues on USB removable media.

2008-10-28 Thread Niall Power
Hi Tim,

Tim Foster wrote:
 Niall Power wrote:
 Bueller? Anyone?

 Yeah, I'd love to know the answer too. The furthest I got into
 investigating this last time was:

 http://mail.opensolaris.org/pipermail/zfs-discuss/2007-December/044787.html 


 - does that help at all Niall?

I dug around and found those few hald pieces for zpools also. Seems to me
that there was at least an intention or desire to make things work work with
hald.
Some further searching around reveals this conversation thread:
http://opensolaris.org/jive/thread.jspa?messageID=257186
The trail goes cold there though.

 The context to Niall's question is to extend Time Slider to do proper
 backups to usb devices whenever a device is inserted.  I nearly had this
 working with:

 http://blogs.sun.com/timf/entry/zfs_backups_to_usb_mass
 http://blogs.sun.com/timf/entry/zfs_automatic_backup_0_1

 but I used pcfs on the storage device to store flat zfs send-streams as
 I didn't have a chance to work out what was going on. Getting ZFS plug
 n' play on usb disks would be much much cooler though[1].

Exactly. Having zfs as the native filesystem would enable snapshot browsing
from within nautilus so it's a requirement for this project.

 cheers,
 tim

 [1] and I reckon that by relying on the 'zfs/interval' 'none' setting
 for the auto-snapshot service, doing this now will be a lot easier than
 my previous auto-backup hack.
That could be quite useful alright. We might need to come up with a 
mechanism
to delete the snapshot after it's taken and backed up.

Cheers,
Niall

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hotplug issues on USB removable media.

2008-10-28 Thread James Litchfield
I believe the answer is in the last email in that thread. hald doesn't offer
the notifications and it's not clear that ZFS can handle them. As is noted,
there are complications with ZFS due to the possibility of multiple disks
comprising a volume, etc. It would be a lot of work to make it work
correctly for any but the simplest single disk case.

Jim
---
Niall Power wrote:
 Hi Tim,

 Tim Foster wrote:
   
 Niall Power wrote:
 
 Bueller? Anyone?
   
 Yeah, I'd love to know the answer too. The furthest I got into
 investigating this last time was:

 http://mail.opensolaris.org/pipermail/zfs-discuss/2007-December/044787.html 


 - does that help at all Niall?
 

 I dug around and found those few hald pieces for zpools also. Seems to me
 that there was at least an intention or desire to make things work work with
 hald.
 Some further searching around reveals this conversation thread:
 http://opensolaris.org/jive/thread.jspa?messageID=257186
 The trail goes cold there though.
   
 The context to Niall's question is to extend Time Slider to do proper
 backups to usb devices whenever a device is inserted.  I nearly had this
 working with:

 http://blogs.sun.com/timf/entry/zfs_backups_to_usb_mass
 http://blogs.sun.com/timf/entry/zfs_automatic_backup_0_1

 but I used pcfs on the storage device to store flat zfs send-streams as
 I didn't have a chance to work out what was going on. Getting ZFS plug
 n' play on usb disks would be much much cooler though[1].
 

 Exactly. Having zfs as the native filesystem would enable snapshot browsing
 from within nautilus so it's a requirement for this project.
   
 cheers,
 tim

 [1] and I reckon that by relying on the 'zfs/interval' 'none' setting
 for the auto-snapshot service, doing this now will be a lot easier than
 my previous auto-backup hack.
 
 That could be quite useful alright. We might need to come up with a 
 mechanism
 to delete the snapshot after it's taken and backed up.

 Cheers,
 Niall

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hotplug issues on USB removable media.

2008-10-28 Thread Niall Power
Hi James,

James Litchfield wrote:
 I believe the answer is in the last email in that thread. hald doesn't 
 offer
 the notifications and it's not clear that ZFS can handle them. As is 
 noted,
 there are complications with ZFS due to the possibility of multiple disks
 comprising a volume, etc. It would be a lot of work to make it work
 correctly for any but the simplest single disk case.

For the kind of usage cases we have in mind the percentage of people wanting
to back up to some kind of hot pluggable multi device volume is minutely
small in all likelihood. Would it be a lot of work to make it work for 
single
disk volumes? It seems a shame to not provide this functionality and 
convenience
because of a few very rare/exotic configurations for which it wouldn't 
work.

What would we stand to lose by getting at least some (majority?) of 
configurations
working?

Thanks,
Niall

 Jim
 ---
 Niall Power wrote:
 Hi Tim,

 Tim Foster wrote:
  
 Niall Power wrote:

 Bueller? Anyone?
   
 Yeah, I'd love to know the answer too. The furthest I got into
 investigating this last time was:

 http://mail.opensolaris.org/pipermail/zfs-discuss/2007-December/044787.html 


 - does that help at all Niall?
 

 I dug around and found those few hald pieces for zpools also. Seems 
 to me
 that there was at least an intention or desire to make things work 
 work with
 hald.
 Some further searching around reveals this conversation thread:
 http://opensolaris.org/jive/thread.jspa?messageID=257186
 The trail goes cold there though.
  
 The context to Niall's question is to extend Time Slider to do proper
 backups to usb devices whenever a device is inserted.  I nearly had 
 this
 working with:

 http://blogs.sun.com/timf/entry/zfs_backups_to_usb_mass
 http://blogs.sun.com/timf/entry/zfs_automatic_backup_0_1

 but I used pcfs on the storage device to store flat zfs send-streams as
 I didn't have a chance to work out what was going on. Getting ZFS plug
 n' play on usb disks would be much much cooler though[1].
 

 Exactly. Having zfs as the native filesystem would enable snapshot 
 browsing
 from within nautilus so it's a requirement for this project.
  
 cheers,
 tim

 [1] and I reckon that by relying on the 'zfs/interval' 'none' setting
 for the auto-snapshot service, doing this now will be a lot easier than
 my previous auto-backup hack.
 
 That could be quite useful alright. We might need to come up with a 
 mechanism
 to delete the snapshot after it's taken and backed up.

 Cheers,
 Niall

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

   


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hotplug issues on USB removable media.

2008-10-28 Thread Ross
Hey guys,

This may be a dumb thought from an end user, but why does it have to be hard 
for ZFS to automatically mount volumes on removable media?

Mounting single volumes should be straightforward and couldn't you just try to 
import any others and just silently fail if any required pieces are missing?  
That way you're using the existing ZFS import behaviour.  It means nothing 
would mount as you insert the first disk of a raid-z volume, but as soon as you 
plug enough of the disks in, ZFS would mount it automatically (albeit in a 
degraded state).

Then once the pool is mounted, the existing SATA and USB auto mount behaviour 
should be enough to incorporate any remaining devices that are inserted.

You might want to just allow simple mounts by default though.  Could you have a 
generic zfs automount property, with settings of 'off', 'simple', 'all'?

Simple pools are definitely going to be the most common usage, but it would be 
nice to have support for more complex setups too.  Especially since this would 
allow people to do things like easily expand their USB pool once it fills up, 
just by adding extra USB drives to the pool.

Ross
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import: all devices online but: insufficient replicas

2008-10-28 Thread kristof
HI,

Today I tried one more time from scratch.

I re-installed server B with latest available opensolaris 2008.11 (b99), b.t.w 
server A runs opensolaris 2008 b98

I also re-labeled all my disks.

This time I can successfully import the pool on server B:

[EMAIL PROTECTED]:~# zpool import
  pool: box3
id: 12004712858660209674
 state: ONLINE
status: One or more devices contains corrupted data.
action: The pool can be imported using its name or numeric identifier.
   see: http://www.sun.com/msg/ZFS-8000-4J
config:

box3   ONLINE
  mirror   ONLINE
c5t1d0s0   UNAVAIL  corrupted data
c8t600144F047BAB22EE081B33B9800d0  ONLINE
  mirror   ONLINE
c6t0d0s0   UNAVAIL  corrupted data
c8t600144F047BAB22FE081B33B9800d0  ONLINE
  mirror   ONLINE
c6t1d0s0   UNAVAIL  corrupted data
c8t600144F047BAB231E081B33B9800d0  ONLINE

 zpool import box3
[EMAIL PROTECTED]:~# zpool status
  pool: box3
 state: ONLINE
status: One or more devices could not be used because the label is missing or
invalid.  Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
box3   ONLINE   0 0 0
  mirror   ONLINE   0 0 0
745960039968786UNAVAIL  0 0 0  
was /dev/dsk/c5t1d0s0
c8t600144F047BAB22EE081B33B9800d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
10898717263963950690   UNAVAIL  0 0 0  
was /dev/dsk/c6t0d0s0
c8t600144F047BAB22FE081B33B9800d0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
7304718211036211049UNAVAIL  0 0 0  
was /dev/dsk/c6t1d0s0
c8t600144F047BAB231E081B33B9800d0  ONLINE   0 0 0

errors: No known data errors


Then I export the pool again on server B, and logout the iscsi targets

On server A I re-enable the iscsi targets and try to import the pool back, this 
is was I get:

-bash-3.2# zpool import
  pool: box3
id: 12004712858660209674
 state: DEGRADED
status: One or more devices are missing from the system.
action: The pool can be imported despite missing or damaged devices.  The
fault tolerance of the pool may be compromised if imported.
   see: http://www.sun.com/msg/ZFS-8000-2Q
config:

box3   DEGRADED
  mirror   DEGRADED
c5t1d0s0   ONLINE
c0t600144F047BAB22EE081B33B9800d0  UNAVAIL  cannot open
  mirror   DEGRADED
c6t0d0s0   ONLINE
c0t600144F047BAB22FE081B33B9800d0  UNAVAIL  cannot open
  mirror   DEGRADED
c6t1d0s0   ONLINE
c0t600144F047BAB231E081B33B9800d0  UNAVAIL  cannot open

-bash-3.2# format
Searching for disks...
Error: can't open selected disk 
'/dev/rdsk/c0t600144F047BAB22EE081B33B9800d0p0'.
Error: can't open selected disk 
'/dev/rdsk/c0t600144F047BAB22FE081B33B9800d0p0'.
Error: can't open selected disk 
'/dev/rdsk/c0t600144F047BAB231E081B33B9800d0p0'.
done

c0t600144F047BAB22EE081B33B9800d0: configured with capacity of 446.24GB
c0t600144F047BAB22FE081B33B9800d0: configured with capacity of 446.24GB
c0t600144F047BAB231E081B33B9800d0: configured with capacity of 446.24GB


AVAILABLE DISK SELECTIONS:
   0. c0t600144F047BAB22EE081B33B9800d0 SUN-SOLARIS-1 cyl 56 alt 2 hd 
255 sec 65535
  /scsi_vhci/[EMAIL PROTECTED]
   1. c0t600144F047BAB22FE081B33B9800d0 SUN-SOLARIS-1 cyl 56 alt 2 hd 
255 sec 65535
  /scsi_vhci/[EMAIL PROTECTED]
   2. c0t600144F047BAB231E081B33B9800d0 SUN-SOLARIS-1 cyl 56 alt 2 hd 
255 sec 65535
  /scsi_vhci/[EMAIL PROTECTED]
   3. c5t0d0 DEFAULT cyl 60798 alt 2 hd 255 sec 63
  /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
   4. c5t1d0 ATA-WDCWD5001ABYS-0-1D01 cyl 60798 alt 2 hd 255 sec 63
  /[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
   5. c6t0d0 ATA-WDCWD1000FYPS-0-1B01 cyl 60797 alt 2 hd 255 sec 126
  /[EMAIL 

[zfs-discuss] lu ZFS root questions

2008-10-28 Thread Karl Rossing
Currently running b93.

I'd like to try out b101.

I previously had b90 running on the system. I ran ludelete snv_90_zfs

but I still see snv_90_zfs:
$ zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
rpool 52.9G  6.11G31K  /rpool
rpool/ROOT41.0G  6.11G19K  /rpool/ROOT
rpool/ROOT/snv_90_zfs 29.6G  6.11G  29.3G  /.alt.tmp.b-Ugf.mnt/
rpool/ROOT/[EMAIL PROTECTED]   319M  -  29.6G  -
rpool/ROOT/snv_93 11.4G  6.11G  26.4G  /
rpool/dump3.95G  6.11G  3.95G  -
rpool/swap8.00G  13.1G  1.05G  -


I also did some house cleaning and delete about 10GB of data but I don't 
see that reflected with zfs list or df -h.

Should I be concerned before I do a lucreate/luupgrade?

Thanks
Karl






CONFIDENTIALITY NOTICE:  This communication (including all attachments) is
confidential and is intended for the use of the named addressee(s) only and
may contain information that is private, confidential, privileged, and
exempt from disclosure under law.  All rights to privilege are expressly
claimed and reserved and are not waived.  Any use, dissemination,
distribution, copying or disclosure of this message and any attachments, in
whole or in part, by anyone other than the intended recipient(s) is strictly
prohibited.  If you have received this communication in error, please notify
the sender immediately, delete this communication from all data storage
devices and destroy all hard copies.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hotplug issues on USB removable media.

2008-10-28 Thread Miles Nordin
 tf == Tim Foster [EMAIL PROTECTED] writes:

tf store flat zfs send-streams

I thought it was said over and over that 'zfs send' streams could
never be stored, only piped to 'zfs recv'.  If you store one and then
find it's corrupt, the answer is ``didn't let ZFS handle redundancy,''
``sysadmin's fault, not a bug,'' ``it was corrupted by weak TCP
checksums, USB gremlins, poor FLASH ECC, traces on your motherboard
without parity (it does happen!  be glad it didn't happen silently!
restore the pool from ba---oh, that was your backup.  shit.)''

Also I don't think it is currently safe to allow mounting of
stick-based USB ZFS filesystems on a multi-user machine because
someone coudl show up at a SunRay cluster with one of these
poison-sticks that panics on import.  I stumbled onto a bugnumber with
a wild idea for addressing this:

 http://bugs.opensolaris.org/view_bug.do?bug_id=4879357

The suggestion is scary that problems with one pool will restart ZFS
for all pools, and it seems like something that could loop.  but the
idea of a single bulletted-whitepaper-feechur addressing a whole class
of problems is pretty attractive.  

I guess using FUSE for all removable media is another path, but feels
like defeat---the hotplug stuff isn't always perfect on Macs but at
least they don't seem to panic from corrupt filesystems often, and
they do proper high-speed in-kernel filesystems.  I guess I'm asking
for something more drastic and beyond common-practice with the SunRay
reference though---to treat USB sticks as untrusted input analagous to
network packets, meaning if you can create a stick that makes the
kernel panic, you've potentially discovered a kernel-level
privilege-escalation exploit, not just a broken stick.  With this
whole power-saving theme of ``containers'' and so on, it's no longer
reasonable to punt and say, ``well he had physical access to the
machine anyway---he could have taken the cover off and done
whatever,'' because we'd like to allow people to introduce USB sticks
over the network.


pgpu7hfJELTwA.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!

2008-10-28 Thread Tano
So it's finally working: nothing special was done to get it working either 
which is extremely vexing!

I disabled the I/OAT DMA feature from the BIOS that apparently assists the 
network card and enabled the TPGT option on the iscsi target. I have two 
iscsitargets, one 100G on a mirror on the internal SATA controller , and a 1TB 
block on a RAIDZ partition.

I have confirmed by disabling I/OAT DMA that I can READ/WRITE to the raidz via 
ISCSI. With I/OAT DMA enabled I can only read from the disks. Writes will 
LAG/FAIL within 10 megabytes


Based on the wiki, I/OAT DMA only provides a 10% speed improvement on the 
network card. It seems that the broadcom drivers supplied with Solaris may be 
the culprit?

I hope for all those individuals who were experiencing this problem can try to 
turn off the I/OAT DMA or similar option to see whether their problems go away.

Transferred 100 gigs of data from the local store to the iscsi target on open 
solaris in 26 minutes.

Local store = 1 SATA 1.5gb/s drive pushing at 65mb/s read average; not too bad!

The I/OAT DMA feature works fine under Debian Linux and serves iscsi targets 
without any issues. 

Thank Nigel for all your help and patience. I will post on this topic some more 
if I get anything new, (basically if I have been getting extremely lucky and 
the problem returns all of a sudden.)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] lu ZFS root questions

2008-10-28 Thread Lori Alt
Karl Rossing wrote:
 Currently running b93.

 I'd like to try out b101.

 I previously had b90 running on the system. I ran ludelete snv_90_zfs

 but I still see snv_90_zfs:
 $ zfs list
 NAME   USED  AVAIL  REFER  MOUNTPOINT
 rpool 52.9G  6.11G31K  /rpool
 rpool/ROOT41.0G  6.11G19K  /rpool/ROOT
 rpool/ROOT/snv_90_zfs 29.6G  6.11G  29.3G  /.alt.tmp.b-Ugf.mnt/
 rpool/ROOT/[EMAIL PROTECTED]   319M  -  29.6G  -
 rpool/ROOT/snv_93 11.4G  6.11G  26.4G  /
 rpool/dump3.95G  6.11G  3.95G  -
 rpool/swap8.00G  13.1G  1.05G  -


 I also did some house cleaning and delete about 10GB of data but I don't 
 see that reflected with zfs list or df -h.

 Should I be concerned before I do a lucreate/luupgrade?
   
Probably, because it looks like you might not
have enough space.

Was there any output from ludelete?

If you do a lustatus now, does the BE still show up?

lori
 Thanks
 Karl






 CONFIDENTIALITY NOTICE:  This communication (including all attachments) is
 confidential and is intended for the use of the named addressee(s) only and
 may contain information that is private, confidential, privileged, and
 exempt from disclosure under law.  All rights to privilege are expressly
 claimed and reserved and are not waived.  Any use, dissemination,
 distribution, copying or disclosure of this message and any attachments, in
 whole or in part, by anyone other than the intended recipient(s) is strictly
 prohibited.  If you have received this communication in error, please notify
 the sender immediately, delete this communication from all data storage
 devices and destroy all hard copies.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] lu ZFS root questions

2008-10-28 Thread Karl Rossing
Lori,

Thanks for taking the time to reply. Please see below.

Karl

Lori Alt wrote:
 Karl Rossing wrote:
 Currently running b93.

 I'd like to try out b101.

 I previously had b90 running on the system. I ran ludelete snv_90_zfs

 but I still see snv_90_zfs:
 $ zfs list
 NAME   USED  AVAIL  REFER  MOUNTPOINT
 rpool 52.9G  6.11G31K  /rpool
 rpool/ROOT41.0G  6.11G19K  /rpool/ROOT
 rpool/ROOT/snv_90_zfs 29.6G  6.11G  29.3G  /.alt.tmp.b-Ugf.mnt/
 rpool/ROOT/[EMAIL PROTECTED]   319M  -  29.6G  -
 rpool/ROOT/snv_93 11.4G  6.11G  26.4G  /
 rpool/dump3.95G  6.11G  3.95G  -
 rpool/swap8.00G  13.1G  1.05G  -


 I also did some house cleaning and delete about 10GB of data but I 
 don't see that reflected with zfs list or df -h.

 Should I be concerned before I do a lucreate/luupgrade?
   
 Probably, because it looks like you might not
 have enough space.

do I need to zfs destroy rpool/ROOT/snv_90_zfs and 
rpool/ROOT/[EMAIL PROTECTED]

 Was there any output from ludelete?
I didn't capture that :(
 If you do a lustatus now, does the BE still show up?

 lori
lustatus:
Boot Environment   Is   Active ActiveCanCopy 
Name   Complete NowOn Reboot Delete Status   
--  -- - -- --
snv_93 yes  yesyes   no -

The new be i'm calling snv_101
lucreate -c snv_93 -n snv_101
Checking GRUB menu...
System has findroot enabled GRUB
Analyzing system configuration.
Comparing source boot environment snv_93 file systems with the file
system(s) you specified for the new boot environment. Determining which
file systems should be in the new boot environment.
Updating boot environment description database on all BEs.
Updating system configuration files.
Creating configuration for boot environment snv_101.
Source boot environment is snv_93.
Creating boot environment snv_101.
Cloning file systems from boot environment snv_93 to create boot 
environment snv_101.
Creating snapshot for rpool/ROOT/snv_93 on rpool/ROOT/[EMAIL PROTECTED].
Creating clone for rpool/ROOT/[EMAIL PROTECTED] on rpool/ROOT/snv_101.
Setting canmount=noauto for / in zone global on rpool/ROOT/snv_101.
Saving existing file /boot/grub/menu.lst in top level dataset for BE 
snv_101 as mount-point//boot/grub/menu.lst.prev.
File /boot/grub/menu.lst propagation successful
Copied GRUB menu from PBE to ABE
No entry for BE snv_101 in GRUB menu
Population of boot environment snv_101 successful.
Creation of boot environment snv_101 successful.

lustatus
Boot Environment   Is   Active ActiveCanCopy 
Name   Complete NowOn Reboot Delete Status   
--  -- - -- --
snv_93 yes  yesyes   no -
snv_101yes  no noyes-

Then when I luupgrade -u -n snv_101 -s /export/Sol11_x86_b101 the server 
freezes and reboots. I don't see any error message on the console or 
/var/adm/messages.

/var/crash/server_name has:
-rw-r--r--   1 root root 1689799 Oct 28 15:04 unix.0
-rw-r--r--   1 root root 2713427 Oct 28 15:20 unix.1
-rw-r--r--   1 root root 2713427 Oct 28 15:39 unix.2
-rw-r--r--   1 root root 1133973504 Oct 28 15:05 vmcore.0
-rw-r--r--   1 root root 518361088 Oct 28 15:21 vmcore.1
-rw-r--r--   1 root root 488185856 Oct 28 15:40 vmcore.2







CONFIDENTIALITY NOTICE:  This communication (including all attachments) is
confidential and is intended for the use of the named addressee(s) only and
may contain information that is private, confidential, privileged, and
exempt from disclosure under law.  All rights to privilege are expressly
claimed and reserved and are not waived.  Any use, dissemination,
distribution, copying or disclosure of this message and any attachments, in
whole or in part, by anyone other than the intended recipient(s) is strictly
prohibited.  If you have received this communication in error, please notify
the sender immediately, delete this communication from all data storage
devices and destroy all hard copies.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] COW updates [C1]

2008-10-28 Thread Marcelo Leal
 Because of one change to just one file, the MOS  is a brend new one
 Yes, all writes in ZFS are done in transaction groups.. so, evertime there 
is a commit, something is really write to disk, there is a new txg and all the 
blocks written are related to that txg (even the ubberblock).
 I don´t know if i understood the other questions about updates in the MOS, 
128K uberblocks, and regular files... but the location of the active uberblock 
is on the vdev´s labels (L0...L3), and the label´s update is the only one is 
not COW on ZFS, because the location of labels are fixed on disks.
 The updates in labels are done following a staged approach = (L0/L2 after 
L1/L3). 
 And the updates to an uberblock are done by writing a modified uberblock to 
another element of the uberblock array(128).
peace.

 Leal
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove slog device from zpool

2008-10-28 Thread Neil Perrin
Ethan,

It is still not possible to remove a slog from a pool. This is bug:

6574286 removing a slog doesn't work

The error message:

cannot remove c4t15d0p0: only inactive hot spares or cache devices can be 
removed

is correct and this is the same as documented in the zpool man page:

 zpool remove pool device ...

 Removes the specified device from the pool. This command
 currently  only  supports  removing hot spares and cache
 devices.

It's actually relatively easy to implement removal of slogs. We simply flush the
outstanding transactions and start using the main pool for the Intent Logs.
Thus the vacated device can be removed.
However, we wanted to make sure it fit into the framework for
the removal of any device. This a much harder problem which we
have made progress, but it's not there yet...

Neil.


On 10/26/08 11:41, Ethan Erchinger wrote:
 Sorry for the first incomplete send,  stupid Ctrl-Enter. :-)
 
 Hello,
 
 I've looked quickly through the archives and haven't found mention of 
 this issue.  I'm running SXCE (snv_99), which uses zfs version 13.  I 
 had an existing zpool:
 --
 [EMAIL PROTECTED] ~]$ zpool status -v data
   pool: data
  state: ONLINE
  scrub: none requested
 config:
 
 NAME   STATE READ WRITE CKSUM
 data   ONLINE   0 0 0
   mirror   ONLINE   0 0 0
 c4t1d0p0   ONLINE   0 0 0
 c4t9d0p0   ONLINE   0 0 0
   ...
 cache
   c4t15d0p0ONLINE   0 0 0
 
 errors: No known data errors
 
 --
 
 The cache device (c4t15d0p0) is an Intel SSD.  To test zil, I removed 
 the cache device, and added it as a log device:
 --
 [EMAIL PROTECTED] ~]$ pfexec zpool remove data c4t15d0p0
 [EMAIL PROTECTED] ~]$ pfexec zpool add data log c4t15d0p0
 [EMAIL PROTECTED] ~]$ zpool status -v data
   pool: data
  state: ONLINE
  scrub: none requested
 config:
 
 NAME   STATE READ WRITE CKSUM
 data   ONLINE   0 0 0
   mirror   ONLINE   0 0 0
 c4t1d0p0   ONLINE   0 0 0
 c4t9d0p0   ONLINE   0 0 0
   ...
 logs   ONLINE   0 0 0
   c4t15d0p0ONLINE   0 0 0
 
 errors: No known data errors
 --
 
 The device is working fine.  I then said, that was fun, time to remove 
 and add as cache device.  But that doesn't seem possible:
 --
 [EMAIL PROTECTED] ~]$ pfexec zpool remove data c4t15d0p0
 cannot remove c4t15d0p0: only inactive hot spares or cache devices can 
 be removed
 --
 
 I've also tried using detach, offline, each failing in other more 
 obvious ways.  The manpage does say that those devices should be 
 removable/replaceable.  At this point the only way to reclaim my SSD 
 device is to destroy the zpool.
 
 Just in-case you are wondering about versions:
 --
 [EMAIL PROTECTED] ~]$ zpool upgrade data
 This system is currently running ZFS pool version 13.
 
 Pool 'data' is already formatted using the current version.
 [EMAIL PROTECTED] ~]$ uname -a
 SunOS opensolaris 5.11 snv_99 i86pc i386 i86pc
 --
 
 Any ideas?
 
 Thanks,
 Ethan
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool CKSUM errors since drive replace

2008-10-28 Thread Matthew Angelo
Another update:

Last night, already reading many blogs about si3124 chipset problems with
Solaris 10 I applied the Patch Id: 138053-02 which updates si3124 from 1.2
to 1.4 and fixes numerous performance and interrupt related bugs.

And it appears to have helped.Below is the zpool scrub after the new
driver, but I'm still not confident on the exact problem.

# zpool status -v
  pool: rzdata
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed with 1 errors on Wed Oct 29 05:32:16 2008
config:

NAMESTATE READ WRITE CKSUM
rzdata  ONLINE   0 0 2
  raidz1ONLINE   0 0 2
c3t0d0  ONLINE   0 0 0
c3t1d0  ONLINE   0 0 0
c3t2d0  ONLINE   0 0 0
c3t3d0  ONLINE   0 0 0
c4t0d0  ONLINE   0 0 0
c4t1d0  ONLINE   0 0 0
c4t2d0  ONLINE   0 0 3
c4t3d0  ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

/rzdata/downloads/linux/ubuntu-8.04.1-desktop-i386.iso

It still didn't clear the errored file I have, which I'm curious about
considering it's a RAIDZ.

On Mon, Oct 27, 2008 at 2:57 PM, Matthew Angelo [EMAIL PROTECTED] wrote:

 Another update.

 Weekly cron kicked in again this week, but this time is failed with a lot
 of CKSUM errors and now also complained about corrupted files.  The single
 file it complained about is a new one I recently copied into it.

 I'm stumped with this.  How do I verify the x86 hardware under the OS?

 I've run Memtest86 and it ran overnight without a problem.  Tonight I will
 be moving back to my old Motherboard/CPU/Memory.  Hopefully this is a simple
 hardware problems.

 But the question I'd like to pose to everyone is, how can we validate our
 x86 hardware?


 On Tue, Oct 21, 2008 at 8:23 AM, David Turnbull [EMAIL PROTECTED]wrote:

 I don't think it's normal, no.. it seems to occur when the resilver is
 interrupted and gets marked as done prematurely?


 On 20/10/2008, at 12:28 PM, Matthew Angelo wrote:

  Hi David,

 Thanks for the additional input.   This is the reason why I thought I'd
 start a thread about it.

 To continue my original topic, I have additional information to add.
 After last weeks initial replace/resilver/scrub -- my weekly cron scrub
 (runs Sunday morning) kicked off and all CKSUM errors have now cleared:


  pool: rzdata
  state: ONLINE
  scrub: scrub completed with 0 errors on Mon Oct 20 09:41:31 2008
 config:

NAMESTATE READ WRITE CKSUM
rzdata  ONLINE   0 0 0
  raidz1ONLINE   0 0 0
c3t0d0  ONLINE   0 0 0
c3t1d0  ONLINE   0 0 0
c3t2d0  ONLINE   0 0 0
c3t3d0  ONLINE   0 0 0
c4t0d0  ONLINE   0 0 0
c4t1d0  ONLINE   0 0 0
c4t2d0  ONLINE   0 0 0
c4t3d0  ONLINE   0 0 0

 errors: No known data errors


 Which requires me to ask -- is it standard for high Checksum (CKSUM)
 errors on a zpool when you replace a failed disk after it has resilvered?

 Is there anything I can feedback into the zfs community on this matter?

 Matt

 On Sun, Oct 19, 2008 at 9:26 AM, David Turnbull [EMAIL PROTECTED]
 wrote:
 Hi Matthew.

 I had a similar problem occur last week. One disk in the raidz had the
 first 4GB zeroed out (manually) before we then offlined it and replaced with
 a new disk.
 High checksum errors were occuring on the partially-zeroed disk, as you'd
 expect, but when the new disk was inserted, checksum errors occured on all
 disks.

 Not sure how relevant this is to your particular situation, but
 unexpected checksum errors on known-good hardware has definitely happened to
 me as well.

 -- Dave


 On 15/10/2008, at 10:50 PM, Matthew Angelo wrote:

 The original disk failure was very explicit.  High Read Errors and errors
 inside /var/adm/messages.

 When I replaced the disk however, these have all gone and the resilver
 was okay.  I am not seeing any read/write or /var/adm/messages errors -- but
 for some reason I am seeing errors inside the CKSUM column which I've never
 seen before.

 I hope you're right and it's a simple memory corruption problem.   I will
 be running memtest86 overnight and hopefully it fails so we can rule our
 zfs.


 On Wed, Oct 15, 2008 at 11:48 AM, Mark J Musante [EMAIL PROTECTED]
 wrote:
  So this is where I stand.  I'd like to ask zfs-discuss if they've seen
 any ZIL/Replay style bugs associated with u3/u5 x86?  Again, I'm confident
 in my 

Re: [zfs-discuss] diagnosing read performance problem

2008-10-28 Thread Nigel Smith
Hi Matt.
Ok, got the capture and successfully 'unzipped' it.
(Sorry, I guess I'm using old software to do this!)

I see 12840 packets. The capture is a TCP conversation 
between two hosts using the SMB aka CIFS protocol.

10.194.217.10 is the client - Presumably Windows?
10.194.217.3 is the server - Presumably OpenSolaris - CIFS server?

Using WireShark,
Menu: 'Statistics  Endpoints' show:

The Client has transmitted 4849 packets, and
the Server has transmitted 7991 packets.

Menu: 'Analyze  Expert info Composite':
The 'Errors' tab shows:
4849 packets with a 'Bad TCP checksum' error - These are all transmitted by the 
Client.

(Apply a filter of 'ip.src_host == 10.194.217.10' to confirm this.)

The 'Notes' tab shows:
..numerous 'Duplicate Ack's'
For example, for 60 different ACK packets, the exact same packet was 
re-transmitted 7 times!
Packet #3718 was duplicated 17 times.
Packet #8215 was duplicated 16 times.
packet #6421 was duplicated 15 times, etc.
These bursts of duplicate ACK packets are all coming from the client side.

This certainly looks strange to me - I've not seen anything like this before.
It's not going to help the speed to unnecessarily duplicate packets like
that, and these burst are often closely followed by a short delay, ~0.2 seconds.
And as far as I can see, it looks to point towards the client as the source
of the problem.
If you are seeing the same problem with other client PC, then I guess we need 
to 
suspect the 'switch' that connects them.

Ok, that's my thoughts  conclusion for now.
Maybe you could get some more snoop captures with other clients, and
with a different switch, and do a similar analysis.
Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-28 Thread Richard Elling
I replied to Matt directly, but didn't hear back.  It may be a driver issue
with checksum offloading.  Certainly the symptoms are consistent.
To test with a workaround see
http://bugs.opensolaris.org/view_bug.do?bug_id=6686415
 -- richard

Nigel Smith wrote:
 Hi Matt.
 Ok, got the capture and successfully 'unzipped' it.
 (Sorry, I guess I'm using old software to do this!)

 I see 12840 packets. The capture is a TCP conversation 
 between two hosts using the SMB aka CIFS protocol.

 10.194.217.10 is the client - Presumably Windows?
 10.194.217.3 is the server - Presumably OpenSolaris - CIFS server?

 Using WireShark,
 Menu: 'Statistics  Endpoints' show:

 The Client has transmitted 4849 packets, and
 the Server has transmitted 7991 packets.

 Menu: 'Analyze  Expert info Composite':
 The 'Errors' tab shows:
 4849 packets with a 'Bad TCP checksum' error - These are all transmitted by 
 the Client.

 (Apply a filter of 'ip.src_host == 10.194.217.10' to confirm this.)

 The 'Notes' tab shows:
 ..numerous 'Duplicate Ack's'
 For example, for 60 different ACK packets, the exact same packet was 
 re-transmitted 7 times!
 Packet #3718 was duplicated 17 times.
 Packet #8215 was duplicated 16 times.
 packet #6421 was duplicated 15 times, etc.
 These bursts of duplicate ACK packets are all coming from the client side.

 This certainly looks strange to me - I've not seen anything like this before.
 It's not going to help the speed to unnecessarily duplicate packets like
 that, and these burst are often closely followed by a short delay, ~0.2 
 seconds.
 And as far as I can see, it looks to point towards the client as the source
 of the problem.
 If you are seeing the same problem with other client PC, then I guess we need 
 to 
 suspect the 'switch' that connects them.

 Ok, that's my thoughts  conclusion for now.
 Maybe you could get some more snoop captures with other clients, and
 with a different switch, and do a similar analysis.
 Regards
 Nigel Smith
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Dribbling checksums

2008-10-28 Thread Charles Menser
My home server is giving me fits.

I have seven disks, comprising three pools, on two multi-port SATA
controllers (one onboard the Asus M2A-VM motherboard, and one
Supermicro AOC-SAT2-MV8).

The disks range from many months to many days old.

Two pools are mirrors, one is a raidz.

The machine is running opensolaris snv99.

Nearly every time I scrub a pool I get small numbers of checksum
errors on random drives on either controller.

I have replaced the power supply, suspecting bad power, to no avail.

I removed the AOC-SAT2-MV8 and all the drives, save the root mirror,
(to try ruling out some weird interaction with the AOC-SAT2-MV8) and
still take errors.

Has anyone had a similar problem?

Any ideas what may be happening?

Is there more data I can provide?

Many thanks,
Charles
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss