Re: [zfs-discuss] System crash on zpool attach object_count == usedobjs failed assertion

2010-03-03 Thread Nigel Smith
I've just run zdb against the two pools on my home OpenSolaris box,
and now both are showing this failed assertion, with the counts off by one.

  # zdb rpool /dev/null
  Assertion failed: object_count == usedobjs (0x18da2 == 0x18da3), file 
../zdb.c, line 1460
  Abort (core dumped)

  # zdb rz2pool /dev/null
  Assertion failed: object_count == usedobjs (0x2ba25 == 0x2ba26), file 
../zdb.c, line 1460
  Abort (core dumped)

The last time I checked them with zdb, probably a few months back,
they were fine.

And since the pools otherwise seem to be behaving without problem,
I've had no reason to run zdb.

'zpool status' looks fine, and the pools mount without problem.
'zpool scrub' works without problem.

I have been upgrading to most of the recent 'dev' version of OpenSolaris.
I wonder if there is some bug in the code that could cause this assertion.

Maybe one unusual thing, is that I have not yet upgraded the 
versions of the pools.

  # uname -a
  SunOS opensolaris 5.11 snv_133 i86pc i386 i86pc  
  # zpool upgrade
  This system is currently running ZFS pool version 22.

  The following pools are out of date, and can be upgraded.  After being
  upgraded, these pools will no longer be accessible by older software versions.

  VER  POOL
  ---  
  13   rpool
  16   rz2pool

The assertions is being tracked by this bug:

  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6801840

..but in that report, the counts are not off by one,
Unfortunately, there is little indication of any progress being made.

Maybe some other 'zfs-discuss' readers would try zdb on there pools,
if using a recent dev build and see if they get a similar problem...

Thanks
Nigel Smith


# mdb core
Loading modules: [ libumem.so.1 libc.so.1 libzpool.so.1 libtopo.so.1 
libavl.so.1 libnvpair.so.1 ld.so.1 ]
 ::status
debugging core file of zdb (64-bit) from opensolaris
file: /usr/sbin/amd64/zdb
initial argv: zdb rpool
threading model: native threads
status: process terminated by SIGABRT (Abort), pid=883 uid=0 code=-1
panic message:
Assertion failed: object_count == usedobjs (0x18da2 == 0x18da3), file ../zdb.c,
line 1460
 $C
fd7fffdff090 libc.so.1`_lwp_kill+0xa()
fd7fffdff0b0 libc.so.1`raise+0x19()
fd7fffdff0f0 libc.so.1`abort+0xd9()
fd7fffdff320 libc.so.1`_assert+0x7d()
fd7fffdff810 dump_dir+0x35a()
fd7fffdff840 dump_one_dir+0x54()
fd7fffdff850 libzpool.so.1`findfunc+0xf()
fd7fffdff940 libzpool.so.1`dmu_objset_find_spa+0x39f()
fd7fffdffa30 libzpool.so.1`dmu_objset_find_spa+0x1d2()
fd7fffdffb20 libzpool.so.1`dmu_objset_find_spa+0x1d2()
fd7fffdffb40 libzpool.so.1`dmu_objset_find+0x2c()
fd7fffdffb70 dump_zpool+0x197()
fd7fffdffc10 main+0xa3d()
fd7fffdffc20 0x406e6c()
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] System crash on zpool attach object_count == usedobjs failed assertion

2010-03-03 Thread Nigel Smith
Hi Stephen 

If your system is crashing while attaching the new device,
are you getting a core dump file?

If so, it would be interesting to examine the file with mdb,
to see the stack backtrace, as this may give a clue to what's going wrong.

What storage controller you are using for the disks?
And what device driver is the controller using?

Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] crashed zpool

2010-03-01 Thread Nigel Smith
Hello Carsten

Have you examined the core dump file with mdb ::stack
to see if this give a clue to what happend?

Regards
Nigel
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help with itadm commands

2010-02-23 Thread Nigel Smith
The iSCSI COMSTAR Port Provider is not installed by default.
What release of OpenSolaris are you running?
If pre snv_133 then:

  $ pfexec pkg install  SUNWiscsit

For snv_133, I think it will be:

  $ pfexec pkg install  network/iscsi/target

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Abysmal ISCSI / ZFS Performance

2010-02-18 Thread Nigel Smith
Hi Matt
Are the seeing low speeds on writes only or on both read AND write?

Are you seeing low speed just with iSCSI or also with NFS or CIFS?

 I've tried updating to COMSTAR 
 (although I'm not certain that I'm actually using it)

To check, do this:

  # svcs -a | grep iscsi

If 'svc:/system/iscsitgt:default' is online,
you are using the old  mature 'user mode' iscsi target.

If 'svc:/network/iscsi/target:default' is online,
then you are using the new 'kernel mode' comstar iscsi target.

For another good way to monitor disk i/o, try:

  # iostat -xndz 1

  http://docs.sun.com/app/docs/doc/819-2240/iostat-1m?a=view

Don't just assume that your Ethernet  IP  TCP layer
are performing to the optimum - check it.

I often use 'iperf' or 'netperf' to do this:

  http://blogs.sun.com/observatory/entry/netperf

(Iperf is available by installing the SUNWiperf package.
A package for netperf is in the contrib repository.)

The last time I checked, the default values used
in the OpenSolaris TCP stack are not optimum
for Gigabit speed, and need to be adjusted.
Here is some advice, I found with Google, but
there are others:

  
http://serverfault.com/questions/13190/what-are-good-speeds-for-iscsi-and-nfs-over-1gb-ethernet

BTW, what sort of network card are you using,
as this can make a difference.

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Abysmal ISCSI / ZFS Performance

2010-02-18 Thread Nigel Smith
Hi Matt

 Haven't gotten NFS or CIFS to work properly.
 Maybe I'm just too dumb to figure it out,
 but I'm ending up with permissions errors that don't let me do much.
 All testing so far has been with iSCSI.

So until you can test NFS or CIFS, we don't know if it's a 
general performance problem, or just an iSCSI problem.

To get CIFS working, try this:

  
http://blogs.sun.com/observatory/entry/accessing_opensolaris_shares_from_windows

 Here's IOStat while doing writes : 
 Here's IOStat when doing reads : 

Your getting 1000 Kr/s  kw/s, so add the iostat 'M' option
to display throughput in MegaBytes per second.

 It'll sustain 10-12% gigabit for a few minutes, have a little dip,

I'd still be interested to see the size of the TCP buffers.
What does this report:

# ndd /dev/tcp  tcp_xmit_hiwat
# ndd /dev/tcp  tcp_recv_hiwat
# ndd /dev/tcp  tcp_conn_req_max_q
# ndd /dev/tcp  tcp_conn_req_max_q0

 Current NIC is an integrated NIC on an Abit Fatality motherboard.
 Just your generic fare gigabit network card.
 I can't imagine that it would be holding me back that much though.

Well there are sometimes bugs in the device drivers:

  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6913756
  http://sigtar.com/2009/02/12/opensolaris-rtl81118168b-issues/

That's why I say don't just assume the network is performing to the optimum.

To do a local test, direct to the hard drives, you could try 'dd',
with various transfer sizes. Some advice from BenR, here:

  http://www.cuddletech.com/blog/pivot/entry.php?id=820

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Abysmal ISCSI / ZFS Performance

2010-02-18 Thread Nigel Smith
Another things you could check, which has been reported to
cause a problem, is if network or disk drivers share an interrupt
with a slow device, like say a usb device. So try:

# echo ::interrupts -d | mdb -k

... and look for multiple driver names on an INT#.
Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Idiots Guide to Running a NAS with ZFS/OpenSolaris

2010-02-18 Thread Nigel Smith
Hi Robert 
Have a look at these links:

  http://delicious.com/nwsmith/opensolaris-nas

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disk Issues

2010-02-16 Thread Nigel Smith
I have booted up an osol-dev-131 live CD on a Dell Precision T7500,
and the AHCI driver successfully loaded, to give access
to the two sata DVD drives in the machine.

(Unfortunately, I did not have the opportunity to attach
any hard drives, but I would expect that also to work.)

'scanpci' identified the southbridge as an
Intel 82801JI (ICH10 family)
Vendor 0x8086, device 0x3a22

AFAIK, as long as the SATA interface report a PCI ID
class-code of 010601, then the AHCI device driver 
should load.

The mode of the SATA interface will need to be selected in the BIOS.
There are normally three modes: Native IDE, RAID or AHCI.

'scanpci' should report different class-codes depending
on the mode selected in the BIOS.

RAID mode should report a class-code of 010400
IDE mode should report a class-code of 0101xx

With OpenSolaris, you can see the class-code in the
output from 'prtconf -pv'.

If Native IDE is selected the ICH10 SATA interface should
appear as two controllers, the first for ports 0-3,
and the second for ports 4  5.

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Painfully slow RAIDZ2 as fibre channel COMSTAR export

2010-02-14 Thread Nigel Smith
Hi Dave
So which hard drives are connected to which controllers?
And what device drivers are those controllers using?

The output from 'format', 'cfgadm' and 'prtconf -D'
may help us to understand.

Strange that you say that there are two hard drives
per controllers, but three drives are showing
high %b.

And strange that you have c7,c8,c9,c10,c11
which looks like FIVE controllers!

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ..and now ZFS send dedupe

2009-11-09 Thread Nigel Smith
More ZFS goodness putback before close of play for snv_128.

  http://mail.opensolaris.org/pipermail/onnv-notify/2009-November/010768.html

  http://hg.genunix.org/onnv-gate.hg/rev/216d8396182e

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-09 Thread Nigel Smith
On Thu Nov 5 14:38:13 PST 2009, Gary Mills wrote:
 It would be nice to see this information at:
 http://hub.opensolaris.org/bin/view/Community+Group+on/126-130
 but it hasn't changed since 23 October.

Well it seems we have an answer:

http://mail.opensolaris.org/pipermail/zfs-discuss/2009-November/033672.html

On Mon Nov 9 14:26:54 PST 2009, James C. McPherson wrote:
 The flag days page has not been updated since the switch
 to XWiki, it's on my todo list but I don't have an ETA
 for when it'll be done.

Perhaps anyone interested in seeing the flags days page
resurrected can petition James to raise the priority on
his todo list.
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] marvell88sx2 driver build126

2009-11-08 Thread Nigel Smith
I think you can work out the files for the driver by looking here:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/pkgdefs/SUNWmv88sx/prototype_i386

So the 32 bit driver is:

 kernel/drv/marvell88sx

And the 64 bit driver is:

 kernel/drv/amd64/marvell88sx

It a pity that the marvell driver is not open source.
For the sata drivers that are open source,

  ahci, nv_sata, si3124

..you can see the history of all the changes to the source code
of the drivers, all cross referenced to the bug numbers, using OpenGrok:

  
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/sata/adapters/

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Nigel Smith
Hi Robert
I think you mean snv_128 not 126 :-)

  6667683  need a way to rollback to an uberblock from a previous txg 
  http://bugs.opensolaris.org/view_bug.do?bug_id=6667683

  http://hg.genunix.org/onnv-gate.hg/rev/8aac17999e4d

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Nigel Smith
Hi Gary
I will let 'website-discuss' know about this problem.
They normally fix issues like that.
Those pages always seemed to just update automatically.
I guess it's related to the website transition.
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupe is in

2009-11-02 Thread Nigel Smith
ZFS dedup will be in snv_128,
but putbacks to snv_128 will not likely close till the end of this week.

The OpenSolaris dev repository was updated to snv_126 last Thursday:
http://mail.opensolaris.org/pipermail/opensolaris-announce/2009-October/001317.html

So it looks like about 5 weeks before the dev
repository will be updated to snv_128.

Then we see if any bugs emerge as we all rush to test it out...
Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Nigel Smith
Adam
The 'OpenSolaris Development Release Packaging Repository'
has recently been updated to release 121.

  
http://mail.opensolaris.org/pipermail/opensolaris-announce/2009-August/001253.html
  http://pkg.opensolaris.org/dev/en/index.shtml

Just to be totally clear, as you recommending that anyone
using raidz, raidz2, raidz3, should not upgrade to that release?

For the people who have already upgraded, presumably the
recommendation is that they should revert to a pre 121 BE.

Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Shrinking a zpool?

2009-08-06 Thread Nigel Smith
Hi Matt
Thanks for this update, and the confirmation
to the outside world that this problem is being actively
worked on with significant resources.

But I would like to support Cyril's comment.

AFAIK, any updates you are making to bug 4852783 are not
available to the outside world via the normal bug URL.
It would be useful if we were able to see them.

I think it is frustrating for the outside world that
it cannot see Sun's internal source code repositories
for work in progress, and only see the code when it is
complete and pushed out.

And so there is no way to judge what progress is being made,
or to actively help with code reviews or testing.

Best Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Shrinking a zpool?

2009-08-06 Thread Nigel Smith
ob Friesenhahn wrote:
 Sun has placed themselves in the interesting predicament that being 
 open about progress on certain high-profile enterprise features 
 (such as shrink and de-duplication) could cause them to lose sales to 
 a competitor.  Perhaps this is a reason why Sun is not nearly as open 
 as we would like them to be.

I agree that it is difficult for Sun, at this time, to 
be more 'open', especially for ZFS, as we still await the resolution
of Oracle purchasing Sun, the court case with NetApp over patents,
and now the GreenBytes issue!

But I would say they are more likely to avoid loosing sales
by confirming what enhancements they are prioritising.
I think people will wait if they know work is being done,
and progress being made, although not indefinitely.

I guess it depends on the rate of progress of ZFS compared to say btrfs.

I would say that maybe Sun should have held back on
announcing the work on deduplication, as it just seems to 
have ramped up frustration, now that it seems no
more news is forthcoming. It's easy to be wise after the event
and time will tell.

Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

2009-07-27 Thread Nigel Smith
David Magda wrote:
 This is also (theoretically) why a drive purchased from Sun is more  
 that expensive then a drive purchased from your neighbourhood computer  
 shop: Sun (and presumably other manufacturers) takes the time and  
 effort to test things to make sure that when a drive says I've synced  
 the data, it actually has synced the data. This testing is what  
 you're presumably paying for.

So how do you test a hard drive to check it does actually sync the data?
How would you do it in theory?
And in practice?

Now say we are talking about a virtual hard drive,
rather than a physical hard drive.
How would that affect the answer to the above questions?

Thanks
Nigel
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help recovering zfs filesystem

2008-11-07 Thread Nigel Smith
FYI, here are the link to the 'labelfix' utility.
It an attachment to one of Jeff Bonwick's posts on this thread:

http://www.opensolaris.org/jive/thread.jspa?messageID=229969

or here:

http://mail.opensolaris.org/pipermail/zfs-discuss/2008-May/047267.html
http://mail.opensolaris.org/pipermail/zfs-discuss/2008-May/047270.html

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-30 Thread Nigel Smith
Hi Matt
Well this time you have filtered out any SSH traffic on port 22 successfully.

But I'm still only seeing half of the conversation!
I see packets sent from client to server.
That is from source: 10.194.217.12 to destination: 10.194.217.3
So a different client IP this time

And the Duplicate ACK packets (often long bursts) are back in this capture.
I've looked at these a little bit more carefully this time,
and I now notice it's using the 'TCP selective acknowledgement' feature (SACK) 
on those packets.

Now this is not something I've come across before, so I need to do some
googling!  SACK is defined in RFC1208.

 http://www.ietf.org/rfc/rfc2018.txt

I found this explanation of when SACK is used:

 http://thenetworkguy.typepad.com/nau/2007/10/one-of-the-most.html
 http://thenetworkguy.typepad.com/nau/2007/10/tcp-selective-a.html

This seems to indicate these 'SACK' packets are triggered as a result 
of 'lost packets', in this case, it must be the packets sent back from
your server to the client, that is during your video playback.

Of course I'm not seeing ANY of those packets in this capture
because there are none captured from server to client!  
I'm still not sure why you cannot seem to capture these packets!

Oh, by the way, I probably should advise you to run...

 # netstat -i

..on the OpenSolaris box, to see if any errors are being counted
on the network interface.

Are you still seeing the link going up/down in '/var/admin/message'?
You are never going to do any good while that is happening.
I think you need to try a different network card in the server.
Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!

2008-10-29 Thread Nigel Smith
Hi Tano
Great to hear that you've now got this working!!

I understand you are using a Broadcom network card,
from your previous posts I can see you are using the 'bnx' driver.

I will raise this as a bug, but first please would you run 
'/usr/X11/bin/scanpci'
to indentify the exact 'vendor id' and 'device id' for the Broadcom network 
chipset,
and report that back here.

I must admit that this is the first I have heard of 'I/OAT DMA',
so I did some Googling on it, and found this links:

http://opensolaris.org/os/community/arc/caselog/2008/257/onepager/

To quote from that ARC case:

  All new Sun Intel based platforms have Intel I/OAT (I/O Acceleration
   Technology) hardware.

   The first such hardware is an on-systemboard asynchronous DMA engine
   code named Crystal Beach.

   Through a set of RFEs Solaris will use this hardware to implement
   TCP receive side zero CPU copy via a socket.

Ok, so I think that makes some sense, in the context of
the problem we were seeing. It's referring to how the network
adaptor transfers the data it has received, out of the buffer
and onto the rest of the operating system.

I've just looked to see if I can find the source code for 
the BNX driver, but I cannot find it.

Digging deeper we find on this page:
http://www.opensolaris.org/os/about/no_source/
..on the 'ON' tab, that:

Components for which there are currently no plans to release source
bnx driver (B)  Broadcom NetXtreme II Gigabit Ethernet driver

So the bnx driver is closed source :-(
Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-29 Thread Nigel Smith
Hi Matt
Can you just confirm if that Ethernet capture file, that you made available,
was done on the client, or on the server. I'm beginning to suspect you
did it on the client.

You can get a capture file on the server (OpenSolaris) using the 'snoop'
command, as per one of my previous emails.  You can still view the
capture file with WireShark as it supports the 'snoop' file format.

Normally it would not be too important where the capture was obtained,
but here, where something strange is happening, it could be critical to 
understanding what is going wrong and where.

It would be interesting to do two separate captures - one on the client
and the one on the server, at the same time, as this would show if the
switch was causing disruption.  Try to have the clocks on the client 
server synchronised as close as possible.
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-29 Thread Nigel Smith
Hi Matt
In your previous capture, (which you have now confirmed was done
on the Windows client), all those 'Bad TCP checksum' packets sent by the 
client, 
are explained, because you must be doing hardware TCP checksum offloading
on the client network adaptor.  WireShark will capture the packets before
that hardware calculation is done, so the checksum all appear to be wrong,
as they have not yet been calculated!

  http://wiki.wireshark.org/TCP_checksum_offload
  http://www.wireshark.org/docs/wsug_html_chunked/ChAdvChecksums.html

Ok, so lets look at the new capture, 'snoop'ed on the OpenSolaris box.

I was surprised how small that snoop capture file was
 - only 753400 bytes after unzipping.
I soon realized why...

The strange thing is that I'm only seeing half of the conversation!
I see packets sent from client to server.
That is from source: 10.194.217.10 to destination: 10.194.217.3

I can also see some packets from
source: 10.194.217.5 (Your AD domain controller) to destination  10.194.217.3

But you've not capture anything transmitted from your
OpenSolaris server - source: 10.194.217.3

(I checked, and I did not have any filters applied in WireShark
that would cause the missing half!)
Strange! I'm not sure how you did that.

The half of the conversation that I can see looks fine - there
does not seem to be any problem.  I'm not seeing any duplication
of ACK's from the client in this capture.  
(So again somewhat strange, unless you've fixed the problem!)

I'm assuming your using a single network card in the Solaris server, 
but maybe you had better just confirm that.

Regarding not capturing SSH traffic and only capturing traffic from
( hopefully to) the client, try this:

 # snoop -o test.cap -d rtls0 host 10.194.217.10 and not port 22

Regarding those 'link down', 'link up' messages, '/var/adm/messages'.
I can tie up some of those events with your snoop capture file,
but it just shows that no packets are being received while the link is down,
which is exactly what you would expect.
But dropping the link for a second will surely disrupt your video playback!

If the switch is ok, and the cable from the switch is ok, then it does
now point towards the network card in the OpenSolaris box.  
Maybe as simple as a bad mechanical connection on the cable socket

BTW, just run '/usr/X11/bin/scanpci'  and identify the 'vendor id' and
'device id' for the network card, just in case it turns out to be a driver bug.
Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-28 Thread Nigel Smith
Hi Matt.
Ok, got the capture and successfully 'unzipped' it.
(Sorry, I guess I'm using old software to do this!)

I see 12840 packets. The capture is a TCP conversation 
between two hosts using the SMB aka CIFS protocol.

10.194.217.10 is the client - Presumably Windows?
10.194.217.3 is the server - Presumably OpenSolaris - CIFS server?

Using WireShark,
Menu: 'Statistics  Endpoints' show:

The Client has transmitted 4849 packets, and
the Server has transmitted 7991 packets.

Menu: 'Analyze  Expert info Composite':
The 'Errors' tab shows:
4849 packets with a 'Bad TCP checksum' error - These are all transmitted by the 
Client.

(Apply a filter of 'ip.src_host == 10.194.217.10' to confirm this.)

The 'Notes' tab shows:
..numerous 'Duplicate Ack's'
For example, for 60 different ACK packets, the exact same packet was 
re-transmitted 7 times!
Packet #3718 was duplicated 17 times.
Packet #8215 was duplicated 16 times.
packet #6421 was duplicated 15 times, etc.
These bursts of duplicate ACK packets are all coming from the client side.

This certainly looks strange to me - I've not seen anything like this before.
It's not going to help the speed to unnecessarily duplicate packets like
that, and these burst are often closely followed by a short delay, ~0.2 seconds.
And as far as I can see, it looks to point towards the client as the source
of the problem.
If you are seeing the same problem with other client PC, then I guess we need 
to 
suspect the 'switch' that connects them.

Ok, that's my thoughts  conclusion for now.
Maybe you could get some more snoop captures with other clients, and
with a different switch, and do a similar analysis.
Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] My 500-gig ZFS is gone: insufficient replicas, corrupted data

2008-10-27 Thread Nigel Smith
Hi Eugene
I'm delighted to hear you got your files back!

I've seen a few posts to this forum where people have
done some change to the hardware, and then found
that the ZFS pool have gone. And often you never
hear any more from them, so you assume they could
not recover it.

Thanks for reporting back your interesting story.
I wonder how many other people have been caught out
with this 'Host Protected Area' (HPA) and never
worked out that this was the cause...

Maybe one moral of this story is to make a note of
your hard drive and partitions sizes now, while
you have a working system.

If your using Solaris, maybe try 'prtvtoc'.
http://docs.sun.com/app/docs/doc/819-2240/prtvtoc-1m?a=view
(Unless someone knows a better way?)
Thanks
Nigel Smith


# prtvtoc /dev/rdsk/c1t1d0
* /dev/rdsk/c1t1d0 partition map
*
* Dimensions:
* 512 bytes/sector
* 1465149168 sectors
* 1465149101 accessible sectors
*
* Flags:
*   1: unmountable
*  10: read-only
*
* Unallocated space:
*   First SectorLast
*   Sector CountSector
*  34   222   255
*
*  First SectorLast
* Partition  Tag  FlagsSector CountSector  Mount Directory
   0  400256 1465132495 1465132750
   8 1100  1465132751 16384 1465149134
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] My 500-gig ZFS is gone: insufficient replicas, corrupted data

2008-10-27 Thread Nigel Smith
...check out that link that Eugene provided.
It was a GigaByte GA-G31M-S2L motherboard.
http://www.gigabyte.com.tw/Products/Motherboard/Products_Spec.aspx?ProductID=2693

Some more info on 'Host Protected Area' (HPA), relating to OpenSolaris here:
http://opensolaris.org/os/community/arc/caselog/2007/660/onepager/
http://bugs.opensolaris.org/view_bug.do?bug_id=5044205

Regards
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import problem

2008-10-27 Thread Nigel Smith
Hi Terry
Please could you post back to this forum the output from

 # zdb -l /dev/rdsk/...

... for each of the 5 drives in your raidz2.
(maybe best as an attachment)
Are you seeing labels with the error  'failed to unpack'?
What is the reported 'status' of your zpool?
(You have not provided a 'zpool status')
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import: all devices online but: insufficient replicas

2008-10-27 Thread Nigel Smith
Hi Kristof 
Please could you post back to this forum the output from

# zdb -l /dev/rdsk/...

... for each of the storage devices in your pool,
while it is in a working condition on Server1.
(Maybe best as an attachment)
Then do the same again with the pool on Server2.

What is the reported 'status' of your zpool on Server2?
(You have not provided a 'zpool status')
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] My 500-gig ZFS is gone: insufficient replicas, corrupted data

2008-10-27 Thread Nigel Smith
Hi Miles
I think you make some very good points in your comments.
It would be nice to get some positive feedback on these from Sun.

And my thought also on (quickly) looking at that bug  ARC case was
does not this also need to be factored into the SATA framework.

I really miss not having 'smartctl' (fully) working with PATA and 
SATA drives on x86 Solaris.

I've done a quick search on PSARC 2007/660 and it was
closed approved fast-track 11/28/2007.
I did a quick search, but I could not find any code that had been
committed to 'onnv-gate' that references this case.
Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!

2008-10-27 Thread Nigel Smith
Hi Tano
Please check out my post on the storage-forum for another idea
to try which may give further clues:
http://mail.opensolaris.org/pipermail/storage-discuss/2008-October/006458.html
Best Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-27 Thread Nigel Smith
Hi Matt
Unfortunately, I'm having problems un-compressing that zip file.
I tried with 7-zip and WinZip reports this:

skipping _1_20081027010354.cap: this file was compressed using an unknown 
compression method.
   Please visit www.winzip.com/wz54.htm for more information.
   The compression method used for this file is 98.

Please can you check it out, and if necessary use a more standard
compression algorithm.
Download File Size was 8,782,584 bytes.
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-26 Thread Nigel Smith
Ok on the answers to all my questions.
There's nothing that really stands out as being obviously wrong.
Just out of interest, what build of OpenSolaris are you using?

One thing you could try on the Ethernet capture file, is to set
the WireShark 'Time' column like this:
View  Time Display Format  Seconds Since Previous Displayed Packet

Then look down the time column for any unusual high time delays
between packets. Any unusually high delays during
a data transfer phase, may indicate a problem.


Another thing you could try is measuring network performance
with a utility called 'iperf'.
It's not part of Solaris, so you would need to compile it.
Download the source from here:
http://sourceforge.net/projects/iperf/

I've just compiled the latest version 2.0.4 on snv_93
without problem, using the normal configure, make, make install.

If you want to run 'iperf' on a windows box, you can
download a '.exe' of an older version here:
http://www.noc.ucf.edu/Tools/Iperf/

You can find tutorials on how to use it at these links:
http://www.openmaniak.com/iperf.php
http://www.enterprisenetworkingplanet.com/netos/article.php/3657236

I've just tried 'iperf' between my OpenSolaris pc  an old
Windows pc, both with low-cost realtek gigabit cards and
linked via a low-cost NetGear switch. I measured a TCP
bandwidth of 196 Mbit/sec in one direction and
145 Mbit/sec in the opposite direction.
(On OpenSolaris, Iperf was not able to increase
the default TCP window size of 48K bytes.)
Regards
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! zfs with iscsitadm (on poweredge1900) and VMWare!

2008-10-26 Thread Nigel Smith
I asked Tano to use the 'snoop' command to capture the Ethernet
packets to a file, while he attempted VMware's 'VMotion'.

 # snoop -d {device} -o {filename} tcp port 3260

This file was made available to me on Tano's web server.
The file size was nearly 85 Mbytes, capturing over 100,000 packets.
I have downloaded the capture file, and been looking at it with 
Ethereal and WireShark.

I do not have a corresponding 'iscsisnoop.d' file, but from the pattern
of activity that I see, I can well imagine that it would show the same
pattern of that we saw from Eugene, which I reported on here:
http://mail.opensolaris.org/pipermail/storage-discuss/2008-October/006444.html

(So here I'm looking at what's happening at the lower TCP level,
rather than at the iScsi level.)

In the Ethernet capture file, I can see the pattern of bursts of
writes from the initiator. The Target can accept so many of these,
and then needs to slow things down by reducing the TCP window size.
Eventually the target says the TCP Window size is zero, effectively
asking the initiator to stop.

Now to start with, the target only leaves the 'TCP ZeroWindow', in
place for a fraction of a second. Then it opens things up again
by sending a 'TCP Window Update', restoring the window to 65160 bytes,
and transfer resumes. This is normal and expected.

But eventually we get to a stage where the target sets the TCP 'ZeroWindow'
and leaves it there for an extended period of time.  I talking about seconds 
here.
The initiator starts to send 'TCP ZeroWindowProbe' packets every 5 seconds.
The target promptly responds with a 'TCP ZeroWindowProbeAck' packet.
(Presumably, this is the initiator just confirming that the target is still 
alive.)
This cycle of Probes  Ack's repeats for 50 seconds.
During this period the target shows no sign of wanting to accept any more data.
Then the initiator seems to decide it has had enough, and just cannot
be bothered to wait any longer, and it [RST,ACK]'s the TCP session, and
then starts a fresh iscsi login.
(And then we go around the whole cycle of the pattern again.)

The question is why has the target refused to accept any more data for over 50 
seconds!

The obvious conclusion would be that the OpenSolaris box is so busy that
it does not have any time left to empty the network stack buffers.
But this then just leads you to another  question - why?

So the mystery deepens, and I am running out of ideas!

Tano, maybe you could check the network performance, with the 'iperf'
programs, as mentioned here:
http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/052136.html

Does the OpenSolaris box give any indication of being busy with other things?
Try running 'prstat' to see if it gives any clues.

Presumably you are using ZFS as the backing store for iScsi, in
which case, maybe try with a UFS formatted disk to see if that is a factor.
Regards
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] diagnosing read performance problem

2008-10-25 Thread Nigel Smith
Hi Matt
What chipset is your PCI network card?
(obviously, it not Intel, but what is it?)
Do you know which driver the card is using?

You say '..The system was fine for a couple of weeks..'.
At that point did you change any software - do any updates or upgrades?
For instance, did you upgrade to a new build of OpenSolaris?

If not, then I would guess it's some sort of hardware problem.
Can you try different cables and a different switch - anything
in the path between client  server is suspect.

A mismatch of Ethernet duplex settings can cause problems - are
you sure this is Ok.

To get an idea of how the network is running try this:

On the Solaris box, do an Ethernet capture with 'snoop' to a file.
http://docs.sun.com/app/docs/doc/819-2240/snoop-1m?a=view

 # snoop -d {device} -o {filename}

.. then while capturing, try to play your video file through the network.
Control-C to stop the capture.

You can then use Ethereal or WireShark to analyze the capture file.
On the 'Analyze' menu, select 'Expert Info'.
This will look through all the packets and will report
any warning or errors it sees.
Regards
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!

2008-10-22 Thread Nigel Smith
Well the '/var/svc/log/system-iscsitgt\:default.log'
is NOT showing any core dumps, which is good, but
means that we need to look  think deeper for the answer.

The 'iscsisnoop.d' output does looks similar to that 
captured by Eugene over on the storage forum, but
Eugene only showed a short sequence.
http://mail.opensolaris.org/pipermail/storage-discuss/2008-October/006414.html

Here we have a longer sequence of 'iscsisnoop.d' output
clearly showing the looping, as the error occurs, 
causing the initiator and target to try to re-establish the session.

The question is - what is the root cause,  what is
 just consequential effect.

Tano, it you could also get some debug log messages
from the iscsi target (/tmp/target_log), that would help to
confirm that this is the same (or not) as what Eugene is seeing:
http://mail.opensolaris.org/pipermail/storage-discuss/2008-October/006428.html

It would be useful to modify the 'iscsisnoop.d' to give
timestamps, as this would help to show if there are 
any unusual delays.
And the DTrace iscsi probes have a 'args[1]' which can
give further details on sequence numbers and tags.

Having seen your 'iscsisnoop.d' output, and the '/tmp/target_log'
from  Eugene, I now going back to thinking this IS an iscsi issue,
with the initiator and target mis-interacting in some way,
and NOT a driver/hardware issue.

I know that SUN have recently been doing a lot of stress testing
with the iscsi target and various initiators, including Linux.
I have found the snv_93 and snv_97 iscsi target to work
well with the Vmware ESX and Microsoft initiators.
So it is a surprise to see these problems occurring.
Maybe some of the more resent builds snv_98, 99 have
'fixes' that have cause the problem...
Regards
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! zfs with iscsitadm (on poweredge1900) and VMWare!

2008-10-22 Thread Nigel Smith
Hi Tano
I will have a look at your snoop file.
(Tomorrow now, as it's late in the UK!)
I will send you my email address.
Thanks
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!

2008-10-21 Thread Nigel Smith
Well, my colleague  myself have recently had a basic Vmare ESX cluster working,
with the Solaris iscsi target, in the Lab at work, so I know it does work.

We used ESX 3.5i on two Dell Precision 390 workstations,
booted from USB memory sticks.
We used snv_97 and no special tweaks required.
We used Vmotion to move a running Windows XP guest
from one ESX host to the another.
Windows XP was playing a video feed at the time.  
It all worked fine.  We repeated the operation three times.
My colleague is the ESX expert, but I believe it was
update 2 with all latest patches applied.
But we only had a single iscsi target setup on the Solaris box,
The target size was 200Gb, formated with VMFS.

Ok, another thing you could try, which may give a clue
to what is going wrong, is to run the 'iscsisnoop.d'
script on the Solaris box.
http://www.solarisinternals.com/wiki/index.php/DTrace_Topics_iSCSI
This is a DTrace script which shows what iscsi target events are happening,
so interesting if it shows anything unusual at the point of failure.

But, I'm beginning to think it could be one of your hardware components
that is playing up, but no clue so far. It could be anywhere on the path.
Maybe you could check the Solaris iScsi target works ok under stress
from something other that ESX, like say the Windows iscsi initiator.
Regards
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!

2008-10-21 Thread Nigel Smith
Hi tano
I hope you can try with the 'iscsisnoop.d' script, so 
we can see if your problem is the same as what Eugene is seeing.

Please can you also check the contents of the file:
/var/svc/log/system-iscsitgt\:default.log
.. just to make sure that the iscsi target is not core dumping  restarting.

I've also done a post on the storage-forum on how to
enable a debug log on the iscsi target, which may also give some clues.
http://mail.opensolaris.org/pipermail/storage-discuss/2008-October/006423.html

It may also be worth trying with a smaller target size,
just to see if that is a factor.
(There have in the past been bugs, now fixed, which triggered with 'large' 
targets.)
As I said, it worked ok for me with a 200Gb target.

Many thanks for all your testing. Please bear with us on this one.
If it is a problem with the Solaris iscsi target we need to get to 
the bottom of the root cause.
Following Eugene report, I'm beginning to fear that some sort of regression
has been introduced into the iscsi target code...
Regards
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] My 500-gig ZFS is gone: insufficient replicas, corrupted data

2008-10-19 Thread Nigel Smith
With ZFS, that are 4 identical labels on each
physical vdev, in this case a single hard drive.
L0/L1 at the start of the vdev, and
L2/L3 at the end of the vdev.

As I understand it, part of the reason for having
four identical labels is to make it difficult
to completely loose the information in the labels.

In this case, labels L0  L1 look ok, but
labels L2  L3 'failed to unpack'.

And status says ..the label is missing or invalid.

Ok, my theory is that some setting in the bios
has got confused about the size of the hard drive,
and thinks it's smaller that it was originally.
Maybe it thinks the geometry is changed.

If it thinks the size of the hard drive has reduced,
then maybe that is why it cannot read the labels at
the end of the vdev.

And maybe it think the two readable labels
are invalid because now the 'asize' does 
not match what the bios is currently reporting.

I would switch back to you original bios and try
looking at settings for the hard drive geometry.

(BTW, this is the sort of situation, where it
would have been good to have noted the reported size
of the hard drive BEFORE the update.)

(And if the above theory is right, having a
mirrored pair of identical hard drives would not help,
as the bios update may cause an identical problem
with each drive.)

Good Luck
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!

2008-10-18 Thread Nigel Smith
According to the svccfg(1M) man page:
http://docs.sun.com/app/docs/doc/819-2240/svccfg-1m?a=view
...it should be just 'export' without a leading '-' or '--'.

I've been googling on NAA and this is the 'Network Address Authority',
It seems to be yet another way of uniquely identifying a target  Lun,
and is apparently to be compatble with the way that Fibre Channel 
SAS do this. For futher details, see:
http://tools.ietf.org/html/rfc3980
T11 Network Address Authority (NAA) Naming Format for iSCSI Node Names

I also found this blog post:
http://timjacobs.blogspot.com/2008/08/matching-luns-between-esx-hosts-and-vcb.html
...which talks about Vmware ESX and NAA.

For anyone interested in the code fix's to the solaris
iscsi target to support Vmware ESX server, take a look
at these links:
http://hg.genunix.org/onnv-gate.hg/rev/29862a7558ef
http://hg.genunix.org/onnv-gate.hg/rev/5b422642546a

Tano, based on the above, I would say you need
unique GUID's for two separate Targets/LUNS.
Best Regards
Nigel Smith
http://nwsmith.blogspot.com/
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HELP! SNV_97, 98, 99 zfs with iscsitadm and VMWare!

2008-10-16 Thread Nigel Smith
I googled on some sub-strings from your ESX logs
and found these threads on the VmWare forum 
which lists similar error messages,
 suggests some actions to try on the ESX server:

http://communities.vmware.com/message/828207

Also, see this thread:

http://communities.vmware.com/thread/131923

Are you using multiple Ethernet connections between the OpenSolaris box
and the ESX server?
Your 'iscsitadm list target -v' is showing Connections: 0,
so run that command after the  ESX server initiator has
successfully connected to the OpenSolaris iscsi target,
and post that output.
The log files seem to show the iscsi session has dropped out,
and the initiator is auto retrying to connect to the target, 
but failing. It may help to get a packet capture at this stage
to try  see why the logon is failing.
Regards
Nigel Smith
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help! ZFS pool is UNAVAILABLE

2008-01-01 Thread Nigel Smith
It would be interesting to see the output from:

# zdb -v zbk

You can also use zdb to examine the labels on each of the disks.
Each disk had  4 copies of the labels, for redundancy.
Two at the start, and two at the end of each disk.
Use a command similar to this:

# zdb -l /dev/dsk/c2d0p2

Presumably the labels are some how confused,
especially for your USB drives :-(
Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-31 Thread Nigel Smith
Ok, this is a strange problem!
You seem to have tried  eliminated all the possible issues
that the community has suggested!

I was hoping you would see some errors logged in
'/var/adm/messages' that would give a clue.

Your original 'zpool status' said 140 errors.
Over what time period are these occurring?
I'm wondering if the errors are occurring at a
constant steady rate or if there are bursts of error?
Maybe you could monitor zpool status while generating
activity with dd or similar.
You could use zpool iostat interval to monitor
bandwidth and see if it is reasonably steady or erratic.

From your prtconf -D we see the 3114 card is using
the ata driver, as expected.
I believe the driver can talk to the disk drive
in either PIO or DMA mode, so you could try 
changing that in the ata.conf file. See here for details:
http://docs.sun.com/app/docs/doc/819-2254/ata-7d?a=view

I've just had a quick look at the source code for
the ata driver, and there does seem to be specific support
for the Silicon Image chips in the drivers:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.c
and
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/dktp/controller/ata/sil3xxx.h
The file sil3xxx.h does mention:
  Errata Sil-AN-0109-B2 (Sil3114 Rev 0.3)
  To prevent erroneous ERR set for queued DMA transfers
  greater then 8k, FIS reception for FIS0cfg needs to be set
  to Accept FIS without Interlock
..which I read as meaning there have being some 'issues'
with this chip. And it sounds similar to the issue mention on
the link that Tomasz supplied:
http://home-tj.org/wiki/index.php/Sil_m15w

If you decide to try a different SATA controller card, possible options are:

1. The si3124 driver, which supports SiI-3132 (PCI-E)
   and SiI-3124 (PCI-X) devices.
   
2. The AHCI driver, which supports the Intel ICH6 and latter devices, often
   found on motherboard.
   
4. The NV_SATA driver which supports Nvidia ck804/mcp55 devices.

Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Nigel Smith
First off, can we just confirm the exact version of the Silicon Image Card
and which driver Solaris is using.

Use 'prtconf -pv' and '/usr/X11/bin/scanpci'
to get the PCI vendor  device ID information.

Use 'prtconf -D' to confirm which drivers are being used by which devices.

And 'modinfo' will tell you the version of the drivers.

The above commands will give details for all the devices
in the PC.  You may want to edit down the output before
posting it back here, or alternatively put the output into an
attached file.

See this link for an example of this sort of information
for a different hard disk controller card:
http://mail.opensolaris.org/pipermail/storage-discuss/2007-September/003399.html

Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs corruption w/ sil3114 sata controllers

2007-10-30 Thread Nigel Smith
And are you seeing any error messages in '/var/adm/messages'
indicating any failure on the disk controller card?
If so, please post a sample back here to the forum.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I/O write failures on non-replicated pool

2007-10-25 Thread Nigel Smith
Nice to see some progress, at last, on this bug:
http://bugs.opensolaris.org/view_bug.do?bug_id=6417779
ZFS: I/O failure (write on ...) -- need to reallocate writes

Commit to Fix:   snv_77

http://www.opensolaris.org/os/community/arc/caselog/2007/567/onepager/

http://mail.opensolaris.org/pipermail/onnv-notify/2007-October/012782.html

Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Possible ZFS Bug - Causes OpenSolaris Crash

2007-10-15 Thread Nigel Smith
Hello Duff
Thanks for emailing me the source  binary for your test app.

My PC for testing has snv_60 installed. I was about to upgrade to snv_70,
but I thought it might be useful to test with the older version of OpenSolaris
first, in case the problem you are seeing is a regression.

And for the first test, I connected up a Samsung 400Gb sata-2 drive
to a pci-e x1 card which uses the  Silicon Image SiI-3132 chip.
This uses the OpenSolaris 'si3124' driver.
So my ZFS pool is using a single drive.

Ok, I ran your test app, using the parameters you advised, with the addition
 of '-c' to validate with a read, the data created.
And the first run of the test has completed with no problems.
So no crash with this setup.

 # ./gbFileCreate -c -r /mytank -d 1000 -f 1000 -s 6:6

 CREATING DIRECTORIES AND FILES:
 In folder /mytank/greenbytes.1459,
 creating 1000 directories each containing 1000 files.
 Files range in size from 6 bytes to 6 bytes.

 CHECKING FILE DATA:
 Files Passed = 100, Files Failed = 0.
 Test complete.

For the next test, I am going to swap the Samsung drive over onto 
the motherboard Intel ICH7 sata chip, so then it will be using the 'ahci' 
driver.
But it's late now, so hopefully I will have the time to do that tomorrow.

I have had a look at the source code history for the 'sd' driver, and I see
that there have been quite alot of changes recently.  So if there is a 
problem with that, then maybe I will not experience the problem until I
upgrade to snv70 or latter.
Regards,
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Possible ZFS Bug - Causes OpenSolaris Crash

2007-10-13 Thread Nigel Smith
Please can you provide the source code for your test app.
I would like to see if I can reproduce this 'crash'.
Thanks
Nigel
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] MS Exchange storage on ZFS?

2007-09-12 Thread Nigel Smith
Microsoft have a document you should read:
Optimizing Storage for Microsoft Exchange Server 2003
http://download.microsoft.com/download/b/e/0/be072b12-9c30-4e00-952d-c7d0d7bcea5f/StoragePerformance.doc

Microsoft also have a utility JetStress which you can use to verify
the performance of the storage system.
http://www.microsoft.com/downloads/details.aspx?familyid=94b9810b-670e-433a-b5ef-b47054595e9cdisplaylang=en
I think you can use Jetstress on a non-exchange server if you copy across
some of the Exchange DLLs. 

If you do any testing along these lines, please report success or failure
back to this forum, and on the 'Storage-discuss' forum where these sort
of questions are more usually discussed.
Thanks
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Please help! ZFS crash burn in SXCE b70!

2007-08-31 Thread Nigel Smith
Yes, I'm not surprised. I thought it would be a RAM problem.
I always recommend a 'memtest' on any new hardware.
Murphy's law predicts that you only have RAM problems
on PC's that you don't test!
Regards
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Please help! ZFS crash burn in SXCE b70!

2007-08-31 Thread Nigel Smith
Richard, thanks for the pointer to the tests in '/usr/sunvts', as this
is the first I have heard of them. They look quite comprehensive.
I will give them a trial when I have some free time.
Thanks
Nigel Smith

pmemtest- Physical Memory Test
ramtest - Memory DIMMs (RAM) Test
vmemtest- Virtual Memory Test
cddvdtest   - Optical Disk Drive Test
cputest - CPUtest
disktest- Disk and Floppy Drives Test
dtlbtest- Data Translation Look-aside Buffer Test
fputest - Floating Point Unit Test
l1dcachetest- Level 1 Data Cache Test
l2sramtest  - Level 2 Cache Test
netlbtest   - Net Loop Back Test
nettest - Network Hardware Test
serialtest  - Serial Port Test
tapetest- Tape Drive Test
usbtest - USB Device Test
systest - System Test
iobustest   - Test for the IO interconnects and the Components on the IObus 
on high end Machines
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] si3124 controller problem and fix (fwd)

2007-07-17 Thread Nigel Smith
You can see the  status of bug here:

http://bugs.opensolaris.org/view_bug.do?bug_id=6566207

Unfortunately, it's showing no progress since 20th June.

This fix really could do to be in place for S10u4 and snv_70.
Thanks
Nigel Smith
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: New zfs pr0n server :)))

2007-05-21 Thread Nigel Smith
I wanted to confirm the drivers I was using for the hard drives in my PC,
and here is the method I used. Maybe you can try something similar,
and see what you get.

I used the 'prtconf' command, with the device path from the 'format' command.
(Use bash as the shell, and use the tab key to expand the path)
My sata drive is using the 'ahci' driver, connecting to the
 ICH7 chipset on the motherboard.
And I have a scsi drive on a Adaptec card, plugged into a PCI slot.
Thanks
Nigel Smith

# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
   0. c0t0d0 DEFAULT cyl 2229 alt 2 hd 255 sec 63
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci9005,[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
   1. c2t0d0 DEFAULT cyl 9723 alt 2 hd 255 sec 63
  /[EMAIL PROTECTED],0/pci1028,[EMAIL PROTECTED],2/[EMAIL PROTECTED],0
Specify disk (enter its number): ^D

# prtconf -acD /devices/[EMAIL PROTECTED],0/pci1028\,[EMAIL PROTECTED],2
i86pc (driver name: rootnex)
pci, instance #0 (driver name: npe)
pci1028,1a8, instance #0 (driver name: ahci)
disk, instance #0 (driver name: sd)

# prtconf -acD /devices/[EMAIL PROTECTED],0/pci8086\,[EMAIL 
PROTECTED]/pci9005\,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
i86pc (driver name: rootnex)
pci, instance #0 (driver name: npe)
pci8086,244e, instance #0 (driver name: pci_pci)
pci9005,62a0, instance #0 (driver name: cadp160)
sd, instance #2 (driver name: sd)

# modinfo | egrep -i 'root|npe|ahci|sd |pci_pci|cadp160'
 20 fbb309c8   4768   1   1  rootnex (i86pc root nexus 1.141)
 30 fbb970e0   6d20 183   1  npe (Host to PCIe nexus driver 1.7)
 34 fbba4aa0   77a8  58   1  ahci (ahci driver 1.1)
 36 fbbb9f70  25fb8  31   1  sd (SCSI Disk Driver 1.548)
146 f7c73000   1a58  84   1  pci_pci (PCI to PCI bridge nexus driver )
147 f7d2  275a0  32   1  cadp160 (Adaptec Ultra160 HBA d1.21)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: snv63: kernel panic on import

2007-05-15 Thread Nigel Smith
I seem to have got the same core dump, in a different way.
I had a zpool setup on a iscsi 'disk'.  For details see:
http://mail.opensolaris.org/pipermail/storage-discuss/2007-May/001162.html
But after a reboot the iscsi target was not longer available, so the iscsi
initiator could not provide the disk that he zpool was based on.
I did a 'zpool status', but the PC just rebooted, rather than handling it in a
graceful way.
After the reboot I discover a core dump has been created - details below:

# cat /etc/release
Solaris Nevada snv_60 X86
   Copyright 2007 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
 Assembled 12 March 2007
#
# cd /var/crash/solaris
# mdb -k 1
Loading modules: [ unix genunix specfs dtrace uppc pcplusmp scsi_vhci ufs ip 
hook neti
 sctp arp usba uhci qlc fctl nca lofs zfs random md cpc crypto fcip fcp 
logindmux ptm
  sppp emlxs ipc ]
 ::status
debugging crash dump vmcore.1 (64-bit) from solaris
operating system: 5.11 snv_60 (i86pc)
panic message:
ZFS: I/O failure (write on unknown off 0: zio fffec38cf340 [L0 packed 
nvlist]
 4000L/600P DVA[0]=0:160225800:600 DVA[1]=0:9800:600 fletcher4 lzjb LE 
contiguous
  birth=192896 fill=1 cksum=6b28
dump content: kernel pages only
 *panic_thread::findstack -v
stack pointer for thread ff00025b2c80: ff00025b28f0
  ff00025b29e0 panic+0x9c()
  ff00025b2a40 zio_done+0x17c(fffec38cf340)
  ff00025b2a60 zio_next_stage+0xb3(fffec38cf340)
  ff00025b2ab0 zio_wait_for_children+0x5d(fffec38cf340, 11, 
fffec38cf598)
  ff00025b2ad0 zio_wait_children_done+0x20(fffec38cf340)
  ff00025b2af0 zio_next_stage+0xb3(fffec38cf340)
  ff00025b2b40 zio_vdev_io_assess+0x129(fffec38cf340)
  ff00025b2b60 zio_next_stage+0xb3(fffec38cf340)
  ff00025b2bb0 vdev_mirror_io_done+0x2af(fffec38cf340)
  ff00025b2bd0 zio_vdev_io_done+0x26(fffec38cf340)
  ff00025b2c60 taskq_thread+0x1a7(fffec154f018)
  ff00025b2c70 thread_start+8()
 ::cpuinfo -v
 ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD   PROC
  0 fbc31f80  1b20  99   nono t-0ff00025b2c80 sched
   ||
RUNNING --++--  PRI THREAD   PROC
  READY60 ff00022c9c80 sched
 EXISTS60 ff00020e9c80 sched
 ENABLE

 ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD   PROC
  1 fffec11ad000  1f30  59  yesno t-0fffec3dcbbc0 
syslogd
   ||
RUNNING --++--  PRI THREAD   PROC
  READY60 ff000212bc80 sched
   QUIESCED59 fffec1e51360 syslogd
 EXISTS59 fffec1ec2180 syslogd
 ENABLE

 ::quit
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss