Re: [zfs-discuss] 1068E mpt driver issue

2010-07-07 Thread Daniel Bakken
Upgrade the HBA firmware to version 1.30. We had the same problem, but
upgrading solved it for us.

Daniel Bakken


On Wed, Jul 7, 2010 at 1:57 PM, Joeri Vanthienen m...@joerivanthienen.bewrote:

 Hi,

 We're using the following components with snv134:
 - 1068E HBA (supermicro)
 - 3U SAS / SATA Expander Backplane with dual LSI SASX28 Expander Chips
 (supermicro)
 - WD RE3 disks

 We've got the following error messages:

 Jul  7 10:09:12 sanwv01 scsi: [ID 107833 kern.warning] WARNING: /p...@0
 ,0/pci8086,3...@7/pci15d9,a...@0/s...@b,0 (sd2):

 Jul  7 10:09:12 sanwv01 incomplete read- retrying

 Jul  7 10:09:17 sanwv01 scsi: [ID 243001 kern.warning] WARNING: /p...@0
 ,0/pci8086,3...@7/pci15d9,a...@0 (mpt0):

 Jul  7 10:09:17 sanwv01 mpt_handle_event_sync: IOCStatus=0x8000,
 IOCLogInfo=0x31123000

 Jul  7 10:09:17 sanwv01 scsi: [ID 243001 kern.warning] WARNING: /p...@0
 ,0/pci8086,3...@7/pci15d9,a...@0 (mpt0):

 Jul  7 10:09:17 sanwv01 mpt_handle_event: IOCStatus=0x8000,
 IOCLogInfo=0x31123000

 Jul  7 10:09:19 sanwv01 scsi: [ID 365881 kern.info] /p...@0
 ,0/pci8086,3...@7/pci15d9,a...@0 (mpt0):

 Jul  7 10:09:19 sanwv01 Log info 0x31123000 received for target 21.

 Jul  7 10:09:19 sanwv01 scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc

 Jul  7 10:09:19 sanwv01 scsi: [ID 365881 kern.info] /p...@0
 ,0/pci8086,3...@7/pci15d9,a...@0 (mpt0):

 Jul  7 10:09:19 sanwv01 Log info 0x31123000 received for target 21.

 Jul  7 10:09:19 sanwv01 scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc

 Jul  7 10:09:19 sanwv01 scsi: [ID 365881 kern.info] /p...@0
 ,0/pci8086,3...@7/pci15d9,a...@0 (mpt0):

 Jul  7 10:09:19 sanwv01 Log info 0x31123000 received for target 21.

 Jul  7 10:09:19 sanwv01 scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc

 Jul  7 10:09:19 sanwv01 scsi: [ID 365881 kern.info] /p...@0
 ,0/pci8086,3...@7/pci15d9,a...@0 (mpt0):

 Jul  7 10:09:19 sanwv01 Log info 0x31123000 received for target 21.

 Jul  7 10:09:19 sanwv01 scsi_status=0x0, ioc_status=0x804b,
 scsi_state=0xc

 Jul  7 10:09:21 sanwv01 smbsrv: [ID 138215 kern.notice] NOTICE:
 smbd[WINDVISION\franz]: testshare share not found

 Jul  7 10:09:26 sanwv01 scsi: [ID 243001 kern.warning] WARNING: /p...@0
 ,0/pci8086,3...@7/pci15d9,a...@0 (mpt0):

 Jul  7 10:09:26 sanwv01 SAS Discovery Error on port 0.
 DiscoveryStatus is DiscoveryStatus is |SMP timeout|
 Jul  7 10:10:20 sanwv01 scsi: [ID 107833 kern.warning] WARNING: /p...@0
 ,0/pci8086,3...@7/pci15d9,a...@0 (mpt0):

 Jul  7 10:10:20 sanwv01 Disconnected command timeout for Target 21

 After this message, the pool is unavailable over iscsi. We can't run the
 format command or zpool status command anymore. We have to reboot the
 server. This happens frequently for different Targets. The stock firmware on
 the HBA is 1.26.01
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 1068E mpt driver issue

2010-07-07 Thread Daniel Bakken
Here is the link to download LSIutil, which can be used to upgrade firmware:

http://www.lsi.com/cm/License.do?url=http://www.lsi.com/DistributionSystem/AssetDocument/LSIUtil_1.62.zipprodName=LSI7104XP-LCsubType=Miscellaneouslocale=EN

And here again is the link to the 1.30 firmware:

http://www.lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/combo/sas3442e-r/index.html

I hope it works for you.

Daniel


On Wed, Jul 7, 2010 at 2:48 PM, Jacob Ritorto jacob.rito...@gmail.comwrote:

 Well, OK, but where do I find it?

 I'd still expect some problems with FCODE - vs. - BIOS issues if it's not
 SPARC firmware.

 thx
 jake



 On 07/07/10 17:46, Garrett D'Amore wrote:

 On Wed, 2010-07-07 at 17:33 -0400, Jacob Ritorto wrote:

 Thank goodness!  Where, specifically, does one obtain this firmware for
 SPARC?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs send hangs

2010-04-09 Thread Daniel Bakken
My zfs filesystem hangs when transferring large filesystems (500GB)
with a couple dozen snapshots between servers using zfs send/receive
with netcat. The transfer hangs about halfway through and is
unkillable, freezing all IO to the filesystem, requiring a hard
reboot. I have attempted this three times and failed every time.

On the destination server I use:
nc -l -p 8023 | zfs receive -vd sas

On the source server I use:
zfs send -vR promise1/rbac...@daily.1 | nc mothra 8023

The filesystems on both servers are the same (zfs version 3). The
source zpool is version 22 (build 129), and the destination server is
version 14 (build 111b).

Rsync does not have this problem and performs extremely well. However,
it will not transfer snapshots. Two other send/receives (234GB and
451GB)  between the same servers have worked fine without hanging.

Thanks,

Daniel Bakken
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] compression property not received

2010-04-07 Thread Daniel Bakken
I worked around the problem by first creating a filesystem of the same name
with compression=gzip on the target server. Like this:

zfs create sas/archive
zfs set compression=gzip sas/archive

Then I used zfs receive with the -F option:

zfs send -vR promise1/arch...@daily.1 | zfs send zfs receive -vFd sas

And now I have gzip compression enabled locally:

zfs get compression sas/archive
NAME PROPERTY VALUE SOURCE
sas/archive  compression  gzip  local

Not pretty, but it works.

Daniel Bakken


On Wed, Apr 7, 2010 at 12:51 PM, Cindy Swearingen 
cindy.swearin...@oracle.com wrote:

 Hi Daniel,

 I tried to reproduce this by sending from a b130 system to a s10u9 system,
 which vary in pool versions, but this shouldn't matter. I've
 been sending/receiving streams between latest build systems and
 older s10 systems for a long time. The zfs send -R option to send a
 recursive snapshot and all properties integrated into b77 so that
 isn't your problem either.

 The above works as expected. See below.

 I also couldn't find any recent bugs related to this, but bug searching is
 not an exact science.

 Mystified as well...

 Cindy

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] compression property not received

2010-04-07 Thread Daniel Bakken
The receive side is running build 111b (2009.06), so I'm not sure if your
advice actually applies to my situation.

Daniel Bakken


On Tue, Apr 6, 2010 at 10:57 PM, Tom Erickson thomas.erick...@oracle.comwrote:


 After build 128, locally set properties override received properties, and
 this would be the expected behavior. In that case, the value was received
 and you can see it like this:

 % zfs get -o all compression tank
 NAME  PROPERTY VALUE RECEIVED  SOURCE
 tank  compression  ongzip  local
 %

 You could make the received value the effective value (clearing the local
 value) like this:

 % zfs inherit -S compression tank
 % zfs get -o all compression tank
 NAME  PROPERTY VALUE RECEIVED  SOURCE
 tank  compression  gzip  gzip  received
 %

 If the receive side is below the version that supports received properties,
 then I would expect the receive to set compression=gzip.

 After build 128 'zfs receive' prints an error message for every property it
 fails to set. Before that version, 'zfs receive' is silent when it fails to
 set a property so long as everything else is successful. I might check
 whether I have permission to set compression with 'zfs allow'. You could
 pipe the send stream to zstreamdump to verify that compression=gzip is in
 the send stream, but I think before build 125 you will not have zstreamdump.

 Tom


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] compression property not received

2010-04-07 Thread Daniel Bakken
Here is the info from zstreamdump -v on the sending side:

BEGIN record
hdrtype = 2
features = 0
magic = 2f5bacbac
creation_time = 0
type = 0
flags = 0x0
toguid = 0
fromguid = 0
toname = promise1/arch...@daily.1

nvlist version: 0
tosnap = daily.1
fss = (embedded nvlist)
nvlist version: 0
0xcfde021e56c8fc = (embedded nvlist)
nvlist version: 0
name = promise1/archive
parentfromsnap = 0x0
props = (embedded nvlist)
nvlist version: 0
mountpoint = /promise1/archive
compression = 0xa
dedup = 0x2
(end props)

I assume that compression = 0xa means gzip. I wonder if the dedup property
is causing the receiver (build 111b)  to disregard all other properties,
since the receiver doesn't support dedup. Dedup was enabled in the past on
the sending filesystem, but is now disabled for reasons of sanity.

I'd like to try the dtrace debugging, but it would destroy the progress I've
made so far transferring the filesystem.

Thanks,

Daniel

On Wed, Apr 7, 2010 at 12:52 AM, Tom Erickson thomas.erick...@oracle.comwrote:


 The advice regarding received vs local properties definitely does not
 apply. You could still confirm the presence of the compression property in
 the send stream with zstreamdump, since the send side is running build 129.
 To debug the receive side I might dtrace the zap_update() function with the
 fbt provider, something like

 zfs send -R promise1/arch...@daily.1 | dtrace -c 'zfs receive -vd sas' \
 -n 'fbt::zap_update:entry / stringof(args[2]) == compression ||  \
 stringof(args[2]) == compression$recvd / { self-trace = 1; }'  \
 -n 'fbt::zap_update:return / self-trace / { trace(args[1]); \
 self-trace = 0; }'

 and look for non-zero return values.

 I'd also redirect 'zdb -vvv poolname' to a file and search it for
 compression to check the value in the ZAP.

 I assume you have permission to set the compression property on the receive
 side, but I'd check anyway.

 Tom

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] compression property not received

2010-04-07 Thread Daniel Bakken
We have found the problem. The mountpoint property on the sender was at one
time changed from the default, then later changed back to defaults using zfs
set instead of zfs inherit. Therefore, zfs send included these local
non-default properties in the stream, even though the local properties are
effectively set at defaults. This caused the receiver to stop processing
subsequent properties in the stream because the mountpoint isn't valid on
the receiver.

I tested this theory with a spare zpool. First I used zfs inherit
mountpoint promise1/archive to remove the local setting (which was
exactly the same value as the default). This time the compression=gzip
property was correctly received.

It seems like a bug to me that one failed property in a stream prevents the
rest from being applied. I should have used zfs inherit, but it would be
best if zfs receive handled failures more gracefully, and attempted to set
as many properties as possible.

Thanks to Cindy and Tom for their help.

Daniel

On Wed, Apr 7, 2010 at 2:31 AM, Tom Erickson thomas.erick...@oracle.comwrote:


 Now I remember that 'zfs receive' used to give up after the first property
 it failed to set. If I'm remembering correctly, then, in this case, if the
 mountpoint was invalid on the receive side, 'zfs receive' would not even try
 to set the remaining properties.

 I'd try the following in the source dataset:

 zfs inherit mountpoint promise1/archive

 to clear the explicit mountpoint and prevent it from being included in the
 send stream. Later set it back the way it was. (Soon there will be an option
 to take care of that; see CR 6883722 want 'zfs recv -o prop=value' to set
 initial property values of received dataset.) Then see if you receive the
 compression property successfully.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] verging OT: how to buy J4500 w/o overpriced

2010-02-12 Thread Daniel Bakken
On Fri, Feb 12, 2010 at 12:11 PM, Al Hopper a...@logical-approach.com wrote:
 There's your first mistake.  You're probably eligible for a very nice
 Federal Systems discount.  My *guess* would be about 40%.

Promise JBOD and similar systems are often the only affordable choice
for those of us who can't get sweetheart discounts, don't work at
billion dollar corporations, or aren't bankrolled by the Federal
Leviathan.

Daniel Bakken
Systems Administrator
Economic Modeling Specialists Inc
1187 Alturas Drive
Moscow, Idaho 83843
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] verging OT: how to buy J4500 w/o overpriced

2010-02-11 Thread Daniel Bakken
On Thu, Feb 11, 2010 at 2:46 PM, Dan Pritts da...@internet2.edu wrote:
 I've been considering it, but I talked to a colleague at another
 institution who had some really awful tales to tell about promise
 FC arrays.  They were clearly not ready for prime time.

 OTOH a SAS jbod is a lot less complicated.

We have used two Promise Vtrak m500f fibre channel arrays in heavy I/O
applications for several years. They don't always handle failing disks
gracefully-- sometimes requiring a hard reboot to recover. This is
partly due to crappy disks with weird failure modes. But a RAID system
should never require a reboot to recover from a single disk failure.
That defeats the whole purpose of RAID, which is supposed to survive
disk failures through redundancy.

However, Promise iSCSI and JBOD systems (we own one of each) are more
stable. We use them with Linux (XFS) and OpenSolaris (ZFS), and
haven't any experienced problems to date. Their JBOD systems, when
filled with Western Digital RE3 disks, are an extremely reliable,
low-cost, high performance ZFS storage solution.

Daniel Bakken
Systems Administrator
Economic Modeling Specialists Inc
1187 Alturas Drive
Moscow, Idaho 83843
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] verging OT: how to buy J4500 w/o overpriced

2010-02-09 Thread Daniel Bakken
On Tue, Feb 9, 2010 at 1:38 PM, Erik Trimble erik.trim...@sun.com wrote:

 Bottom line here: if someone comes along and provides the same level of
 service for a better price, the market will flock to them.  Or if the market
 decides the current level of service is unnecessary, it will move to vendors
 providing the sufficient level of service at the new price point.   But for
 now, there is no indication that the current pricing/service level models
 aren't correct.  You may /want/ and /think/ that a BMW 528i should cost $30k
 (I mean, it's not really any different than an Accord, right?), but the
 market has said no, it's $45k.  Sorry, the market is correct.


From my perspective as an IT pro, Sun is selling BMW's at $200k. It's
a great car, but a Mercedes is half the cost. Most hardware consumers
have already flocked to the competition, which explains Sun's
staggering losses and the Oracle buyout. We can't afford to pay tens
of thousands of dollars extra for bragging rights over IBM, HP, or
Dell shops.

Daniel Bakken
Systems Administrator

Economic Modeling Specialists Inc
Moscow, Idaho
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Non-Raid HBA

2010-01-29 Thread Daniel Bakken
I've used the LSI SAS3442E-R with great success in OpenSolaris 2009.06. It
is a non-raid SAS HBA that costs $220 on NewEgg.


Daniel Bakken


On Fri, Jan 29, 2010 at 8:21 AM, A. Krijgsman a.krijgs...@draftsman.nlwrote:

 Hi All,

 Since I believe there are people in this list having more experience in
 this matter then I do,

 To point out what it is I am doing:
 I am building a (very) low cost SAN setup as proof of concept using ZFS.
 The goal of this project is to provide a High-Available (or getting close
 to) low power solution.
 I am using the Supermicro 2.5 mobile racks. (8-in-2 5.25) which have dual
 port connectors, and connect
 both primary Host and Standby Host to the same JBOD using the dual port
 setup.

 Now I am looking for a HBA card for in the host machine.
 With preferable a lot of external SAS ports. ( But ZFS style ofcourse, so
 no raid/cache HBA required. )
 Like: http://www.areca.com.tw/products/sasnoneraid.htm
 Seen a lot of cards having internal ports as well, but don't know if
 breaking those out is a good idea.
 Anyone care to share their insights on this?

 The JBOD is self-build, so I am looking to get the SAS cables into my JBOD
 mobile racks.
 I found a couple of external to internal SAS adapters but they all seem
 quite expensive.
 Is there a simple (clean) way to connect an external chassis to my host?

 Hope anyone can give me some feedback on my questions.
 Offcourse I am willing to write down my findings in a blog if people are
 interested.

 Kind Regards,
 Armand


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss