Re: [zfs-discuss] Narrow escape with FAULTED disks

2010-08-23 Thread Mark Bennett
Well I do have a plan.

Thanks to the portability of ZFS boot disks, I'll make two new OS disks on 
another machine with the next Nexcenta release, export the data pool and swap 
in the new ones.

That way, I can at least manage a zfs scrub without killing the performance and 
get the Intel SSD's I have been testing to work properly.

On the other hand, I could just use the spare 7210 Appliance boot disk I have 
lying about.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-14 Thread Mark Bennett
That's a very good question actually. I would think that COMSTAR would
stay because its used by the Fishworks appliance... however, COMSTAR is
a competitive advantage for DIY storage solutions. Maybe they will rip
it out of S11 and make it an add-on or something. That would suck.


I guess the only real reason you can't yank COMSTAR is because its now
the basis for iSCSI Target support. But again, there is nothing saying
that Target support has to be part of the standard OS offering.

Scary to think about. :)

benr.

That would be the sensible commercial decision, and kill off the competition in 
the storage market using OpenSolaris based product.

I haven't found a linux that can reliably spin the 100Tb I currently have 
behind OpenSolaris and ZFS.
Luckily b134 doesn't seem to have any major issues, and I'm currently looking 
into a USB boot/raidz root combination for 1U storage.

I ran Red Hat 9 with updated packages for quite a few years.
As long as the kernel is stable, and you can work through the hurdles, it can 
still do the job.


Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS development moving behind closed doors

2010-08-14 Thread Mark Bennett
On 8/13/10 8:56 PM -0600 Eric D. Mudama wrote:
 On Fri, Aug 13 at 19:06, Frank Cusack wrote:
 Interesting POV, and I agree. Most of the many distributions of
 OpenSolaris had very little value-add. Nexenta was the most interesting
 and why should Oracle enable them to build a business at their expense?

 These distributions are, in theory, the gateway drug where people
 can experiment inexpensively to try out new technologies (ZFS, dtrace,
 crossbow, comstar, etc.) and eventually step up to Oracle's big iron
 as their business grows.

I've never understood how OpenSolaris was supposed to get you to Solaris.
OpenSolaris is for enthusiasts and great great folks like Nexenta.
Solaris lags so far behind it's not really an upgrade path.

Fedora is a great beta test arena for what eventually becomes a commercial 
Enterprise offering. OpenSolaris was the Solaris equivalent.

Losing the free bleeding edge testing community will no doubt impact on the 
Solaris code quality.

It is now even more likely Solaris will revert to it's niche on SPARC over the 
next few years.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-02-01 Thread Mark Bennett
I did see that and confirmed the support has made it into the 130 release I'm 
testing with.

However, the WD10EARS does not expose 4k sectors to the outside world, so it is 
not identified as supporting it.
Correct alignment, to ensure best performance of the internal translation, 
seems to be the only option available.  

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-02-01 Thread Mark Bennett
The results are in:

My timeout issue is definitely the WD10EARS disks.
Although differences in the error rate was seen with different LSI firmware 
revisions, the errors persisted. The more disks on the expander, the higher the 
number with iostat errors.
This then causes zpool issues (disk failures, resilvering etc.)

I replaced 24 of them with ST32000542NS (f/w CC34), and the problem departed 
with the WD disks.
Full scrub of 1.5Tb, not one error seen anywhere.

WD has chosen to cripple their consumer grade disks when used in quantities 
greater than one.

I'll now need to evaluate alternative supplers of low cost disks for low end 
high volume storage.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-02-01 Thread Mark Bennett
The WD10EARS disks don't work well.

I had too many issues with timeouts that disappeared when replacing them with 
ST32000542AS drives.

My next challenge is to get the LSI 3081 to boot off the disk I want it to, and 
then to get multipath functional.

Has anyone else had issues with the LSI IT mode firmware changing the disk 
order so the bootable disk is no longer the one booted from with expanders?
It boots with only two disks installed(bootable zfs mirror). Add some more and 
the target boot disk moves to one of them.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-01-31 Thread Mark Bennett
Update:

For the WD10EARS, the blocks appear to be aligned on the 4k boundary when zfs 
uses the whole disk (whole disk as EFI partition).

Part  TagFlag First Sector Size Last Sector
 0usrwm256   931.51Gb  1953508750
 
 calc256*512/4096=32

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-01-30 Thread Mark Bennett
I'm looking into the alignment implications for the WD10EARS disks.
It may explain my issues.

I seem to recall boot issues in some of the LSI release notes affecting other 
boot devices. I think it takes over boot responsibility.
I've encountered this sort of issue over the years with many scsi cards.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-01-28 Thread Mark Bennett
My experience was different again.
I have the same timeout issues with both the LSI and Supermicro cards in IT 
mode.
IR mode on the Supermicro card didn't solve the problem, but seems to have 
reduced it .
Server has 1 x 16 bay chassis and 1 x 24 bay chassis (both use expander)
test pool has 24 x WD10EARS in 6 disk vdev sets, 1 on 16 bay and 2 on 24 bay.

Mark
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Strange random errors getting automatically repaired

2010-01-27 Thread Mark Bennett
Hi Giovanni,

I have seen these while testing the mpt timeout issue, and on other systems 
during resilvering of failed disks and while running a scrub.

Once so far on this test scrub, and several on yesterdays.

I checked the iostat errors, and they weren't that high on that device, 
compared to other disks.

c2t34d0  ONLINE   0 0 1  25.5K repaired

  errors ---
  s/w h/w trn tot device
  0   8  61  69 c2t30d0
  0   2  17  19 c2t31d0
  0   5  41  46 c2t32d0
  0   5  33  38 c2t33d0
  0   3  31  34 c2t34d0 
  0  10  81  91 c2t35d0
  0   4  22  26 c2t36d0
  0   6  44  50 c2t37d0
  0   3  21  24 c2t38d0
  0   5  49  54 c2t39d0
  0   9  77  86 c2t40d0
  0   6  58  64 c2t41d0
  0   5  50  55 c2t42d0
  0   4  34  38 c2t43d0
  0   6  37  43 c2t44d0
  0   9  75  84 c2t45d0
  0  13  82  95 c2t46d0
  0   7  57  64 c2t47d0
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-01-25 Thread Mark Bennett
I can produce the timeout error on multiple, similar servers.
These are storage servers, so no zones or gui running.
Hardware:
Supermicro X7DWN with AOC-USASLP-L8i controller
E1 (single port) backplanes (16  24 bay)
(LSILOGICSASX28 A.0 and LSILOGICSASX36 A.1)
up to 36 1Tb WD Sata disks 

This server has 2 x quad core Intel CPU  16Gb ram.
Disks: WD 1Tb c4t12d0 to c4t47d0 as single raidz pool. (6 disks per set)
Running dev 131.
I see problem on 2009.06 as well.

I note that the latest AOC-USASLP-L8i firmware is LSI Rev 1.26.00.00, which I 
believe does not support MSI. (working on Supermicro to update the firmware)

I have an LSI controller to swap for the  AOC-USASLP-L8i with latest firmware, 
which I can retest with.

After a few hours of light load, no errors appear unless I initiate a scrub.

 iostat -X -e -n
  errors ---
  s/w h/w trn tot device
  0   0   0   0 fd0
  0   9   0   9 c5t1d0
  0   0   0   0 c4t8d0
  0   0   0   0 c4t9d0
  0   0   0   0 c4t12d0
  0   0   0   0 c4t13d0
  0   0   0   0 c4t14d0
  0   0   0   0 c4t15d0
  0   0   0   0 c4t16d0
  0   0   0   0 c4t17d0
  0   0   0   0 c4t18d0
  0   0   0   0 c4t19d0
  0   0   0   0 c4t20d0
  0   0   0   0 c4t21d0
  0   0   0   0 c4t22d0
  0   0   0   0 c4t23d0
  0   0   0   0 c4t30d0
  0   1  10  11 c4t31d0
  0   2  20  22 c4t32d0
  0   0   0   0 c4t33d0
  0   0   0   0 c4t34d0
  0   0   0   0 c4t35d0
  0   0   0   0 c4t36d0
  0   0   0   0 c4t37d0
  0   0   0   0 c4t38d0
  0   0   0   0 c4t39d0
  0   0   0   0 c4t40d0
  0   0   0   0 c4t41d0
  0   0   0   0 c4t42d0
  0   1  10  11 c4t43d0
  0   3  31  34 c4t44d0
  0   1  10  11 c4t45d0
  0   2  20  22 c4t46d0
  0   1  10  11 c4t47d0
  0   0   0   0 c4t48d0
  0   0   0   0 c4t49d0
  0   0   0   0 c4t50d0
  0   0   0   0 c4t51d0
  0   0   0   0 c4t52d0

In this instance, all errors are on the same (24 bay) backplane.
I have also had them on the 16 bay backplane with this 2 chassis configuration.

The problem becomes more of a pain when drives drop off for a short period, 
then reconnect and resilver or occassionally just stop until a reboot or hot 
plug.
The robustness of ZFS certainly helps keep things running.


Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] abusing zfs boot disk for fun and DR

2010-01-09 Thread Mark Bennett
Ben,
I have found that booting from cdrom and importing the pool on the new host, 
then boot the hard disk will prevent these issues.
That will reconfigure the zfs to use the new disk device.
When running, zpool detach the missing mirror device and attach a new one.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Understanding SAS/SATA Backplanes and Connectivity

2010-01-07 Thread Mark Bennett
Thanks Will,

I thought it might be an i2c interface port to the psu, but obviously much 
simpler.
I'll probably use a small picaxe micro, since I have a few here  have used 
them before.
I used them to 'translate' the replacement fans clock pulse to what the 
monitoring circuit needed in a few V240 PSU's.
Much cheaper than replacing the whole psu due to poor fan lifespan.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] how do i prevent changing device names? is this even a problem in ZFS

2010-01-06 Thread Mark Bennett
The earlier (2008) Opensolaris drivers tended to crash the server if you pulled 
out an active drive. It may have improved in later releases.
In the case of the Sun Storage Appliances, the sata (and sas) drivers used are 
different from those in Opensolaris and are considerably better featured.

My understanding is the chipset on the sata cards can't handle drive events as 
well as sas controllers do. 

After testing of both options, I concluded that sas was the better choice 
because of this, cabling complexity (24 bay chassis isn't fun), and 
expandability.
The controller and backplane price difference diminishes when you get to 24 
bays.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Understanding SAS/SATA Backplanes and Connectivity

2010-01-06 Thread Mark Bennett
Will,

sorry for picking an old thread, but you mentioned a psu monitor to supplement 
the CSE-PTJBOD-CB1.
I have two of these and am interested in your design.
Oddly, the LSI backplane chipset supports 2 x i2c busses that Supermicro didn't 
make use of for monitoring the psu's.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] how do i prevent changing device names? is this even a problem in ZFS

2010-01-04 Thread Mark Bennett
I'd recommend a SAS non-raid controller (with sas backplane) over sata.
It has better hot plug support.

I use the Supermicro SC836E1 and a AOC-USAS-L4i with a UIO M/b.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool import without mounting

2010-01-03 Thread Mark Bennett
Hi,

Is it possible to import a zpool and stop it mounting the zfs file systems, or 
override the mount paths?

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Supermicro AOC-USAS-L8i

2010-01-03 Thread Mark Bennett
I have used these cards several UIO capable Supermicro systems and Opensolaris, 
with the Supermicro storage chassis and up to 30 stata 1Tb disks.

With IT mode firmware (non-raid) they are excellent. They usually have the 
hardware assisted raid firmware by default.

The card is designed for the UIO slot. The cards are a mirror image of a normal 
pci-e card and may overlap adjacent slots.

They may work in other servers, but I have found some Supermicro non-UIO 
servers that wouldn't run them.

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Identify cause when disk faulted

2009-08-25 Thread Mark Bennett
On an OpenSolaris 2009.06 I have a zpool of 12 x WD10EACS disks plus 2 spares
One disk is reported as Faulted due to corrupted data.
The drive tests ok, but won't let me reuse it.

The drive passes the manufacturers diagnostic tests, and doesn't show issues 
with hdat2 diags or smart.

zeroing and checking after a replace attempt , shows it is being partitioned 
during the replace attempt.

obviously corrupt data suggests read/write integrity, but is there any way to 
get more detailed info or logs from zfs on what the reason for rejecting it ?
e.g. sector that is failing

Mark.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss