Re: [zfs-discuss] TLER and ZFS

2010-10-13 Thread Travis Tabbal
As a home user, here are my thoughts. 

WD = ignore (TLER issues, parking issues, etc)

I recently built up a server on Osol running Samsung 1.5TB drives. They are 
green, but don't seem to have the irritating features found on the WD 
green drives. They are 5400RPM, but seem to transfer data plenty fast for a 
home setup. Current setup is 2x6-disk raidz2. Seek times obviously hurt, and 
ZIL caused so many issues that I turned it off. Yes, I know I might lose some 
data doing that, yes, I'm OK with the tradeoff. The ZFS devs say I won't lose 
filesystem consistency, just that the write cache could be lost, about 30sec of 
data in most cases. As it's on a UPS and the rest of the network isn't, or is 
on small UPSes, it will be the last box online, so any clients will probably 
have their data saved before the server goes down. The next upgrade is a UPS 
that can tell the server power is out so it can shut down gracefully. I'll 
probably get an SSD for slog/l2arc at some point and re-enable ZIL, but for 
now, this does the job as SSDs that don't have similar issues when
  used as slog devices are rare and expensive. If the X25e won't do...

This setup with the 5400 RPM drives is significantly faster than the same box 
with 7200RPM Seagate 400G drives was. Of course, those 400G drives are a few 
years old now, but I was pleasantly surprised by the speed I get out of the 
Samsungs.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] General help with understanding ZFS performance bottlenecks

2010-06-09 Thread Travis Tabbal
NFS writes on ZFS blows chunks performance wise. The only way to increase the 
write speed is by using an slog, the problem is that a proper slog device 
(one that doesn't lose transactions) does not exist for a reasonable price. The 
least expensive SSD that will work is the Intel X25-E, and even then you have 
to disable the write cache, which kills performance. And if you lose 
transactions in the ZIL, you may as well not have one.

Switching to a pool configuration with mirrors might help some. You will still 
get hit with sync write penalties on NFS though. 

Before messing with that, try disabling the ZIL entirely and see if that's 
where your problems are. Note that running without a ZIL can cause you to lose 
about 30secs of uncommitted data and if the server crashes without the clients 
rebooting, you can get corrupted data (from the client's perspective). However, 
it solved the performance issue for me. 

If that works, you can then decide how important the ZIL is to you. Personally, 
I like things to be correct, but that doesn't help me if performance is in the 
toilet. In my case, the server is on a UPS, the clients aren't. And most of the 
clients use netboot anyway, so they will crash and have to be rebooted if the 
server goes down. So for me, the drawback is small while the performance gain 
is huge. That's not the case for everyone, and it's up to the admin to decide 
what they can live with. Thankfully, the next release of OpenSolaris will have 
the ability to set ZIL on/off per filesystem. 

Note that the ZIL only effects sync write speed, so if your workload isn't sync 
heavy, it might not matter in your case. However, with NFS in the mix, it 
probably is. The ZFS on-disk data state is not effected by ZIL on/off, so your 
pool's data IS safe. You might lose some data that a client THINKS is safely 
written, but the ZFS pool will come back properly on reboot. So the client will 
be wrong about what is and is not written, thus the possible corruption from 
the client perspective. 

I run ZFS on 2 6-disk raidz2 arrays in the same pool and performance is very 
good locally. With ZIL enabled, NFS performance was so bad it was near 
unusable. With it disabled, I can saturate the single gigabit link and 
performance in the Linux VM (xVM) running on that server improved 
significantly, to near local speed, when using the NFS mounts to the main pool. 
My 5400RPM drives were not up to ZIL's needs, though they are plenty fast in 
general, and a working slog was out of budget for a home server.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Interesting experience with Nexenta - anyone seen it?

2010-05-20 Thread Travis Tabbal
Disable ZIL and test again. NFS does a lot of sync writes and kills 
performance. Disabling ZIL (or using the synchronicity option if a build with 
that ever comes out) will prevent that behavior, and should get your NFS 
performance close to local. It's up to you if you want to leave it that way. 
There are reasons not to as well. NFS clients can get corrupted views of the 
filesystem should the server go down before a write flush is completed. ZIL 
prevents that problem. In my case, the clients aren't on a UPS while the server 
is, so it's not an issue. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Travis Tabbal
 On May 19, 2010, at 2:29 PM, Don wrote:
 
 The data risk is a few moments of data loss. However,
 if the order of the
 uberblock updates is not preserved (which is why the
 caches are flushed)
 then recovery from a reboot may require manual
 intervention.  The amount
 of manual intervention could be significant for
 builds prior to b128.


This risk is mostly mitigated by UPS backup and auto-shutdown when the UPS 
detects power loss, correct? Outside of pulling the plug that should solve 
power related problems. Kernel panics should only be caused by hardware issues, 
which might corrupt the disk data anyway. Obviously software can and does fail, 
but the biggest problem I hear about with ZIL devices is behavior in a sudden 
power loss situation. It seems to me that UPS backup along with starting a 
shutdown cycle before complete power failure should prevent most issues. 

Seems like that should help with issues like the X25-E not honoring cache flush 
as well, the UPS would give it time to finish the writes. Again, without a 
firmware issue in the drive itself. Should be about the same as a supercap 
anyway.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] New SSD options

2010-05-20 Thread Travis Tabbal
 use a slog at all if it's not durable?  You should
 disable the ZIL
 instead. 


This is basically where I was going. There only seems to be one SSD that is 
considered working, the Zeus IOPS. Even if I had the money, I can't buy it. 
As my application is a home server, not a datacenter, things like NFS breaking 
if I don't reboot the clients is a non-issue. As long as the on-disk data is 
consistent so I don't have to worry about the entire pool going belly-up, I'm 
happy enough. I might lose 30 seconds of data, worst case, as a result of 
running without ZIL. Considering that I can't buy a proper ZIL at a cost I can 
afford, and an improper ZIL is not worth much, I don't see a reason to bother 
with ZIL at all. I'll just get a cheap large SSD for L2ARC, disable ZIL, and 
call it a day. 

For my use, I'd want a device in the $200 range to even consider an slog 
device. As nothing even remotely close to that price range exists that will 
work properly at all, let alone with decent performance, I see no point in ZIL 
for my application. The performance hit is just too severe to continue using it 
without an slog, and there's no slog device I can afford that works properly, 
even if I ignore performance.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Strategies for expanding storage area of home storage-server

2010-05-17 Thread Travis Tabbal
When I did a similar upgrade a while back I did #2. Create a new pool raidz2 
with 6 drives, copy the data to it, verify the data, delete the old pool, add 
old drives + some new drives to another 6 disk raidz2 in the new pool. 
Performance has been quite good, and the migration was very smooth. 

The other nice thing about this arrangement for a home user is that I now only 
need to upgrade 6 drives to get more space, rather than 12 per option #1. To be 
clear, this is my current config. 

NAME STATE READ WRITE CKSUM
raid ONLINE   0 0 0
  raidz2-0   ONLINE   0 0 0
c9t4d0   ONLINE   0 0 0
c9t5d0   ONLINE   0 0 0
c9t6d0   ONLINE   0 0 0
c9t7d0   ONLINE   0 0 0
c10t5d0  ONLINE   0 0 0
c10t4d0  ONLINE   0 0 0
  raidz2-1   ONLINE   0 0 0
c9t0d0   ONLINE   0 0 0
c9t1d0   ONLINE   0 0 0
c10t0d0  ONLINE   0 0 0
c10t1d0  ONLINE   0 0 0
c10t2d0  ONLINE   0 0 0
c10t3d0  ONLINE   0 0 0
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Replacement brackets for Supermicro UIO SAS cards....

2010-05-04 Thread Travis Tabbal
Thanks! I might just have to order a few for the next time I take the server 
apart. Not that my bent up versions don't work, but I might as well have them 
be pretty too. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on drives for ZIL/L2ARC?

2010-04-27 Thread Travis Tabbal
 I've got an OCZ Vertex 30gb drive with a 1GB stripe
 used for the slog
 and the rest used for the L2ARC, which for ~ $100 has
 been a nice
 boost to nfs writes.


What about the Intel X25-V? I know it will likely be fine for L2ARC, but what 
about ZIL/slog?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on drives for ZIL/L2ARC?

2010-04-26 Thread Travis Tabbal
 If your clients are mounting async don't bother.
  If the clients are
 ounting async, then all the writes are done
 asynchronously, fully
 accelerated, and never any data written to ZIL log.


I've tried async, things run well until you get to the end of the job, then the 
process hangs until the write is complete. This was just with tar extracting to 
the NFS drive.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thoughts on drives for ZIL/L2ARC?

2010-04-26 Thread Travis Tabbal
  From: zfs-discuss-boun...@opensolaris.org
 [mailto:zfs-discuss-
  boun...@opensolaris.org] On Behalf Of Travis Tabbal
 
 Oh, one more thing.  Your subject says ZIL/L2ARC
 and your message says I
 want to speed up NFS writes.
 
 ZIL (log) is used for writes.
 L2ARC (cache) is used for reads.
 
 I'd recommend looking at the ZFS Best Practices
 Guide.

At the end of my OP I mentioned that I was interested in L2ARC for dedupe. It 
sounds like the DDT can get bigger than RAM and slow things to a crawl. Not 
that I expect a lot from using an HDD for that, but I thought it might help. 
I'd like to get a nice SSD or two for this stuff, but that's not in the budget 
right now.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Thoughts on drives for ZIL/L2ARC?

2010-04-25 Thread Travis Tabbal
I have a few old drives here that I thought might help me a little, though not 
at much as a nice SSD, for those uses. I'd like to speed up NFS writes, and 
there have been some mentions that even a decent HDD can do this, though not to 
the same level a good SSD will.

The 3 drives are older LVD SCSI Cheetah drives. ST318203LW. I have 2 
controllers I could use, one appears to be a RAID controller with a memory 
module installed. An Adaptec AAA-131U2. The memory module comes up on Google as 
a 2MB EDO DIMM. Not sure that's worth anything to me. :) 

The other controller is an Adaptec 29160. Looks to be a 64-bit PCI card, but 
the machine it came from is only 32-bit PCI, as is my current machine. 

What say the pros here? I'm concerned that the max data rate is going to be 
somewhat low with them, but the seek time should be good as they are 10K RPM (I 
think). The only reason I thought to use one for L2ARC is for dedupe. It sounds 
like L2ARC helps a lot there. This is for a home server, so all I'm really 
looking to do is speed things up a bit while I save and look for a decent SSD 
option. However, if it's a waste of time, I'd rather find out before I install 
them.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Non-redundant zpool behavior?

2010-03-04 Thread Travis Tabbal
I have a small stack of disks that I was considering putting in a box to build 
a backup server. It would only store data that is duplicated elsewhere, so I 
wouldn't really need redundancy at the disk layer. The biggest issue is that 
the disks are not all the same size. So I can't really do a raidz or mirror 
with them anyway. So I was considering just putting them all in one pool. My 
question is how does zpool behave if I lose one disk in this pool? Can I still 
access the data on the other disks? Or is it like a traditional raid0 and I 
lose the whole pool? Is there a better way to deal with this, using my old 
mismatched hardware? 

Yes, I could probably build a raidz by partitioning and such, but I'd like to 
avoid the complexity. I'd probably just use zfs send/recv to send snapshots 
over or perhaps crashplan.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Non-redundant zpool behavior?

2010-03-04 Thread Travis Tabbal
Thanks. That's what I expected the case to be. Any reasons this shouldn't work 
for strictly backup purposes? Obviously, one disk down kills the pool, but as I 
only ever need to care if I'm restoring, that doesn't seem to be such a big 
deal. It will be a secondary backup destination for local machines like laptops 
that don't have redundant storage. The primary backups will still be hosted on 
the main server with 2 raidz2 arrays. 

The only downside I can see to this idea is that I was expecting it to be used 
as an offsite backup as well, so in a real disaster I might have only a single 
non-redundant copy of the data. That alone might be enough reason for me not to 
do it. After getting used to redundancy, it's hard to go back to not having it. 
:)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Travis Tabbal
Supermicro USAS-L8i controllers. 

I agree with you, I'd much rather have the drives respond properly and promptly 
than save a little power if that means I'm going to get strange errors from the 
array. And these are the green drives, they just don't seem to cause me any 
problems. The issues people have noted with WD have made me stay away from them 
as just about every drive I own lives in some kind of RAID sometime in its 
life. I have a couple laptop drives that are single, all desktops have at least 
a mirror. I'm a little nuts and would probably install mirrors in the laptops 
if there were somewhere to put them. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-03 Thread Travis Tabbal
smartmontools doesn't work with my controllers. I can try it again when the 2 
new drives I've ordered arrive. I'll try connecting to the motherboard ports 
and see if that works with smartmontools. 

I haven't noticed any sleeping with the drives. I don't get any lag accessing 
the array or any error messages about them disappearing.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I can't seem to get the pool to export...

2010-01-18 Thread Travis Tabbal
On Sun, Jan 17, 2010 at 8:14 PM, Richard Elling richard.ell...@gmail.comwrote:

 On Jan 16, 2010, at 10:03 PM, Travis Tabbal wrote:

  Hmm... got it working after a reboot. Odd that it had problems before
 that. I was able to rename the pools and the system seems to be running well
 now. Irritatingly, the settings for sharenfs, sharesmb, quota, etc. didn't
 get copied over with the zfs send/recv. I didn't have that many filesystems
 though, so it wasn't too bad to reconfigure them.

 What OS or build?  I've had similar issues with b130 on all sorts of mounts
 besides ZFS.



Opensolaris snv_129.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-01-17 Thread Travis Tabbal
I've been having good luck with Samsung green 1.5TB drives. I have had 1 DOA, 
but I currently have 10 of them, so that's not so bad. In that size purchase, 
I've had one bad from just about any manufacturer. I've avoided WD for RAID 
because of the error handling stuff kicking drives out of arrays. I don't know 
if that's currently an issue though. And with Seagate's recent record, I didn't 
feel confident in their larger drives. I was concerned about the 5400RPM speed 
being a problem, but I can read over 100MB/s from the array, and 95% on my use 
is over a gigabit LAN, so they are more than fast enough for my needs. 

I just set up a new array with them, 6 in raidz2. The replacement time is high 
enough that I decided the extra parity was worth the cost, even for a home 
server. I need 2 more drives, then I'll migrate my other 4 from the older array 
over as well into another 6 drive raidz2 and add it to the pool. 

I have decided to treat HDDs as completely untrustworthy. So when I get new 
drives I test them by creating a temporary pool in a mirror config and filling 
the drives up by copying data from the primary array. Then do a scrub. When 
it's done, if you get no errors, and no other errors in dmesg, then wait a week 
or so and do another scrub test. I found a bad SATA hotswap backplane and a bad 
drive this way. There are probably faster ways, but this works for me.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-01-17 Thread Travis Tabbal
HD154UI/1AG01118

They have been great drives for a home server. Enterprise users probably need 
faster drives for most uses, but they work great for me.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] I can't seem to get the pool to export...

2010-01-16 Thread Travis Tabbal
r...@nas:~# zpool export -f raid
cannot export 'raid': pool is busy

I've disabled all the services I could think of. I don't see anything accessing 
it. I also don't see any of the filesystems mounted with mount or zfs mount. 
What's the deal?  This is not the rpool, so I'm not booted off it or anything 
like that. I'm on snv_129. 

I'm attempting to move the main storage to a new pool. I created the new pool, 
used zfs send | zfs recv for the filesystems. That's all fine. The plan was 
to export both pools, and use the import to rename them. I've got the new pool 
exported, but the older one refuses to export. 

Is there some way to get the system to tell me what's using the pool?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] I can't seem to get the pool to export...

2010-01-16 Thread Travis Tabbal
Hmm... got it working after a reboot. Odd that it had problems before that. I 
was able to rename the pools and the system seems to be running well now. 
Irritatingly, the settings for sharenfs, sharesmb, quota, etc. didn't get 
copied over with the zfs send/recv. I didn't have that many filesystems though, 
so it wasn't too bad to reconfigure them.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raidz data loss stories?

2009-12-22 Thread Travis Tabbal
Interesting discussion. I know the bias here is generally toward enterprise 
users. I was wondering if the same recommendations hold for home users that are 
generally more price sensitive. I'm currently running OpenSolaris on a system 
with 12 drives. I had split them into 3 sets of 4 raidz1 arrays. This made some 
sense at the time as I can upgrade 4 disks at a time as new sizes come out. 
However, with 8 of the disks currently being 1.5TB, I'm getting concerned about 
this strategy. While important data is backed up, a loss of the server data 
would be very irritating. 

My next thought was to get more drives and run a single raidz3 vdev with 
12x1.5TB. More space than I need for quite a while, since I can't add just a 
few drives, triple parity for protection. I'd need a few extra drives to hold 
the data while I rebuild the main array, so I'd have cold-spares available that 
I would use for backing up critical data from the server. So they would see use 
and scrubs, not just sitting on the shelf. Access is over a gigE network, so I 
don't need more performance than that. I have read that the overall speed of a 
vdev is approximately the speed of a single device in the vdev, and in this 
case that is more than fast enough. I'm curious what the experts here think of 
this new plan. I'm pretty sure I know what you all think of the old one. :) 

Do you recommend swapping spare drives into the array periodically? It seems 
like it wouldn't really be any better than running scrub over the same period, 
but I've heard of people doing it on hardware raid controllers.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] raidz data loss stories?

2009-12-22 Thread Travis Tabbal
 Everything I've seen you should stay around 6-9
 drives for raidz, so don't do a raidz3 with 12
 drives.  Instead make two raidz3 with 6 drives each
  (which is (6-3)*1.5 * 2 = 9 TB array.)

So the question becomes, why? If it's performance, I can live with lower IOPS 
and max throughput. If it's reliability, I'd like to hear why. I would think 
that the number of acceptable devices in a raidz would scale somewhat with the 
number of drives used for parity. So I would expect to see a sliding scale 
somewhat like the one mentioned before regarding disk size vs. raidz level. 

For example: 

3-4 drives: raidz1
4-8 drives: raidz2
8+ drives: raidz3

In practice, I would expect to see some kind of chart with number of devices 
and size of devices used together to determine the proper raidz level. Perhaps 
I'm way off base though. Note that I don't really have a problem doing 2 
arrays, but I would think that perhaps raidz2 would be acceptable in that 
configuration. The benefit to that config for me would be that I could create a 
parallel array of 6 to copy my existing data to, then add the second array 
after the initial file copy/scrub. I would need fewer disks to complete the 
transition.
 
 As for whether or not to do raidz, for me the
 issue is performance.  I can't handle the raidz
 write penalty.  If I needed triple drive protection,
 a 3way mirror setup would be the only way I would
 go.  I don't yet quite understand why a 3+ drive
 raidz2 vdev is better than a 3 drive mirror vdev?
 Other than a 5 drive setup is 3 drives of space
 when a 6 drive setup using 3 way mirror is only 2
  drive space.

I've already stipulated that performance is not the primary concern. 100MB/sec 
with reasonable random I/O for a max of 5 clients is more than enough. My 
existing raidz is more than fast enough for my needs, and I have 5400RPM drives 
in there. 

I'd be very interested to hear an expert opinion on this. Given, say, 6 disks. 
What advantage in reliability, if any, would a raidz3 have vs. a striped pair 
of 3-way mirrors? Obviously the raidz3 has 1 disk worth of extra space, but 
we're talking about reliability here. I would guess performance would be higher 
with the mirrors.

With all of my comments, please keep in mind that I am not a huge enterprise 
customer with loads of money to spend on this. If I were, I'd just buy 
Thumpers. I'm a home user with a decent fileserver.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How can we help fix MPT driver post build 129

2009-12-07 Thread Travis Tabbal
To be fair, I think it's obvious that Sun people are looking into it and that 
users are willing to help diagnose and test. There were requests for particular 
data in those threads you linked to, have you sent yours? It might help them 
find a pattern in the errors. 

I understand the frustration that it hasn't been fixed in a couple builds that 
they have been aware of it, but it could be a very tricky problem. It also 
sounds like it's not reproducible on Sun hardware, so they have to get cards 
and such as well. It's also less urgent now that they have identified a 
workaround that works for most of us. While disabling MSIs is not optimal, it 
does help a lot.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] mpt errors on snv 127

2009-12-01 Thread Travis Tabbal
If someone from Sun will confirm that it should work to use the mpt driver from 
2009.06, I'd be willing to set up a BE and try it. I still have the snapshot 
from my 2009.06 install, so I should be able to mount that and grab the files 
easily enough.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-12-01 Thread Travis Tabbal
Just an update, my scrub completed without any timeout errors in the log. XVM 
with MSI disabled globally.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-12-01 Thread Travis Tabbal
Perhaps. As I noted though, it also occurs on the onboard NVidia SATA 
controller when MSI is enabled. I had already put a line in /etc/system to 
disable MSI for that controller per a forum thread and it worked great. I'm now 
running with all MSI disabled via XVM as the mpt controller is giving me the 
same problems. As it's happening on totally different controller types, cable 
types, and drive types, I have to go with software issues. I know for sure the 
NVidia issue didn't come up on 2009.06. It makes the system take forever to 
boot, so it's very noticeable. It happened when I first went to dev builds, I 
want to say it was around b118. I updated for better XVM support for newer 
Linux kernels. 

The NVidia controller causes similar log messages. Command timeouts. Disabling 
MSIs fixes it as well. Motherboard is an Asus M4N82 Deluxe. NVIDIA nForce 980a 
SLI chipset.

I expect the root cause is the same, and I would guess that something is 
causing the drivers to miss or not receive some interrupts. However, my 
programming at this level is limited, so perhaps I'm misdiagnosing the issue.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-30 Thread Travis Tabbal
 o The problems are not seen with Sun's version of
  this card

Unable to comment as I don't have a Sun card here. If Sun would like to send me 
one, I would be willing to test it compared to the cards I do have. I'm running 
Supermicro USAS-L8i cards (LSI 1068e based). 

 o The problems are not seen with LSI's version of
  the driver

I haven't tried it as comments from Sun staff here have indicated that it's not 
a good idea. 

 o The problems are seen with the latest LSI
 firmware

Yes. When I checked, the LSI site was listing the version I see at boot. 

 o Errors still occur if MSIs are disabled.  

I haven't seen any command timeout errors since disabling MSIs. I tried using 
the command to disable MSI only for the MPT driver, but I get a similar error 
from the NVidia driver at that point as it has my boot drives. It seems to me 
that the issue seems to have more in common with MSIs than the drivers 
themselves. I do have a scrub scheduled for 12/1, so I can check the logs after 
than to see if it appears from that. My other tests have not triggered the 
issue since disabling MSIs. I'm currently running with set 
xpv_psm:xen_support_msi = -1.

I am not using any jbod enclosures. My setup uses SAS to SATA breakout cables 
and connect directly to the drives. I have tried different cables and lengths. 
The timeouts affected drives in a seemingly random fashion. I would get 
timeouts on both controllers and every drive over time.

I have never had command errors here. Just the timeouts.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-30 Thread Travis Tabbal
 (1) disabling MSI support in xVM makes the problem go
 away

Yes here.


 (6) mpt(7d) without MSI support is sloow.


That does seem to be the case. It's not so bad overall, and at least the 
performance is consistent. It would be nice if this were improved. 


 For those of you who have been running xVM without
 MSI support,
 could you please confirm whether the devices
 exhibiting the problem
 are internal to your host, or connected via jbod. And
 if via jbod,
 please confirm the model number and cables.


Direct connect. The drives are in hot-swap racks, but they are passive devices. 
No expanders or anything like that in there. In case it's interesting, the 
racks are StarTech HSB430SATBK devices. I'm using SAS to SATA breakout cables 
to connect them. I have tried different lengths with the same result.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-24 Thread Travis Tabbal
 
 On Nov 23, 2009, at 7:28 PM, Travis Tabbal wrote:
 
  I have a possible workaround. Mark Johnson
 mark.john...@sun.com  
  has been emailing me today about this issue and he
 proposed the  
  following:
 
  You can try adding the following to /etc/system,
 then rebooting...
  set xpv_psm:xen_support_msi = -1
 
 would this change affect systems not using XVM?  we
 are just using  
 these as backup storage.

Probably not. Are you seeing the issue without XVM installed? We had one other 
user report that the issue went away when they removed XVM, so I had thought it 
wouldn't affect other users. If you are getting the same issue without XVM, 
there may be overlapping bugs in play. Someone at Sun might be able to tell you 
how to disable MSI on the controller. Someone told me how to do it for the 
NVidia SATA controller when there was a bug in that driver. I would think there 
is a way to do it for the MPT driver.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-24 Thread Travis Tabbal
 Travis Tabbal wrote:
  I have a possible workaround. Mark Johnson
 mark.john...@sun.com has
  been emailing me today about this issue and he
 proposed the
  following:
  
  You can try adding the following to /etc/system,
 then rebooting... 
  set xpv_psm:xen_support_msi = -1
 
 I am also running XVM, and after modifying
 /etc/system and rebooting, my 
 zpool scrub test is runing along merrily with no
 hangs so far, where 
 usually I would expect to see several by now.
 
 Can the other folks who have seen this please test
 and report back? I'd 
 hate to think we solved it only to discover there
 were overlapping bugs.
 
 Fingers crossed, and many thanks to those who have
 worked to track this 
 down!


Nice to see we have one confirmed report that things are working. Hopefully we 
get a few more! Even if it's just a workaround until a real fix makes it in, it 
gets us running.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-23 Thread Travis Tabbal
 I will give you all of this information on monday.
  This is great  news :)


Indeed. I will also be posting this information when I get to the server 
tonight. Perhaps it will help. I don't think I want to try using that old 
driver though, it seems too risky for my taste. 

Is there a command to get the disk firmware rev from OpenSolaris while booted 
up? I know of some boot CDs that can get to it, but I'm unsure about accessing 
it while the server is running.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-23 Thread Travis Tabbal
I have a possible workaround. Mark Johnson mark.john...@sun.com has been 
emailing me today about this issue and he proposed the following: 

 You can try adding the following to /etc/system, then rebooting...
  set xpv_psm:xen_support_msi = -1

I have been able to format a ZVOL container from a VM 3 times while other 
activity is going on the system and it's working. I think performance is down a 
bit, but it's still acceptable. More importantly, it does so without killing 
the server. I would get the stall every time I would try this test before. So 
at least 1 case seems to be helped by doing this. I'll watch the server over 
the next few days to see if it stays improved. He mentioned that there is a fix 
being worked on for MSI handling in XVM that might make it into b129 that could 
fix this problem.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-20 Thread Travis Tabbal
 The latter, we run these VMs over NFS anyway and had
 ESXi boxes under test already. we were already
 separating data exports from VM exports. We use
 an in-house developed configuration management/bare
 metal system which allows us to install new machines
 pretty easily. In this case we just provisioned the
 ESXi VMs to  new VM exports on the Thor whilst
 re-using the data-exports as they were...


Thanks for the info. Unfortunately, I need this box to do double duty and run 
the VMs as well. The hardware is capable, this issue with XvM and/or the mpt 
driver just needs to get fixed. Other than that, things are running great with 
this server.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-12 Thread Travis Tabbal
On Wed, Nov 11, 2009 at 10:25 PM, James C. McPherson
j...@opensolaris.orgwrote:


 The first step towards acknowledging that there is a problem
 is you logging a bug in bugs.opensolaris.org. If you don't, we
 don't know that there might be a problem outside of the ones
 that we identify.



I apologize if I offended by not knowing the protocol. I thought that
posting in the forums was watched and the bug tracker updated by people at
Sun. I didn't think normal users had access to submit bugs. Thank you for
the reply. I have submitted a bug on the issue with all the information I
think might be useful. If someone at Sun would like more information, output
from commands, or testing, I would be happy to help.

I was not provided with a bug number by the system. I assume that those are
given out if the bug is deemed worthy of further consideration.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-12 Thread Travis Tabbal
I submitted a bug on this issue, it looks like you can reference other bugs 
when you submit one, so everyone having this issue could possibly link mine and 
submit their own hardware config. It sounds like it's widespread though, so I'm 
not sure if that would help or hinder. I'd hate to bury the developers/QA team 
under a mountain of duplicate requests. 

CR 6900767
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-12 Thread Travis Tabbal
 What type of disks are you using?

I'm using SATA disks with SAS-SATA breakout cables. I've tried different cables 
as I have a couple spares. 

mpt0 has 4x1.5TB Samsung Green drives. 
mpt1 has 4x400GB Seagate 7200 RPM drives.

I get errors from both adapters. Each adapter has an unused SAS channel 
available. If I can get this fixed, I'm planning to populate those as well.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-12 Thread Travis Tabbal
 Have you tried wrapping your disks inside LVM
 metadevices and then used those for your ZFS pool?

I have not tried that. I could try it with my spare disks I suppose. I avoided 
LVM as it didn't seem to offer me anything ZFS/ZPOOL didn't.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-12 Thread Travis Tabbal
  I'm running nv126 XvM right now. I haven't tried
 it
  without XvM.
 
 Without XvM we do not see these issues. We're running
 the VMs through NFS now (using ESXi)...

Interesting. It sounds like it might be an XvM specific bug. I'm glad I 
mentioned that in my bug report to Sun. Hopefully they can duplicate it. I'd 
like to stick with XvM as I've spent a fair amount of time getting things 
working well under it. 

How did your migration to ESXi go? Are you using it on the same hardware or did 
you just switch that server to an NFS server and run the VMs on another box?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-11 Thread Travis Tabbal
 Hi, you could try LSI itmpt driver as well, it seems
 to handle this better, although I think it only
 supports 8 devices at once or so.
 
 You could also try more recent version of opensolaris
 (123 or even 126), as there seems to be a lot fixes
 regarding mpt-driver (which still seems to have
 issues).


I won't speak for the OP, but I've been seeing this same behaviour on 126 with 
LSI 1068E based cards (Supermicro USAS-L8i). 

For the LSI driver, how does one install it? I'm new to OpenSolaris and don't 
want to mess it up. It looked to be very old, is Solaris backward compatibility 
that good? 

It would be really nice if Sun would at least acknowledge the bug and that they 
can/can't reproduce it. I'm happy to supply information and test things if it 
will help. I have some spare disks I can attach to one of these cards and test 
driver updates and such. It sounds like people with Sun hardware are 
experiencing this as well.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-11 Thread Travis Tabbal
 Have you tried another SAS-cable?


I have. 2 identical SAS cards, different cables, different disks (brand, size, 
etc). I get the errors on random disks in the pool. I don't think it's hardware 
related as there have been a few reports of this issue already.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-31 Thread Travis Tabbal
I am also running 2 of the Supermicro cards. I just upgraded to b126 and it 
seems improved. I am running a large file copy locally. I get these warnings in 
the dmesg log. When I do, I/O seems to stall for about 60sec. It comes back up 
fine, but it's very annoying. Any hints? I have 4 disks per controller right 
now, different brands, sizes, everything. New SATA fanout cables and no 
expanders. 

The drives on mpt0 and mpt1 are completely different, 4x400GB Seagate drives, 
4x1.5TB Samsung drives. I get the problem from both controllers. I didn't 
notice this till about b124. I can reproduce it with rsync copying files 
locally between ZFS filesystems and with --bwlimit=1 (10MB/sec). Keeping 
the limit low does seem to help. 

---

Oct 31 23:05:32 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@3/pci15d9,a...@0 (mpt1):
Oct 31 23:05:32 nas Disconnected command timeout for Target 7
Oct 31 23:09:42 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@2/pci15d9,a...@0 (mpt0):
Oct 31 23:09:42 nas Disconnected command timeout for Target 1
Oct 31 23:16:23 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@2/pci15d9,a...@0 (mpt0):
Oct 31 23:16:23 nas Disconnected command timeout for Target 3
Oct 31 23:18:43 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@3/pci15d9,a...@0 (mpt1):
Oct 31 23:18:43 nas Disconnected command timeout for Target 6
Oct 31 23:27:24 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@3/pci15d9,a...@0 (mpt1):
Oct 31 23:27:24 nas Disconnected command timeout for Target 7
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] bewailing of the n00b

2009-10-23 Thread Travis Tabbal
 - How can I effect OCE with ZFS? The traditional
 'back up all the data somewhere, add a drive,
 re-establish the file system/pools/whatever, then
 copy the data back' is not going to work because
 there will be nowhere to temporarily 'put' the
  data.

Add devices to the pool. Preferably in mirrors or raidz configurations. If you 
just add bare devices to the pool you are running RAID-0, no redundancy. You 
cannot add devices to a raidz, as mentioned. But you can add more raidz or 
mirror devices. You can also replace devices with larger ones. It would be nice 
to be able to add more devices to a raidz for home users like us, maybe we'll 
see it someday. For now, the capabilities we do have make it reasonable to deal 
with. 

 - Concordantly, Is ZFS affected by a RAID card
 that supports OCE? Or is this to no advantage?


Don't bother. Spend the money on more RAM, and drives. :) Do get a nice 
controller though. Supermicro makes a few nice units. I'm using 2 AOC-USAS-L8i 
cards. They work great, though you do have to mod the mounting bracket to get 
them to work in a standard case. These are based on LSI cards, I just found 
them cheaper than the same LSI branded card. Avoid the cheap $20 4-port jobs. 
I've had a couple of them die already. Thankfully, I didn't lose any data... I 
think... no ZFS on that box. 


 - RAID5/6 with ZFS: As I understand it, ZFS with
 raidz will provide the data/drive redundancy I seek
 [home network, with maybe two simultaneous users on
 at least a p...@1ghz/1Gb RAM storage server] so
 obtaining a RAID controller card is
  unnecessary/unhelpful. Yes?


Correct. Though I would increase the RAM personally, it's so cheap these days. 
My home fileserver has 8GB of ECC RAM. I'm also running Xen VMs though, so some 
of my RAM is used for running those. 

You can even do tripple-redundant raidz with ZFS now, so you could lose 3 
drives without any data loss. For those that want really high availability, or 
really big arrays I suppose. I'm running 4x1.5TB in a raidz1, no problems. I do 
plan to keep a spare around though. I'll just use it to store backups to start 
with. If a drive goes bad, I'll drop it in and do a zpool replace. 

Don't worry about the command line. The ZFS based commands are pretty short and 
simple. Read up on zpool and zfs. Those are the commands you use the most for 
managing ZFS. There's also the ZFS best practices guide if you haven't seen it. 
Useful advice in there.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool with very different sized vdevs?

2009-10-23 Thread Travis Tabbal
Hmm.. I expected people to jump on me yelling that it's a bad idea. :) 

How about this, can I remove a vdev from a pool if the pool still has enough 
space to hold the data? So could I add it in and mess with it for a while 
without losing anything? I would expect the system to resliver the data onto 
the remaining vdevs, or tell me to go jump off a pier. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool with very different sized vdevs?

2009-10-22 Thread Travis Tabbal
I have a new array of 4x1.5TB drives running fine. I also have the old array of 
4x400GB drives in the box on a separate pool for testing. I was planning to 
have the old drives just be a backup file store, so I could keep snapshots and 
such over there for important files. 

I was wondering if it makes any sense to add the older drives to the new pool. 
Reliability might be lower as they are older drives, so if I were to loose 2 of 
them, things could get ugly. I'm just curious if it would make any sense to do 
something like this.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] White box server for OpenSolaris

2009-09-25 Thread Travis Tabbal
 I am after suggestions of motherboard, CPU and ram.
 Basically I want ECC ram and at least two PCI-E x4
 channels.  As I want to run 2 x AOC-USAS_L8i cards
  for 16 drives.

Asus M4N82 Deluxe. I have one running with 2 USAS-L8i cards just fine. I don't 
have all the drives loaded in yet, but the cards are detected and they can use 
the drives I do have attached. I currently have 8GB of ECC RAM on the board and 
it's working fine. The ECC options in the BIOS are enabled and it reports the 
ECC is enabled at boot. It has 3 PCIe x16 slots, I have a graphics card in the 
other slot, and an Intel e1000g card in the PCIe x1 slot. The onboard 
peripherals all work, with the exception of the onboard AHCI ports being buggy 
in b123 under xVM. Not sure what that's all about, I posted in the main 
discussion board but haven't heard if it's a known bug or if it will be fixed 
in the next version. It would be nice as my boot drives are on that controller. 
2009.06 works fine though. CPU is a Phenom II X3 720. Probably overkill for 
fileserver duties, but I also want to do some VMs for other things, thus the 
bug I found with the xVM updates.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss