from:"Travis Tabbal"

Re: [zfs-discuss] TLER and ZFS

2010-10-13 Thread Travis Tabbal

As a home user, here are my thoughts. 

WD = ignore (TLER issues, parking issues, etc)

I recently built up a server on Osol running Samsung 1.5TB drives. They are 
"green", but don't seem to have the irritating "features" found on the WD 
"green" drives. They are 5400RPM, but seem to transfer data plenty fast for a 
home setup. Current setup is 2x6-disk raidz2. Seek times obviously hurt, and 
ZIL caused so many issues that I turned it off. Yes, I know I might lose some 
data doing that, yes, I'm OK with the tradeoff. The ZFS devs say I won't lose 
filesystem consistency, just that the write cache could be lost, about 30sec of 
data in most cases. As it's on a UPS and the rest of the network isn't, or is 
on small UPSes, it will be the last box online, so any clients will probably 
have their data saved before the server goes down. The next upgrade is a UPS 
that can tell the server power is out so it can shut down gracefully. I'll 
probably get an SSD for slog/l2arc at some point and re-enable ZIL, but for 
now, this does the job as SSDs that don't have similar issues when
  used as slog devices are rare and expensive. If the X25e won't do...

This setup with the 5400 RPM drives is significantly faster than the same box 
with 7200RPM Seagate 400G drives was. Of course, those 400G drives are a few 
years old now, but I was pleasantly surprised by the speed I get out of the 
Samsungs.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] General help with understanding ZFS performance bottlenecks

2010-06-09 Thread Travis Tabbal

NFS writes on ZFS blows chunks performance wise. The only way to increase the 
write speed is by using an slog, the problem is that a "proper" slog device 
(one that doesn't lose transactions) does not exist for a reasonable price. The 
least expensive SSD that will work is the Intel X25-E, and even then you have 
to disable the write cache, which kills performance. And if you lose 
transactions in the ZIL, you may as well not have one.

Switching to a pool configuration with mirrors might help some. You will still 
get hit with sync write penalties on NFS though. 

Before messing with that, try disabling the ZIL entirely and see if that's 
where your problems are. Note that running without a ZIL can cause you to lose 
about 30secs of uncommitted data and if the server crashes without the clients 
rebooting, you can get corrupted data (from the client's perspective). However, 
it solved the performance issue for me. 

If that works, you can then decide how important the ZIL is to you. Personally, 
I like things to be correct, but that doesn't help me if performance is in the 
toilet. In my case, the server is on a UPS, the clients aren't. And most of the 
clients use netboot anyway, so they will crash and have to be rebooted if the 
server goes down. So for me, the drawback is small while the performance gain 
is huge. That's not the case for everyone, and it's up to the admin to decide 
what they can live with. Thankfully, the next release of OpenSolaris will have 
the ability to set ZIL on/off per filesystem. 

Note that the ZIL only effects sync write speed, so if your workload isn't sync 
heavy, it might not matter in your case. However, with NFS in the mix, it 
probably is. The ZFS on-disk data state is not effected by ZIL on/off, so your 
pool's data IS safe. You might lose some data that a client THINKS is safely 
written, but the ZFS pool will come back properly on reboot. So the client will 
be wrong about what is and is not written, thus the possible "corruption" from 
the client perspective. 

I run ZFS on 2 6-disk raidz2 arrays in the same pool and performance is very 
good locally. With ZIL enabled, NFS performance was so bad it was near 
unusable. With it disabled, I can saturate the single gigabit link and 
performance in the Linux VM (xVM) running on that server improved 
significantly, to near local speed, when using the NFS mounts to the main pool. 
My 5400RPM drives were not up to ZIL's needs, though they are plenty fast in 
general, and a working slog was out of budget for a home server.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New SSD options

2010-05-20 Thread Travis Tabbal

> use a slog at all if it's not durable?  You should
> disable the ZIL
> instead. 


This is basically where I was going. There only seems to be one SSD that is 
considered "working", the Zeus IOPS. Even if I had the money, I can't buy it. 
As my application is a home server, not a datacenter, things like NFS breaking 
if I don't reboot the clients is a non-issue. As long as the on-disk data is 
consistent so I don't have to worry about the entire pool going belly-up, I'm 
happy enough. I might lose 30 seconds of data, worst case, as a result of 
running without ZIL. Considering that I can't buy a proper ZIL at a cost I can 
afford, and an improper ZIL is not worth much, I don't see a reason to bother 
with ZIL at all. I'll just get a cheap large SSD for L2ARC, disable ZIL, and 
call it a day. 

For my use, I'd want a device in the $200 range to even consider an slog 
device. As nothing even remotely close to that price range exists that will 
work properly at all, let alone with decent performance, I see no point in ZIL 
for my application. The performance hit is just too severe to continue using it 
without an slog, and there's no slog device I can afford that works properly, 
even if I ignore performance.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New SSD options

2010-05-20 Thread Travis Tabbal

> On May 19, 2010, at 2:29 PM, Don wrote:

> The data risk is a few moments of data loss. However,
> if the order of the
> uberblock updates is not preserved (which is why the
> caches are flushed)
> then recovery from a reboot may require manual
> intervention.  The amount
> of manual intervention could be significant for
> builds prior to b128.

This risk is mostly mitigated by UPS backup and auto-shutdown when the UPS 
detects power loss, correct? Outside of pulling the plug that should solve 
power related problems. Kernel panics should only be caused by hardware issues, 
which might corrupt the disk data anyway. Obviously software can and does fail, 
but the biggest problem I hear about with ZIL devices is behavior in a sudden 
power loss situation. It seems to me that UPS backup along with starting a 
shutdown cycle before complete power failure should prevent most issues. 

Seems like that should help with issues like the X25-E not honoring cache flush 
as well, the UPS would give it time to finish the writes. Again, without a 
firmware issue in the drive itself. Should be about the same as a supercap 
anyway.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Interesting experience with Nexenta - anyone seen it?

2010-05-20 Thread Travis Tabbal

Disable ZIL and test again. NFS does a lot of sync writes and kills 
performance. Disabling ZIL (or using the synchronicity option if a build with 
that ever comes out) will prevent that behavior, and should get your NFS 
performance close to local. It's up to you if you want to leave it that way. 
There are reasons not to as well. NFS clients can get corrupted views of the 
filesystem should the server go down before a write flush is completed. ZIL 
prevents that problem. In my case, the clients aren't on a UPS while the server 
is, so it's not an issue. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Strategies for expanding storage area of home storage-server

2010-05-17 Thread Travis Tabbal

When I did a similar upgrade a while back I did #2. Create a new pool raidz2 
with 6 drives, copy the data to it, verify the data, delete the old pool, add 
old drives + some new drives to another 6 disk raidz2 in the new pool. 
Performance has been quite good, and the migration was very smooth. 

The other nice thing about this arrangement for a home user is that I now only 
need to upgrade 6 drives to get more space, rather than 12 per option #1. To be 
clear, this is my current config. 

NAME STATE READ WRITE CKSUM
raid ONLINE   0 0 0
  raidz2-0   ONLINE   0 0 0
c9t4d0   ONLINE   0 0 0
c9t5d0   ONLINE   0 0 0
c9t6d0   ONLINE   0 0 0
c9t7d0   ONLINE   0 0 0
c10t5d0  ONLINE   0 0 0
c10t4d0  ONLINE   0 0 0
  raidz2-1   ONLINE   0 0 0
c9t0d0   ONLINE   0 0 0
c9t1d0   ONLINE   0 0 0
c10t0d0  ONLINE   0 0 0
c10t1d0  ONLINE   0 0 0
c10t2d0  ONLINE   0 0 0
c10t3d0  ONLINE   0 0 0
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Replacement brackets for Supermicro UIO SAS cards....

2010-05-04 Thread Travis Tabbal

Thanks! I might just have to order a few for the next time I take the server 
apart. Not that my bent up versions don't work, but I might as well have them 
be pretty too. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Thoughts on drives for ZIL/L2ARC?

2010-04-27 Thread Travis Tabbal

> I've got an OCZ Vertex 30gb drive with a 1GB stripe
> used for the slog
> and the rest used for the L2ARC, which for ~ $100 has
> been a nice
> boost to nfs writes.


What about the Intel X25-V? I know it will likely be fine for L2ARC, but what 
about ZIL/slog?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Thoughts on drives for ZIL/L2ARC?

2010-04-26 Thread Travis Tabbal

> > From: zfs-discuss-boun...@opensolaris.org
> [mailto:zfs-discuss-
> > boun...@opensolaris.org] On Behalf Of Travis Tabbal
> 
> Oh, one more thing.  Your subject says "ZIL/L2ARC"
> and your message says "I
> want to speed up NFS writes."
> 
> ZIL (log) is used for writes.
> L2ARC (cache) is used for reads.
> 
> I'd recommend looking at the ZFS Best Practices
> Guide.

At the end of my OP I mentioned that I was interested in L2ARC for dedupe. It 
sounds like the DDT can get bigger than RAM and slow things to a crawl. Not 
that I expect a lot from using an HDD for that, but I thought it might help. 
I'd like to get a nice SSD or two for this stuff, but that's not in the budget 
right now.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Thoughts on drives for ZIL/L2ARC?

2010-04-26 Thread Travis Tabbal

> If your clients are mounting "async" don't bother.
>  If the clients are
> ounting async, then all the writes are done
> asynchronously, fully
> accelerated, and never any data written to ZIL log.


I've tried async, things run well until you get to the end of the job, then the 
process hangs until the write is complete. This was just with tar extracting to 
the NFS drive.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Thoughts on drives for ZIL/L2ARC?

2010-04-25 Thread Travis Tabbal

I have a few old drives here that I thought might help me a little, though not 
at much as a nice SSD, for those uses. I'd like to speed up NFS writes, and 
there have been some mentions that even a decent HDD can do this, though not to 
the same level a good SSD will.

The 3 drives are older LVD SCSI Cheetah drives. ST318203LW. I have 2 
controllers I could use, one appears to be a RAID controller with a memory 
module installed. An Adaptec AAA-131U2. The memory module comes up on Google as 
a 2MB EDO DIMM. Not sure that's worth anything to me. :) 

The other controller is an Adaptec 29160. Looks to be a 64-bit PCI card, but 
the machine it came from is only 32-bit PCI, as is my current machine. 

What say the pros here? I'm concerned that the max data rate is going to be 
somewhat low with them, but the seek time should be good as they are 10K RPM (I 
think). The only reason I thought to use one for L2ARC is for dedupe. It sounds 
like L2ARC helps a lot there. This is for a home server, so all I'm really 
looking to do is speed things up a bit while I save and look for a decent SSD 
option. However, if it's a waste of time, I'd rather find out before I install 
them.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Non-redundant zpool behavior?

2010-03-04 Thread Travis Tabbal

Thanks. That's what I expected the case to be. Any reasons this shouldn't work 
for strictly backup purposes? Obviously, one disk down kills the pool, but as I 
only ever need to care if I'm restoring, that doesn't seem to be such a big 
deal. It will be a secondary backup destination for local machines like laptops 
that don't have redundant storage. The primary backups will still be hosted on 
the main server with 2 raidz2 arrays. 

The only downside I can see to this idea is that I was expecting it to be used 
as an offsite backup as well, so in a real disaster I might have only a single 
non-redundant copy of the data. That alone might be enough reason for me not to 
do it. After getting used to redundancy, it's hard to go back to not having it. 
:)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Non-redundant zpool behavior?

2010-03-04 Thread Travis Tabbal

I have a small stack of disks that I was considering putting in a box to build 
a backup server. It would only store data that is duplicated elsewhere, so I 
wouldn't really need redundancy at the disk layer. The biggest issue is that 
the disks are not all the same size. So I can't really do a raidz or mirror 
with them anyway. So I was considering just putting them all in one pool. My 
question is how does zpool behave if I lose one disk in this pool? Can I still 
access the data on the other disks? Or is it like a traditional raid0 and I 
lose the whole pool? Is there a better way to deal with this, using my old 
mismatched hardware? 

Yes, I could probably build a raidz by partitioning and such, but I'd like to 
avoid the complexity. I'd probably just use zfs send/recv to send snapshots 
over or perhaps crashplan.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-04 Thread Travis Tabbal

Supermicro USAS-L8i controllers. 

I agree with you, I'd much rather have the drives respond properly and promptly 
than save a little power if that means I'm going to get strange errors from the 
array. And these are the "green" drives, they just don't seem to cause me any 
problems. The issues people have noted with WD have made me stay away from them 
as just about every drive I own lives in some kind of RAID sometime in its 
life. I have a couple laptop drives that are single, all desktops have at least 
a mirror. I'm a little nuts and would probably install mirrors in the laptops 
if there were somewhere to put them. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-02-03 Thread Travis Tabbal

smartmontools doesn't work with my controllers. I can try it again when the 2 
new drives I've ordered arrive. I'll try connecting to the motherboard ports 
and see if that works with smartmontools. 

I haven't noticed any sleeping with the drives. I don't get any lag accessing 
the array or any error messages about them disappearing.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] I can't seem to get the pool to export...

2010-01-18 Thread Travis Tabbal

On Sun, Jan 17, 2010 at 8:14 PM, Richard Elling wrote:

> On Jan 16, 2010, at 10:03 PM, Travis Tabbal wrote:
>
> > Hmm... got it working after a reboot. Odd that it had problems before
> that. I was able to rename the pools and the system seems to be running well
> now. Irritatingly, the settings for sharenfs, sharesmb, quota, etc. didn't
> get copied over with the zfs send/recv. I didn't have that many filesystems
> though, so it wasn't too bad to reconfigure them.
>
> What OS or build?  I've had similar issues with b130 on all sorts of mounts
> besides ZFS.
>


Opensolaris snv_129.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-01-17 Thread Travis Tabbal

HD154UI/1AG01118

They have been great drives for a home server. Enterprise users probably need 
faster drives for most uses, but they work great for me.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?

2010-01-17 Thread Travis Tabbal

I've been having good luck with Samsung "green" 1.5TB drives. I have had 1 DOA, 
but I currently have 10 of them, so that's not so bad. In that size purchase, 
I've had one bad from just about any manufacturer. I've avoided WD for RAID 
because of the error handling stuff kicking drives out of arrays. I don't know 
if that's currently an issue though. And with Seagate's recent record, I didn't 
feel confident in their larger drives. I was concerned about the 5400RPM speed 
being a problem, but I can read over 100MB/s from the array, and 95% on my use 
is over a gigabit LAN, so they are more than fast enough for my needs. 

I just set up a new array with them, 6 in raidz2. The replacement time is high 
enough that I decided the extra parity was worth the cost, even for a home 
server. I need 2 more drives, then I'll migrate my other 4 from the older array 
over as well into another 6 drive raidz2 and add it to the pool. 

I have decided to treat HDDs as completely untrustworthy. So when I get new 
drives I test them by creating a temporary pool in a mirror config and filling 
the drives up by copying data from the primary array. Then do a scrub. When 
it's done, if you get no errors, and no other errors in dmesg, then wait a week 
or so and do another scrub test. I found a bad SATA hotswap backplane and a bad 
drive this way. There are probably faster ways, but this works for me.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] I can't seem to get the pool to export...

2010-01-16 Thread Travis Tabbal

Hmm... got it working after a reboot. Odd that it had problems before that. I 
was able to rename the pools and the system seems to be running well now. 
Irritatingly, the settings for sharenfs, sharesmb, quota, etc. didn't get 
copied over with the zfs send/recv. I didn't have that many filesystems though, 
so it wasn't too bad to reconfigure them.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] I can't seem to get the pool to export...

2010-01-16 Thread Travis Tabbal

r...@nas:~# zpool export -f raid
cannot export 'raid': pool is busy

I've disabled all the services I could think of. I don't see anything accessing 
it. I also don't see any of the filesystems mounted with mount or "zfs mount". 
What's the deal?  This is not the rpool, so I'm not booted off it or anything 
like that. I'm on snv_129. 

I'm attempting to move the main storage to a new pool. I created the new pool, 
used "zfs send | zfs recv" for the filesystems. That's all fine. The plan was 
to export both pools, and use the import to rename them. I've got the new pool 
exported, but the older one refuses to export. 

Is there some way to get the system to tell me what's using the pool?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] raidz data loss stories?

2009-12-22 Thread Travis Tabbal

> Everything I've seen you should stay around 6-9
> drives for raidz, so don't do a raidz3 with 12
> drives.  Instead make two raidz3 with 6 drives each
>  (which is (6-3)*1.5 * 2 = 9 TB array.)

So the question becomes, why? If it's performance, I can live with lower IOPS 
and max throughput. If it's reliability, I'd like to hear why. I would think 
that the number of acceptable devices in a raidz would scale somewhat with the 
number of drives used for parity. So I would expect to see a sliding scale 
somewhat like the one mentioned before regarding disk size vs. raidz level. 

For example: 

3-4 drives: raidz1
4-8 drives: raidz2
8+ drives: raidz3

In practice, I would expect to see some kind of chart with number of devices 
and size of devices used together to determine the proper raidz level. Perhaps 
I'm way off base though. Note that I don't really have a problem doing 2 
arrays, but I would think that perhaps raidz2 would be acceptable in that 
configuration. The benefit to that config for me would be that I could create a 
parallel array of 6 to copy my existing data to, then add the second array 
after the initial file copy/scrub. I would need fewer disks to complete the 
transition.
 
> As for whether or not to do raidz, for me the
> issue is performance.  I can't handle the raidz
> write penalty.  If I needed triple drive protection,
> a 3way mirror setup would be the only way I would
> go.  I don't yet quite understand why a 3+ drive
> raidz2 vdev is better than a 3 drive mirror vdev?
> Other than a 5 drive setup is 3 drives of space
> when a 6 drive setup using 3 way mirror is only 2
>  drive space.

I've already stipulated that performance is not the primary concern. 100MB/sec 
with reasonable random I/O for a max of 5 clients is more than enough. My 
existing raidz is more than fast enough for my needs, and I have 5400RPM drives 
in there. 

I'd be very interested to hear an expert opinion on this. Given, say, 6 disks. 
What advantage in reliability, if any, would a raidz3 have vs. a striped pair 
of 3-way mirrors? Obviously the raidz3 has 1 disk worth of extra space, but 
we're talking about reliability here. I would guess performance would be higher 
with the mirrors.

With all of my comments, please keep in mind that I am not a huge enterprise 
customer with loads of money to spend on this. If I were, I'd just buy 
Thumpers. I'm a home user with a decent fileserver.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] raidz data loss stories?

2009-12-22 Thread Travis Tabbal

Interesting discussion. I know the bias here is generally toward enterprise 
users. I was wondering if the same recommendations hold for home users that are 
generally more price sensitive. I'm currently running OpenSolaris on a system 
with 12 drives. I had split them into 3 sets of 4 raidz1 arrays. This made some 
sense at the time as I can upgrade 4 disks at a time as new sizes come out. 
However, with 8 of the disks currently being 1.5TB, I'm getting concerned about 
this strategy. While important data is backed up, a loss of the server data 
would be very irritating. 

My next thought was to get more drives and run a single raidz3 vdev with 
12x1.5TB. More space than I need for quite a while, since I can't add just a 
few drives, triple parity for protection. I'd need a few extra drives to hold 
the data while I rebuild the main array, so I'd have cold-spares available that 
I would use for backing up critical data from the server. So they would see use 
and scrubs, not just sitting on the shelf. Access is over a gigE network, so I 
don't need more performance than that. I have read that the overall speed of a 
vdev is approximately the speed of a single device in the vdev, and in this 
case that is more than fast enough. I'm curious what the experts here think of 
this new plan. I'm pretty sure I know what you all think of the old one. :) 

Do you recommend swapping spare drives into the array periodically? It seems 
like it wouldn't really be any better than running scrub over the same period, 
but I've heard of people doing it on hardware raid controllers.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How can we help fix MPT driver post build 129

2009-12-07 Thread Travis Tabbal

To be fair, I think it's obvious that Sun people are looking into it and that 
users are willing to help diagnose and test. There were requests for particular 
data in those threads you linked to, have you sent yours? It might help them 
find a pattern in the errors. 

I understand the frustration that it hasn't been fixed in a couple builds that 
they have been aware of it, but it could be a very tricky problem. It also 
sounds like it's not reproducible on Sun hardware, so they have to get cards 
and such as well. It's also less urgent now that they have identified a 
workaround that works for most of us. While disabling MSIs is not optimal, it 
does help a lot.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-12-01 Thread Travis Tabbal

Perhaps. As I noted though, it also occurs on the onboard NVidia SATA 
controller when MSI is enabled. I had already put a line in /etc/system to 
disable MSI for that controller per a forum thread and it worked great. I'm now 
running with all MSI disabled via XVM as the mpt controller is giving me the 
same problems. As it's happening on totally different controller types, cable 
types, and drive types, I have to go with software issues. I know for sure the 
NVidia issue didn't come up on 2009.06. It makes the system take forever to 
boot, so it's very noticeable. It happened when I first went to dev builds, I 
want to say it was around b118. I updated for better XVM support for newer 
Linux kernels. 

The NVidia controller causes similar log messages. Command timeouts. Disabling 
MSIs fixes it as well. Motherboard is an Asus M4N82 Deluxe. NVIDIA nForce 980a 
SLI chipset.

I expect the root cause is the same, and I would guess that something is 
causing the drivers to miss or not receive some interrupts. However, my 
programming at this level is limited, so perhaps I'm misdiagnosing the issue.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-12-01 Thread Travis Tabbal

Just an update, my scrub completed without any timeout errors in the log. XVM 
with MSI disabled globally.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] mpt errors on snv 127

2009-12-01 Thread Travis Tabbal

If someone from Sun will confirm that it should work to use the mpt driver from 
2009.06, I'd be willing to set up a BE and try it. I still have the snapshot 
from my 2009.06 install, so I should be able to mount that and grab the files 
easily enough.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-30 Thread Travis Tabbal

> (1) disabling MSI support in xVM makes the problem go
> away

Yes here.


> (6) mpt(7d) without MSI support is sloow.


That does seem to be the case. It's not so bad overall, and at least the 
performance is consistent. It would be nice if this were improved. 


> For those of you who have been running xVM without
> MSI support,
> could you please confirm whether the devices
> exhibiting the problem
> are internal to your host, or connected via jbod. And
> if via jbod,
> please confirm the model number and cables.


Direct connect. The drives are in hot-swap racks, but they are passive devices. 
No expanders or anything like that in there. In case it's interesting, the 
racks are StarTech HSB430SATBK devices. I'm using SAS to SATA breakout cables 
to connect them. I have tried different lengths with the same result.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-30 Thread Travis Tabbal

> o The problems are not seen with Sun's version of
>  this card

Unable to comment as I don't have a Sun card here. If Sun would like to send me 
one, I would be willing to test it compared to the cards I do have. I'm running 
Supermicro USAS-L8i cards (LSI 1068e based). 

> o The problems are not seen with LSI's version of
>  the driver

I haven't tried it as comments from Sun staff here have indicated that it's not 
a good idea. 

> o The problems are seen with the latest LSI
> firmware

Yes. When I checked, the LSI site was listing the version I see at boot. 

> o Errors still occur if MSIs are disabled.  

I haven't seen any command timeout errors since disabling MSIs. I tried using 
the command to disable MSI only for the MPT driver, but I get a similar error 
from the NVidia driver at that point as it has my boot drives. It seems to me 
that the issue seems to have more in common with MSIs than the drivers 
themselves. I do have a scrub scheduled for 12/1, so I can check the logs after 
than to see if it appears from that. My other tests have not triggered the 
issue since disabling MSIs. I'm currently running with "set 
xpv_psm:xen_support_msi = -1".

I am not using any jbod enclosures. My setup uses SAS to SATA breakout cables 
and connect directly to the drives. I have tried different cables and lengths. 
The timeouts affected drives in a seemingly random fashion. I would get 
timeouts on both controllers and every drive over time.

I have never had command errors here. Just the timeouts.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-24 Thread Travis Tabbal

> Travis Tabbal wrote:
> > I have a possible workaround. Mark Johnson
>  has
> > been emailing me today about this issue and he
> proposed the
> > following:
> > 
> >> You can try adding the following to /etc/system,
> then rebooting... 
> >> set xpv_psm:xen_support_msi = -1
> 
> I am also running XVM, and after modifying
> /etc/system and rebooting, my 
> zpool scrub test is runing along merrily with no
> hangs so far, where 
> usually I would expect to see several by now.
> 
> Can the other folks who have seen this please test
> and report back? I'd 
> hate to think we solved it only to discover there
> were overlapping bugs.
> 
> Fingers crossed, and many thanks to those who have
> worked to track this 
> down!


Nice to see we have one confirmed report that things are working. Hopefully we 
get a few more! Even if it's just a workaround until a real fix makes it in, it 
gets us running.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-24 Thread Travis Tabbal

> 
> On Nov 23, 2009, at 7:28 PM, Travis Tabbal wrote:
> 
> > I have a possible workaround. Mark Johnson
>   
> > has been emailing me today about this issue and he
> proposed the  
> > following:
> >
> >> You can try adding the following to /etc/system,
> then rebooting...
> >> set xpv_psm:xen_support_msi = -1
> 
> would this change affect systems not using XVM?  we
> are just using  
> these as backup storage.

Probably not. Are you seeing the issue without XVM installed? We had one other 
user report that the issue went away when they removed XVM, so I had thought it 
wouldn't affect other users. If you are getting the same issue without XVM, 
there may be overlapping bugs in play. Someone at Sun might be able to tell you 
how to disable MSI on the controller. Someone told me how to do it for the 
NVidia SATA controller when there was a bug in that driver. I would think there 
is a way to do it for the MPT driver.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-23 Thread Travis Tabbal

I have a possible workaround. Mark Johnson  has been 
emailing me today about this issue and he proposed the following: 

> You can try adding the following to /etc/system, then rebooting...
>  set xpv_psm:xen_support_msi = -1

I have been able to format a ZVOL container from a VM 3 times while other 
activity is going on the system and it's working. I think performance is down a 
bit, but it's still acceptable. More importantly, it does so without killing 
the server. I would get the stall every time I would try this test before. So 
at least 1 case seems to be helped by doing this. I'll watch the server over 
the next few days to see if it stays improved. He mentioned that there is a fix 
being worked on for MSI handling in XVM that might make it into b129 that could 
fix this problem.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-23 Thread Travis Tabbal

> I will give you all of this information on monday.
>  This is great  news :)


Indeed. I will also be posting this information when I get to the server 
tonight. Perhaps it will help. I don't think I want to try using that old 
driver though, it seems too risky for my taste. 

Is there a command to get the disk firmware rev from OpenSolaris while booted 
up? I know of some boot CDs that can get to it, but I'm unsure about accessing 
it while the server is running.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-20 Thread Travis Tabbal

> The latter, we run these VMs over NFS anyway and had
> ESXi boxes under test already. we were already
> separating "data" exports from "VM" exports. We use
> an in-house developed configuration management/bare
> metal system which allows us to install new machines
> pretty easily. In this case we just provisioned the
> ESXi VMs to  new "VM" exports on the Thor whilst
> re-using the data-exports as they were...


Thanks for the info. Unfortunately, I need this box to do double duty and run 
the VMs as well. The hardware is capable, this issue with XvM and/or the mpt 
driver just needs to get fixed. Other than that, things are running great with 
this server.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-11-12 Thread Travis Tabbal

> > I'm running nv126 XvM right now. I haven't tried
> it
> > without XvM.
> 
> Without XvM we do not see these issues. We're running
> the VMs through NFS now (using ESXi)...

Interesting. It sounds like it might be an XvM specific bug. I'm glad I 
mentioned that in my bug report to Sun. Hopefully they can duplicate it. I'd 
like to stick with XvM as I've spent a fair amount of time getting things 
working well under it. 

How did your migration to ESXi go? Are you using it on the same hardware or did 
you just switch that server to an NFS server and run the VMs on another box?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-12 Thread Travis Tabbal

> Have you tried wrapping your disks inside LVM
> metadevices and then used those for your ZFS pool?

I have not tried that. I could try it with my spare disks I suppose. I avoided 
LVM as it didn't seem to offer me anything ZFS/ZPOOL didn't.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-12 Thread Travis Tabbal

> What type of disks are you using?

I'm using SATA disks with SAS-SATA breakout cables. I've tried different cables 
as I have a couple spares. 

mpt0 has 4x1.5TB Samsung "Green" drives. 
mpt1 has 4x400GB Seagate 7200 RPM drives.

I get errors from both adapters. Each adapter has an unused SAS channel 
available. If I can get this fixed, I'm planning to populate those as well.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-12 Thread Travis Tabbal

I submitted a bug on this issue, it looks like you can reference other bugs 
when you submit one, so everyone having this issue could possibly link mine and 
submit their own hardware config. It sounds like it's widespread though, so I'm 
not sure if that would help or hinder. I'd hate to bury the developers/QA team 
under a mountain of duplicate requests. 

CR 6900767
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-12 Thread Travis Tabbal

On Wed, Nov 11, 2009 at 10:25 PM, James C. McPherson
wrote:

>
> The first step towards "acknowledging" that there is a problem
> is you logging a bug in bugs.opensolaris.org. If you don't, we
> don't know that there might be a problem outside of the ones
> that we identify.
>

I apologize if I offended by not knowing the protocol. I thought that
posting in the forums was watched and the bug tracker updated by people at
Sun. I didn't think normal users had access to submit bugs. Thank you for
the reply. I have submitted a bug on the issue with all the information I
think might be useful. If someone at Sun would like more information, output
from commands, or testing, I would be happy to help.

I was not provided with a bug number by the system. I assume that those are
given out if the bug is deemed worthy of further consideration.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-11 Thread Travis Tabbal

> Have you tried another SAS-cable?


I have. 2 identical SAS cards, different cables, different disks (brand, size, 
etc). I get the errors on random disks in the pool. I don't think it's hardware 
related as there have been a few reports of this issue already.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not responding

2009-11-11 Thread Travis Tabbal

> Hi, you could try LSI itmpt driver as well, it seems
> to handle this better, although I think it only
> supports 8 devices at once or so.
> 
> You could also try more recent version of opensolaris
> (123 or even 126), as there seems to be a lot fixes
> regarding mpt-driver (which still seems to have
> issues).


I won't speak for the OP, but I've been seeing this same behaviour on 126 with 
LSI 1068E based cards (Supermicro USAS-L8i). 

For the LSI driver, how does one install it? I'm new to OpenSolaris and don't 
want to mess it up. It looked to be very old, is Solaris backward compatibility 
that good? 

It would be really nice if Sun would at least acknowledge the bug and that they 
can/can't reproduce it. I'm happy to supply information and test things if it 
will help. I have some spare disks I can attach to one of these cards and test 
driver updates and such. It sounds like people with Sun hardware are 
experiencing this as well.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-31 Thread Travis Tabbal

I am also running 2 of the Supermicro cards. I just upgraded to b126 and it 
seems improved. I am running a large file copy locally. I get these warnings in 
the dmesg log. When I do, I/O seems to stall for about 60sec. It comes back up 
fine, but it's very annoying. Any hints? I have 4 disks per controller right 
now, different brands, sizes, everything. New SATA fanout cables and no 
expanders. 

The drives on mpt0 and mpt1 are completely different, 4x400GB Seagate drives, 
4x1.5TB Samsung drives. I get the problem from both controllers. I didn't 
notice this till about b124. I can reproduce it with rsync copying files 
locally between ZFS filesystems and with --bwlimit=1 (10MB/sec). Keeping 
the limit low does seem to help. 

---

Oct 31 23:05:32 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@3/pci15d9,a...@0 (mpt1):
Oct 31 23:05:32 nas Disconnected command timeout for Target 7
Oct 31 23:09:42 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@2/pci15d9,a...@0 (mpt0):
Oct 31 23:09:42 nas Disconnected command timeout for Target 1
Oct 31 23:16:23 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@2/pci15d9,a...@0 (mpt0):
Oct 31 23:16:23 nas Disconnected command timeout for Target 3
Oct 31 23:18:43 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@3/pci15d9,a...@0 (mpt1):
Oct 31 23:18:43 nas Disconnected command timeout for Target 6
Oct 31 23:27:24 nas scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,7...@10/pci10de,5...@0/pci10de,5...@3/pci15d9,a...@0 (mpt1):
Oct 31 23:27:24 nas Disconnected command timeout for Target 7
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool with very different sized vdevs?

2009-10-23 Thread Travis Tabbal

Hmm.. I expected people to jump on me yelling that it's a bad idea. :) 

How about this, can I remove a vdev from a pool if the pool still has enough 
space to hold the data? So could I add it in and mess with it for a while 
without losing anything? I would expect the system to resliver the data onto 
the remaining vdevs, or tell me to go jump off a pier. :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] bewailing of the n00b

2009-10-23 Thread Travis Tabbal

> - How can I effect OCE with ZFS? The traditional
> 'back up all the data somewhere, add a drive,
> re-establish the file system/pools/whatever, then
> copy the data back' is not going to work because
> there will be nowhere to temporarily 'put' the
>  data.

Add devices to the pool. Preferably in mirrors or raidz configurations. If you 
just add bare devices to the pool you are running RAID-0, no redundancy. You 
cannot add devices to a raidz, as mentioned. But you can add more raidz or 
mirror devices. You can also replace devices with larger ones. It would be nice 
to be able to add more devices to a raidz for home users like us, maybe we'll 
see it someday. For now, the capabilities we do have make it reasonable to deal 
with. 

> - Concordantly, Is ZFS affected by a RAID card
> that supports OCE? Or is this to no advantage?


Don't bother. Spend the money on more RAM, and drives. :) Do get a nice 
controller though. Supermicro makes a few nice units. I'm using 2 AOC-USAS-L8i 
cards. They work great, though you do have to mod the mounting bracket to get 
them to work in a standard case. These are based on LSI cards, I just found 
them cheaper than the same LSI branded card. Avoid the cheap $20 4-port jobs. 
I've had a couple of them die already. Thankfully, I didn't lose any data... I 
think... no ZFS on that box. 


> - RAID5/6 with ZFS: As I understand it, ZFS with
> raidz will provide the data/drive redundancy I seek
> [home network, with maybe two simultaneous users on
> at least a p...@1ghz/1Gb RAM storage server] so
> obtaining a RAID controller card is
>  unnecessary/unhelpful. Yes?


Correct. Though I would increase the RAM personally, it's so cheap these days. 
My home fileserver has 8GB of ECC RAM. I'm also running Xen VMs though, so some 
of my RAM is used for running those. 

You can even do tripple-redundant raidz with ZFS now, so you could lose 3 
drives without any data loss. For those that want really high availability, or 
really big arrays I suppose. I'm running 4x1.5TB in a raidz1, no problems. I do 
plan to keep a spare around though. I'll just use it to store backups to start 
with. If a drive goes bad, I'll drop it in and do a zpool replace. 

Don't worry about the command line. The ZFS based commands are pretty short and 
simple. Read up on zpool and zfs. Those are the commands you use the most for 
managing ZFS. There's also the ZFS best practices guide if you haven't seen it. 
Useful advice in there.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zpool with very different sized vdevs?

2009-10-22 Thread Travis Tabbal

I have a new array of 4x1.5TB drives running fine. I also have the old array of 
4x400GB drives in the box on a separate pool for testing. I was planning to 
have the old drives just be a backup file store, so I could keep snapshots and 
such over there for important files. 

I was wondering if it makes any sense to add the older drives to the new pool. 
Reliability might be lower as they are older drives, so if I were to loose 2 of 
them, things could get ugly. I'm just curious if it would make any sense to do 
something like this.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] White box server for OpenSolaris

2009-09-25 Thread Travis Tabbal

> I am after suggestions of motherboard, CPU and ram.
> Basically I want ECC ram and at least two PCI-E x4
> channels.  As I want to run 2 x AOC-USAS_L8i cards
>  for 16 drives.

Asus M4N82 Deluxe. I have one running with 2 USAS-L8i cards just fine. I don't 
have all the drives loaded in yet, but the cards are detected and they can use 
the drives I do have attached. I currently have 8GB of ECC RAM on the board and 
it's working fine. The ECC options in the BIOS are enabled and it reports the 
ECC is enabled at boot. It has 3 PCIe x16 slots, I have a graphics card in the 
other slot, and an Intel e1000g card in the PCIe x1 slot. The onboard 
peripherals all work, with the exception of the onboard AHCI ports being buggy 
in b123 under xVM. Not sure what that's all about, I posted in the main 
discussion board but haven't heard if it's a known bug or if it will be fixed 
in the next version. It would be nice as my boot drives are on that controller. 
2009.06 works fine though. CPU is a Phenom II X3 720. Probably overkill for 
fileserver duties, but I also want to do some VMs for other things, thus the 
bug I found with the xVM updates.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

45 matches

Mail list logo