Some further developments:

I noticed that in addition to bios updates, there was a separate firmware
update for the marvell SE9230 that is one of the secondary controllers on
the motherboard.  So I grabbed this and updated everything.

Well, it certainly made a difference: a scrub hung within about 30s of
starting. Not quite the improvement I had been hoping for.

However, it pointed the finger of suspicion at this controller in
particular.  I rearranged all the disks around, so that none of the ones
from the large pool were on this controller.  In this configuration, the
pool scrubs fine.  It also does so notably faster (perhaps by 25-30%),
despite 4 of the disks now being on slower 3gbit ports.

The ssd pool wound up on that controller instead, in this rearrangement.  I
tried scrubbing that and it was very, very slow.

So this controller is starting to smell pretty bad.

I've been looking around at google results and it does seem to have a
collection of issues reported, with people getting highly variable results
based on config options. There was an interesting linux kernel quirk added
with iommu enabled, because it uses an undeclared second pci function id.
I now have a collection of firmware versions and option settings to try.

Even if the controller is doing something wrong, it seems we're still
losing track of commands without reporting errors or warnings on the
device. I've uploaded a crash dump in case there are any clues there.



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to