David Wolfskill wrote:
From a quick look in the lists, I get the impression that the Dell PERC
5/i may be a bit problematic.  Since I hadn't any plans on using that
hardware, though, I've paid more attention to other things.


Not sure that this impression is entirely accurate.  The biggest problem
with MFI machines is online RAID management.  The storage driver itself
matured very quickly and has been very reliable.

Well, now a colleague is trying to run 6.2-R on one of these 2950s; dmesg
says the controller is:

mfi0: <Dell PERC 5/i> mem 0xd80f0000-0xd80fffff,0xfc4e0000-0xfc4fffff irq 78 at 
device 14.0 on pci2
mfi0: 817 (224963336s/0x0020/0) - Shutdown command received from host
mfi0: 818 (4278190080s/0x0020/0) - PCI 0x041028 0x0415 0x041028 0x041f03: 
Firmware initialization started (PCI ID 0015/1028/1f03/1028)
mfi0: 819 (4278190080s/0x0020/0) - Type 18: Firmware version 1.00.02-0157
mfi0: 820 (4278190096s/0x0008/0) - Battery Present
mfi0: 821 (4278190124s/0x0004/0) - PD 08(e1/s255) event: Enclosure (SES) 
discovered on PD 08(e1/s255)
mfi0: 822 (4278190124s/0x0002/0) - PD 08(e1/s255) event: Inserted: PD 
08(e1/s255)
mfi0: 823 (4278190124s/0x0002/0) - Type 29: Inserted: PD 08(e1/s255) Info: 
enclPd=08, scsiType=d, portMap=00, sasAddr=500180b04413ce00,0000000000000000
mfi0: 824 (4278190124s/0x0002/0) - PD 00(e1/s0) event: Inserted: PD 00(e1/s0)
mfi0: 825 (4278190124s/0x0002/0) - Type 29: Inserted: PD 00(e1/s0) Info: 
enclPd=08, scsiType=0, portMap=01, sasAddr=50010b900046038e,0000000000000000
mfi0: 826 (4278190124s/0x0002/0) - PD 01(e1/s1) event: Inserted: PD 01(e1/s1)
mfi0: 827 (4278190124s/0x0002/0) - Type 29: Inserted: PD 01(e1/s1) Info: 
enclPd=08, scsiType=0, portMap=02, sasAddr=50010b9000460376,0000000000000000
mfi0: 828 (4278190124s/0x0002/0) - PD 02(e1/s2) event: Inserted: PD 02(e1/s2)
mfi0: 829 (4278190124s/0x0002/0) - Type 29: Inserted: PD 02(e1/s2) Info: 
enclPd=08, scsiType=0, portMap=04, sasAddr=50010b900046035a,0000000000000000
mfi0: 830 (4278190124s/0x0002/0) - PD 03(e1/s3) event: Inserted: PD 03(e1/s3)
mfi0: 831 (4278190124s/0x0002/0) - Type 29: Inserted: PD 03(e1/s3) Info: 
enclPd=08, scsiType=0, portMap=08, sasAddr=50010b90004603be,0000000000000000
mfi0: 832 (4278190124s/0x0002/0) - PD 04(e1/s4) event: Inserted: PD 04(e1/s4)
mfi0: 833 (4278190124s/0x0002/0) - Type 29: Inserted: PD 04(e1/s4) Info: 
enclPd=08, scsiType=0, portMap=10, sasAddr=50010b900045f6d6,0000000000000000
mfi0: 834 (4278190124s/0x0002/0) - PD 05(e1/s5) event: Inserted: PD 05(e1/s5)
mfi0: 835 (4278190124s/0x0002/0) - Type 29: Inserted: PD 05(e1/s5) Info: 
enclPd=08, scsiType=0, portMap=20, sasAddr=50010b9000460246,0000000000000000
mfi0: 836 (224964238s/0x0020/0) - Adapter ticks 224964238 elapsed 45s: Time 
established as 02/16/07 18:03:58; (45 seconds since power on)

and the disks looks like:

mfid0: <MFI Logical Disk> on mfi0
mfid0: 418176MB (856424448 sectors) RAID volume '' is optimal


Looks A OK to me.


The intended production workload involves creation and deletion of
a large number of files rather rapidly.

I recalled that for the first year or two with Soft Updates, there
were problems with that kind of workload, such that there was enough
hysteresis in making free blocks actually available for subsequent
allocation that processes that were trying to write to new blocks
on such file systems would often fail, reporting ENOSPC.  Un-mounting
and re-mounting the file system would clean things up, but that
doesn't tend to be a viable approach for keeping a long-running
application happy.  :-}


sysctl vfs.ffs.doasyncfree=0 might help. Running the syncer more frequently might also help, but I don't recall the sysctl node for
that.

I reminded my colleague of this, since she also reported that an
un-mount/re-mount sequence caused a lot of free space to show up
on the file system in question, and she responded that she had been
aware of this, and had been turning off Soft Updates on the file
systems for the application in question, but she had forgotten that
Soft Updates was on by default when she set up this (test) system.

She then turned off Soft Updates and started the test workload again.
And instead of failing with ENOSPC after 3 days, it only took 2.

Very strange.  No chance that it was due to files that were deleted but
still referenced by open apps?


Hmmm... well; that wasn't exactly what I had expected.

Any hints, here?  The machine is running the i386 arch, with a pair of
dual-core 2.33HHz Xeons.

I have a recent dmesg.boot, but I'd rather keep list messages fairly
short.

We have a local private mirror of the FreeBSD CVS repository, so we have
some flexibility in what we can do for testing, but the objective is to
put the box in production -- and I'd rather not run CURRENT as part of a
customer-visible production workload.  :-}  [My laptop is a different
matter, of course....]


This sounds purely like a filesystem issue, not an MFI driver issue.

Scott
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to