Dear jfs heads

Recently i've been unfortunate to experience the following problem with JFS.
Periodically, but at irregular intervals, my jfs partition would remount
itself as read-only after about a dozen of these messages seen in dmesg:

        ERROR: (device sdc): diRead: i_ino != di_number

I can't decisively tell what i'm doing wrong, hence i request your expertize.
Let me tell you about the environment.

The head node is a Xeon 3.40GHz running a fedora core 4's latest kernel -
2.6.17-1.2141_FC4smp.  (I installed the kernel 2 days ago to replace a much 
older
2.6.11-1.1369_FC4smp, however the inode problem persisted.)

The head node is attached to an Infortrend's EonStore A16U-G2421 
SCSI-320-to-SATA
RAID-5 syb-system with a total capacity of 5Tb which then divided into 3 large
logical drives of 1.6Tb in size each.  The SCSI card is an Adaptec 39320A-R.
EonStore's RAID5 is reportedly happy, although I did have to rebuild the logical
drive recently due to a SATA drive failure (which is not uncommon.)

The system is under certain amount of stress 24/7, it's being used to host
user's homes.  The problematic jfs partition is a home of one particularly
notorious user which has managed to generate a literally _enormous_ amount of
tiny files. I am afraid to try to find out how many, but last time i tried,
the `du -sh' took more than 24 hours, and produced a total of 660Gb!
Seems to me that the problem occurs when he's trying to manage files.
For example, today he tried to delete a large directory right before the problem
occurred. Also, fsck of this partition takes 7 hours, whereas the other two
partitions take about 2 hours each to fsck.

My question can seem rather general, but am i reaching the limits of what my
environment can do?  Also, is there a specific reason for this error to occur?
Am i overlooking something obvious, something that can be anticipated and 
avoided?

Perhaps, smaller partitions?  Can LVM2 work in this environment to divide the
large terabyte chunks into smaller pieces that can expand?  Will LVM2 regardless
of the file system be able to take that many files?

Also, what is the best possible strategy to stress-test my hardware and a file
system as close to my environment as possible: large partitions under constant
IO stress occupied by a several dozen of users' homes made available via NFS?

Any feedback is appreciated

Thanks,

Alex Lisker

University of California


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Jfs-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jfs-discussion

Reply via email to