1) The old partition (LUN0) was ext2, but because of the NFS hanging we had to do couples of hard resets so we upgraded it to ext3 - it takes quite a long time to fsck 1TB of data. Basically we just added the journal using tunefs (a real life saver!). The second partition (LUN1) was created with the new kernel and it is ext3.
More, initially the second partition was not exported, so we had problems only with the first one, which was ext2 at that time.
Can you please point me to some references about the problems with NFS and "more recent filesystems"?
2) This is exactly what I had in mind, but I cannot afford to keep the machine off-line for to much time now. A reset + quick fs recovery using the journal (few minutes top) is already bad, but I can leave for a wile with it. Changing just the kernel will be about the same time or less, provided that I compile it elsewhere. So I prefer to wait for a while - maybe I'll found a simpler solution ...
Thanks again,
-iulian
Dan Stromberg wrote:
These are completely shots in the dark, but: 1) If you are using something other than ext2, you might want to try ext2. There were (and may remain) problems with more recent filesystems and NFS2) You might want to downgrade to 7.3, and use a new kernel On Fri, Feb 07, 2003 at 10:02:50AM -0800, Iulian Musat wrote:Hi everybody !
After we installed a fresh RedHat 8.0 on a 2 processor machine the NFS daemon hangs from time to time - really badly, since I cannot kill it with -9. The problem occurs every couples of days.
I cannot even reboot, because the reboot process will hang trying to kill the NFS daemon. The problem is that after I try to kill nfsd, the exported partitions are not accessible, even locally - every access just hangs. Also, umount will report that the partition is busy. Just to make it clear:
- the nfsd hangs (I cannot access the exported partition from any client)
- the partition is still accessible locally
- try (unsuccessfully) to kill nfsd
- the partition is not accessible anymore
There is no processor activity when it hangs ? both processors are idle. Also looks like there is no disk activity.
Now, the partitions are some big (1TB) RAID disks. The RAID system is a separate box, accessible through a SCSI interface. So the computer will see just one big disk. Actually there are two of them, on different LUNs: 0 and 1. It worked fine before the upgrade, but because of a old kernel only LUN0 was accessible.
Here are the details :
processor: two Pentium III (Coppermine) at 800MHz, 256KB cache
memory: 2GB
kernel version: 2.4.18-19.8.0smp (upgraded with up2date)
I think there is no need to say that any suggestions to solve this problem are extremely welcome :-)
Cheers,
iulian
--
redhat-list mailing list
unsubscribe mailto:[EMAIL PROTECTED]?subject=unsubscribe
https://listman.redhat.com/mailman/listinfo/redhat-list
-- redhat-list mailing list unsubscribe mailto:[EMAIL PROTECTED]?subject=unsubscribe https://listman.redhat.com/mailman/listinfo/redhat-list