Hi all, Our OSS has started panicing in the last couple of days, it seems to be related to nfs4, but not sure so asking the group for pointers.
Fistly a couple of screen grabs are at : http://penguin.stats.warwick.ac.uk/~stsxab/Lustre/ The OSS server is currently running Ubuntu 10.04 LTS with an alien (redhat I believe) kernel installed. The running kernel is : 2.6.32-131.6.1.el6_lustre.g65156ed.x86_64 I believe that it is running lustre 1.6.x. The MDS is also setup in a similar manner. The clients are a mixture of Ubuntu 10.04 LTS with Lustre 1.6.x and the 3 most recent nodes are Ubuntu 12.04 LTS with Lustre 2.5.x which I built recently. The OSS has 2 raid arrays, one on the onboard SAS controller which has two of the Lustre volumes (/home and /scratch), along with the NFS exported file system, on a separate XFS partition. The second raid array is on an external PCIE Raid controler, and an external disk array and holds the other Lustre filesystem on two virtual disks. The OSS also has a couple of NFS4 shares : /export 192.168.0.0/24(rw,async,fsid=0,crossmnt,no_root_squash,no_subtree_check) 192.168.1.0/24(rw,sync,fsid=0,no_root_squash,crossmnt,no_subtree_check) /export/software/packages-x86_64-linux-gnu 192.168.0.0/24(rw,async,no_subtree_check,no_root_squash) Which are on a separate disk. If I disable the NFS shares then the OSS server seems to stay up and client machines can access the lustre file systems. But once I enable the NFS shares the OSS will panic within a few minutes, this is why I suspect some interaction with NFS. The odd thing is the machine only started doing this yesterday, I have replaced / re-seated the RAM, CPUs and cards (Ethernet & SAS), but this doesn't seem to have changed anything. I am aware that this setup is not a supported architecture (I inherited custody of the cluster from a previous admin) and am planning on re-installing both the OSS and MDS with (probably) CentOS, as that is supported for the server. Is there anything I need to be aware of in planning this upgrade ? Does anyone have any clue as to what I might try, is there an easy way I can check the integrity of the Lustre volumes ? Cheers. Phill. _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss