Hi all, I used the method described below to build client rpms with the source kit lustre-1.8.5.tar.gz.
There was only one error reported during the make rpms, relating to lustre-iolit-1.2-root, but the rpms were built under /usr/src/packages/RPMS/x86_64. The rpms lustre-modules, lustre and lustre-tests were then installed smoothly without any complaints. But the subsequent "modprobe lustre" will return a "Killed" message, with no lustre module loaded. dmesg also reveals "BUG: unable to handle kernel NULL pointer dereference at 0000000000000008" A second modprobe lustre command will then hang, again with no module loaded. Subsequently the client is not able to mount the lustre storage. Can anyone shed some light as to what has gone wrong here please? Many thanks. Regards, Peter Chiu STFC Rutherford Appleton Laboratory Space Science & Technology Department Building R25, Room 2.02 Chilton Didcot OXON OX11 0QX UK Phone: 01235-446699 Fax: 01235-445848 Email: peter.c...@stfc.ac.uk Details: =========================================================== Client host cmip-proc8: cat /etc/issue: Welcome to SUSE Linux Enterprise Server 11 SP1 (x86_64) - Kernel \r (\l). cmip-proc8:~ # uname -a Linux cmip-proc8.badc.rl.ac.uk 2.6.32.29-0.3-xen #1 SMP 2011-02-25 13:36:59 +0100 x86_64 x86_64 x86_64 GNU/Linux Install kit from: cd /usr/local/kits/lustre-1.8.5 ls -ls /usr/src/ 4 drwxr-xr-x 3 root root 4096 2011-05-09 08:31 debug 0 lrwxrwxrwx 1 root root 19 2011-03-20 15:54 linux -> linux-2.6.32.29-0.3 4 drwxr-xr-x 25 root root 4096 2011-05-09 08:49 linux-2.6.32.29-0.3 4 drwxr-xr-x 3 root root 4096 2011-03-20 15:54 linux-2.6.32.29-0.3-obj 4 drwxr-xr-x 3 root root 4096 2011-03-20 15:54 linux-obj 4 drwxr-xr-x 10 root root 4096 2011-05-09 08:31 lustre-1.8.5 4 drwxr-xr-x 7 root root 4096 2011-03-20 14:58 packages Install command: ./configure --with-linux=/usr/src/linux --with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen make rpms One error recorded: + ./configure --prefix=/usr configure: error: cannot find install-sh or install.sh in . ./.. ./../.. error: Bad exit status from /var/tmp/rpm-tmp.51316 (%build) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.51316 (%build) make[1]: *** [rpms] Error 1 make[1]: Leaving directory `/usr/local/kits/lustre-1.8.5/lustre-iokit' By trial and error, this error can be avoided if I rsync /usr/local/kits/lustre-1.8.5/lustre-iokit /usr/src/packages/BUILD/lustre-iokit-1.2 Anyway, rpms are built under: cmip-proc8:/usr/local/kits/lustre-1.8.5 # ls /usr/src/packages/RPMS//x86_64/*1.8.5* /usr/src/packages/RPMS//x86_64/lustre-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm No error when installing these rpms: cmip-proc8:/usr/local/kits/lustre-1.8.5 # rpm -qa | grep lustre lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815 To check and load lustre module - none found cmip-proc8:~ # lsmod | grep lustre cmip-proc8:~ # modprobe lustre Killed cmip-proc8:~ # lsmod | grep lustre cmip-proc8:~ # modprobe lustre & [1] 3454 cmip-proc8:~ # cmip-proc8:~ # ps auxw | grep lustre root 3454 0.0 0.0 3940 624 pts/1 S 18:04 0:00 modprobe lustre Dmesg records this error after the first "modeprobe lustre" command: cmip-proc8:/usr/local/kits/lustre-1.8.5 # diff /tmp/d1 /tmp/d2 195a196,250 > [ 168.647996] BUG: unable to handle kernel NULL pointer dereference at > 0000000000000008 > [ 168.648006] IP: [<ffffffff8002c3d2>] task_rq_lock+0x42/0xa0 > [ 168.648018] PGD 7fac4067 PUD 7ef4c067 PMD 0 > [ 168.648023] Oops: 0000 [#1] SMP > [ 168.648026] last sysfs file: /sys/module/ip_tables/initstate > [ 168.648028] CPU 0 > [ 168.648030] Modules linked in: lnet(N+) lvfs(N) libcfs(N) iptable_nat > nf_nat xt_tcpudp xt_pkttype ipt_LOG xt_limit autofs4 binfmt_misc microcode > xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter > nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 > ip_tables ip6_tables x_tables fuse loop dm_mod joydev rtc_core rtc_lib xennet > ext3 mbcache jbd processor thermal_sys hwmon xenblk cdrom > [ 168.648063] Supported: Yes > [ 168.648066] Pid: 3445, comm: modprobe Tainted: G N > 2.6.32.29-0.3-xen #1 > [ 168.648069] RIP: e030:[<ffffffff8002c3d2>] [<ffffffff8002c3d2>] > task_rq_lock+0x42/0xa0 > [ 168.648074] RSP: e02b:ffff88007efa5e38 EFLAGS: 00010082 > [ 168.648077] RAX: 0000000000000001 RBX: 0000000000009700 RCX: > dead000000100100 > [ 168.648080] RDX: 0000000000000000 RSI: ffff88007efa5e88 RDI: > 0000000000000000 > [ 168.648083] RBP: ffff88007efa5e58 R08: ffffffffa0252fb6 R09: > 0000000000000000 > [ 168.648086] R10: 0000000000000001 R11: 0000000000000061 R12: > 0000000000009700 > [ 168.648089] R13: 0000000000000000 R14: ffff88007efa5e88 R15: > 000000000000000f > [ 168.648095] FS: 00007f3f41030700(0000) GS:ffff8800013c1000(0000) > knlGS:0000000000000000 > [ 168.648098] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 168.648101] CR2: 0000000000000008 CR3: 000000007ef7d000 CR4: > 0000000000002660 > [ 168.648104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 168.648107] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [ 168.648110] Process modprobe (pid: 3445, threadinfo ffff88007efa4000, task > ffff88007e9100c0) > [ 168.648113] Stack: > [ 168.648115] ffffffffa02579f8 0000000000000000 0000000000623da0 > 0000000000623d30 > [ 168.648118] <0> ffff88007efa5eb8 ffffffff80038588 000000007ef8ef00 > 00000000a02579f8 > [ 168.648123] <0> 00000000a0243060 0000000000000000 0000000000000001 > ffffffffa02579f8 > [ 168.648129] Call Trace: > [ 168.648138] [<ffffffff80038588>] try_to_wake_up+0x48/0x420 > [ 168.648143] [<ffffffff8005b2e8>] up+0x48/0x50 > [ 168.648153] [<ffffffffa0230d92>] LNetInit+0x92/0xc0 [lnet] > [ 168.648167] [<ffffffffa02430ac>] init_lnet+0x4c/0x280 [lnet] > [ 168.648178] [<ffffffff80004045>] do_one_initcall+0x35/0x1b0 > [ 168.648184] [<ffffffff8006d154>] sys_init_module+0xe4/0x270 > [ 168.648189] [<ffffffff80007458>] system_call_fastpath+0x16/0x1b > [ 168.648194] [<00007f3f40bc9f7a>] 0x7f3f40bc9f7a > [ 168.648196] Code: 1c 24 49 89 f6 4c 89 64 24 08 49 c7 c4 00 97 00 00 65 8a > 04 25 c1 67 00 00 65 c6 04 25 c1 67 00 00 01 0f b6 c0 4c 89 e3 49 89 06 <49> > 8b 45 08 8b 40 18 48 03 1c c5 80 ae 62 80 48 89 df e8 f7 87 > [ 168.648230] RIP [<ffffffff8002c3d2>] task_rq_lock+0x42/0xa0 > [ 168.648234] RSP <ffff88007efa5e38> > [ 168.648236] CR2: 0000000000000008 > [ 168.648239] ---[ end trace 57429513f7001015 ]--- cmip-proc8:/usr/local/kits/lustre-1.8.5 # I have tried Lustre-1.8.4, but got the same result. I have also tried to follow the 1.8 Operations Manual to locate the diagnostic tools, but the link wiki.lustre.org is no longer valid. -- Scanned by iCritical.
_______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss