Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working
Peter, Sorry for the late response. I don't know if this will help you or not, but below are the commands I ran to build the lustre client rpms on one of our SLES systems: nautilus:~ # cat /etc/SuSE-release SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 1 nautilus:~ # uname -a Linux nautilus 2.6.32.29-0.3.1.2687.3.PTF.607050.iommu-default #1 SMP 2011-02-25 13:36:59 +0100 x86_64 x86_64 x86_64 GNU/Linux nautilus:~ # cd /usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu nautilus:/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu # make cloneconfig Cloning configuration file /proc/config.gz ... nautilus:/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu # make prepare scripts/kconfig/conf -s arch/x86/Kconfig CHK include/linux/version.h UPD include/linux/version.h nautilus:/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu # make scripts HOSTCC scripts/genksyms/genksyms.o SHIPPED scripts/genksyms/lex.c nautilus:/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu # cd /root/lustre-1.8.5 nautilus:~/lustre-1.8.5 # ./configure --disable-server --with-linux=/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu \ --with-linux-obj=/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu-obj/x86_64/default \ --with-linux-config=/boot/config-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu-default checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu nautilus:~/lustre-1.8.5 # make rpms -- Rick Mohr HPC Systems Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu/ ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working
Dear Andreas, I wonder if there is any further advice you can kindly offer as to how to troubleshoot the failure in bringing up lustre module? Many thanks. Peter -Original Message- From: Chiu, Peter (STFC,RAL,RALSP) Sent: 11 May 2011 11:50 To: Andreas Dilger Cc: lustre-discuss@lists.lustre.org; Chiu, Peter (STFC,RAL,RALSP) Subject: RE: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working Understood, Andreas, Just to supplement is that the same approach works for SLES 11 using a xen kernel (2.6.27.54-0.2-xen). The Lustre Client rpms works okay: cmip-proc1:~ # cat /etc/issue Welcome to SUSE Linux Enterprise Server 11 (x86_64) - Kernel \r (\l). cmip-proc1:~ # uname -a Linux cmip-proc1 2.6.27.54-0.2-xen #1 SMP 2010-10-19 18:40:07 +0200 x86_64 x86_64 x86_64 GNU/Linux cmip-proc1:~ # df -h /disks/ceda1 FilesystemSize Used Avail Use% Mounted on 130.246.191.64:130.246.191.65@tcp0:/ceda1 51T 130G 48T 1% /disks/ceda1 SLES 11 SP1 is a service pack update to SLES 11 (now on 2.6.32.29-0.3-xen). Is it possible to find out what the problem is? Regards, Peter -Original Message- From: Andreas Dilger [mailto:adil...@whamcloud.com] Sent: 11 May 2011 10:11 To: Chiu, Peter (STFC,RAL,RALSP) Cc: lustre-discuss@lists.lustre.org; Chiu, Peter (STFC,RAL,RALSP) Subject: Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working The only other potential problem I see is that you are using a xen kernel and this us somehow causing problems. Cheers, Andreas On 2011-05-11, at 1:33 AM, peter.c...@stfc.ac.uk wrote: Dear Andreas, Many thanks for your response. Below are further details on this. I shall be grateful for your advice on this. Regards, Peter The system is: cmip-proc8:/etc # uname -a Linux cmip-proc8.badc.rl.ac.uk 2.6.32.29-0.3-xen #1 SMP 2011-02-25 13:36:59 +0100 x86_64 x86_64 x86_64 GNU/Linux /usr/src/linux is a symlink pointing to the source corresponding to linux-2.6.32.29-0.3-obj: cmip-proc8:/etc # ls -l /usr/src total 24 drwxr-xr-x 3 root root 4096 2011-05-09 08:31 debug lrwxrwxrwx 1 root root 19 2011-03-20 15:54 linux - linux-2.6.32.29-0.3 drwxr-xr-x 25 root root 4096 2011-05-09 08:49 linux-2.6.32.29-0.3 drwxr-xr-x 3 root root 4096 2011-03-20 15:54 linux-2.6.32.29-0.3-obj drwxr-xr-x 3 root root 4096 2011-03-20 15:54 linux-obj drwxr-xr-x 10 root root 4096 2011-05-09 08:31 lustre-1.8.5 drwxr-xr-x 7 root root 4096 2011-03-20 14:58 packages cmip-proc8:/etc # cmip-proc8:~ # ls /usr/local/kits/lustre-1.8.5 aclocal.m4 config.h.ininstall-sh Makefile autoMakefile config.log ldiskfs Makefile.in autoMakefile.am config.status libsysio missing autoMakefile.in config.sub lnet mkinstalldirs buildconfigure lustre README ChangeLogconfigure.ac lustre-1.8.5.tar.gz Rules compile COPYINGlustre-iokit snmp config.guess debian lustre.spec stamp-h1 config.h depcomplustre.spec.in tree_status cmip-proc8:~ # The build with .configure and make rpms produced rpms that are installable: cmip-proc8:/etc # ls -ls /usr/src/packages/RPMS/x86_64/*1.8.5* 4024 -rw-r--r-- 1 root root 4112883 2011-05-09 08:53 /usr/src/packages/RPMS/x86_64/lustre-1.8.52.6.32.29_0.3_xen_201105090815.x86_64.rpm 15532 -rw-r--r-- 1 root root 15881360 2011-05-09 08:54 /usr/src/packages/RPMS/x86_64/lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm 1332 -rw-r--r-- 1 root root 1358924 2011-05-09 08:54 /usr/src/packages/RPMS/x86_64/lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm 1416 -rw-r--r-- 1 root root 1441937 2011-05-09 08:53 /usr/src/packages/RPMS/x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm 3524 -rw-r--r-- 1 root root 3602163 2011-05-09 08:53 /usr/src/packages/RPMS/x86_64/lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm 2600 -rw-r--r-- 1 root root 2656393 2011-05-09 08:53 /usr/src/packages/RPMS/x86_64/lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm cmip-proc8:/etc # rpm -e lustre-tests cmip-proc8:/etc # rpm -e lustre cmip-proc8:/etc # rpm -e lustre-modules cmip-proc8:/etc # rpm -ivh /usr/src/packages/RPMS/x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm Preparing...### [100%] 1:lustre-modules ### [100%] Congratulations on finishing your Lustre installation! To register your copy of Lustre and find out more about Lustre Support, Service, and Training offerings please visit http://www.sun.com/software/products/lustre/lustre_reg.jsp cmip-proc8:/etc # rpm -ivh /usr/src/packages/RPMS
Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working
[ 104.171341] Modules linked in: lnet(N+) lvfs(N) libcfs(N) iptable_nat nf_nat xt_tcpudp xt_pkttype ipt_LOG xt_limit autofs4 binfmt_misc microcode xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6_tables x_tables fuse loop dm_mod joydev rtc_core rtc_lib xennet ext3 mbcache jbd processor thermal_sys hwmon xenblk cdrom [ 104.171373] Supported: Yes [ 104.171376] Pid: 3441, comm: modprobe Tainted: G N 2.6.32.29-0.3-xen #1 [ 104.171379] RIP: e030:[8002c3d2] [8002c3d2] task_rq_lock+0x42/0xa0 [ 104.171384] RSP: e02b:88007edade38 EFLAGS: 00010082 [ 104.171387] RAX: 0001 RBX: 9700 RCX: dead00100100 [ 104.171390] RDX: RSI: 88007edade88 RDI: [ 104.171393] RBP: 88007edade58 R08: a0252fb6 R09: [ 104.171396] R10: 0001 R11: 805f4200 R12: 9700 [ 104.171399] R13: R14: 88007edade88 R15: 000f [ 104.171406] FS: 7f541715a700() GS:8800013c1000() knlGS: [ 104.171409] CS: e033 DS: ES: CR0: 8005003b [ 104.171412] CR2: 0008 CR3: 7d905000 CR4: 2660 [ 104.171415] DR0: DR1: DR2: [ 104.171418] DR3: DR6: 0ff0 DR7: 0400 [ 104.171421] Process modprobe (pid: 3441, threadinfo 88007edac000, task 88007df8a400) [ 104.171424] Stack: [ 104.171426] a02579f8 00623da0 00623d30 [ 104.171430] 0 88007edadeb8 80038588 7fc11fa0 a02579f8 [ 104.171435] 0 a0243060 0001 a02579f8 [ 104.171441] Call Trace: [ 104.171449] [80038588] try_to_wake_up+0x48/0x420 [ 104.171455] [8005b2e8] up+0x48/0x50 [ 104.171464] [a0230d92] LNetInit+0x92/0xc0 [lnet] [ 104.171478] [a02430ac] init_lnet+0x4c/0x280 [lnet] [ 104.171489] [80004045] do_one_initcall+0x35/0x1b0 [ 104.171495] [8006d154] sys_init_module+0xe4/0x270 [ 104.171500] [80007458] system_call_fastpath+0x16/0x1b [ 104.171506] [7f5416cf3f7a] 0x7f5416cf3f7a [ 104.171508] Code: 1c 24 49 89 f6 4c 89 64 24 08 49 c7 c4 00 97 00 00 65 8a 04 25 c1 67 00 00 65 c6 04 25 c1 67 00 00 01 0f b6 c0 4c 89 e3 49 89 06 49 8b 45 08 8b 40 18 48 03 1c c5 80 ae 62 80 48 89 df e8 f7 87 [ 104.171544] RIP [8002c3d2] task_rq_lock+0x42/0xa0 [ 104.171548] RSP 88007edade38 [ 104.171550] CR2: 0008 [ 104.171553] ---[ end trace 34c6e019e0aea7d2 ]--- [ 106.380129] SFW2-INext-DROP-DEFLT IN=eth0 OUT= MAC=01:00:5e:00:00:01:00:17:f2:0e:c4:a1:08:00 SRC=130.246.188.58 DST=224.0.0.1 LEN=44 TOS=0x00 PREC=0x00 TTL=1 ID=27534 PROTO=UDP SPT=54228 DPT=8612 LEN=24 cmip-proc8:~ # -Original Message- From: Andreas Dilger [mailto:adil...@whamcloud.com] Sent: 10 May 2011 21:48 To: Chiu, Peter (STFC,RAL,RALSP) Cc: lustre-discuss@lists.lustre.org Subject: Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working On May 9, 2011, at 11:38, peter.c...@stfc.ac.uk peter.c...@stfc.ac.uk wrote: The rpms lustre-modules, lustre and lustre-tests were then installed smoothly without any complaints. But the subsequent modprobe lustre will return a Killed message, with no lustre module loaded. dmesg also reveals BUG: unable to handle kernel NULL pointer dereference at 0008 A second modprobe lustre command will then hang, again with no module loaded. Subsequently the client is not able to mount the lustre storage. Can anyone shed some light as to what has gone wrong here please? ./configure --with-linux=/usr/src/linux --with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen Are you sure that /usr/src/linux points to the same source as /usr/src/linux-2.6.32.29-0.3-obj? Is that a symlink? Normally the source and -obj files have a very similar pathname (i.e. just with -obj suffix difference). [ 168.647996] BUG: unable to handle kernel NULL pointer dereference at 0008 [ 168.648066] Pid: 3445, comm: modprobe Tainted: G N 2.6.32.29-0.3-xen #1 0400 [ 168.648110] Process modprobe (pid: 3445, threadinfo 88007efa4000, task 88007e9100c0) [ 168.648129] Call Trace: [ 168.648138] [80038588] try_to_wake_up+0x48/0x420 [ 168.648143] [8005b2e8] up+0x48/0x50 [ 168.648153] [a0230d92] LNetInit+0x92/0xc0 [lnet] [ 168.648167] [a02430ac] init_lnet+0x4c/0x280 [lnet] [ 168.648178] [80004045] do_one_initcall+0x35/0x1b0 [ 168.648184] [8006d154] sys_init_module+0xe4/0x270 [ 168.648189] [80007458
Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working
NULL pointer dereference at 0008 [ 104.171317] IP: [8002c3d2] task_rq_lock+0x42/0xa0 [ 104.171328] PGD 7d9d0067 PUD 7d94c067 PMD 0 [ 104.171333] Oops: [#1] SMP [ 104.171336] last sysfs file: /sys/module/ip_tables/initstate [ 104.171339] CPU 0 [ 104.171341] Modules linked in: lnet(N+) lvfs(N) libcfs(N) iptable_nat nf_nat xt_tcpudp xt_pkttype ipt_LOG xt_limit autofs4 binfmt_misc microcode xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6_tables x_tables fuse loop dm_mod joydev rtc_core rtc_lib xennet ext3 mbcache jbd processor thermal_sys hwmon xenblk cdrom [ 104.171373] Supported: Yes [ 104.171376] Pid: 3441, comm: modprobe Tainted: G N 2.6.32.29-0.3-xen #1 [ 104.171379] RIP: e030:[8002c3d2] [8002c3d2] task_rq_lock+0x42/0xa0 [ 104.171384] RSP: e02b:88007edade38 EFLAGS: 00010082 [ 104.171387] RAX: 0001 RBX: 9700 RCX: dead00100100 [ 104.171390] RDX: RSI: 88007edade88 RDI: [ 104.171393] RBP: 88007edade58 R08: a0252fb6 R09: [ 104.171396] R10: 0001 R11: 805f4200 R12: 9700 [ 104.171399] R13: R14: 88007edade88 R15: 000f [ 104.171406] FS: 7f541715a700() GS:8800013c1000() knlGS: [ 104.171409] CS: e033 DS: ES: CR0: 8005003b [ 104.171412] CR2: 0008 CR3: 7d905000 CR4: 2660 [ 104.171415] DR0: DR1: DR2: [ 104.171418] DR3: DR6: 0ff0 DR7: 0400 [ 104.171421] Process modprobe (pid: 3441, threadinfo 88007edac000, task 88007df8a400) [ 104.171424] Stack: [ 104.171426] a02579f8 00623da0 00623d30 [ 104.171430] 0 88007edadeb8 80038588 7fc11fa0 a02579f8 [ 104.171435] 0 a0243060 0001 a02579f8 [ 104.171441] Call Trace: [ 104.171449] [80038588] try_to_wake_up+0x48/0x420 [ 104.171455] [8005b2e8] up+0x48/0x50 [ 104.171464] [a0230d92] LNetInit+0x92/0xc0 [lnet] [ 104.171478] [a02430ac] init_lnet+0x4c/0x280 [lnet] [ 104.171489] [80004045] do_one_initcall+0x35/0x1b0 [ 104.171495] [8006d154] sys_init_module+0xe4/0x270 [ 104.171500] [80007458] system_call_fastpath+0x16/0x1b [ 104.171506] [7f5416cf3f7a] 0x7f5416cf3f7a [ 104.171508] Code: 1c 24 49 89 f6 4c 89 64 24 08 49 c7 c4 00 97 00 00 65 8a 04 25 c1 67 00 00 65 c6 04 25 c1 67 00 00 01 0f b6 c0 4c 89 e3 49 89 06 49 8b 45 08 8b 40 18 48 03 1c c5 80 ae 62 80 48 89 df e8 f7 87 [ 104.171544] RIP [8002c3d2] task_rq_lock+0x42/0xa0 [ 104.171548] RSP 88007edade38 [ 104.171550] CR2: 0008 [ 104.171553] ---[ end trace 34c6e019e0aea7d2 ]--- [ 106.380129] SFW2-INext-DROP-DEFLT IN=eth0 OUT= MAC=01:00:5e:00:00:01:00:17:f2:0e:c4:a1:08:00 SRC=130.246.188.58 DST=224.0.0.1 LEN=44 TOS=0x00 PREC=0x00 TTL=1 ID=27534 PROTO=UDP SPT=54228 DPT=8612 LEN=24 cmip-proc8:~ # -Original Message- From: Andreas Dilger [mailto:adil...@whamcloud.com] Sent: 10 May 2011 21:48 To: Chiu, Peter (STFC,RAL,RALSP) Cc: lustre-discuss@lists.lustre.org Subject: Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working On May 9, 2011, at 11:38, peter.c...@stfc.ac.uk peter.c...@stfc.ac.uk wrote: The rpms lustre-modules, lustre and lustre-tests were then installed smoothly without any complaints. But the subsequent modprobe lustre will return a Killed message, with no lustre module loaded. dmesg also reveals BUG: unable to handle kernel NULL pointer dereference at 0008 A second modprobe lustre command will then hang, again with no module loaded. Subsequently the client is not able to mount the lustre storage. Can anyone shed some light as to what has gone wrong here please? ./configure --with-linux=/usr/src/linux --with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen Are you sure that /usr/src/linux points to the same source as /usr/src/linux-2.6.32.29-0.3-obj? Is that a symlink? Normally the source and -obj files have a very similar pathname (i.e. just with -obj suffix difference). [ 168.647996] BUG: unable to handle kernel NULL pointer dereference at 0008 [ 168.648066] Pid: 3445, comm: modprobe Tainted: G N 2.6.32.29-0.3-xen #1 0400 [ 168.648110] Process modprobe (pid: 3445, threadinfo 88007efa4000, task 88007e9100c0) [ 168.648129] Call Trace: [ 168.648138] [80038588] try_to_wake_up+0x48/0x420 [ 168.648143] [8005b2e8] up+0x48/0x50 [ 168.648153
Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working
Understood, Andreas, Just to supplement is that the same approach works for SLES 11 using a xen kernel (2.6.27.54-0.2-xen). The Lustre Client rpms works okay: cmip-proc1:~ # cat /etc/issue Welcome to SUSE Linux Enterprise Server 11 (x86_64) - Kernel \r (\l). cmip-proc1:~ # uname -a Linux cmip-proc1 2.6.27.54-0.2-xen #1 SMP 2010-10-19 18:40:07 +0200 x86_64 x86_64 x86_64 GNU/Linux cmip-proc1:~ # df -h /disks/ceda1 FilesystemSize Used Avail Use% Mounted on 130.246.191.64:130.246.191.65@tcp0:/ceda1 51T 130G 48T 1% /disks/ceda1 SLES 11 SP1 is a service pack update to SLES 11 (now on 2.6.32.29-0.3-xen). Is it possible to find out what the problem is? Regards, Peter -Original Message- From: Andreas Dilger [mailto:adil...@whamcloud.com] Sent: 11 May 2011 10:11 To: Chiu, Peter (STFC,RAL,RALSP) Cc: lustre-discuss@lists.lustre.org; Chiu, Peter (STFC,RAL,RALSP) Subject: Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working The only other potential problem I see is that you are using a xen kernel and this us somehow causing problems. Cheers, Andreas On 2011-05-11, at 1:33 AM, peter.c...@stfc.ac.uk wrote: Dear Andreas, Many thanks for your response. Below are further details on this. I shall be grateful for your advice on this. Regards, Peter The system is: cmip-proc8:/etc # uname -a Linux cmip-proc8.badc.rl.ac.uk 2.6.32.29-0.3-xen #1 SMP 2011-02-25 13:36:59 +0100 x86_64 x86_64 x86_64 GNU/Linux /usr/src/linux is a symlink pointing to the source corresponding to linux-2.6.32.29-0.3-obj: cmip-proc8:/etc # ls -l /usr/src total 24 drwxr-xr-x 3 root root 4096 2011-05-09 08:31 debug lrwxrwxrwx 1 root root 19 2011-03-20 15:54 linux - linux-2.6.32.29-0.3 drwxr-xr-x 25 root root 4096 2011-05-09 08:49 linux-2.6.32.29-0.3 drwxr-xr-x 3 root root 4096 2011-03-20 15:54 linux-2.6.32.29-0.3-obj drwxr-xr-x 3 root root 4096 2011-03-20 15:54 linux-obj drwxr-xr-x 10 root root 4096 2011-05-09 08:31 lustre-1.8.5 drwxr-xr-x 7 root root 4096 2011-03-20 14:58 packages cmip-proc8:/etc # cmip-proc8:~ # ls /usr/local/kits/lustre-1.8.5 aclocal.m4 config.h.ininstall-sh Makefile autoMakefile config.log ldiskfs Makefile.in autoMakefile.am config.status libsysio missing autoMakefile.in config.sub lnet mkinstalldirs buildconfigure lustre README ChangeLogconfigure.ac lustre-1.8.5.tar.gz Rules compile COPYINGlustre-iokit snmp config.guess debian lustre.spec stamp-h1 config.h depcomplustre.spec.in tree_status cmip-proc8:~ # The build with .configure and make rpms produced rpms that are installable: cmip-proc8:/etc # ls -ls /usr/src/packages/RPMS/x86_64/*1.8.5* 4024 -rw-r--r-- 1 root root 4112883 2011-05-09 08:53 /usr/src/packages/RPMS/x86_64/lustre-1.8.52.6.32.29_0.3_xen_201105090815.x86_64.rpm 15532 -rw-r--r-- 1 root root 15881360 2011-05-09 08:54 /usr/src/packages/RPMS/x86_64/lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm 1332 -rw-r--r-- 1 root root 1358924 2011-05-09 08:54 /usr/src/packages/RPMS/x86_64/lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm 1416 -rw-r--r-- 1 root root 1441937 2011-05-09 08:53 /usr/src/packages/RPMS/x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm 3524 -rw-r--r-- 1 root root 3602163 2011-05-09 08:53 /usr/src/packages/RPMS/x86_64/lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm 2600 -rw-r--r-- 1 root root 2656393 2011-05-09 08:53 /usr/src/packages/RPMS/x86_64/lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm cmip-proc8:/etc # rpm -e lustre-tests cmip-proc8:/etc # rpm -e lustre cmip-proc8:/etc # rpm -e lustre-modules cmip-proc8:/etc # rpm -ivh /usr/src/packages/RPMS/x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm Preparing...### [100%] 1:lustre-modules ### [100%] Congratulations on finishing your Lustre installation! To register your copy of Lustre and find out more about Lustre Support, Service, and Training offerings please visit http://www.sun.com/software/products/lustre/lustre_reg.jsp cmip-proc8:/etc # rpm -ivh /usr/src/packages/RPMS/x86_64/lustre-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm Preparing...### [100%] 1:lustre ### [100%] cmip-proc8:/etc # rpm -ivh /usr/src/packages/RPMS/x86_64/lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm Preparing
Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working
On May 9, 2011, at 11:38, peter.c...@stfc.ac.uk peter.c...@stfc.ac.uk wrote: The rpms lustre-modules, lustre and lustre-tests were then installed smoothly without any complaints. But the subsequent “modprobe lustre” will return a “Killed” message, with no lustre module loaded. dmesg also reveals “BUG: unable to handle kernel NULL pointer dereference at 0008” A second modprobe lustre command will then hang, again with no module loaded. Subsequently the client is not able to mount the lustre storage. Can anyone shed some light as to what has gone wrong here please? ./configure --with-linux=/usr/src/linux --with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen Are you sure that /usr/src/linux points to the same source as /usr/src/linux-2.6.32.29-0.3-obj? Is that a symlink? Normally the source and -obj files have a very similar pathname (i.e. just with -obj suffix difference). [ 168.647996] BUG: unable to handle kernel NULL pointer dereference at 0008 [ 168.648066] Pid: 3445, comm: modprobe Tainted: G N 2.6.32.29-0.3-xen #1 0400 [ 168.648110] Process modprobe (pid: 3445, threadinfo 88007efa4000, task 88007e9100c0) [ 168.648129] Call Trace: [ 168.648138] [80038588] try_to_wake_up+0x48/0x420 [ 168.648143] [8005b2e8] up+0x48/0x50 [ 168.648153] [a0230d92] LNetInit+0x92/0xc0 [lnet] [ 168.648167] [a02430ac] init_lnet+0x4c/0x280 [lnet] [ 168.648178] [80004045] do_one_initcall+0x35/0x1b0 [ 168.648184] [8006d154] sys_init_module+0xe4/0x270 [ 168.648189] [80007458] system_call_fastpath+0x16/0x1b [ 168.648194] [7f3f40bc9f7a] 0x7f3f40bc9f7a I have tried Lustre-1.8.4, but got the same result. I have also tried to follow the 1.8 Operations Manual to locate the diagnostic tools, but the link wiki.lustre.org is no longer valid. This looks like a pretty serious error to oops during module insertion, and I'd suspect the build environment before any particular Lustre code. Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] SLES 11 SP1 Client rpms built but not working
Hi all, I used the method described below to build client rpms with the source kit lustre-1.8.5.tar.gz. There was only one error reported during the make rpms, relating to lustre-iolit-1.2-root, but the rpms were built under /usr/src/packages/RPMS/x86_64. The rpms lustre-modules, lustre and lustre-tests were then installed smoothly without any complaints. But the subsequent modprobe lustre will return a Killed message, with no lustre module loaded. dmesg also reveals BUG: unable to handle kernel NULL pointer dereference at 0008 A second modprobe lustre command will then hang, again with no module loaded. Subsequently the client is not able to mount the lustre storage. Can anyone shed some light as to what has gone wrong here please? Many thanks. Regards, Peter Chiu STFC Rutherford Appleton Laboratory Space Science Technology Department Building R25, Room 2.02 Chilton Didcot OXON OX11 0QX UK Phone: 01235-446699 Fax: 01235-445848 Email: peter.c...@stfc.ac.uk Details: === Client host cmip-proc8: cat /etc/issue: Welcome to SUSE Linux Enterprise Server 11 SP1 (x86_64) - Kernel \r (\l). cmip-proc8:~ # uname -a Linux cmip-proc8.badc.rl.ac.uk 2.6.32.29-0.3-xen #1 SMP 2011-02-25 13:36:59 +0100 x86_64 x86_64 x86_64 GNU/Linux Install kit from: cd /usr/local/kits/lustre-1.8.5 ls -ls /usr/src/ 4 drwxr-xr-x 3 root root 4096 2011-05-09 08:31 debug 0 lrwxrwxrwx 1 root root 19 2011-03-20 15:54 linux - linux-2.6.32.29-0.3 4 drwxr-xr-x 25 root root 4096 2011-05-09 08:49 linux-2.6.32.29-0.3 4 drwxr-xr-x 3 root root 4096 2011-03-20 15:54 linux-2.6.32.29-0.3-obj 4 drwxr-xr-x 3 root root 4096 2011-03-20 15:54 linux-obj 4 drwxr-xr-x 10 root root 4096 2011-05-09 08:31 lustre-1.8.5 4 drwxr-xr-x 7 root root 4096 2011-03-20 14:58 packages Install command: ./configure --with-linux=/usr/src/linux --with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen make rpms One error recorded: + ./configure --prefix=/usr configure: error: cannot find install-sh or install.sh in . ./.. ./../.. error: Bad exit status from /var/tmp/rpm-tmp.51316 (%build) RPM build errors: Bad exit status from /var/tmp/rpm-tmp.51316 (%build) make[1]: *** [rpms] Error 1 make[1]: Leaving directory `/usr/local/kits/lustre-1.8.5/lustre-iokit' By trial and error, this error can be avoided if I rsync /usr/local/kits/lustre-1.8.5/lustre-iokit /usr/src/packages/BUILD/lustre-iokit-1.2 Anyway, rpms are built under: cmip-proc8:/usr/local/kits/lustre-1.8.5 # ls /usr/src/packages/RPMS//x86_64/*1.8.5* /usr/src/packages/RPMS//x86_64/lustre-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm /usr/src/packages/RPMS//x86_64/lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm No error when installing these rpms: cmip-proc8:/usr/local/kits/lustre-1.8.5 # rpm -qa | grep lustre lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815 lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815 To check and load lustre module - none found cmip-proc8:~ # lsmod | grep lustre cmip-proc8:~ # modprobe lustre Killed cmip-proc8:~ # lsmod | grep lustre cmip-proc8:~ # modprobe lustre [1] 3454 cmip-proc8:~ # cmip-proc8:~ # ps auxw | grep lustre root 3454 0.0 0.0 3940 624 pts/1S18:04 0:00 modprobe lustre Dmesg records this error after the first modeprobe lustre command: cmip-proc8:/usr/local/kits/lustre-1.8.5 # diff /tmp/d1 /tmp/d2 195a196,250 [ 168.647996] BUG: unable to handle kernel NULL pointer dereference at 0008 [ 168.648006] IP: [8002c3d2] task_rq_lock+0x42/0xa0 [ 168.648018] PGD 7fac4067 PUD 7ef4c067 PMD 0 [ 168.648023] Oops: [#1] SMP [ 168.648026] last sysfs file: /sys/module/ip_tables/initstate [ 168.648028] CPU 0 [ 168.648030] Modules linked in: lnet(N+) lvfs(N) libcfs(N) iptable_nat nf_nat xt_tcpudp xt_pkttype ipt_LOG xt_limit autofs4 binfmt_misc microcode xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6_tables x_tables fuse loop dm_mod joydev rtc_core rtc_lib xennet ext3 mbcache jbd processor thermal_sys hwmon xenblk cdrom [ 168.648063] Supported: Yes [ 168.648066] Pid: 3445, comm: modprobe Tainted: G N 2.6.32.29-0.3-xen #1 [ 168.648069] RIP: e030:[8002c3d2] [8002c3d2]