Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

2011-05-24 Thread Rick Mohr
Peter,

Sorry for the late response.  I don't know if this will help you or not,
but below are the commands I ran to build the lustre client rpms on one
of our SLES systems:


nautilus:~ # cat /etc/SuSE-release 
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 1

nautilus:~ # uname -a
Linux nautilus 2.6.32.29-0.3.1.2687.3.PTF.607050.iommu-default #1 SMP 
2011-02-25 13:36:59 +0100 x86_64 x86_64 x86_64 GNU/Linux

nautilus:~ # cd /usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu

nautilus:/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu # make 
cloneconfig
Cloning configuration file /proc/config.gz
...

nautilus:/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu # make prepare
scripts/kconfig/conf -s arch/x86/Kconfig
  CHK include/linux/version.h
  UPD include/linux/version.h


nautilus:/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu # make scripts
  HOSTCC  scripts/genksyms/genksyms.o
  SHIPPED scripts/genksyms/lex.c


nautilus:/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu # cd 
/root/lustre-1.8.5

nautilus:~/lustre-1.8.5 # ./configure --disable-server 
--with-linux=/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu \
--with-linux-obj=/usr/src/linux-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu-obj/x86_64/default
 \
--with-linux-config=/boot/config-2.6.32.29-0.3.1.2687.3.PTF.607050.iommu-default
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu


nautilus:~/lustre-1.8.5 # make rpms

-- 
Rick Mohr
HPC Systems Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu/

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

2011-05-13 Thread peter.chiu
Dear Andreas,

I wonder if there is any further advice you can kindly offer as to how to 
troubleshoot the failure in bringing up lustre module?

Many thanks.

Peter

-Original Message-
From: Chiu, Peter (STFC,RAL,RALSP) 
Sent: 11 May 2011 11:50
To: Andreas Dilger
Cc: lustre-discuss@lists.lustre.org; Chiu, Peter (STFC,RAL,RALSP)
Subject: RE: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

Understood, Andreas,

Just to supplement is that the same approach works for SLES 11 using a xen 
kernel (2.6.27.54-0.2-xen).
The Lustre Client rpms works okay:

cmip-proc1:~ # cat /etc/issue

Welcome to SUSE Linux Enterprise Server 11 (x86_64) - Kernel \r (\l).

cmip-proc1:~ # uname -a
Linux cmip-proc1 2.6.27.54-0.2-xen #1 SMP 2010-10-19 18:40:07 +0200 x86_64 
x86_64 x86_64 GNU/Linux
cmip-proc1:~ # df -h /disks/ceda1
FilesystemSize  Used Avail Use% Mounted on
130.246.191.64:130.246.191.65@tcp0:/ceda1
   51T  130G   48T   1% /disks/ceda1


SLES 11 SP1 is a service pack update to SLES 11 (now on 2.6.32.29-0.3-xen).

Is it possible to find out what the problem is? 

Regards,
Peter


-Original Message-
From: Andreas Dilger [mailto:adil...@whamcloud.com] 
Sent: 11 May 2011 10:11
To: Chiu, Peter (STFC,RAL,RALSP)
Cc: lustre-discuss@lists.lustre.org; Chiu, Peter (STFC,RAL,RALSP)
Subject: Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

The only other potential problem I see is that you are using a xen kernel and 
this us somehow causing problems. 

Cheers, Andreas

On 2011-05-11, at 1:33 AM, peter.c...@stfc.ac.uk wrote:

 Dear Andreas,
 
 Many thanks for your response.
 
 Below are further details on this.
 
 I shall be grateful for your advice on this.
 
 Regards,
 
 Peter
 
 
 The system is:
 
 cmip-proc8:/etc # uname -a
 Linux cmip-proc8.badc.rl.ac.uk 2.6.32.29-0.3-xen #1 SMP 2011-02-25 13:36:59 
 +0100 x86_64 x86_64 x86_64 GNU/Linux
 
 /usr/src/linux is a symlink pointing to the source corresponding to 
 linux-2.6.32.29-0.3-obj:
 
 cmip-proc8:/etc # ls -l /usr/src
 total 24
 drwxr-xr-x  3 root root 4096 2011-05-09 08:31 debug
 lrwxrwxrwx  1 root root   19 2011-03-20 15:54 linux - linux-2.6.32.29-0.3
 drwxr-xr-x 25 root root 4096 2011-05-09 08:49 linux-2.6.32.29-0.3
 drwxr-xr-x  3 root root 4096 2011-03-20 15:54 linux-2.6.32.29-0.3-obj
 drwxr-xr-x  3 root root 4096 2011-03-20 15:54 linux-obj
 drwxr-xr-x 10 root root 4096 2011-05-09 08:31 lustre-1.8.5
 drwxr-xr-x  7 root root 4096 2011-03-20 14:58 packages
 cmip-proc8:/etc #
 
 cmip-proc8:~ # ls /usr/local/kits/lustre-1.8.5
 
 aclocal.m4   config.h.ininstall-sh   Makefile
 autoMakefile config.log ldiskfs  Makefile.in
 autoMakefile.am  config.status  libsysio missing
 autoMakefile.in  config.sub lnet mkinstalldirs
 buildconfigure  lustre   README
 ChangeLogconfigure.ac   lustre-1.8.5.tar.gz  Rules
 compile  COPYINGlustre-iokit snmp
 config.guess debian lustre.spec  stamp-h1
 config.h depcomplustre.spec.in   tree_status
 cmip-proc8:~ #
 
 The build with .configure and make rpms produced rpms that are installable:
 
 cmip-proc8:/etc # ls -ls /usr/src/packages/RPMS/x86_64/*1.8.5*
 4024 -rw-r--r-- 1 root root  4112883 2011-05-09 08:53 
 /usr/src/packages/RPMS/x86_64/lustre-1.8.52.6.32.29_0.3_xen_201105090815.x86_64.rpm
 15532 -rw-r--r-- 1 root root 15881360 2011-05-09 08:54 
 /usr/src/packages/RPMS/x86_64/lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 1332 -rw-r--r-- 1 root root  1358924 2011-05-09 08:54 
 /usr/src/packages/RPMS/x86_64/lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 1416 -rw-r--r-- 1 root root  1441937 2011-05-09 08:53 
 /usr/src/packages/RPMS/x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 3524 -rw-r--r-- 1 root root  3602163 2011-05-09 08:53 
 /usr/src/packages/RPMS/x86_64/lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 2600 -rw-r--r-- 1 root root  2656393 2011-05-09 08:53 
 /usr/src/packages/RPMS/x86_64/lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 
 
 cmip-proc8:/etc # rpm -e lustre-tests
 cmip-proc8:/etc # rpm -e lustre
 cmip-proc8:/etc # rpm -e lustre-modules
 cmip-proc8:/etc # rpm -ivh 
 /usr/src/packages/RPMS/x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 Preparing...### [100%]
   1:lustre-modules ### [100%]
 Congratulations on finishing your Lustre installation!  To register
 your copy of Lustre and find out more about Lustre Support, Service,
 and Training offerings please visit
 
 http://www.sun.com/software/products/lustre/lustre_reg.jsp
 cmip-proc8:/etc # rpm -ivh 
 /usr/src/packages/RPMS

Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

2011-05-11 Thread peter.chiu
 [  104.171341] Modules linked in: lnet(N+) lvfs(N) libcfs(N) iptable_nat 
 nf_nat xt_tcpudp xt_pkttype ipt_LOG xt_limit autofs4 binfmt_misc microcode 
 xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter 
 nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 
 ip_tables ip6_tables x_tables fuse loop dm_mod joydev rtc_core rtc_lib xennet 
 ext3 mbcache jbd processor thermal_sys hwmon xenblk cdrom
 [  104.171373] Supported: Yes
 [  104.171376] Pid: 3441, comm: modprobe Tainted: G  N  
 2.6.32.29-0.3-xen #1 
 [  104.171379] RIP: e030:[8002c3d2]  [8002c3d2] 
 task_rq_lock+0x42/0xa0
 [  104.171384] RSP: e02b:88007edade38  EFLAGS: 00010082
 [  104.171387] RAX: 0001 RBX: 9700 RCX: 
 dead00100100
 [  104.171390] RDX:  RSI: 88007edade88 RDI: 
 
 [  104.171393] RBP: 88007edade58 R08: a0252fb6 R09: 
 
 [  104.171396] R10: 0001 R11: 805f4200 R12: 
 9700
 [  104.171399] R13:  R14: 88007edade88 R15: 
 000f
 [  104.171406] FS:  7f541715a700() GS:8800013c1000() 
 knlGS:
 [  104.171409] CS:  e033 DS:  ES:  CR0: 8005003b
 [  104.171412] CR2: 0008 CR3: 7d905000 CR4: 
 2660
 [  104.171415] DR0:  DR1:  DR2: 
 
 [  104.171418] DR3:  DR6: 0ff0 DR7: 
 0400
 [  104.171421] Process modprobe (pid: 3441, threadinfo 88007edac000, task 
 88007df8a400)
 [  104.171424] Stack:
 [  104.171426]  a02579f8  00623da0 
 00623d30
 [  104.171430] 0 88007edadeb8 80038588 7fc11fa0 
 a02579f8
 [  104.171435] 0 a0243060  0001 
 a02579f8
 [  104.171441] Call Trace:
 [  104.171449]  [80038588] try_to_wake_up+0x48/0x420
 [  104.171455]  [8005b2e8] up+0x48/0x50
 [  104.171464]  [a0230d92] LNetInit+0x92/0xc0 [lnet]
 [  104.171478]  [a02430ac] init_lnet+0x4c/0x280 [lnet]
 [  104.171489]  [80004045] do_one_initcall+0x35/0x1b0
 [  104.171495]  [8006d154] sys_init_module+0xe4/0x270
 [  104.171500]  [80007458] system_call_fastpath+0x16/0x1b
 [  104.171506]  [7f5416cf3f7a] 0x7f5416cf3f7a
 [  104.171508] Code: 1c 24 49 89 f6 4c 89 64 24 08 49 c7 c4 00 97 00 00 65 8a 
 04 25 c1 67 00 00 65 c6 04 25 c1 67 00 00 01 0f b6 c0 4c 89 e3 49 89 06 49 
 8b 45 08 8b 40 18 48 03 1c c5 80 ae 62 80 48 89 df e8 f7 87 
 [  104.171544] RIP  [8002c3d2] task_rq_lock+0x42/0xa0
 [  104.171548]  RSP 88007edade38
 [  104.171550] CR2: 0008
 [  104.171553] ---[ end trace 34c6e019e0aea7d2 ]---
 [  106.380129] SFW2-INext-DROP-DEFLT IN=eth0 OUT= 
 MAC=01:00:5e:00:00:01:00:17:f2:0e:c4:a1:08:00 SRC=130.246.188.58 
 DST=224.0.0.1 LEN=44 TOS=0x00 PREC=0x00 TTL=1 ID=27534 PROTO=UDP SPT=54228 
 DPT=8612 LEN=24 
cmip-proc8:~ #


-Original Message-
From: Andreas Dilger [mailto:adil...@whamcloud.com] 
Sent: 10 May 2011 21:48
To: Chiu, Peter (STFC,RAL,RALSP)
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

On May 9, 2011, at 11:38, peter.c...@stfc.ac.uk peter.c...@stfc.ac.uk wrote:
 The rpms lustre-modules, lustre and lustre-tests were then installed smoothly 
 without any complaints.
  
 But the subsequent modprobe lustre will return a Killed message, with no 
 lustre module loaded.
  
 dmesg also reveals  BUG: unable to handle kernel NULL pointer dereference at 
 0008
  
 A second modprobe lustre command will then hang, again with no module loaded.
 Subsequently the client is not able to mount the lustre storage.
  
 Can anyone shed some light as to what has gone wrong here please?
  
 ./configure --with-linux=/usr/src/linux 
 --with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen

Are you sure that /usr/src/linux points to the same source as 
/usr/src/linux-2.6.32.29-0.3-obj?  Is that a symlink?  Normally the source 
and -obj files have a very similar pathname (i.e. just with -obj suffix 
difference).

  [  168.647996] BUG: unable to handle kernel NULL pointer dereference at 
  0008
  [  168.648066] Pid: 3445, comm: modprobe Tainted: G  N  
  2.6.32.29-0.3-xen #1
 0400
  [  168.648110] Process modprobe (pid: 3445, threadinfo 88007efa4000, 
  task 88007e9100c0)
  [  168.648129] Call Trace:
  [  168.648138]  [80038588] try_to_wake_up+0x48/0x420
  [  168.648143]  [8005b2e8] up+0x48/0x50
  [  168.648153]  [a0230d92] LNetInit+0x92/0xc0 [lnet]
  [  168.648167]  [a02430ac] init_lnet+0x4c/0x280 [lnet]
  [  168.648178]  [80004045] do_one_initcall+0x35/0x1b0
  [  168.648184]  [8006d154] sys_init_module+0xe4/0x270
  [  168.648189]  [80007458

Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

2011-05-11 Thread Andreas Dilger
 NULL pointer dereference at 
 0008
 [  104.171317] IP: [8002c3d2] task_rq_lock+0x42/0xa0
 [  104.171328] PGD 7d9d0067 PUD 7d94c067 PMD 0 
 [  104.171333] Oops:  [#1] SMP 
 [  104.171336] last sysfs file: /sys/module/ip_tables/initstate
 [  104.171339] CPU 0
 [  104.171341] Modules linked in: lnet(N+) lvfs(N) libcfs(N) iptable_nat 
 nf_nat xt_tcpudp xt_pkttype ipt_LOG xt_limit autofs4 binfmt_misc microcode 
 xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter 
 nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 
 ip_tables ip6_tables x_tables fuse loop dm_mod joydev rtc_core rtc_lib 
 xennet ext3 mbcache jbd processor thermal_sys hwmon xenblk cdrom
 [  104.171373] Supported: Yes
 [  104.171376] Pid: 3441, comm: modprobe Tainted: G  N  
 2.6.32.29-0.3-xen #1 
 [  104.171379] RIP: e030:[8002c3d2]  [8002c3d2] 
 task_rq_lock+0x42/0xa0
 [  104.171384] RSP: e02b:88007edade38  EFLAGS: 00010082
 [  104.171387] RAX: 0001 RBX: 9700 RCX: 
 dead00100100
 [  104.171390] RDX:  RSI: 88007edade88 RDI: 
 
 [  104.171393] RBP: 88007edade58 R08: a0252fb6 R09: 
 
 [  104.171396] R10: 0001 R11: 805f4200 R12: 
 9700
 [  104.171399] R13:  R14: 88007edade88 R15: 
 000f
 [  104.171406] FS:  7f541715a700() GS:8800013c1000() 
 knlGS:
 [  104.171409] CS:  e033 DS:  ES:  CR0: 8005003b
 [  104.171412] CR2: 0008 CR3: 7d905000 CR4: 
 2660
 [  104.171415] DR0:  DR1:  DR2: 
 
 [  104.171418] DR3:  DR6: 0ff0 DR7: 
 0400
 [  104.171421] Process modprobe (pid: 3441, threadinfo 88007edac000, 
 task 88007df8a400)
 [  104.171424] Stack:
 [  104.171426]  a02579f8  00623da0 
 00623d30
 [  104.171430] 0 88007edadeb8 80038588 7fc11fa0 
 a02579f8
 [  104.171435] 0 a0243060  0001 
 a02579f8
 [  104.171441] Call Trace:
 [  104.171449]  [80038588] try_to_wake_up+0x48/0x420
 [  104.171455]  [8005b2e8] up+0x48/0x50
 [  104.171464]  [a0230d92] LNetInit+0x92/0xc0 [lnet]
 [  104.171478]  [a02430ac] init_lnet+0x4c/0x280 [lnet]
 [  104.171489]  [80004045] do_one_initcall+0x35/0x1b0
 [  104.171495]  [8006d154] sys_init_module+0xe4/0x270
 [  104.171500]  [80007458] system_call_fastpath+0x16/0x1b
 [  104.171506]  [7f5416cf3f7a] 0x7f5416cf3f7a
 [  104.171508] Code: 1c 24 49 89 f6 4c 89 64 24 08 49 c7 c4 00 97 00 00 65 
 8a 04 25 c1 67 00 00 65 c6 04 25 c1 67 00 00 01 0f b6 c0 4c 89 e3 49 89 06 
 49 8b 45 08 8b 40 18 48 03 1c c5 80 ae 62 80 48 89 df e8 f7 87 
 [  104.171544] RIP  [8002c3d2] task_rq_lock+0x42/0xa0
 [  104.171548]  RSP 88007edade38
 [  104.171550] CR2: 0008
 [  104.171553] ---[ end trace 34c6e019e0aea7d2 ]---
 [  106.380129] SFW2-INext-DROP-DEFLT IN=eth0 OUT= 
 MAC=01:00:5e:00:00:01:00:17:f2:0e:c4:a1:08:00 SRC=130.246.188.58 
 DST=224.0.0.1 LEN=44 TOS=0x00 PREC=0x00 TTL=1 ID=27534 PROTO=UDP SPT=54228 
 DPT=8612 LEN=24 
 cmip-proc8:~ #
 
 
 -Original Message-
 From: Andreas Dilger [mailto:adil...@whamcloud.com] 
 Sent: 10 May 2011 21:48
 To: Chiu, Peter (STFC,RAL,RALSP)
 Cc: lustre-discuss@lists.lustre.org
 Subject: Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working
 
 On May 9, 2011, at 11:38, peter.c...@stfc.ac.uk peter.c...@stfc.ac.uk 
 wrote:
 The rpms lustre-modules, lustre and lustre-tests were then installed 
 smoothly without any complaints.
 
 But the subsequent modprobe lustre will return a Killed message, with no 
 lustre module loaded.
 
 dmesg also reveals  BUG: unable to handle kernel NULL pointer dereference 
 at 0008
 
 A second modprobe lustre command will then hang, again with no module loaded.
 Subsequently the client is not able to mount the lustre storage.
 
 Can anyone shed some light as to what has gone wrong here please?
 
 ./configure --with-linux=/usr/src/linux 
 --with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen
 
 Are you sure that /usr/src/linux points to the same source as 
 /usr/src/linux-2.6.32.29-0.3-obj?  Is that a symlink?  Normally the source 
 and -obj files have a very similar pathname (i.e. just with -obj suffix 
 difference).
 
 [  168.647996] BUG: unable to handle kernel NULL pointer dereference at 
 0008
 [  168.648066] Pid: 3445, comm: modprobe Tainted: G  N  
 2.6.32.29-0.3-xen #1
 0400
 [  168.648110] Process modprobe (pid: 3445, threadinfo 88007efa4000, 
 task 88007e9100c0)
 [  168.648129] Call Trace:
 [  168.648138]  [80038588] try_to_wake_up+0x48/0x420
 [  168.648143]  [8005b2e8] up+0x48/0x50
 [  168.648153

Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

2011-05-11 Thread peter.chiu
Understood, Andreas,

Just to supplement is that the same approach works for SLES 11 using a xen 
kernel (2.6.27.54-0.2-xen).
The Lustre Client rpms works okay:

cmip-proc1:~ # cat /etc/issue

Welcome to SUSE Linux Enterprise Server 11 (x86_64) - Kernel \r (\l).

cmip-proc1:~ # uname -a
Linux cmip-proc1 2.6.27.54-0.2-xen #1 SMP 2010-10-19 18:40:07 +0200 x86_64 
x86_64 x86_64 GNU/Linux
cmip-proc1:~ # df -h /disks/ceda1
FilesystemSize  Used Avail Use% Mounted on
130.246.191.64:130.246.191.65@tcp0:/ceda1
   51T  130G   48T   1% /disks/ceda1


SLES 11 SP1 is a service pack update to SLES 11 (now on 2.6.32.29-0.3-xen).

Is it possible to find out what the problem is? 

Regards,
Peter


-Original Message-
From: Andreas Dilger [mailto:adil...@whamcloud.com] 
Sent: 11 May 2011 10:11
To: Chiu, Peter (STFC,RAL,RALSP)
Cc: lustre-discuss@lists.lustre.org; Chiu, Peter (STFC,RAL,RALSP)
Subject: Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

The only other potential problem I see is that you are using a xen kernel and 
this us somehow causing problems. 

Cheers, Andreas

On 2011-05-11, at 1:33 AM, peter.c...@stfc.ac.uk wrote:

 Dear Andreas,
 
 Many thanks for your response.
 
 Below are further details on this.
 
 I shall be grateful for your advice on this.
 
 Regards,
 
 Peter
 
 
 The system is:
 
 cmip-proc8:/etc # uname -a
 Linux cmip-proc8.badc.rl.ac.uk 2.6.32.29-0.3-xen #1 SMP 2011-02-25 13:36:59 
 +0100 x86_64 x86_64 x86_64 GNU/Linux
 
 /usr/src/linux is a symlink pointing to the source corresponding to 
 linux-2.6.32.29-0.3-obj:
 
 cmip-proc8:/etc # ls -l /usr/src
 total 24
 drwxr-xr-x  3 root root 4096 2011-05-09 08:31 debug
 lrwxrwxrwx  1 root root   19 2011-03-20 15:54 linux - linux-2.6.32.29-0.3
 drwxr-xr-x 25 root root 4096 2011-05-09 08:49 linux-2.6.32.29-0.3
 drwxr-xr-x  3 root root 4096 2011-03-20 15:54 linux-2.6.32.29-0.3-obj
 drwxr-xr-x  3 root root 4096 2011-03-20 15:54 linux-obj
 drwxr-xr-x 10 root root 4096 2011-05-09 08:31 lustre-1.8.5
 drwxr-xr-x  7 root root 4096 2011-03-20 14:58 packages
 cmip-proc8:/etc #
 
 cmip-proc8:~ # ls /usr/local/kits/lustre-1.8.5
 
 aclocal.m4   config.h.ininstall-sh   Makefile
 autoMakefile config.log ldiskfs  Makefile.in
 autoMakefile.am  config.status  libsysio missing
 autoMakefile.in  config.sub lnet mkinstalldirs
 buildconfigure  lustre   README
 ChangeLogconfigure.ac   lustre-1.8.5.tar.gz  Rules
 compile  COPYINGlustre-iokit snmp
 config.guess debian lustre.spec  stamp-h1
 config.h depcomplustre.spec.in   tree_status
 cmip-proc8:~ #
 
 The build with .configure and make rpms produced rpms that are installable:
 
 cmip-proc8:/etc # ls -ls /usr/src/packages/RPMS/x86_64/*1.8.5*
 4024 -rw-r--r-- 1 root root  4112883 2011-05-09 08:53 
 /usr/src/packages/RPMS/x86_64/lustre-1.8.52.6.32.29_0.3_xen_201105090815.x86_64.rpm
 15532 -rw-r--r-- 1 root root 15881360 2011-05-09 08:54 
 /usr/src/packages/RPMS/x86_64/lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 1332 -rw-r--r-- 1 root root  1358924 2011-05-09 08:54 
 /usr/src/packages/RPMS/x86_64/lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 1416 -rw-r--r-- 1 root root  1441937 2011-05-09 08:53 
 /usr/src/packages/RPMS/x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 3524 -rw-r--r-- 1 root root  3602163 2011-05-09 08:53 
 /usr/src/packages/RPMS/x86_64/lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 2600 -rw-r--r-- 1 root root  2656393 2011-05-09 08:53 
 /usr/src/packages/RPMS/x86_64/lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 
 
 cmip-proc8:/etc # rpm -e lustre-tests
 cmip-proc8:/etc # rpm -e lustre
 cmip-proc8:/etc # rpm -e lustre-modules
 cmip-proc8:/etc # rpm -ivh 
 /usr/src/packages/RPMS/x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 Preparing...### [100%]
   1:lustre-modules ### [100%]
 Congratulations on finishing your Lustre installation!  To register
 your copy of Lustre and find out more about Lustre Support, Service,
 and Training offerings please visit
 
 http://www.sun.com/software/products/lustre/lustre_reg.jsp
 cmip-proc8:/etc # rpm -ivh 
 /usr/src/packages/RPMS/x86_64/lustre-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 Preparing...### [100%]
   1:lustre ### [100%]
 cmip-proc8:/etc # rpm -ivh 
 /usr/src/packages/RPMS/x86_64/lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
 Preparing

Re: [Lustre-discuss] SLES 11 SP1 Client rpms built but not working

2011-05-10 Thread Andreas Dilger
On May 9, 2011, at 11:38, peter.c...@stfc.ac.uk peter.c...@stfc.ac.uk wrote:
 The rpms lustre-modules, lustre and lustre-tests were then installed smoothly 
 without any complaints.
  
 But the subsequent “modprobe lustre” will return a “Killed” message, with no 
 lustre module loaded.
  
 dmesg also reveals  “BUG: unable to handle kernel NULL pointer dereference at 
 0008”
  
 A second modprobe lustre command will then hang, again with no module loaded.
 Subsequently the client is not able to mount the lustre storage.
  
 Can anyone shed some light as to what has gone wrong here please?
  
 ./configure --with-linux=/usr/src/linux 
 --with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen

Are you sure that /usr/src/linux points to the same source as 
/usr/src/linux-2.6.32.29-0.3-obj?  Is that a symlink?  Normally the source 
and -obj files have a very similar pathname (i.e. just with -obj suffix 
difference).

  [  168.647996] BUG: unable to handle kernel NULL pointer dereference at 
  0008
  [  168.648066] Pid: 3445, comm: modprobe Tainted: G  N  
  2.6.32.29-0.3-xen #1
 0400
  [  168.648110] Process modprobe (pid: 3445, threadinfo 88007efa4000, 
  task 88007e9100c0)
  [  168.648129] Call Trace:
  [  168.648138]  [80038588] try_to_wake_up+0x48/0x420
  [  168.648143]  [8005b2e8] up+0x48/0x50
  [  168.648153]  [a0230d92] LNetInit+0x92/0xc0 [lnet]
  [  168.648167]  [a02430ac] init_lnet+0x4c/0x280 [lnet]
  [  168.648178]  [80004045] do_one_initcall+0x35/0x1b0
  [  168.648184]  [8006d154] sys_init_module+0xe4/0x270
  [  168.648189]  [80007458] system_call_fastpath+0x16/0x1b
  [  168.648194]  [7f3f40bc9f7a] 0x7f3f40bc9f7a
  
 I have tried Lustre-1.8.4, but got the same result.
 I have also tried to follow the 1.8 Operations Manual to locate the 
 diagnostic tools, but the link wiki.lustre.org is no longer valid.

This looks like a pretty serious error to oops during module insertion, and I'd 
suspect the build environment before any particular Lustre code.

Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] SLES 11 SP1 Client rpms built but not working

2011-05-09 Thread peter.chiu
Hi all,

I used the method described below to build client rpms with the source kit 
lustre-1.8.5.tar.gz.

There was only one error reported during the make rpms, relating to 
lustre-iolit-1.2-root,
but the rpms were built under /usr/src/packages/RPMS/x86_64.

The rpms lustre-modules, lustre and lustre-tests were then installed smoothly 
without any complaints.

But the subsequent modprobe lustre will return a Killed message, with no 
lustre module loaded.

dmesg also reveals  BUG: unable to handle kernel NULL pointer dereference at 
0008

A second modprobe lustre command will then hang, again with no module loaded.
Subsequently the client is not able to mount the lustre storage.

Can anyone shed some light as to what has gone wrong here please?

Many thanks.

Regards,

Peter Chiu
STFC Rutherford Appleton Laboratory
Space Science  Technology Department
Building R25, Room 2.02
Chilton
Didcot
OXON
OX11 0QX
UK

Phone:  01235-446699
Fax:  01235-445848
Email:   peter.c...@stfc.ac.uk

Details:
===

Client host cmip-proc8:  cat /etc/issue:

Welcome to SUSE Linux Enterprise Server 11 SP1  (x86_64) - Kernel \r (\l).

cmip-proc8:~ # uname -a
Linux cmip-proc8.badc.rl.ac.uk 2.6.32.29-0.3-xen #1 SMP 2011-02-25 13:36:59 
+0100 x86_64 x86_64 x86_64 GNU/Linux

Install kit from:
cd /usr/local/kits/lustre-1.8.5

ls -ls /usr/src/
4 drwxr-xr-x  3 root root 4096 2011-05-09 08:31 debug
0 lrwxrwxrwx  1 root root   19 2011-03-20 15:54 linux - linux-2.6.32.29-0.3
4 drwxr-xr-x 25 root root 4096 2011-05-09 08:49 linux-2.6.32.29-0.3
4 drwxr-xr-x  3 root root 4096 2011-03-20 15:54 linux-2.6.32.29-0.3-obj
4 drwxr-xr-x  3 root root 4096 2011-03-20 15:54 linux-obj
4 drwxr-xr-x 10 root root 4096 2011-05-09 08:31 lustre-1.8.5
4 drwxr-xr-x  7 root root 4096 2011-03-20 14:58 packages

Install command:

./configure --with-linux=/usr/src/linux 
--with-linux-obj=/usr/src/linux-2.6.32.29-0.3-obj/x86_64/xen
make rpms

One error recorded:
+ ./configure --prefix=/usr
configure: error: cannot find install-sh or install.sh in . ./.. ./../..
error: Bad exit status from /var/tmp/rpm-tmp.51316 (%build)

RPM build errors:
Bad exit status from /var/tmp/rpm-tmp.51316 (%build)
make[1]: *** [rpms] Error 1
make[1]: Leaving directory `/usr/local/kits/lustre-1.8.5/lustre-iokit'

By trial and error, this error can be avoided if I rsync 
/usr/local/kits/lustre-1.8.5/lustre-iokit 
/usr/src/packages/BUILD/lustre-iokit-1.2

Anyway, rpms are built under:

cmip-proc8:/usr/local/kits/lustre-1.8.5 # ls 
/usr/src/packages/RPMS//x86_64/*1.8.5*
/usr/src/packages/RPMS//x86_64/lustre-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
/usr/src/packages/RPMS//x86_64/lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
/usr/src/packages/RPMS//x86_64/lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
/usr/src/packages/RPMS//x86_64/lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
/usr/src/packages/RPMS//x86_64/lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm
/usr/src/packages/RPMS//x86_64/lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815.x86_64.rpm

No error when  installing these rpms:

cmip-proc8:/usr/local/kits/lustre-1.8.5 # rpm -qa | grep lustre
lustre-debuginfo-1.8.5-2.6.32.29_0.3_xen_201105090815
lustre-modules-1.8.5-2.6.32.29_0.3_xen_201105090815
lustre-1.8.5-2.6.32.29_0.3_xen_201105090815
lustre-debugsource-1.8.5-2.6.32.29_0.3_xen_201105090815
lustre-tests-1.8.5-2.6.32.29_0.3_xen_201105090815
lustre-source-1.8.5-2.6.32.29_0.3_xen_201105090815


To check and load lustre module - none found

cmip-proc8:~ # lsmod | grep lustre
cmip-proc8:~ # modprobe lustre
Killed
cmip-proc8:~ # lsmod | grep lustre
cmip-proc8:~ # modprobe lustre 
[1] 3454
cmip-proc8:~ #
cmip-proc8:~ # ps auxw | grep lustre
root  3454  0.0  0.0   3940   624 pts/1S18:04   0:00 modprobe lustre

Dmesg records this error after the first modeprobe lustre command:

cmip-proc8:/usr/local/kits/lustre-1.8.5 # diff /tmp/d1 /tmp/d2
195a196,250
 [  168.647996] BUG: unable to handle kernel NULL pointer dereference at 
 0008
 [  168.648006] IP: [8002c3d2] task_rq_lock+0x42/0xa0
 [  168.648018] PGD 7fac4067 PUD 7ef4c067 PMD 0
 [  168.648023] Oops:  [#1] SMP
 [  168.648026] last sysfs file: /sys/module/ip_tables/initstate
 [  168.648028] CPU 0
 [  168.648030] Modules linked in: lnet(N+) lvfs(N) libcfs(N) iptable_nat 
 nf_nat xt_tcpudp xt_pkttype ipt_LOG xt_limit autofs4 binfmt_misc microcode 
 xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter 
 nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 
 ip_tables ip6_tables x_tables fuse loop dm_mod joydev rtc_core rtc_lib xennet 
 ext3 mbcache jbd processor thermal_sys hwmon xenblk cdrom
 [  168.648063] Supported: Yes
 [  168.648066] Pid: 3445, comm: modprobe Tainted: G  N  
 2.6.32.29-0.3-xen #1
 [  168.648069] RIP: e030:[8002c3d2]  [8002c3d2]