Re: [lustre-discuss] Issue updating lustre from 2.10.6 to 2.10.7

2019-04-12 Thread Kurt Strosahl
Thanks, that was exactly what I needed to dig out the bad modules!




From: Jeff Johnson 
Sent: Friday, April 12, 2019 12:00 PM
To: Kurt Strosahl
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Issue updating lustre from 2.10.6 to 2.10.7

Kurt,

I see you're using dkms. Take a look at `dkms status` and ensure that there are 
no lingering installs from the previous version. Sometimes the older version 
doesn't get uninstalled from /lib/modules/`uname -r`/ and the dkms install 
process for the new version doesn't overwrite them.

--Jeff


On Fri, Apr 12, 2019 at 8:34 AM Kurt Strosahl 
mailto:stros...@jlab.org>> wrote:

Good Morning,


   I've encountered an issue updating lustre from 2.10.6 to 2.10.7 on my 
metadata system.


I installed the updated RPMs using whamcloud's yum repository, but when I run 
modprobe -v lustre I get the following errors:

insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/libcfs.ko.xz
insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/lnet.ko.xz 
networks=o2ib0(bond0)
insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/obdclass.ko.xz
insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/ptlrpc.ko.xz
insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/fld.ko.xz
modprobe: ERROR: could not insert 'lustre': Invalid argument

dmesg shows the following:

[ 377.981515] fld: disagrees about version of symbol class_export_put
[ 377.981518] fld: Unknown symbol class_export_put (err -22)
[ 377.981529] fld: disagrees about version of symbol req_capsule_server_pack
[ 377.981531] fld: Unknown symbol req_capsule_server_pack (err -22)
[ 377.981537] fld: disagrees about version of symbol req_capsule_set_size
[ 377.981538] fld: Unknown symbol req_capsule_set_size (err -22)
[ 377.981542] fld: disagrees about version of symbol req_capsule_client_get
[ 377.981543] fld: Unknown symbol req_capsule_client_get (err -22)
[ 377.981555] fld: disagrees about version of symbol lu_env_init
[ 377.981556] fld: Unknown symbol lu_env_init (err -22)
[ 377.981562] fld: disagrees about version of symbol ptlrpc_queue_wait
[ 377.981563] fld: Unknown symbol ptlrpc_queue_wait (err -22)
[ 377.981571] fld: disagrees about version of symbol lu_context_key_get
[ 377.981573] fld: Unknown symbol lu_context_key_get (err -22)
[ 377.981604] fld: disagrees about version of symbol lu_env_fini
[ 377.981606] fld: Unknown symbol lu_env_fini (err -22)
[ 377.981611] fld: disagrees about version of symbol lu_context_key_degister
[ 377.981612] fld: Unknown symbol lu_context_key_degister (err -22)
[ 377.981623] fld: disagrees about version of symbol class_exp2cliimp
[ 377.981625] fld: Unknown symbol class_exp2cliimp (err -22)
[ 377.981631] fld: disagrees about version of symbol req_capsule_set
[ 377.981632] fld: Unknown symbol req_capsule_set (err -22)


I've tried uninstalling and reinstalling the RPMs but this is always the state 
I end up in.


I'm using zfs for the back end, and those modules work after the upgrade


Here are the RPMs currently installed.

perf-3.10.0-957.1.3.el7_lustre.x86_64
lustre-resource-agents-2.10.7-1.el7.x86_64
lustre-osd-zfs-mount-2.10.7-1.el7.x86_64
kernel-headers-3.10.0-957.1.3.el7_lustre.x86_64
lustre-2.10.7-1.el7.x86_64
kernel-3.10.0-957.1.3.el7_lustre.x86_64
perf-debuginfo-3.10.0-957.el7_lustre.x86_64
lustre-zfs-dkms-2.10.7-1.el7.noarch
kernel-debuginfo-common-x86_64-3.10.0-957.el7_lustre.x86_64
kernel-devel-3.10.0-957.1.3.el7_lustre.x86_64
bpftool-3.10.0-957.el7_lustre.x86_64



Thank you,

Kurt J. Strosahl
System Administrator: Lustre, HPC
Scientific Computing Group, Thomas Jefferson National Accelerator Facility

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<https://gcc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.lustre.org%2Flistinfo.cgi%2Flustre-discuss-lustre.org=02%7C01%7Cstrosahl%40jlab.org%7Ca0a639bae29b43dcc64f08d6bf5ff962%7Cb4d7ee1f4fb34f0690372b5b522042ab%7C1%7C1%7C636906816276221845=1hd1cgdHo7g%2BBbX39Wak9xjIeto0%2FB4Vj5rZHJIUO4s%3D=0>


--
--
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com<mailto:jeff.john...@aeoncomputing.com>
www.aeoncomputing.com<https://gcc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.aeoncomputing.com=02%7C01%7Cstrosahl%40jlab.org%7Ca0a639bae29b43dcc64f08d6bf5ff962%7Cb4d7ee1f4fb34f0690372b5b522042ab%7C1%7C1%7C636906816276231849=BwElgBPP9ehr98Qfm23I0mu3OhJfrrW9DC8QjGEEXnI%3D=0>
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite C - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Issue updating lustre from 2.10.6 to 2.10.7

2019-04-12 Thread Jeff Johnson
Kurt,

I see you're using dkms. Take a look at `dkms status` and ensure that there
are no lingering installs from the previous version. Sometimes the older
version doesn't get uninstalled from /lib/modules/`uname -r`/ and the dkms
install process for the new version doesn't overwrite them.

--Jeff


On Fri, Apr 12, 2019 at 8:34 AM Kurt Strosahl  wrote:

> Good Morning,
>
>
>I've encountered an issue updating lustre from 2.10.6 to 2.10.7 on my
> metadata system.
>
>
> I installed the updated RPMs using whamcloud's yum repository, but when I
> run modprobe -v lustre I get the following errors:
>
> insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/libcfs.ko.xz
> insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/lnet.ko.xz
> networks=o2ib0(bond0)
> insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/obdclass.ko.xz
> insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/ptlrpc.ko.xz
> insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/fld.ko.xz
> modprobe: ERROR: could not insert 'lustre': Invalid argument
>
> dmesg shows the following:
>
> [ 377.981515] fld: disagrees about version of symbol class_export_put
> [ 377.981518] fld: Unknown symbol class_export_put (err -22)
> [ 377.981529] fld: disagrees about version of symbol
> req_capsule_server_pack
> [ 377.981531] fld: Unknown symbol req_capsule_server_pack (err -22)
> [ 377.981537] fld: disagrees about version of symbol req_capsule_set_size
> [ 377.981538] fld: Unknown symbol req_capsule_set_size (err -22)
> [ 377.981542] fld: disagrees about version of symbol req_capsule_client_get
> [ 377.981543] fld: Unknown symbol req_capsule_client_get (err -22)
> [ 377.981555] fld: disagrees about version of symbol lu_env_init
> [ 377.981556] fld: Unknown symbol lu_env_init (err -22)
> [ 377.981562] fld: disagrees about version of symbol ptlrpc_queue_wait
> [ 377.981563] fld: Unknown symbol ptlrpc_queue_wait (err -22)
> [ 377.981571] fld: disagrees about version of symbol lu_context_key_get
> [ 377.981573] fld: Unknown symbol lu_context_key_get (err -22)
> [ 377.981604] fld: disagrees about version of symbol lu_env_fini
> [ 377.981606] fld: Unknown symbol lu_env_fini (err -22)
> [ 377.981611] fld: disagrees about version of symbol
> lu_context_key_degister
> [ 377.981612] fld: Unknown symbol lu_context_key_degister (err -22)
> [ 377.981623] fld: disagrees about version of symbol class_exp2cliimp
> [ 377.981625] fld: Unknown symbol class_exp2cliimp (err -22)
> [ 377.981631] fld: disagrees about version of symbol req_capsule_set
> [ 377.981632] fld: Unknown symbol req_capsule_set (err -22)
>
> I've tried uninstalling and reinstalling the RPMs but this is always the
> state I end up in.
>
>
> I'm using zfs for the back end, and those modules work after the upgrade
>
>
> Here are the RPMs currently installed.
>
> perf-3.10.0-957.1.3.el7_lustre.x86_64
> lustre-resource-agents-2.10.7-1.el7.x86_64
> lustre-osd-zfs-mount-2.10.7-1.el7.x86_64
> kernel-headers-3.10.0-957.1.3.el7_lustre.x86_64
> lustre-2.10.7-1.el7.x86_64
> kernel-3.10.0-957.1.3.el7_lustre.x86_64
> perf-debuginfo-3.10.0-957.el7_lustre.x86_64
> lustre-zfs-dkms-2.10.7-1.el7.noarch
> kernel-debuginfo-common-x86_64-3.10.0-957.el7_lustre.x86_64
> kernel-devel-3.10.0-957.1.3.el7_lustre.x86_64
> bpftool-3.10.0-957.el7_lustre.x86_64
>
>
> Thank you,
>
> Kurt J. Strosahl
> System Administrator: Lustre, HPC
> Scientific Computing Group, Thomas Jefferson National Accelerator Facility
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>


-- 
--
Jeff Johnson
Co-Founder
Aeon Computing

jeff.john...@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x1001   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite C - San Diego, CA 92117

High-Performance Computing / Lustre Filesystems / Scale-out Storage
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Issue updating lustre from 2.10.6 to 2.10.7

2019-04-12 Thread Kurt Strosahl
Good Morning,


   I've encountered an issue updating lustre from 2.10.6 to 2.10.7 on my 
metadata system.


I installed the updated RPMs using whamcloud's yum repository, but when I run 
modprobe -v lustre I get the following errors:

insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/libcfs.ko.xz
insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/lnet.ko.xz 
networks=o2ib0(bond0)
insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/obdclass.ko.xz
insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/ptlrpc.ko.xz
insmod /lib/modules/3.10.0-957.1.3.el7_lustre.x86_64/extra/fld.ko.xz
modprobe: ERROR: could not insert 'lustre': Invalid argument

dmesg shows the following:

[ 377.981515] fld: disagrees about version of symbol class_export_put
[ 377.981518] fld: Unknown symbol class_export_put (err -22)
[ 377.981529] fld: disagrees about version of symbol req_capsule_server_pack
[ 377.981531] fld: Unknown symbol req_capsule_server_pack (err -22)
[ 377.981537] fld: disagrees about version of symbol req_capsule_set_size
[ 377.981538] fld: Unknown symbol req_capsule_set_size (err -22)
[ 377.981542] fld: disagrees about version of symbol req_capsule_client_get
[ 377.981543] fld: Unknown symbol req_capsule_client_get (err -22)
[ 377.981555] fld: disagrees about version of symbol lu_env_init
[ 377.981556] fld: Unknown symbol lu_env_init (err -22)
[ 377.981562] fld: disagrees about version of symbol ptlrpc_queue_wait
[ 377.981563] fld: Unknown symbol ptlrpc_queue_wait (err -22)
[ 377.981571] fld: disagrees about version of symbol lu_context_key_get
[ 377.981573] fld: Unknown symbol lu_context_key_get (err -22)
[ 377.981604] fld: disagrees about version of symbol lu_env_fini
[ 377.981606] fld: Unknown symbol lu_env_fini (err -22)
[ 377.981611] fld: disagrees about version of symbol lu_context_key_degister
[ 377.981612] fld: Unknown symbol lu_context_key_degister (err -22)
[ 377.981623] fld: disagrees about version of symbol class_exp2cliimp
[ 377.981625] fld: Unknown symbol class_exp2cliimp (err -22)
[ 377.981631] fld: disagrees about version of symbol req_capsule_set
[ 377.981632] fld: Unknown symbol req_capsule_set (err -22)


I've tried uninstalling and reinstalling the RPMs but this is always the state 
I end up in.


I'm using zfs for the back end, and those modules work after the upgrade


Here are the RPMs currently installed.

perf-3.10.0-957.1.3.el7_lustre.x86_64
lustre-resource-agents-2.10.7-1.el7.x86_64
lustre-osd-zfs-mount-2.10.7-1.el7.x86_64
kernel-headers-3.10.0-957.1.3.el7_lustre.x86_64
lustre-2.10.7-1.el7.x86_64
kernel-3.10.0-957.1.3.el7_lustre.x86_64
perf-debuginfo-3.10.0-957.el7_lustre.x86_64
lustre-zfs-dkms-2.10.7-1.el7.noarch
kernel-debuginfo-common-x86_64-3.10.0-957.el7_lustre.x86_64
kernel-devel-3.10.0-957.1.3.el7_lustre.x86_64
bpftool-3.10.0-957.el7_lustre.x86_64



Thank you,

Kurt J. Strosahl
System Administrator: Lustre, HPC
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org