Sorry for a long email, wanted to make sure I share enough details for community to provide guidance. I am building all lustre packages for Oracle Linux7.9-RHCK and MOFED: 5.3-1.0.0.1 using steps described here: https://wiki.lustre.org/Compiling_Lustre
Oracle Linux 7.9 – Kernel: 3.10.0-1160.15.2.el7.x86_64 I was able to create the below RPM packages successfully using a node which has same OS and kernel version and MOFED version and MLNX CX-5 card, but when I try to install them on my lustre nodes, I get a dependency failure related to ksym/MOFED packages (more details below). 1. LDISKFS and Patching the Linux Kernel 2. MOFED rpms 3. Lustre server rpms 4. Lustre client rpms After all RPMs were created, I created a local repo and added to all Lustre nodes: cat > /etc/yum.repos.d/lustre.repo << EOF [hpddLustreserver] name=OL-Lustre-Server baseurl=file:///home/opc/releases/lustre-server/ gpgcheck=0 [e2fsprogs] name=CentOS- - Ldiskfs baseurl=https://downloads.whamcloud.com/public/e2fsprogs/latest/el7/ gpgcheck=0 [hpddLustreclient] name=OL-Lustre-Client baseurl=file:///home/opc/releases/lustre-client/ gpgcheck=0 [LustreKernel] name=LustreKernel baseurl=file:///home/opc/releases/lustre-kernel/ gpgcheck=0 [MOFED] name=MOFED baseurl=file:///home/opc/releases/mofed/ gpgcheck=0 EOF MOFED is installed and configured on those nodes and was able to validate using IMB-MPI1 pingpong test. show_gids mlx5_0 1 2 0000:0000:0000:0000:0000:ffff:c0a8:a985 192.168.169.133 v1 enp94s0f0 Dependency failure : On OSS nodes, I ran the below to install all Lustre packages: sudo yum install lustre-tests -y [opc@inst-dwnv3-topical-goblin ~]$ sudo yum install -y lustre-tests Loaded plugins: langpacks, ulninfo LustreKernel | 2.9 kB 00:00:00 MOFED | 2.9 kB 00:00:00 e2fsprogs | 2.9 kB 00:00:00 hpddLustreclient | 2.9 kB 00:00:00 hpddLustreserver | 2.9 kB 00:00:00 MOFED/primary_db --> Running transaction check ---> Package lustre-tests.x86_64 0:2.14.51-1.el7 will be installed --> Processing Dependency: lustre-devel = 2.14.51 for package: lustre-tests-2.14.51-1.el7.x86_64 --> Processing Dependency: kmod-lustre-tests = 2.14.51 for package: lustre-tests-2.14.51-1.el7.x86_64 --> Processing Dependency: kmod-lustre = 2.14.51 for package: lustre-tests-2.14.51-1.el7.x86_64 --> Processing Dependency: lustre-iokit for package: lustre-tests-2.14.51-1.el7.x86_64 --> Processing Dependency: liblustreapi.so.1()(64bit) for package: lustre-tests-2.14.51-1.el7.x86_64 --> Processing Dependency: liblnetconfig.so.4()(64bit) for package: lustre-tests-2.14.51-1.el7.x86_64 --> Running transaction check ---> Package kmod-lustre.x86_64 0:2.14.51-1.el7 will be installed ….. …. ---> Package libcom_err.x86_64 0:1.45.4-3.0.5.el7 will be updated ---> Package libcom_err.x86_64 0:1.46.2.wc1-0.el7 will be an update ---> Package libss.x86_64 0:1.45.4-3.0.5.el7 will be updated ---> Package libss.x86_64 0:1.46.2.wc1-0.el7 will be an update --> Finished Dependency Resolution Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver) Requires: ksym(ib_map_mr_sg) = 0xcd1ffb73 Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver) Requires: ksym(rdma_resolve_route) = 0xc2064869 Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver) Requires: ksym(ib_unregister_event_handler) = 0xc58881d0 Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver) Requires: ksym(ib_query_port) = 0x6889b87f Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver) Requires: ksym(rdma_disconnect) = 0x49262e62 Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver) Requires: ksym(rdma_connect_locked) = 0x7eaa4a8a …. …. All ib/rdma related errors similar to above for kmod-lustre.x …. Error: Package: kmod-lustre-2.14.51-1.el7.x86_64 (hpddLustreserver) Requires: ksym(ib_destroy_cq_user) = 0x5671830b You could try using --skip-broken to work around the problem ** Found 3 pre-existing rpmdb problem(s), 'yum check' output follows: oracle-cloud-agent-1.11.1-5104.el7.x86_64 is a duplicate with oracle-cloud-agent-1.8.2-3843.el7.x86_64 rdma-core-devel-52mlnx1-1.53100.x86_64 has missing requires of pkgconfig(libnl-3.0) rdma-core-devel-52mlnx1-1.53100.x86_64 has missing requires of pkgconfig(libnl-route-3.0) [opc@inst-dwnv3-topical-goblin ~]$ RPMS from: LDISKFS and Patching the Linux Kernel ls lustre-kernel/RPMS/ * bpftool-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * bpftool-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debug-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debug-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debug-devel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-debuginfo-common-x86_64-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-devel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-headers-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-tools-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-tools-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-tools-libs-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * kernel-tools-libs-devel-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * perf-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * perf-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * python-perf-3.10.0-1160.15.2.el7_lustre.x86_64.rpm * python-perf-debuginfo-3.10.0-1160.15.2.el7_lustre.x86_64.rpm MOFED rpms Steps followed: Download from MLNX site the source: MLNX_OFED_SRC-5.3-1.0.0.1.tgz tar -zvxf $HOME/MLNX_OFED_SRC-5.3-1.0.0.1.tgz cd MLNX_OFED_SRC-5.3-1.0.0.1/ ./install.pl --build-only --kernel-only \ --kernel 3.10.0-1160.15.2.el7.x86_64 \ --kernel-sources /usr/src/kernels/3.10.0-1160.15.2.el7.x86_64 cp RPMS/*/*/*.rpm $HOME/releases/mofed Question: I am passing regular kernel (3.10.0-1160.15.2.el7.x86_64) and its source (not Lustre patched kernel) as input to MOFED install command above, I hope that is correct. * kernel-mft-4.16.3-12.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * knem-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.ol7u9.x86_64.rpm * knem-modules-1.1.4.90mlnx1-OFED.5.1.2.5.0.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * mlnx-nfsrdma-5.3-OFED.5.3.0.3.8.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * mlnx-nfsrdma-debuginfo-5.3-OFED.5.3.0.3.8.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * mlnx-ofa_kernel-5.3-OFED.5.3.1.0.0.1.ol7u9.x86_64.rpm * mlnx-ofa_kernel-debuginfo-5.3-OFED.5.3.1.0.0.1.ol7u9.x86_64.rpm * mlnx-ofa_kernel-devel-5.3-OFED.5.3.1.0.0.1.ol7u9.x86_64.rpm * mlnx-ofa_kernel-modules-5.3-OFED.5.3.1.0.0.1.kver.3.10.0_1160.15.2.el7.x86_64.x86_64.rpm * ofed-scripts-5.3-OFED.5.3.1.0.0.x86_64.rpm Lustre Server packages ./configure --enable-server \ --with-linux=/usr/src/kernels/*_lustre.x86_64 \ --with-o2ib=/usr/src/ofa_kernel/default make rpms * kmod-lustre-2.14.51-1.el7.x86_64.rpm * kmod-lustre-osd-ldiskfs-2.14.51-1.el7.x86_64.rpm * kmod-lustre-tests-2.14.51-1.el7.x86_64.rpm * lustre-2.14.51-1.el7.x86_64.rpm * lustre-2.14.51-1.src.rpm * lustre-debuginfo-2.14.51-1.el7.x86_64.rpm * lustre-devel-2.14.51-1.el7.x86_64.rpm * lustre-iokit-2.14.51-1.el7.x86_64.rpm * lustre-osd-ldiskfs-mount-2.14.51-1.el7.x86_64.rpm * lustre-resource-agents-2.14.51-1.el7.x86_64.rpm * lustre-tests-2.14.51-1.el7.x86_64.rpm Lustre Client packages ./configure --disable-server --enable-client \ --with-linux=/usr/src/kernels/*_lustre.x86_64 \ --with-o2ib=/usr/src/ofa_kernel/default make rpms * kmod-lustre-client-2.14.51-1.el7.x86_64.rpm * kmod-lustre-client-tests-2.14.51-1.el7.x86_64.rpm * lustre-2.14.51-1.src.rpm * lustre-client-2.14.51-1.el7.x86_64.rpm * lustre-client-debuginfo-2.14.51-1.el7.x86_64.rpm * lustre-client-devel-2.14.51-1.el7.x86_64.rpm * lustre-client-tests-2.14.51-1.el7.x86_64.rpm * lustre-iokit-2.14.51-1.el7.x86_64.rpm Thanks, Pinkesh Valdria Principal Solutions Architect – HPC Oracle Cloud Infrastructure +65-8932-3639 (m) - Singapore +1-425-205-7834 (m) - USA
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
