Kumar

Self-signup on the Whamcloud JIRA had to be deactivated about ten years ago 
after repeated hacking attempts via that option. You are by no means the first 
person to be confused by the default UI around this but there is a message on 
the front page that tells you to email [email protected] to get an account.

Peter

From: lustre-discuss <[email protected]> on behalf of 
Kumar Nachiketa via lustre-discuss <[email protected]>
Date: Sunday, May 10, 2026 at 7:05 AM
To: [email protected] <[email protected]>
Subject: [lustre-discuss] [BUG] osd-zfs configure writes wrong 
KBUILD_EXTRA_SYMBOLS path against OpenZFS 2.4+ (Module.symvers moved)

Hi all,

First post to the list, I'm Kumar Nachiketa, a storage practitioner
running a small personal lab on NVIDIA DGX Spark hardware (Grace UMA
workstation, ARM64, Ubuntu 24.04 LTS, kernel 6.17.0-1014-nvidia). This
is a personal-capacity project on my own hardware, not affiliated with
my employer.

I tried jira.whamcloud.com self-signup but the registration page
redirects to a contact-administrators form that isn't configured, so
I'm filing on the list. Happy to refile in Jira if a maintainer can
help me get an account, or if a triager prefers to relay it there.

This is the first of three bugs I'll send separately.


Summary
-------

Lustre master ./configure --with-zfs=<zfs-source-root> writes
KBUILD_EXTRA_SYMBOLS pointing at <zfs-source-root>/Module.symvers, but
OpenZFS 2.4+ relocated this file to <zfs-source-root>/module/Module.symvers
as part of its build-tree reorganization. The osd-zfs build then proceeds
with no Module.symvers reference; kbuild emits

WARNING: Symbol version dump "<path>/Module.symvers" is missing

and the resulting osd_zfs.ko has unresolved symbols and fails to load
(Unknown symbol against zfs/spl symbols at insmod time).


Environment
-----------

  Kernel:    6.17.0-1014-nvidia (NVIDIA-signed Ubuntu kernel)
  OS:        Ubuntu 24.04.1 LTS ARM64
  OpenZFS:   2.4.1 (release tag)
  Lustre:    master @ 805cece6747f442449f32a1d25a8b8a03b230875
  Hardware:  NVIDIA DGX Spark (Grace UMA workstation, ARM64)


Reproduction
------------

Step 1 — Build OpenZFS 2.4:

  git clone https://github.com/openzfs/zfs.git && cd zfs
  git checkout zfs-2.4.1
  ./autogen.sh
  ./configure --with-linux=/lib/modules/$(uname -r)/build \
              --with-linux-obj=/lib/modules/$(uname -r)/build
  make -j$(nproc)

At this point Module.symvers is at module/Module.symvers, NOT at the
source root:

  ls Module.symvers          # -> ENOENT
  ls module/Module.symvers   # -> present

Step 2 — Build Lustre against it:

  cd ../
  git clone https://git.whamcloud.com/fs/lustre-release.git
  cd lustre-release
  git checkout 805cece6747f442449f32a1d25a8b8a03b230875
  sh autogen.sh
  ./configure --with-linux=/lib/modules/$(uname -r)/build \
              --with-zfs=/path/to/zfs \
              --disable-ldiskfs \
              --with-o2ib=/lib/modules/$(uname -r)/build
  make -j$(nproc)

The osd-zfs build emits the Symbol-version-dump warning; the resulting
osd_zfs.ko is unresolvable.


Expected behavior
-----------------

./configure --with-zfs=<path> resolves the OpenZFS symbol-versions file
regardless of whether the OpenZFS source tree uses the pre-2.4 layout
(<path>/Module.symvers) or the 2.4+ layout (<path>/module/Module.symvers).


Actual behavior
---------------

KBUILD_EXTRA_SYMBOLS is written with the pre-2.4 path unconditionally.
osd-zfs builds without symbol-version data; the kernel module produced
is unresolvable.


Workaround (measured working)
-----------------------------

After the OpenZFS build completes and before Lustre ./configure, create
a symlink at the legacy location:

  ln -sf module/Module.symvers <zfs-source-root>/Module.symvers

Lustre ./configure then resolves Module.symvers correctly, osd-zfs
builds clean, and osd_zfs.ko loads. Validated end-to-end (Lustre
filesystem mounted, IO measured) in the linked reproduce kit.


Suggested fix
-------------

./configure should probe both candidate paths and use whichever exists:

  if test -f "$with_zfs/module/Module.symvers"; then
      ZFS_SYMBOLS_PATH="$with_zfs/module/Module.symvers"
  elif test -f "$with_zfs/Module.symvers"; then
      ZFS_SYMBOLS_PATH="$with_zfs/Module.symvers"
  else
      AC_MSG_ERROR([cannot locate ZFS Module.symvers under $with_zfs])
  fi

(Or equivalently in the osd-zfs Kbuild that consumes the variable.)


Reference
---------

Public reproduce kit (build cascade documented end-to-end):

https://github.com/knachiketa04/aihomelab/tree/main/artifacts/training/lustre-on-uma-workstations/reproduce

Thanks,
Kumar
[email protected]
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to