[Kernel-packages] [Bug 2040526] [NEW] Backport DMABUF functionality

Ian May Wed, 25 Oct 2023 10:01:15 -0700

Public bug reported:

SRU Justification:

[Impact]

Backport RDMA DMABUF functionality

Nvidia is working on a high performance networking solution with real
customers. That solution is being developed using the Ubuntu 22.04 LTS
distro release and the distro kernel (lowlatency flavour). This
“dma_buf” patchset consists of upstreamed patches that allow buffers to
be shared between drivers thus enhancing performance while reducing
copying of data.

Our team is currently engaged in the development of a high-performance
networking solution tailored to meet the demands of real-world
customers. This cutting-edge solution is being crafted on the foundation
of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically
the lowlatency flavor.

At the heart of our innovation lies the transformative "dma_buf"
patchset, comprising a series of patches that have been integrated into
the upstream kernel in 5.16 and 5.17. These patches introduce a
groundbreaking capability: enabling the seamless sharing of buffers
among various drivers. This not only bolsters the solution's performance
but also minimizes the need for data copying, effectively enhancing
efficiency across the board.

The new functionality is isolated such that existing user will not
execute these new code paths.

* First 3 patches adds a new api to the RDMA subsystem that allows drivers to
get a pinned dmabuf memory
region without requiring an implementation of the move_notify callback.

https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/

* The remaining patches add support for DMABUF when creating a devx umem. devx
umems
are quite similar to MR's execpt they cannot be revoked, so this uses the
dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
work with MR.

https://lore.kernel.org/all/0-v1-bd147097458e+ede-
umem_dmabuf_...@nvidia.com/

[Test Plan]

SW Configuration:
• Download CUDA 12.2 run file
(https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=runfile_local)
• Install using kernel-open i.e. #sh ./cuda_12.2.2_535.104.05_linux.run
-m=kernel-open
• Clone perftest from https://github.com/linux-rdma/perftest.
• cd perftest
• export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH
• export LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LIBRARY_PATH
• run: ./autogen.sh ; ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h;
make

# Start Server
$ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf

#Start Client
$ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost

[Where problems could occur?]

** Affects: linux (Ubuntu)
Importance: Undecided
Status: Incomplete

--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
Backport DMABUF functionality

Status in linux package in Ubuntu:
Incomplete

Bug description:
SRU Justification:

[Impact]

Backport RDMA DMABUF functionality

Nvidia is working on a high performance networking solution with real
customers. That solution is being developed using the Ubuntu 22.04 LTS
distro release and the distro kernel (lowlatency flavour). This
“dma_buf” patchset consists of upstreamed patches that allow buffers
to be shared between drivers thus enhancing performance while reducing
copying of data.

Our team is currently engaged in the development of a high-performance
networking solution tailored to meet the demands of real-world
customers. This cutting-edge solution is being crafted on the
foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
specifically the lowlatency flavor.

At the heart of our innovation lies the transformative "dma_buf"
patchset, comprising a series of patches that have been integrated
into the upstream kernel in 5.16 and 5.17. These patches introduce a
groundbreaking capability: enabling the seamless sharing of buffers
among various drivers. This not only bolsters the solution's
performance but also minimizes the need for data copying, effectively
enhancing efficiency across the board.

The new functionality is isolated such that existing user will not
execute these new code paths.

* First 3 patches adds a new api to the RDMA subsystem that allows drivers to
get a pinned dmabuf memory
region without requiring an implementation of the move_notify callback.

https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/

* The remaining patches add support for DMABUF when creating a devx umem.
devx umems
are quite similar to MR's execpt they cannot be revoked, so this uses the
dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
work with MR.

https://lore.kernel.org/all/0-v1-bd147097458e+ede-
umem_dmabuf_...@nvidia.com/

[Test Plan]

# Start Server
$ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf

#Start Client
$ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost

[Where problems could occur?]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2040526/+subscriptions

--
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 2040526] [NEW] Backport DMABUF functionality

Reply via email to