This bug is missing log files that will aid in diagnosing the problem.
While running an Ubuntu kernel (not a mainline or third-party kernel)
please enter the following command in a terminal window:
apport-collect 2040526
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.
** Changed in: linux (Ubuntu)
Status: New => Incomplete
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526
Title:
Backport DMABUF functionality
Status in linux package in Ubuntu:
Incomplete
Bug description:
SRU Justification:
[Impact]
Backport RDMA DMABUF functionality
Nvidia is working on a high performance networking solution with real
customers. That solution is being developed using the Ubuntu 22.04 LTS
distro release and the distro kernel (lowlatency flavour). This
“dma_buf” patchset consists of upstreamed patches that allow buffers
to be shared between drivers thus enhancing performance while reducing
copying of data.
Our team is currently engaged in the development of a high-performance
networking solution tailored to meet the demands of real-world
customers. This cutting-edge solution is being crafted on the
foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
specifically the lowlatency flavor.
At the heart of our innovation lies the transformative "dma_buf"
patchset, comprising a series of patches that have been integrated
into the upstream kernel in 5.16 and 5.17. These patches introduce a
groundbreaking capability: enabling the seamless sharing of buffers
among various drivers. This not only bolsters the solution's
performance but also minimizes the need for data copying, effectively
enhancing efficiency across the board.
The new functionality is isolated such that existing user will not
execute these new code paths.
* First 3 patches adds a new api to the RDMA subsystem that allows drivers to
get a pinned dmabuf memory
region without requiring an implementation of the move_notify callback.
https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
* The remaining patches add support for DMABUF when creating a devx umem.
devx umems
are quite similar to MR's execpt they cannot be revoked, so this uses the
dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
work with MR.
https://lore.kernel.org/all/0-v1-bd147097458e+ede-
umem_dmabuf_...@nvidia.com/
[Test Plan]
SW Configuration:
• Download CUDA 12.2 run file
(https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=runfile_local)
• Install using kernel-open i.e. #sh ./cuda_12.2.2_535.104.05_linux.run
-m=kernel-open
• Clone perftest from https://github.com/linux-rdma/perftest.
• cd perftest
• export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH
• export LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LIBRARY_PATH
• run: ./autogen.sh ; ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h;
make
# Start Server
$ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf
#Start Client
$ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost
[Where problems could occur?]
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2040526/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp