Public bug reported:

SRU Justification:

[Impact]

Backport RDMA DMABUF functionality

Nvidia is working on a high performance networking solution with real
customers. That solution is being developed using the Ubuntu 22.04 LTS
distro release and the distro kernel (lowlatency flavour). This
“dma_buf” patchset consists of upstreamed patches that allow buffers to
be shared between drivers thus enhancing performance while reducing
copying of data.

Our team is currently engaged in the development of a high-performance
networking solution tailored to meet the demands of real-world
customers. This cutting-edge solution is being crafted on the foundation
of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically
the lowlatency flavor.

At the heart of our innovation lies the transformative "dma_buf"
patchset, comprising a series of patches that have been integrated into
the upstream kernel in 5.16 and 5.17. These patches introduce a
groundbreaking capability: enabling the seamless sharing of buffers
among various drivers. This not only bolsters the solution's performance
but also minimizes the need for data copying, effectively enhancing
efficiency across the board.

The new functionality is isolated such that existing user will not
execute these new code paths.

* First 3 patches adds a new api to the RDMA subsystem that allows drivers to 
get a pinned dmabuf memory
region without requiring an implementation of the move_notify callback.

https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/

* The remaining patches add support for DMABUF when creating a devx umem. devx 
umems
are quite similar to MR's execpt they cannot be revoked, so this uses the 
dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
work with MR. 

https://lore.kernel.org/all/0-v1-bd147097458e+ede-
umem_dmabuf_...@nvidia.com/

[Test Plan]

SW Configuration:
• Download CUDA 12.2 run file 
(https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=runfile_local)
• Install using kernel-open i.e. #sh ./cuda_12.2.2_535.104.05_linux.run 
-m=kernel-open
• Clone perftest from https://github.com/linux-rdma/perftest.
• cd perftest
• export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH
• export LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LIBRARY_PATH
• run: ./autogen.sh ; ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h; 
make

# Start Server
$ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf

#Start Client
$ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost

[Where problems could occur?]

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
  Backport DMABUF functionality

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  SRU Justification:

  [Impact]

  Backport RDMA DMABUF functionality

  Nvidia is working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers
  to be shared between drivers thus enhancing performance while reducing
  copying of data.

  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the
  foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
  specifically the lowlatency flavor.

  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated
  into the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's
  performance but also minimizes the need for data copying, effectively
  enhancing efficiency across the board.

  The new functionality is isolated such that existing user will not
  execute these new code paths.

  * First 3 patches adds a new api to the RDMA subsystem that allows drivers to 
get a pinned dmabuf memory
  region without requiring an implementation of the move_notify callback.

  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/

  * The remaining patches add support for DMABUF when creating a devx umem. 
devx umems
  are quite similar to MR's execpt they cannot be revoked, so this uses the 
  dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
  work with MR. 

  https://lore.kernel.org/all/0-v1-bd147097458e+ede-
  umem_dmabuf_...@nvidia.com/

  [Test Plan]

  SW Configuration:
  • Download CUDA 12.2 run file 
(https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=20.04&target_type=runfile_local)
  • Install using kernel-open i.e. #sh ./cuda_12.2.2_535.104.05_linux.run 
-m=kernel-open
  • Clone perftest from https://github.com/linux-rdma/perftest.
  • cd perftest
  • export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH
  • export LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LIBRARY_PATH
  • run: ./autogen.sh ; ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h; 
make

  # Start Server
  $ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf

  #Start Client
  $ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost

  [Where problems could occur?]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2040526/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to