#11694: Multi-GPU: NVDec cant find CUDA device when used inside a docker 
container
and targeting specific GPU ID
-------------------------------------+-------------------------------------
             Reporter:  baudneo      |                    Owner:  (none)
                 Type:  defect       |                   Status:  new
             Priority:  normal       |                Component:
                                     |  undetermined
              Version:  7.0          |               Resolution:
             Keywords:  cuda NVDEC   |               Blocked By:
  docker nvidia-container-toolkit    |
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
Description changed by baudneo:

Old description:

> What I was trying to accomplish:
> Multi nvidia gpu setup using a frigate tensorrt image and nvidia-
> container-toolkit. Forcing a specific GPU ID (that is not the default
> index: 0) using one of several methods results in an error:
>
> {{{
> [h264 @ 0x557b6d7f3900] decoder->cvdl->cuvidGetDecoderCaps(&caps) failed
> -> CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
> }}}
>
> The issue here is that from what I can tell, only ffmpeg has this issue.
> Other libs/apps in the container that target a specific GPU work: loading
> ONNX models into the GPU works when targeting a specific device_id that
> is not the default index of 0.
>
> If I target the default index of 0, everything works. It's only when
> targeting the non-default index. I've also tried specifying GPU_UUIDs
> with the same result. In my case, I can't physically move the GPUs around
> to force which device gets index 0, so I am left with forcing it via
> NVIDIA_VISIBLE_DEVICES env var or docker compose deploy options.
>
> This issue has been reproduced by other frigate users and in ubuntu cuda
> images with ffmpeg 7, so not just localized to my setup.
>
> Here are a couple issues:
> - [https://github.com/blakeblackshear/frigate/discussions/18018]
> - [https://github.com/blakeblackshear/frigate/discussions/18722]
>
> Here is an issue I opened in the nvidia-container-toolkit repo:
> [https://github.com/NVIDIA/nvidia-container-toolkit/issues/1209]
>
> Version:
> {{{
> ffmpeg version n7.0.2-18-g3e6cec1286-20240919 Copyright (c) 2000-2024 the
> FFmpeg developers
>   built with gcc 14.2.0 (crosstool-NG 1.26.0.106_ed12fa6)
>   configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static
> --pkg-config=pkg-config --cross-prefix=x86_64-ffbuild-linux-gnu-
> --arch=x86_64 --target-os=linux --enable-gpl --enable-version3 --disable-
> debug --enable-iconv --enable-zlib --enable-libfreetype --enable-
> libfribidi --enable-gmp --enable-libxml2 --enable-openssl --enable-lzma
> --enable-fontconfig --enable-libharfbuzz --enable-libvorbis --enable-
> opencl --enable-libpulse --enable-libvmaf --enable-libxcb --enable-xlib
> --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth
> --enable-chromaprint --enable-libdav1d --enable-libdavs2 --enable-
> libdvdread --enable-libdvdnav --disable-libfdk-aac --enable-ffnvcodec
> --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libkvazaar
> --enable-libaribcaption --enable-libass --enable-libbluray --enable-
> libjxl --enable-libmp3lame --enable-libopus --enable-librist --enable-
> libssh --enable-libtheora --enable-libvpx --enable-libwebp --enable-
> libzmq --enable-lv2 --enable-libvpl --enable-openal --enable-libopencore-
> amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-
> libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband
> --disable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt
> --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --enable-libdrm
> --enable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc
> --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2
> --enable-libxvid --enable-libzimg --enable-libzvbi --extra-
> cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-libs='-ldl -lgomp'
> --extra-ldflags=-pthread --extra-ldexeflags=-pie --cc=x86_64-ffbuild-
> linux-gnu-gcc --cxx=x86_64-ffbuild-linux-gnu-g++ --ar=x86_64-ffbuild-
> linux-gnu-gcc-ar --ranlib=x86_64-ffbuild-linux-gnu-gcc-ranlib --nm=x86_64
> -ffbuild-linux-gnu-gcc-nm --extra-version=20240919
>   libavutil      59.  8.100 / 59.  8.100
>   libavcodec     61.  3.100 / 61.  3.100
>   libavformat    61.  1.100 / 61.  1.100
>   libavdevice    61.  1.100 / 61.  1.100
>   libavfilter    10.  1.100 / 10.  1.100
>   libswscale      8.  1.100 /  8.  1.100
>   libswresample   5.  1.100 /  5.  1.100
>   libpostproc    58.  1.100 / 58.  1.100
> }}}
>
> Command used:
> {{{
> /usr/lib/ffmpeg/7.0/bin/ffmpeg -hide_banner -loglevel warning -threads 2
> -hwaccel cuda -hwaccel_output_format cuda -user_agent "FFmpeg
> Frigate/0.16.0-0b7a33d" -rtsp_transport tcp -timeout 10000000 -fflags
> nobuffer -flags low_delay -i rtsp://127.0.0.1:8554/living_room -f segment
> -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v
> copy -c:a aac "/tmp/cache/living_room@%Y%m%d%H%M%S%z.mp4" -r 5 -vf
> "fps=5,scale_cuda=w=1920:h=1080,hwdownload,format=nv12,eq=gamma=1.4:gamma_weight=0.5"
> -threads 2 -f rawvideo -pix_fmt yuv420p
> }}}
>
> FFREPORT log file: [https://pastebin.com/qKedrL6W]

New description:

 What I was trying to accomplish:
 Multi nvidia gpu setup using a frigate tensorrt image and nvidia-
 container-toolkit. Forcing a specific GPU ID (that is not the default
 index: 0) using one of several methods results in an error:

 {{{
 [h264 @ 0x557b6d7f3900] decoder->cvdl->cuvidGetDecoderCaps(&caps) failed
 -> CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
 }}}

 The issue here is that from what I can tell, only ffmpeg has this issue.
 Other libs/apps in the container that target a specific GPU work: loading
 ONNX models into the GPU works when targeting a specific device_id that is
 not the default index of 0.

 If I target the default index of 0, everything works. It's only when
 targeting the non-default index. I've also tried specifying GPU_UUIDs with
 the same result. In my case, I can't physically move the GPUs around to
 force which device gets index 0, so I am left with forcing it via
 NVIDIA_VISIBLE_DEVICES env var or docker compose deploy options.

 Using the global args '-hide_banner -loglevel warning -threads 2
 -hwaccel_device 1' does not seem to work.

 This issue has been reproduced by other frigate users and in ubuntu cuda
 images with ffmpeg 7, so not just localized to my setup.

 Here are a couple issues:
 - [https://github.com/blakeblackshear/frigate/discussions/18018]
 - [https://github.com/blakeblackshear/frigate/discussions/18722]

 Here is an issue I opened in the nvidia-container-toolkit repo:
 [https://github.com/NVIDIA/nvidia-container-toolkit/issues/1209]

 Version:
 {{{
 ffmpeg version n7.0.2-18-g3e6cec1286-20240919 Copyright (c) 2000-2024 the
 FFmpeg developers
   built with gcc 14.2.0 (crosstool-NG 1.26.0.106_ed12fa6)
   configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static
 --pkg-config=pkg-config --cross-prefix=x86_64-ffbuild-linux-gnu-
 --arch=x86_64 --target-os=linux --enable-gpl --enable-version3 --disable-
 debug --enable-iconv --enable-zlib --enable-libfreetype --enable-
 libfribidi --enable-gmp --enable-libxml2 --enable-openssl --enable-lzma
 --enable-fontconfig --enable-libharfbuzz --enable-libvorbis --enable-
 opencl --enable-libpulse --enable-libvmaf --enable-libxcb --enable-xlib
 --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth
 --enable-chromaprint --enable-libdav1d --enable-libdavs2 --enable-
 libdvdread --enable-libdvdnav --disable-libfdk-aac --enable-ffnvcodec
 --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libkvazaar
 --enable-libaribcaption --enable-libass --enable-libbluray --enable-libjxl
 --enable-libmp3lame --enable-libopus --enable-librist --enable-libssh
 --enable-libtheora --enable-libvpx --enable-libwebp --enable-libzmq
 --enable-lv2 --enable-libvpl --enable-openal --enable-libopencore-amrnb
 --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg
 --enable-libopenmpt --enable-librav1e --enable-librubberband --disable-
 schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1
 --enable-libtwolame --enable-libuavs3d --enable-libdrm --enable-vaapi
 --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-
 libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-
 libxvid --enable-libzimg --enable-libzvbi --extra-
 cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-libs='-ldl -lgomp'
 --extra-ldflags=-pthread --extra-ldexeflags=-pie --cc=x86_64-ffbuild-
 linux-gnu-gcc --cxx=x86_64-ffbuild-linux-gnu-g++ --ar=x86_64-ffbuild-
 linux-gnu-gcc-ar --ranlib=x86_64-ffbuild-linux-gnu-gcc-ranlib --nm=x86_64
 -ffbuild-linux-gnu-gcc-nm --extra-version=20240919
   libavutil      59.  8.100 / 59.  8.100
   libavcodec     61.  3.100 / 61.  3.100
   libavformat    61.  1.100 / 61.  1.100
   libavdevice    61.  1.100 / 61.  1.100
   libavfilter    10.  1.100 / 10.  1.100
   libswscale      8.  1.100 /  8.  1.100
   libswresample   5.  1.100 /  5.  1.100
   libpostproc    58.  1.100 / 58.  1.100
 }}}

 Command used:
 {{{
 /usr/lib/ffmpeg/7.0/bin/ffmpeg -hide_banner -loglevel warning -threads 2
 -hwaccel cuda -hwaccel_output_format cuda -user_agent "FFmpeg
 Frigate/0.16.0-0b7a33d" -rtsp_transport tcp -timeout 10000000 -fflags
 nobuffer -flags low_delay -i rtsp://127.0.0.1:8554/living_room -f segment
 -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c:v
 copy -c:a aac "/tmp/cache/living_room@%Y%m%d%H%M%S%z.mp4" -r 5 -vf
 
"fps=5,scale_cuda=w=1920:h=1080,hwdownload,format=nv12,eq=gamma=1.4:gamma_weight=0.5"
 -threads 2 -f rawvideo -pix_fmt yuv420p
 }}}

 FFREPORT log file: [https://pastebin.com/qKedrL6W]

--
-- 
Ticket URL: <https://trac.ffmpeg.org/ticket/11694#comment:1>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
_______________________________________________
FFmpeg-trac mailing list
FFmpeg-trac@avcodec.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-trac

To unsubscribe, visit link above, or email
ffmpeg-trac-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to