On 2018-02-28 19:53, Marcin Woźniak wrote:
Try the same command but remove overlay filter and check.
I removed filter and found slight NVENC usage increase (33%). Then I conducted following checks with minimal options set.
------------------- Full HW transcoding -------------------/usr/local/ffmpeg-dev/bin/ffmpeg -hwaccel cuvid -c:v mpeg4_cuvid -i input.avi -map 0:v:0 -c:v h264_nvenc -b:v 1024k -f null -
...frame=69703 fps=3086 q=19.0 Lsize=N/A time=00:46:28.32 bitrate=N/A speed= 123x
CPU usage: top - 22:37:15 up 22:26, 11 users, load average: 0.18, 0.35, 0.42 Threads: 1188 total, 2 running, 1185 sleeping, 0 stopped, 1 zombie%Cpu0 : 21.7 us, 18.4 sy, 0.0 ni, 59.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 19.9 us, 17.8 sy, 0.0 ni, 62.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 1.3 us, 0.3 sy, 0.0 ni, 98.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 7.3 us, 5.6 sy, 0.0 ni, 87.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 0.0 us, 0.0 sy, 0.0 ni, 97.4 id, 2.6 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 1.0 us, 0.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu6 : 0.0 us, 8.6 sy, 0.0 ni, 91.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu7 : 0.3 us, 1.0 sy, 0.0 ni, 98.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 12261384 total, 11810564 used, 450820 free, 1303564 buffersKiB Swap: 7999484 total, 560488 used, 7438996 free. 5455608 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND P 6495 root 20 0 12.955g 116208 101472 R 83.9 0.9 0:06.99 `- ffmpeg 3 6498 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6499 root 20 0 12.955g 116208 101472 S 7.6 0.9 0:00.59 `- ffmpeg 4 6500 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6501 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6502 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6503 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6504 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6505 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6506 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6507 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6508 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4
NVDEC/NVENC usage:
# nvidia-smi dmon
# gpu pwr temp sm mem enc dec mclk pclk
# Idx W C % % % % MHz MHz
0 49 49 10 14 99 63 3802 2012
0 49 50 9 14 99 59 3802 2012
0 49 50 14 14 99 62 3802 2012
0 49 50 10 14 99 62 3802 2012
0 48 50 12 14 97 62 3802 2012
0 48 51 9 14 99 63 3802 2012
------------------------
Partially HW transcoding
------------------------
/usr/local/ffmpeg-dev/bin/ffmpeg -c:v mpeg4_cuvid -i input.avi -map
0:v:0 -c:v h264_nvenc -b:v 1024k -f null -
...frame=69703 fps=2136 q=19.0 Lsize=N/A time=00:46:28.32 bitrate=N/A speed=85.4x
CPU usage: top - 22:42:04 up 22:31, 11 users, load average: 0.23, 0.29, 0.39 Threads: 1185 total, 3 running, 1181 sleeping, 0 stopped, 1 zombie%Cpu0 : 6.6 us, 2.4 sy, 0.0 ni, 91.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 9.1 us, 2.4 sy, 0.0 ni, 88.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 24.2 us, 2.0 sy, 0.0 ni, 73.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 6.0 us, 1.7 sy, 0.0 ni, 92.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 3.0 us, 0.7 sy, 0.0 ni, 96.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 12.2 us, 3.7 sy, 0.0 ni, 84.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu6 : 0.7 us, 15.4 sy, 0.0 ni, 84.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu7 : 28.7 us, 2.0 sy, 0.0 ni, 69.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 12261384 total, 11916268 used, 345116 free, 1276148 buffersKiB Swap: 7999484 total, 561988 used, 7437496 free. 5450584 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND P 6620 root 20 0 17.235g 211408 186860 R 74.2 1.7 0:08.31 `- ffmpeg 4 6623 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6624 root 20 0 17.235g 211408 186860 S 13.1 1.7 0:01.36 `- ffmpeg 5 6625 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 0 6626 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6627 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 4 6628 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6629 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6630 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6631 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6632 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6633 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6634 root 20 0 17.235g 211408 186860 S 7.6 1.7 0:00.81 `- ffmpeg 2 6635 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 0
NVDEC/NVENC usage:
# nvidia-smi dmon
# gpu pwr temp sm mem enc dec mclk pclk
# Idx W C % % % % MHz MHz
0 46 49 34 10 69 42 3802 2012
0 46 50 34 10 70 42 3802 2012
0 46 50 34 10 70 42 3802 2012
0 46 50 34 10 72 42 3802 2012
0 46 50 34 10 72 42 3802 2012
0 46 50 34 10 70 44 3802 2012
I see 30% NVENC performance loss when the transcoding path is NVDEC -
RAM - NVENC. Does it mean system memory bandwidth is a bottleneck in
this case? Am I faced other unavoidable overheads?
Thanks. Garri
W dniu 28.02.2018 o 14:44, Garri Djavadyan pisze:Hello FFmpeg community, I faced a problem with NVDEC/NVENC resources underutilization while running one ffmpeg instance. We use ffmpeg to convert various format videos to MP4(h264/aac) and applying logo overlay. Hardware decoding and encoding process is performed by NVDEC and NVENC chips. For example, our cmdline is: /usr/local/ffmpeg-dev/bin/ffmpeg -y -c:v mpeg4_cuvid \ -i input.avi -i logo.png \ -filter_complex [0:v:0][1:v:0]overlay=10:10[out1] \ -map [out1] -map 0:a:0 -map_metadata -1 -map_chapters -1 \ -c:v h264_nvenc -b:v 1024k -r 25 \ -c:a libfdk_aac -b:a 128k \ -movflags faststart out.mp4 The overal transcoding process is greatly accelerated. But I see, that NVDEC/NVENC cycles are not fully utilized. For example: # nvidia-smi dmon # gpu pwr temp sm mem enc dec mclk pclk # Idx W C % % % % MHz MHz 0 32 50 11 3 22 16 3802 1632 0 32 50 11 3 23 14 3802 1632 0 32 50 12 3 22 15 3802 1632 0 33 51 13 3 18 15 3802 1632 0 32 51 12 3 17 11 3802 1632 0 32 51 10 2 20 13 3802 1632 I tried to find a bottleneck, but all system resources are OK. For example, CPU (top output, at least 60% idle): Tasks: 296 total, 3 running, 292 sleeping, 0 stopped, 1 zombie %Cpu0 : 14,6 us, 0,7 sy, 0,0 ni, 84,7 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu1 : 11,2 us, 1,4 sy, 0,0 ni, 87,5 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu2 : 14,5 us, 1,3 sy, 0,0 ni, 84,2 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu3 : 4,4 us, 0,7 sy, 0,0 ni, 94,9 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu4 : 39,6 us, 1,3 sy, 0,0 ni, 59,1 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu5 : 2,4 us, 0,7 sy, 0,0 ni, 96,6 id, 0,3 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu6 : 2,7 us, 4,3 sy, 0,0 ni, 93,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu7 : 21,0 us, 0,7 sy, 0,0 ni, 78,3 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st КiB Mem: 12261384 total, 10578352 used, 1683032 free, 4809968 buffersКiB Swap: 7999484 total, 584072 used, 7415412 free. 1431448 cachedMem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND21245 user 20 0 17,516g 235792 187228 R 101,3 1,9 1:21.15ffmpeg1512 root -51 0 0 0 0 S 7,0 0,0 17:29.39irq/47-nvidia ----------------- Memory: # free -mtotal used free shared buffers cached Mem: 11974 8619 3354 133 3758 653 -/+ buffers/cache: 4207 7766 Swap: 7811 570 7241 ----------------- Storage I/O: # iostat -x 1 3 Linux 3.13.0-142-generic (user-desktop) 28.02.2018 _x86 _64_ (8 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 10,86 3,03 1,43 4,75 0,00 79,93 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %utilsda 0,49 0,00 0,01 0,00 0,29 0,00 45,90 0,00 47,45 47,45 0,00 14,55 0,02sdb 849,21 769,16 47,81 22,18 3859,21 3935,51 222,74 9,18 131,22 23,62 363,20 4,05 28,38 avg-cpu: %user %nice %system %iowait %steal %idle 14,65 0,00 1,78 0,00 0,00 83,57 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %utilsda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00sdb 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 avg-cpu: %user %nice %system %iowait %steal %idle 21,24 0,00 2,53 0,51 0,00 75,73 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %utilsda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00sdb 0,00 7,00 0,00 3,00 0,00 88,00 58,67 0,04 13,33 0,00 13,33 13,33 4,00 -------------------- FFmper version and configuration options: # /usr/local/ffmpeg-dev/bin/ffmpeg -version ffmpeg version N-90054-g474194a Copyright (c) 2000-2018 the FFmpeg developers built with gcc 4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.4) configuration: --prefix=/usr/local/ffmpeg-dev --enable-gpl --enable- nonfree --enable-libfdk-aac --enable-libx264 --enable-nvenc --enable- libnpp libavutil 56. 7.101 / 56. 7.101 libavcodec 58. 11.101 / 58. 11.101 libavformat 58. 9.100 / 58. 9.100 libavdevice 58. 1.100 / 58. 1.100 libavfilter 7. 12.100 / 7. 12.100 libswscale 5. 0.101 / 5. 0.101 libswresample 3. 0.101 / 3. 0.101 libpostproc 55. 0.100 / 55. 0.100 --------------------- NVIDIA driver and card information: # nvidia-smi Wed Feb 28 17:20:13 2018 +-------------------------------------------------------------------- ---------+ | NVIDIA-SMI 384.111 Driver Version: 384.111 | |-------------------------------+----------------------+--------------- -------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU- Util Compute M. | |===============================+======================+=============== =======| | 0 GeForce GTX 106... Off | 00000000:01:00.0 On | N/A | | 0% 53C P2 32W / 150W | 939MiB / 6071MiB | 11% Default | +-------------------------------+----------------------+--------------- -------++-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |====================================================================== =======| | 0 1423 G /usr/bin/X 443MiB | | 0 2555 G compiz 224MiB | | 0 14879 G ...-token=ACEXXXXXXXXX XX2E9DDXXXXXXXE41 107MiB | | 0 21458 C /usr/local/ffmpeg- dev/bin/ffmpeg 159MiB | +-------------------------------------------------------------------- ---------+ I believe I overlooked something, or maybe there are some limitations. So, I kindly ask your suggestions. Many thanks in advance! Garri _______________________________________________ ffmpeg-user mailing list [email protected] http://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe"._______________________________________________ ffmpeg-user mailing list [email protected] http://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
_______________________________________________ ffmpeg-user mailing list [email protected] http://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
