On 2018-02-28 19:53, Marcin Woźniak wrote:
Try the same command but remove overlay filter and check.

I removed filter and found slight NVENC usage increase (33%). Then I conducted following checks with minimal options set.

-------------------
Full HW transcoding
-------------------

/usr/local/ffmpeg-dev/bin/ffmpeg -hwaccel cuvid -c:v mpeg4_cuvid -i input.avi -map 0:v:0 -c:v h264_nvenc -b:v 1024k -f null -
...
frame=69703 fps=3086 q=19.0 Lsize=N/A time=00:46:28.32 bitrate=N/A speed= 123x


CPU usage:

top - 22:37:15 up 22:26, 11 users,  load average: 0.18, 0.35, 0.42
Threads: 1188 total,   2 running, 1185 sleeping,   0 stopped,   1 zombie
%Cpu0 : 21.7 us, 18.4 sy, 0.0 ni, 59.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 19.9 us, 17.8 sy, 0.0 ni, 62.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 1.3 us, 0.3 sy, 0.0 ni, 98.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 7.3 us, 5.6 sy, 0.0 ni, 87.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 0.0 us, 0.0 sy, 0.0 ni, 97.4 id, 2.6 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 1.0 us, 0.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu6 : 0.0 us, 8.6 sy, 0.0 ni, 91.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu7 : 0.3 us, 1.0 sy, 0.0 ni, 98.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem:  12261384 total, 11810564 used,   450820 free,  1303564 buffers
KiB Swap: 7999484 total, 560488 used, 7438996 free. 5455608 cached Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND P 6495 root 20 0 12.955g 116208 101472 R 83.9 0.9 0:06.99 `- ffmpeg 3 6498 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6499 root 20 0 12.955g 116208 101472 S 7.6 0.9 0:00.59 `- ffmpeg 4 6500 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6501 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6502 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6503 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6504 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6505 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6506 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6507 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4 6508 root 20 0 12.955g 116208 101472 S 0.0 0.9 0:00.00 `- ffmpeg 4


NVDEC/NVENC usage:

# nvidia-smi dmon
# gpu   pwr  temp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     %     %     %     %   MHz   MHz
    0    49    49    10    14    99    63  3802  2012
    0    49    50     9    14    99    59  3802  2012
    0    49    50    14    14    99    62  3802  2012
    0    49    50    10    14    99    62  3802  2012
    0    48    50    12    14    97    62  3802  2012
    0    48    51     9    14    99    63  3802  2012


------------------------
Partially HW transcoding
------------------------

/usr/local/ffmpeg-dev/bin/ffmpeg -c:v mpeg4_cuvid -i input.avi -map 0:v:0 -c:v h264_nvenc -b:v 1024k -f null -
...
frame=69703 fps=2136 q=19.0 Lsize=N/A time=00:46:28.32 bitrate=N/A speed=85.4x


CPU usage:

top - 22:42:04 up 22:31, 11 users,  load average: 0.23, 0.29, 0.39
Threads: 1185 total,   3 running, 1181 sleeping,   0 stopped,   1 zombie
%Cpu0 : 6.6 us, 2.4 sy, 0.0 ni, 91.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 9.1 us, 2.4 sy, 0.0 ni, 88.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 24.2 us, 2.0 sy, 0.0 ni, 73.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 6.0 us, 1.7 sy, 0.0 ni, 92.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 3.0 us, 0.7 sy, 0.0 ni, 96.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 12.2 us, 3.7 sy, 0.0 ni, 84.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu6 : 0.7 us, 15.4 sy, 0.0 ni, 84.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu7 : 28.7 us, 2.0 sy, 0.0 ni, 69.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem:  12261384 total, 11916268 used,   345116 free,  1276148 buffers
KiB Swap: 7999484 total, 561988 used, 7437496 free. 5450584 cached Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND P 6620 root 20 0 17.235g 211408 186860 R 74.2 1.7 0:08.31 `- ffmpeg 4 6623 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6624 root 20 0 17.235g 211408 186860 S 13.1 1.7 0:01.36 `- ffmpeg 5 6625 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 0 6626 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6627 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 4 6628 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6629 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6630 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6631 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6632 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6633 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 7 6634 root 20 0 17.235g 211408 186860 S 7.6 1.7 0:00.81 `- ffmpeg 2 6635 root 20 0 17.235g 211408 186860 S 0.0 1.7 0:00.00 `- ffmpeg 0


NVDEC/NVENC usage:

# nvidia-smi dmon
# gpu   pwr  temp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     %     %     %     %   MHz   MHz
    0    46    49    34    10    69    42  3802  2012
    0    46    50    34    10    70    42  3802  2012
    0    46    50    34    10    70    42  3802  2012
    0    46    50    34    10    72    42  3802  2012
    0    46    50    34    10    72    42  3802  2012
    0    46    50    34    10    70    44  3802  2012


I see 30% NVENC performance loss when the transcoding path is NVDEC - RAM - NVENC. Does it mean system memory bandwidth is a bottleneck in this case? Am I faced other unavoidable overheads?

Thanks.


Garri


W dniu 28.02.2018 o 14:44, Garri Djavadyan pisze:
Hello FFmpeg community,


I faced a problem with NVDEC/NVENC resources underutilization while
running one ffmpeg instance.

We use ffmpeg to convert various format videos to MP4(h264/aac) and
applying logo overlay. Hardware decoding and encoding process is
performed by NVDEC and NVENC chips. For example, our cmdline is:

/usr/local/ffmpeg-dev/bin/ffmpeg -y -c:v mpeg4_cuvid \
   -i input.avi -i logo.png \
   -filter_complex [0:v:0][1:v:0]overlay=10:10[out1] \
   -map [out1] -map 0:a:0 -map_metadata -1 -map_chapters -1 \
   -c:v h264_nvenc -b:v 1024k -r 25 \
   -c:a libfdk_aac -b:a 128k \
   -movflags faststart out.mp4


The overal transcoding process is greatly accelerated. But I see, that
NVDEC/NVENC cycles are not fully utilized. For example:

# nvidia-smi dmon
# gpu   pwr  temp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     %     %     %     %   MHz   MHz
     0    32    50    11     3    22    16  3802  1632
     0    32    50    11     3    23    14  3802  1632
     0    32    50    12     3    22    15  3802  1632
     0    33    51    13     3    18    15  3802  1632
     0    32    51    12     3    17    11  3802  1632
     0    32    51    10     2    20    13  3802  1632


I tried to find a bottleneck, but all system resources are OK. For
example, CPU (top output, at least 60% idle):

Tasks: 296 total,   3 running, 292 sleeping,   0 stopped,   1 zombie
%Cpu0  : 14,6 us,  0,7 sy,  0,0 ni, 84,7 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu1  : 11,2 us,  1,4 sy,  0,0 ni, 87,5 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu2  : 14,5 us,  1,3 sy,  0,0 ni, 84,2 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu3  :  4,4 us,  0,7 sy,  0,0 ni, 94,9 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu4  : 39,6 us,  1,3 sy,  0,0 ni, 59,1 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu5  :  2,4 us,  0,7 sy,  0,0 ni, 96,6 id,  0,3 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu6  :  2,7 us,  4,3 sy,  0,0 ni, 93,0 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
%Cpu7  : 21,0 us,  0,7 sy,  0,0 ni, 78,3 id,  0,0 wa,  0,0 hi,  0,0
si,  0,0 st
КiB Mem:  12261384 total, 10578352 used,  1683032 free,  4809968
buffers
КiB Swap: 7999484 total, 584072 used, 7415412 free. 1431448 cached
Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND
21245 user 20 0 17,516g 235792 187228 R 101,3 1,9 1:21.15
ffmpeg
1512 root -51 0 0 0 0 S 7,0 0,0 17:29.39
irq/47-nvidia

-----------------
Memory:

# free -m
total used free shared buffers cach
ed
Mem:      11974       8619       3354        133       3758        653
-/+ buffers/cache:       4207       7766
Swap:       7811        570       7241

-----------------
Storage I/O:

# iostat -x 1 3
Linux 3.13.0-142-generic (user-desktop)         28.02.2018      _x86
_64_    (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           10,86    3,03    1,43    4,75    0,00   79,93

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda 0,49 0,00 0,01 0,00 0,29 0,00 45
,90     0,00   47,45   47,45    0,00  14,55   0,02
sdb 849,21 769,16 47,81 22,18 3859,21 3935,51 222
,74     9,18  131,22   23,62  363,20   4,05  28,38

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           14,65    0,00    1,78    0,00    0,00   83,57

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0
,00     0,00    0,00    0,00    0,00   0,00   0,00
sdb 0,00 0,00 0,00 0,00 0,00 0,00 0
,00     0,00    0,00    0,00    0,00   0,00   0,00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           21,24    0,00    2,53    0,51    0,00   75,73

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0
,00     0,00    0,00    0,00    0,00   0,00   0,00
sdb 0,00 7,00 0,00 3,00 0,00 88,00 58
,67     0,04   13,33    0,00   13,33  13,33   4,00


--------------------
FFmper version and configuration options:

# /usr/local/ffmpeg-dev/bin/ffmpeg -version
ffmpeg version N-90054-g474194a Copyright (c) 2000-2018 the FFmpeg
developers
built with gcc 4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.4)
configuration: --prefix=/usr/local/ffmpeg-dev --enable-gpl --enable-
nonfree --enable-libfdk-aac --enable-libx264 --enable-nvenc --enable-
libnpp
libavutil      56.  7.101 / 56.  7.101
libavcodec     58. 11.101 / 58. 11.101
libavformat    58.  9.100 / 58.  9.100
libavdevice    58.  1.100 / 58.  1.100
libavfilter     7. 12.100 /  7. 12.100
libswscale      5.  0.101 /  5.  0.101
libswresample   3.  0.101 /  3.  0.101
libpostproc    55.  0.100 / 55.  0.100


---------------------
NVIDIA driver and card information:

# nvidia-smi
Wed Feb 28 17:20:13 2018
+--------------------------------------------------------------------
---------+
| NVIDIA-SMI 384.111                Driver Version:
384.111                   |
|-------------------------------+----------------------+---------------
-------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile
Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-
Util  Compute M. |
|===============================+======================+===============
=======|
|   0  GeForce GTX 106...  Off  | 00000000:01:00.0  On
|                  N/A |
|  0%   53C    P2    32W / 150W |    939MiB /  6071MiB
|     11%      Default |
+-------------------------------+----------------------+---------------
-------+
+--------------------------------------------------------------------
---------+
| Processes:                                                       GPU
Memory |
|  GPU       PID   Type   Process
name                             Usage      |
|======================================================================
=======|
|    0      1423      G   /usr/bin/X
443MiB |
|    0      2555      G   compiz
224MiB |
|    0     14879      G   ...-token=ACEXXXXXXXXX
XX2E9DDXXXXXXXE41   107MiB |
|    0     21458      C   /usr/local/ffmpeg-
dev/bin/ffmpeg             159MiB |
+--------------------------------------------------------------------
---------+


I believe I overlooked something, or maybe there are some limitations.
So, I kindly ask your suggestions. Many thanks in advance!


Garri
_______________________________________________
ffmpeg-user mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".


_______________________________________________
ffmpeg-user mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".
_______________________________________________
ffmpeg-user mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Reply via email to