Also, see this: https://www.x.org/wiki/RadeonFeature/
For term disambiguation: https://www.reddit.com/r/archlinux/comments/6la6n5/trying_to_understand_drm_dri_mesa_radeon_gallium/ On 25 July 2018 at 05:16, Dennis Mungai <[email protected]> wrote: > From what I can gather, AMD's driver implementation for VAAPI (gallium? > through mesa) is a work in progress, and compared to i915 (intel's), is > quite behind. > > On your system, are you able to build FFmpeg to utilize OMX IL? AMD has > support for it via the VCE block. See this for an example on enabling it: > https://github.com/legotheboss/YouTube-files/wiki/(RPi)-Compile-FFmpeg- > with-the-OpenMAX-H.264-GPU-acceleration > > The guide was written for the rPI, but what we're interested in is OpenMAX > bellagio and the configuration switches that enable OpenMAX IL encoders. > > > > On 24 July 2018 at 12:01, Lukas Obermann <[email protected]> wrote: > >> Hello Dennis, >> >> thank you for your help! Much appreciate it. >> >> Using your command I get a 1.9x speed. So a slight improvement, but not >> much. >> I pasted the debug output here, maybe you can see something usefull? >> https://pastebin.com/W0KKjZbN <https://pastebin.com/W0KKjZbN> >> >> ad 1. Yes, there is the onboard intel device and 6 of those RX570 that in >> the end I want to have all transcode stuff in parallel. >> >> lukas@transcoder:~$ vainfo --display drm --device /dev/dri/card1 >> libva info: VA-API version 1.1.0 >> libva info: va_getDriverName() returns 0 >> libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/ >> radeonsi_drv_video.so >> libva info: Found init function __vaDriverInit_1_1 >> libva info: va_openDriver() returns 0 >> vainfo: VA-API version: 1.1 (libva 2.1.0) >> vainfo: Driver version: mesa gallium vaapi >> vainfo: Supported profile and entrypoints >> VAProfileMPEG2Simple : VAEntrypointVLD >> VAProfileMPEG2Main : VAEntrypointVLD >> VAProfileVC1Simple : VAEntrypointVLD >> VAProfileVC1Main : VAEntrypointVLD >> VAProfileVC1Advanced : VAEntrypointVLD >> VAProfileH264ConstrainedBaseline: VAEntrypointVLD >> VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice >> VAProfileH264Main : VAEntrypointVLD >> VAProfileH264Main : VAEntrypointEncSlice >> VAProfileH264High : VAEntrypointVLD >> VAProfileH264High : VAEntrypointEncSlice >> VAProfileHEVCMain : VAEntrypointVLD >> VAProfileHEVCMain10 : VAEntrypointVLD >> VAProfileJPEGBaseline : VAEntrypointVLD >> VAProfileNone : VAEntrypointVideoProc >> >> lukas@transcoder:~$ ls -la /dev/dri/ >> total 0 >> drwxr-xr-x 3 root root 340 Jul 23 15:47 . >> drwxr-xr-x 19 root root 5020 Jul 23 15:47 .. >> drwxr-xr-x 2 root root 320 Jul 23 15:47 by-path >> crw-rw----+ 1 root video 226, 0 Jul 23 15:47 card0 >> crw-rw----+ 1 root video 226, 1 Jul 23 15:47 card1 >> crw-rw----+ 1 root video 226, 2 Jul 23 15:47 card2 >> crw-rw----+ 1 root video 226, 3 Jul 23 15:47 card3 >> crw-rw----+ 1 root video 226, 4 Jul 23 15:47 card4 >> crw-rw----+ 1 root video 226, 5 Jul 23 15:47 card5 >> crw-rw----+ 1 root video 226, 6 Jul 23 15:47 card6 >> crw-rw----+ 1 root video 226, 128 Jul 23 15:47 renderD128 >> crw-rw----+ 1 root video 226, 129 Jul 23 15:47 renderD129 >> crw-rw----+ 1 root video 226, 130 Jul 23 15:47 renderD130 >> crw-rw----+ 1 root video 226, 131 Jul 23 15:47 renderD131 >> crw-rw----+ 1 root video 226, 132 Jul 23 15:47 renderD132 >> crw-rw----+ 1 root video 226, 133 Jul 23 15:47 renderD133 >> crw-rw----+ 1 root video 226, 134 Jul 23 15:47 renderD134 >> >> >> ad 2. ok, understand. Is there a benefit of doing it that way? >> >> ad 3. I have done two tests now with only the decoder running, which are >> confusing me now even more. >> >> So running following command: >> ffmpeg -init_hw_device vaapi=amd:/dev/dri/renderD129 -hwaccel vaapi >> -hwaccel_output_format vaapi -hwaccel_device amd -filter_hw_device amd -i >> fs_experiental_method.avi -f null - >> >> Results in ~ 12x speed >> frame= 5824 fps=340 q=-0.0 Lsize=N/A time=00:03:14.47 bitrate=N/A >> speed=11.4x >> >> But, using the CPU (a dual core pentium from last year) >> ffmpeg -i fs_experiental_method.avi -f null - >> >> Results in ~ 14x speed >> frame=10570 fps=408 q=-0.0 Lsize=N/A time=00:05:52.90 bitrate=N/A >> speed=13.6x >> >> Of course the vaapi one uses only like 10% of CPU while the CPU one uses >> 100%. >> >> The graph looks like this for vaapi: >> >> [graph_1_in_0_1 @ 0x557f5d1cd5c0] Setting 'time_base' to value '1/32000' >> [graph_1_in_0_1 @ 0x557f5d1cd5c0] Setting 'sample_rate' to value '32000' >> [graph_1_in_0_1 @ 0x557f5d1cd5c0] Setting 'sample_fmt' to value 'fltp' >> [graph_1_in_0_1 @ 0x557f5d1cd5c0] Setting 'channel_layout' to value '0x4' >> [graph_1_in_0_1 @ 0x557f5d1cd5c0] tb:1/32000 samplefmt:fltp >> samplerate:32000 chlayout:0x4 >> [format_out_0_1 @ 0x557f5d29fe40] Setting 'sample_fmts' to value 's16' >> [format_out_0_1 @ 0x557f5d29fe40] auto-inserting filter >> 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter >> 'format_out_0_1' >> [AVFilterGraph @ 0x557f5d1ceec0] query_formats: 4 queried, 6 merged, 3 >> already done, 0 delayed >> [auto_resampler_0 @ 0x557f5d2be600] [SWR @ 0x557f5d1d8200] Using fltp >> internally between filters >> [auto_resampler_0 @ 0x557f5d2be600] ch:1 chl:mono fmt:fltp r:32000Hz -> >> ch:1 chl:mono fmt:s16 r:32000Hz >> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'video_size' to >> value '1920x1080' >> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'pix_fmt' to >> value '46' >> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'time_base' to >> value '1/30' >> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'pixel_aspect' >> to value '0/1' >> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'sws_param' to >> value 'flags=2' >> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] Setting 'frame_rate' to >> value '30/1' >> [graph 0 input from stream 0:0 @ 0x557f5d2c1e40] w:1920 h:1080 >> pixfmt:vaapi_vld tb:1/30 fr:30/1 sar:0/1 sws_param:flags=2 >> [AVFilterGraph @ 0x557f5d17e480] query_formats: 3 queried, 2 merged, 0 >> already done, 0 delayed >> >> and like this for the cpu: >> >> [graph_1_in_0_1 @ 0x55986d1cab40] Setting 'time_base' to value '1/32000' >> [graph_1_in_0_1 @ 0x55986d1cab40] Setting 'sample_rate' to value '32000' >> [graph_1_in_0_1 @ 0x55986d1cab40] Setting 'sample_fmt' to value 'fltp' >> [graph_1_in_0_1 @ 0x55986d1cab40] Setting 'channel_layout' to value '0x4' >> [graph_1_in_0_1 @ 0x55986d1cab40] tb:1/32000 samplefmt:fltp >> samplerate:32000 chlayout:0x4 >> [format_out_0_1 @ 0x55986d19c740] Setting 'sample_fmts' to value 's16' >> [format_out_0_1 @ 0x55986d19c740] auto-inserting filter >> 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter >> 'format_out_0_1' >> [AVFilterGraph @ 0x55986d090fc0] query_formats: 4 queried, 6 merged, 3 >> already done, 0 delayed >> [auto_resampler_0 @ 0x55986d1ab680] [SWR @ 0x55986d0cd740] Using fltp >> internally between filters >> [auto_resampler_0 @ 0x55986d1ab680] ch:1 chl:mono fmt:fltp r:32000Hz -> >> ch:1 chl:mono fmt:s16 r:32000Hz >> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'video_size' to >> value '1920x1080' >> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'pix_fmt' to >> value '0' >> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'time_base' to >> value '1/30' >> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'pixel_aspect' >> to value '0/1' >> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'sws_param' to >> value 'flags=2' >> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] Setting 'frame_rate' to >> value '30/1' >> [graph 0 input from stream 0:0 @ 0x55986d15f2c0] w:1920 h:1080 >> pixfmt:yuv420p tb:1/30 fr:30/1 sar:0/1 sws_param:flags=2 >> [AVFilterGraph @ 0x55986d196c40] query_formats: 3 queried, 2 merged, 0 >> already done, 0 delayed >> >> >> I find it very strange that CPU decoding is faster then GPU decoding. Or >> maybe is it a bottleneck? I am a bit lost right now I have to say. >> >> >> >> > On 23.07.2018, at 22:51, Dennis Mungai <[email protected]> wrote: >> > >> > Hello there, >> > >> > Here's something you can try: >> > >> > ffmpeg -init_hw_device vaapi=amd:/dev/dri/renderD129 -hwaccel vaapi >> > -hwaccel_output_format vaapi -hwaccel_device amd -filter_hw_device amd >> -i >> > fs_experiental_method.avi -vf 'format=nv12|vaapi,hwupload' -y -c:v >> > h264_vaapi -qp:v 21 -sei +identifier+timing+recovery_point -profile:v >> main >> > -level 4 output.avi >> > >> > Assumptions made: >> > >> > 1. You have another GPU on the system. See the DRI device you >> highlighted >> > (/dev/dri/card1) is implied to be the second render node because the >> first >> > ordinal device would have been /dev/dri/card0, mapped to >> > /dev/dri/renderD128. >> > >> > Confirm this by providing the output of: >> > >> > (a). vainfo >> > (b). ls -al /dev/dri/ >> > >> > 2. We explicitly initialize and name the hardware device >> > (/dev/dri/renderD129) to 'amd' and pass it to both the decoder, encoder >> and >> > the video filtergraph. >> > >> > 3. Observe the video filter graph. Here's what it does: The decoder will >> > output either vaapi surfaces (if the hwaccel is usable) or software >> frames >> > (if it isn't). In the first case, it matches the vaapi format and >> hwupload >> > does nothing (it passes through hardware frames unchanged). In the >> second >> > case, it matches the nv12 format and converts whatever the input is to >> > that, then uploads. >> > >> > This is done for safety reasons: Either way, the encoder will run. >> However, >> > depending on the path chosen (upload to memory vs native VAAPI hwdec), >> your >> > performance may vary. >> > >> > Reference used: >> > >> > 1. The VAAPI entry on FFmpeg wiki: >> > https://trac.ffmpeg.org/wiki/Hardware/VAAPI >> > >> > 2. The VAAPI encoders entry in the docs: >> > http://www.ffmpeg.org/ffmpeg-codecs.html#VAAPI-encoders >> > >> > On 23 July 2018 at 22:30, Lukas Obermann <[email protected]> >> wrote: >> > >> >> Hi all, >> >> >> >> I want to use a RX570 for transcoding with ffmpeg. Have been looking >> into >> >> this for some time now and testing around various things. >> >> I use Ubuntu 18.04 and I have it running with VAAPI. But the >> performance >> >> is not good imo. For a 1080p file I only get like 1.8x speed. I was >> >> expecting something around 6x to 8x. >> >> Is VAAPI the right way to go here? I see that AMF is not yet ready for >> >> linux and VDPAU only support decoding, not encoding. >> >> >> >> Following is the command: >> >> ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/card1 >> -hwaccel_output_format >> >> vaapi -i fs_experiental_method.avi -y -c:v h264_vaapi -profile:v main >> >> output.avi >> >> >> >> ffmpeg version n4.0.2 >> >> mesa 18 >> >> amdgpu-pro-18.20-606296 >> >> libva: VA-API version 1.1.0 >> >> >> >> And here below the non-debug output of the command, to show the >> formats. >> >> I would appreciate any help on this. >> >> >> >> Thanks! >> >> Lukas >> >> >> >> >> >> ffmpeg version n4.0.2-2 Copyright (c) 2000-2018 the FFmpeg developers >> >> built with gcc 7 (Ubuntu 7.3.0-16ubuntu3) >> >> configuration: --prefix=/usr --extra-version=2 --toolchain=hardened >> >> --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-l >> inux-gnu >> >> --extra-cflags=-I/usr/local/include --extra-ldflags=-L/usr/local/lib >> >> --enable-gpl --disable-stripping --enable-avresample --enable-avisynth >> >> --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray >> >> --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite >> >> --enable-libfontconfig --enable-libfreetype --enable-libfribidi >> >> --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa >> >> --enable-libopenjpeg --enable-libopenmpt --enable-libopus >> --enable-libpulse >> >> --enable-librubberband --enable-librsvg --enable-libshine >> >> --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh >> >> --enable-libtheora --enable-libtwolame --enable-libvorbis >> --enable-libvpx >> >> --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 >> >> --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx >> >> --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 >> >> --enable-libdrm --enable-libiec61883 --enable-chromaprint >> --enable-frei0r >> >> --enable-libx264 --enable-shared --enable-vaapi --enable-vdpau >> >> libavutil 56. 14.100 / 56. 14.100 >> >> libavcodec 58. 18.100 / 58. 18.100 >> >> libavformat 58. 12.100 / 58. 12.100 >> >> libavdevice 58. 3.100 / 58. 3.100 >> >> libavfilter 7. 16.100 / 7. 16.100 >> >> libavresample 4. 0. 0 / 4. 0. 0 >> >> libswscale 5. 1.100 / 5. 1.100 >> >> libswresample 3. 1.100 / 3. 1.100 >> >> libpostproc 55. 1.100 / 55. 1.100 >> >> Input #0, avi, from 'fs_experiental_method.avi': >> >> Metadata: >> >> encoder : Lavf57.83.100 >> >> Duration: 00:33:38.10, start: 0.000000, bitrate: 8133 kb/s >> >> Stream #0:0: Video: h264 (Constrained Baseline) (H264 / 0x34363248), >> >> yuv420p(progressive), 1920x1080, 8057 kb/s, 30 fps, 30 tbr, 30 tbn, 60 >> tbc >> >> Stream #0:1: Audio: mp3 (U[0][0][0] / 0x0055), 32000 Hz, mono, fltp, >> >> 64 kb/s >> >> Stream mapping: >> >> Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_vaapi)) >> >> Stream #0:1 -> #0:1 (mp3 (mp3float) -> mp3 (libmp3lame)) >> >> Press [q] to stop, [?] for help >> >> [h264_vaapi @ 0x55fcc47055c0] B frames are not supported (0x1) by the >> >> underlying driver. >> >> [h264_vaapi @ 0x55fcc47055c0] Warning: some packed headers are not >> >> supported (want 0xd, got 0). >> >> Output #0, avi, to 'output.avi': >> >> Metadata: >> >> ISFT : Lavf58.12.100 >> >> Stream #0:0: Video: h264 (h264_vaapi) (Main) (H264 / 0x34363248), >> >> vaapi_vld, 1920x1080, q=0-31, 30 fps, 30 tbn, 30 tbc >> >> Metadata: >> >> encoder : Lavc58.18.100 h264_vaapi >> >> Stream #0:1: Audio: mp3 (libmp3lame) (U[0][0][0] / 0x0055), 32000 >> Hz, >> >> mono, fltp >> >> Metadata: >> >> encoder : Lavc58.18.100 libmp3lame >> >> frame= 202 fps= 52 q=-0.0 Lsize= 4309kB time=00:00:06.80 >> >> bitrate=5187.5kbits/s speed=1.74x >> >> video:4249kB audio:40kB subtitle:0kB other streams:0kB global >> headers:0kB >> >> muxing overhead: 0.444606% >> >> _______________________________________________ >> >> ffmpeg-user mailing list >> >> [email protected] >> >> http://ffmpeg.org/mailman/listinfo/ffmpeg-user >> >> >> >> To unsubscribe, visit link above, or email >> >> [email protected] with subject "unsubscribe". >> > _______________________________________________ >> > ffmpeg-user mailing list >> > [email protected] >> > http://ffmpeg.org/mailman/listinfo/ffmpeg-user >> > >> > To unsubscribe, visit link above, or email >> > [email protected] with subject "unsubscribe". >> >> _______________________________________________ >> ffmpeg-user mailing list >> [email protected] >> http://ffmpeg.org/mailman/listinfo/ffmpeg-user >> >> To unsubscribe, visit link above, or email >> [email protected] with subject "unsubscribe". >> > > _______________________________________________ ffmpeg-user mailing list [email protected] http://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
