On 28/05/17 12:22, Andy Furniss wrote:
> ==13089== HEAP SUMMARY:
> ==13089==     in use at exit: 1,641,516,251 bytes in 201,131 blocks
> ==13089==   total heap usage: 2,075,668 allocs, 1,874,537 frees, 
> 68,823,339,225 bytes allocated
> ==13089==
> ==13089== 368 bytes in 1 blocks are possibly lost in loss record 908 of 932
> ==13089==    at 0x4C28B06: malloc 
> (/home/andy/Src2016/valgrind-3.12.0/coregrind/m_replacemalloc/vg_replace_malloc.c:299)
> ==13089==    by 0xFBED2DE: vlVaCreateBuffer 
> (/mesa/src/gallium/state_trackers/va/buffer.c:56)
> ==13089==    by 0x603BB82: vaCreateBuffer (/libva/va/va.c:1116)
> ==13089==    by 0xED2C21: ff_vaapi_decode_make_param_buffer 
> (/ffmpeg/libavcodec/vaapi_decode.c:40)
> ==13089==    by 0xAFDA07: vaapi_h264_start_frame 
> (/ffmpeg/libavcodec/vaapi_h264.c:286)
> ==13089==    by 0x853E0B: decode_nal_units (/ffmpeg/libavcodec/h264dec.c:690)
> ==13089==    by 0x853E0B: h264_decode_frame 
> (/ffmpeg/libavcodec/h264dec.c:1008)
> ==13089==    by 0xA0A54F: frame_worker_thread 
> (/ffmpeg/libavcodec/pthread_frame.c:199)
> ==13089==    by 0x8D12433: start_thread (in /lib/libpthread-2.22.so)
> ==13089==    by 0x901106C: clone (in /lib/libc-2.22.so)
> ==13089==
> ==13089== 22,968 bytes in 11 blocks are possibly lost in loss record 922 of 
> 932
> ==13089==    at 0x4C28B06: malloc 
> (/home/andy/Src2016/valgrind-3.12.0/coregrind/m_replacemalloc/vg_replace_malloc.c:299)
> ==13089==    by 0xFBED2DE: vlVaCreateBuffer 
> (/mesa/src/gallium/state_trackers/va/buffer.c:56)
> ==13089==    by 0x603BB82: vaCreateBuffer (/libva/va/va.c:1116)
> ==13089==    by 0xED2D46: ff_vaapi_decode_make_slice_buffer 
> (/ffmpeg/libavcodec/vaapi_decode.c:86)
> ==13089==    by 0xAFCCBB: vaapi_h264_decode_slice 
> (/ffmpeg/libavcodec/vaapi_h264.c:380)
> ==13089==    by 0x853CCD: decode_nal_units (/ffmpeg/libavcodec/h264dec.c:703)
> ==13089==    by 0x853CCD: h264_decode_frame 
> (/ffmpeg/libavcodec/h264dec.c:1008)
> ==13089==    by 0xA0A54F: frame_worker_thread 
> (/ffmpeg/libavcodec/pthread_frame.c:199)
> ==13089==    by 0x8D12433: start_thread (in /lib/libpthread-2.22.so)
> ==13089==    by 0x901106C: clone (in /lib/libc-2.22.so)
> ==13089==
> ==13089== 3,042,432 (800,640 direct, 2,241,792 indirect) bytes in 10,008 
> blocks are definitely lost in loss record 925 of 932
> ==13089==    at 0x4C2A898: calloc 
> (/home/andy/Src2016/valgrind-3.12.0/coregrind/m_replacemalloc/vg_replace_malloc.c:711)
> ==13089==    by 0xFBED2C0: vlVaCreateBuffer 
> (/mesa/src/gallium/state_trackers/va/buffer.c:49)
> ==13089==    by 0x603BB82: vaCreateBuffer (/libva/va/va.c:1116)
> ==13089==    by 0xED2C21: ff_vaapi_decode_make_param_buffer 
> (/ffmpeg/libavcodec/vaapi_decode.c:40)
> ==13089==    by 0xAFDBA9: vaapi_h264_start_frame 
> (/ffmpeg/libavcodec/vaapi_h264.c:299)
> ==13089==    by 0x853E0B: decode_nal_units (/ffmpeg/libavcodec/h264dec.c:690)
> ==13089==    by 0x853E0B: h264_decode_frame 
> (/ffmpeg/libavcodec/h264dec.c:1008)
> ==13089==    by 0xA0A54F: frame_worker_thread 
> (/ffmpeg/libavcodec/pthread_frame.c:199)
> ==13089==    by 0x8D12433: start_thread (in /lib/libpthread-2.22.so)
> ==13089==    by 0x901106C: clone (in /lib/libc-2.22.so)
> ==13089==
> ==13089== 4,483,216 (800,640 direct, 3,682,576 indirect) bytes in 10,008 
> blocks are definitely lost in loss record 927 of 932
> ==13089==    at 0x4C2A898: calloc 
> (/home/andy/Src2016/valgrind-3.12.0/coregrind/m_replacemalloc/vg_replace_malloc.c:711)
> ==13089==    by 0xFBED2C0: vlVaCreateBuffer 
> (/mesa/src/gallium/state_trackers/va/buffer.c:49)
> ==13089==    by 0x603BB82: vaCreateBuffer (/libva/va/va.c:1116)
> ==13089==    by 0xED2C21: ff_vaapi_decode_make_param_buffer 
> (/ffmpeg/libavcodec/vaapi_decode.c:40)
> ==13089==    by 0xAFDA07: vaapi_h264_start_frame 
> (/ffmpeg/libavcodec/vaapi_h264.c:286)
> ==13089==    by 0x853E0B: decode_nal_units (/ffmpeg/libavcodec/h264dec.c:690)
> ==13089==    by 0x853E0B: h264_decode_frame 
> (/ffmpeg/libavcodec/h264dec.c:1008)
> ==13089==    by 0xA0A54F: frame_worker_thread 
> (/ffmpeg/libavcodec/pthread_frame.c:199)
> ==13089==    by 0x8D12433: start_thread (in /lib/libpthread-2.22.so)
> ==13089==    by 0x901106C: clone (in /lib/libc-2.22.so)
> ==13089==
> ==13089== 6,872,207 bytes in 150 blocks are possibly lost in loss record 928 
> of 932
> ==13089==    at 0x4C28B06: malloc 
> (/home/andy/Src2016/valgrind-3.12.0/coregrind/m_replacemalloc/vg_replace_malloc.c:299)
> ==13089==    by 0xFBED2DE: vlVaCreateBuffer 
> (/mesa/src/gallium/state_trackers/va/buffer.c:56)
> ==13089==    by 0x603BB82: vaCreateBuffer (/libva/va/va.c:1116)
> ==13089==    by 0xED2DA5: ff_vaapi_decode_make_slice_buffer 
> (/ffmpeg/libavcodec/vaapi_decode.c:100)
> ==13089==    by 0xAFCCBB: vaapi_h264_decode_slice 
> (/ffmpeg/libavcodec/vaapi_h264.c:380)
> ==13089==    by 0x853CCD: decode_nal_units (/ffmpeg/libavcodec/h264dec.c:703)
> ==13089==    by 0x853CCD: h264_decode_frame 
> (/ffmpeg/libavcodec/h264dec.c:1008)
> ==13089==    by 0xA0A54F: frame_worker_thread 
> (/ffmpeg/libavcodec/pthread_frame.c:199)
> ==13089==    by 0x8D12433: start_thread (in /lib/libpthread-2.22.so)
> ==13089==    by 0x901106C: clone (in /lib/libc-2.22.so)
> ==13089==
> ==13089== 86,766,408 (3,202,560 direct, 83,563,848 indirect) bytes in 40,032 
> blocks are definitely lost in loss record 930 of 932
> ==13089==    at 0x4C2A898: calloc 
> (/home/andy/Src2016/valgrind-3.12.0/coregrind/m_replacemalloc/vg_replace_malloc.c:711)
> ==13089==    by 0xFBED2C0: vlVaCreateBuffer 
> (/mesa/src/gallium/state_trackers/va/buffer.c:49)
> ==13089==    by 0x603BB82: vaCreateBuffer (/libva/va/va.c:1116)
> ==13089==    by 0xED2D46: ff_vaapi_decode_make_slice_buffer 
> (/ffmpeg/libavcodec/vaapi_decode.c:86)
> ==13089==    by 0xAFCCBB: vaapi_h264_decode_slice 
> (/ffmpeg/libavcodec/vaapi_h264.c:380)
> ==13089==    by 0x853CCD: decode_nal_units (/ffmpeg/libavcodec/h264dec.c:703)
> ==13089==    by 0x853CCD: h264_decode_frame 
> (/ffmpeg/libavcodec/h264dec.c:1008)
> ==13089==    by 0xA0A54F: frame_worker_thread 
> (/ffmpeg/libavcodec/pthread_frame.c:199)
> ==13089==    by 0x8D12433: start_thread (in /lib/libpthread-2.22.so)
> ==13089==    by 0x901106C: clone (in /lib/libc-2.22.so)
> ==13089==
> ==13089== 1,540,168,859 (3,202,560 direct, 1,536,966,299 indirect) bytes in 
> 40,032 blocks are definitely lost in loss record 932 of 932
> ==13089==    at 0x4C2A898: calloc 
> (/home/andy/Src2016/valgrind-3.12.0/coregrind/m_replacemalloc/vg_replace_malloc.c:711)
> ==13089==    by 0xFBED2C0: vlVaCreateBuffer 
> (/mesa/src/gallium/state_trackers/va/buffer.c:49)
> ==13089==    by 0x603BB82: vaCreateBuffer (/libva/va/va.c:1116)
> ==13089==    by 0xED2DA5: ff_vaapi_decode_make_slice_buffer 
> (/ffmpeg/libavcodec/vaapi_decode.c:100)
> ==13089==    by 0xAFCCBB: vaapi_h264_decode_slice 
> (/ffmpeg/libavcodec/vaapi_h264.c:380)
> ==13089==    by 0x853CCD: decode_nal_units (/ffmpeg/libavcodec/h264dec.c:703)
> ==13089==    by 0x853CCD: h264_decode_frame 
> (/ffmpeg/libavcodec/h264dec.c:1008)
> ==13089==    by 0xA0A54F: frame_worker_thread 
> (/ffmpeg/libavcodec/pthread_frame.c:199)
> ==13089==    by 0x8D12433: start_thread (in /lib/libpthread-2.22.so)
> ==13089==    by 0x901106C: clone (in /lib/libc-2.22.so)
> ==13089==
> ==13089== LEAK SUMMARY:
> ==13089==    definitely lost: 8,006,400 bytes in 100,080 blocks
> ==13089==    indirectly lost: 1,626,454,515 bytes in 99,918 blocks
> ==13089==      possibly lost: 6,895,543 bytes in 162 blocks
> ==13089==    still reachable: 159,793 bytes in 971 blocks
> ==13089==         suppressed: 0 bytes in 0 blocks
> ==13089== Reachable blocks (those to which a pointer was found) are not shown.
> ==13089== To see them, rerun with: --leak-check=full --show-leak-kinds=all
> ==13089==
> ==13089== For counts of detected and suppressed errors, rerun with: -v
> ==13089== ERROR SUMMARY: 7 errors from 7 contexts (suppressed: 0 from 0)

Hmph, looks like the parameter buffer confusion strikes again.


The original VAAPI specification contained the following functions:

"""
/**
 * After this call, the buffer is deleted and this buffer_id is no longer valid
 * Only call this if the buffer is not going to be passed to vaRenderBuffer
 */
VAStatus vaDestroyBuffer (
...
/**
 * Send decode buffers to the server.
 * Buffers are automatically destroyed afterwards
 */
VAStatus vaRenderPicture (
"""

The newer lavc implementation follows this, and does not call vaDestroyBuffer() 
on buffers passed to vaRenderPicture().  Not doing this would cause random 
crashes in multithreaded programs, because another buffer with the same ID 
could be created between the two calls.

However, the Intel implementation never actually followed the specification - 
it leaked the buffers passed to vaRenderPicture().  So, a special driver quirk 
was added to detect that driver and additionally destroy the buffers in that 
case: 
<http://git.videolan.org/?p=ffmpeg.git;a=commit;h=4926fa9a4aa03f3b751f52e900b9efb87fea0591>.
  This restored same behaviour as old lavc with the Intel driver without 
breaking other implementations.

That worked for a bit.

Unfortunately, Intel got tired of people complaining that their driver was 
broken.  Rather than fixing it, they decided to change the specification: 
<https://github.com/01org/libva/commit/3eb038aa13bdd785808286c0a4995bd7a1ef07e9>
 (this is an ABI and API break, and the SONAME was not changed).

Thus we have the state now:

VAAPI < 0.40:
A)  Destroy buffers: wrong, will crash in multithreaded programs.
B)  Don't destroy buffers: correct, driver quirk restores behaviour for Intel.

VAAPI >= 0.40:
A)  Destroy buffers: correct, but may crash old drivers which haven't been 
updated (not detectable, because the library allows loading old drivers without 
telling the user).
B)  Don't destroy buffers: wrong, driver quirk restores correct behaviour for 
Intel, others may leak memory.

lavc currently chooses (B) in both cases on the grounds that leaking memory is 
preferable to crashing randomly.  There is some thought of switching to (A) for 
VAAPI >= 0.40, but given that we can't detect the version at runtime and the 
libraries consider everything to be compatible this seems dangerous.


For Mesa, I feel like I remember it having a correct < 0.40 implementation, but 
looking again it doesn't seem to and there isn't any obvious change (maybe I am 
just wrong).  Do you know if this changed?  Adding the driver quirk like the 
Intel driver is probably the right thing to do if it has always been like that, 
if it hasn't then we need more magic to be able to distinguish versions to 
prevent older ones from crashing.

Thanks,

- Mark
_______________________________________________
ffmpeg-user mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Reply via email to