On Wed, Nov 13, 2024 at 02:06:58PM +0100, Anton Khirnov wrote: > Propagating decoder state between per-thread contexts with frame > threading currently works as follows: > 0) Every frame thread has its own "child" decoder context, > 1) Frame thread T0 decodes the frame header and updates its context > accordingly. At most one frame thread can be in this stage at any > given time. > 2) Frame thread T0 calls ff_thread_finish_setup() to indicate that > header decoding is done. > 3a) Frame thread T0 proceeds with decoding frame data. > 3b) The main thread calls the decoder's update_thread_context() > callback, transferring T0's state to the next thread T1. > > Since 3a) and 3b) run concurrently, during 3a) T0 must not write to any > context variables accessed by update_thread_context(), otherwise a data > race occurs. This approach turns out to be highly fragile in practice, > as developers are either not aware of this constraint, or fail to keep > it in mind while modifying decoders. > > This commit aims to eliminate the possibility of such races by changing > the logic in the folowing way: > * child decoders are no longer permanently bound to worker threads, but > are instead assigned dynamically only while decoding a single frame; a > different decoder may be assigned to the same thread for a later frame > * with N frame threads, N+1 child decoder contexts are allocated > (instead of N, as before), so at any time at least one decoder is > idle (unassigned) > * when a frame thread calls ff_thread_finish_setup(), its context state > is immediately and synchronously transferred to the idle context > * the idle context is then assigned to the next frame thread, whose > previous decoder now becomes idle > > With this approach, improperly updating a decoder context after > ff_thread_finish_setup() transforms from a race into a deterministic > failure to propagate the relevant variables to following frame threads, > which > * should be much easier to debug > * is no longer UB > --- > libavcodec/pthread_frame.c | 142 +++++++++++++++++++++---------------- > 1 file changed, 81 insertions(+), 61 deletions(-)
Segfaults: [h264 @ 0xa213ac0] ==3430921== Thread 2 av:h264:df0: ==3430921== Invalid read of size 4 ==3430921== at 0xC64D63: ff_thread_report_progress (pthread_frame.c:666) ==3430921== by 0x1100CD3: h264_field_start (h264_slice.c:1472) ==3430921== by 0x11038FE: ff_h264_queue_decode_slice (h264_slice.c:2153) ==3430921== by 0xA6AA24: decode_nal_units (h264dec.c:657) ==3430921== by 0xA6C03F: h264_decode_frame (h264dec.c:1070) ==3430921== by 0x99FC82: decode_simple_internal (decode.c:443) ==3430921== by 0x9A01E1: decode_simple_receive_frame (decode.c:613) ==3430921== by 0x9A038B: ff_decode_receive_frame_internal (decode.c:649) ==3430921== by 0xC63BD4: frame_worker_thread (pthread_frame.c:317) ==3430921== by 0x692C608: start_thread (pthread_create.c:477) ==3430921== by 0x6A66352: clone (clone.S:95) ==3430921== Address 0x134 is not stack'd, malloc'd or (recently) free'd ==3430921== ==3430921== ==3430921== Process terminating with default action of signal 11 (SIGSEGV) ==3430921== Access not within mapped region at address 0x134 ==3430921== at 0xC64D63: ff_thread_report_progress (pthread_frame.c:666) ==3430921== by 0x1100CD3: h264_field_start (h264_slice.c:1472) ==3430921== by 0x11038FE: ff_h264_queue_decode_slice (h264_slice.c:2153) ==3430921== by 0xA6AA24: decode_nal_units (h264dec.c:657) ==3430921== by 0xA6C03F: h264_decode_frame (h264dec.c:1070) ==3430921== by 0x99FC82: decode_simple_internal (decode.c:443) ==3430921== by 0x9A01E1: decode_simple_receive_frame (decode.c:613) ==3430921== by 0x9A038B: ff_decode_receive_frame_internal (decode.c:649) ==3430921== by 0xC63BD4: frame_worker_thread (pthread_frame.c:317) ==3430921== by 0x692C608: start_thread (pthread_create.c:477) ==3430921== by 0x6A66352: clone (clone.S:95) ==3430921== If you believe this happened as a result of a stack ==3430921== overflow in your program's main thread (unlikely but ==3430921== possible), you can try to increase the size of the ==3430921== main thread stack using the --main-stacksize= flag. ==3430921== The main thread stack size used in this run was 8388608. Testcase: -threads 2 -i ~/tickets/2927/h264_dead.avi -threads 1 -f null - (if sample is not on trac or server, i can provide it) thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB I do not agree with what you have to say, but I'll defend to the death your right to say it. -- Voltaire
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".