Hi Luca,

I will update the patch with the longer description.

On Wed, Dec 7, 2016 at 5:42 PM, Luca Barbato <[email protected]> wrote:
>
> Assuming it is not causing noticeable slowdowns/speedups (a benchmark
> would be welcome), I'm all for consistency.

I assume that initializing progress[0] and progress[1] with
atomic_init() instead of atomic_store() is not controversial. The
changes to ff_thread_report_progress() are unlikely to cause
noticeable slowdowns/speedups. In this kind of situation I often
inspect the code generated by the compiler to evaluate the changes.

In this analysis I use clang 3.8.0, targeting x86_64, at optimization
level -O3. (I admit a compiler targeting ARM would be better.)

The first change in ff_thread_report_progress(), to the
atomic_load_explicit() call, results in no change in the generated
code.

The second change in ff_thread_report_progress(), to the
atomic_store(), changes an xchgl instruction to a movl instruction.

Here is the diff between the generated code (original vs. modified) of
the end of ff_thread_report_progress(). The original C source code is:
469     pthread_mutex_lock(&p->progress_mutex);
470
471     atomic_store(&progress[field], n);
472
473     pthread_cond_broadcast(&p->progress_cond);
474     pthread_mutex_unlock(&p->progress_mutex);
475 }

I change line 471 to:

471     atomic_store_explicit(&progress[field], n, memory_order_release);

The diff between generated code is:

==========
 .Ltmp154:
 .LBB2_5:
        #DEBUG_VALUE: ff_thread_report_progress:field <- %R9D
        #DEBUG_VALUE: ff_thread_report_progress:n <- %EBP
        #DEBUG_VALUE: ff_thread_report_progress:progress <- %RBX
        #DEBUG_VALUE: ff_thread_report_progress:p <- %R14
        .loc    6 469 28                # libavcodec/pthread_frame.c:469:28
        leaq    208(%r14), %r15
        .loc    6 469 5 is_stmt 0       # libavcodec/pthread_frame.c:469:5
        movq    %r15, %rdi
        callq   pthread_mutex_lock
        .loc    6 471 5 is_stmt 1       # libavcodec/pthread_frame.c:471:5
-       xchgl   %ebp, (%rbx,%r12,4)
-.Ltmp155:
+       movl    %ebp, (%rbx,%r12,4)
        .loc    6 473 32                # libavcodec/pthread_frame.c:473:32
        addq    $72, %r14
-.Ltmp156:
+.Ltmp155:
        .loc    6 473 5 is_stmt 0       # libavcodec/pthread_frame.c:473:5
        movq    %r14, %rdi
        callq   pthread_cond_broadcast
        .loc    6 474 5 is_stmt 1       # libavcodec/pthread_frame.c:474:5
        movq    %r15, %rdi
        popq    %rbx
-.Ltmp157:
+.Ltmp156:
        popq    %r12
        popq    %r14
        popq    %r15
        popq    %rbp
+.Ltmp157:
        jmp     pthread_mutex_unlock    # TAILCALL
 .Ltmp158:
 .LBB2_6:                                # %.thread
        #DEBUG_VALUE: ff_thread_report_progress:n <- %EBP
        #DEBUG_VALUE: ff_thread_report_progress:field <- %R9D
        #DEBUG_VALUE: ff_thread_report_progress:f <- %RDI
        .loc    6 475 1 discriminator 2 # libavcodec/pthread_frame.c:475:1
        popq    %rbx
        popq    %r12
        popq    %r14
        popq    %r15
        popq    %rbp
 .Ltmp159:
        retq
 .Ltmp160:
 .Lfunc_end2:
        .size   ff_thread_report_progress, .Lfunc_end2-ff_thread_report_progress
        .cfi_endproc
==========

Wan-Teh Chang
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to