Re: [libav-devel] [PATCH] pthread_frame: use better memory orders for frame progress

2016-12-14 Thread Anton Khirnov
Quoting Wan-Teh Chang (2016-12-09 18:54:47)
> This improves commit 59c70227405c214b29971e6272f3a3ff6fcce3d0.
> 
> In ff_thread_report_progress(), the fast code path can load
> progress[field] with the relaxed memory order, and the slow code path
> can store progress[field] with the release memory order. These changes
> are mainly intended to avoid confusion when one inspects the source code.
> They are unlikely to have measurable performance improvement.
> 
> ff_thread_report_progress() and ff_thread_await_progress() form a pair.
> ff_thread_await_progress() reads progress[field] with the acquire memory
> order (in the fast code path). Therefore, one expects to see
> ff_thread_report_progress() write progress[field] with the matching
> release memory order.
> 
> In the fast code path in ff_thread_report_progress(), the atomic load of
> progress[field] doesn't need the acquire memory order because the
> calling thread is trying to make the data it just decoded visible to the
> other threads, rather than trying to read the data decoded by other
> threads.
> 
> In ff_thread_get_buffer(), initialize progress[0] and progress[1] using
> atomic_init().
> 
> Signed-off-by: Wan-Teh Chang 
> ---
>  libavcodec/pthread_frame.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)

Pushed, thanks.

-- 
Anton Khirnov
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] pthread_frame: use better memory orders for frame progress

2016-12-09 Thread Wan-Teh Chang
On Fri, Dec 9, 2016 at 10:22 AM, Sean McGovern  wrote:
> Hi Wan-Teh,
>
[...]
> Does this by any chance fix https://bugzilla.libav.org/show_bug.cgi?id=930 ?

Hi Sean,

No. My patch is completely unrelated to that bug.

That bug is a crash caused by a null pointer dereference. The crash
occurs when preparing the first argument for
ff_thread_report_progress(). I believe the process didn't enter the
ff_thread_report_progress() function, even though
ff_thread_report_progress() is in the callback.

Wan-Teh Chang
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] pthread_frame: use better memory orders for frame progress

2016-12-09 Thread Sean McGovern
Hi Wan-Teh,

On Dec 9, 2016 12:55, "Wan-Teh Chang"  wrote:

This improves commit 59c70227405c214b29971e6272f3a3ff6fcce3d0.

In ff_thread_report_progress(), the fast code path can load
progress[field] with the relaxed memory order, and the slow code path
can store progress[field] with the release memory order. These changes
are mainly intended to avoid confusion when one inspects the source code.
They are unlikely to have measurable performance improvement.

ff_thread_report_progress() and ff_thread_await_progress() form a pair.
ff_thread_await_progress() reads progress[field] with the acquire memory
order (in the fast code path). Therefore, one expects to see
ff_thread_report_progress() write progress[field] with the matching
release memory order.

In the fast code path in ff_thread_report_progress(), the atomic load of
progress[field] doesn't need the acquire memory order because the
calling thread is trying to make the data it just decoded visible to the
other threads, rather than trying to read the data decoded by other
threads.

In ff_thread_get_buffer(), initialize progress[0] and progress[1] using
atomic_init().

Signed-off-by: Wan-Teh Chang 
---
 libavcodec/pthread_frame.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libavcodec/pthread_frame.c b/libavcodec/pthread_frame.c
index 2736a81..142eaa5 100644
--- a/libavcodec/pthread_frame.c
+++ b/libavcodec/pthread_frame.c
@@ -458,7 +458,7 @@ void ff_thread_report_progress(ThreadFrame *f, int n,
int field)
 atomic_int *progress = f->progress ? (atomic_int*)f->progress->data :
NULL;

 if (!progress ||
-atomic_load_explicit([field], memory_order_acquire) >= n)
+atomic_load_explicit([field], memory_order_relaxed) >= n)
 return;

 p = f->owner->internal->thread_ctx;
@@ -468,7 +468,7 @@ void ff_thread_report_progress(ThreadFrame *f, int n,
int field)

 pthread_mutex_lock(>progress_mutex);

-atomic_store([field], n);
+atomic_store_explicit([field], n, memory_order_release);

 pthread_cond_broadcast(>progress_cond);
 pthread_mutex_unlock(>progress_mutex);
@@ -745,8 +745,8 @@ int ff_thread_get_buffer(AVCodecContext *avctx,
ThreadFrame *f, int flags)
 }
 progress = (atomic_int*)f->progress->data;

-atomic_store([0], -1);
-atomic_store([1], -1);
+atomic_init([0], -1);
+atomic_init([1], -1);
 }

 pthread_mutex_lock(>parent->buffer_mutex);
--
2.8.0.rc3.226.g39d4020

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Does this by any chance fix https://bugzilla.libav.org/show_bug.cgi?id=930 ?

Sean McGovern -  +1 613 862 6507
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] pthread_frame: use better memory orders for frame progress

2016-12-08 Thread Wan-Teh Chang
Hi Luca,

I will update the patch with the longer description.

On Wed, Dec 7, 2016 at 5:42 PM, Luca Barbato  wrote:
>
> Assuming it is not causing noticeable slowdowns/speedups (a benchmark
> would be welcome), I'm all for consistency.

I assume that initializing progress[0] and progress[1] with
atomic_init() instead of atomic_store() is not controversial. The
changes to ff_thread_report_progress() are unlikely to cause
noticeable slowdowns/speedups. In this kind of situation I often
inspect the code generated by the compiler to evaluate the changes.

In this analysis I use clang 3.8.0, targeting x86_64, at optimization
level -O3. (I admit a compiler targeting ARM would be better.)

The first change in ff_thread_report_progress(), to the
atomic_load_explicit() call, results in no change in the generated
code.

The second change in ff_thread_report_progress(), to the
atomic_store(), changes an xchgl instruction to a movl instruction.

Here is the diff between the generated code (original vs. modified) of
the end of ff_thread_report_progress(). The original C source code is:
469 pthread_mutex_lock(>progress_mutex);
470
471 atomic_store([field], n);
472
473 pthread_cond_broadcast(>progress_cond);
474 pthread_mutex_unlock(>progress_mutex);
475 }

I change line 471 to:

471 atomic_store_explicit([field], n, memory_order_release);

The diff between generated code is:

==
 .Ltmp154:
 .LBB2_5:
#DEBUG_VALUE: ff_thread_report_progress:field <- %R9D
#DEBUG_VALUE: ff_thread_report_progress:n <- %EBP
#DEBUG_VALUE: ff_thread_report_progress:progress <- %RBX
#DEBUG_VALUE: ff_thread_report_progress:p <- %R14
.loc6 469 28# libavcodec/pthread_frame.c:469:28
leaq208(%r14), %r15
.loc6 469 5 is_stmt 0   # libavcodec/pthread_frame.c:469:5
movq%r15, %rdi
callq   pthread_mutex_lock
.loc6 471 5 is_stmt 1   # libavcodec/pthread_frame.c:471:5
-   xchgl   %ebp, (%rbx,%r12,4)
-.Ltmp155:
+   movl%ebp, (%rbx,%r12,4)
.loc6 473 32# libavcodec/pthread_frame.c:473:32
addq$72, %r14
-.Ltmp156:
+.Ltmp155:
.loc6 473 5 is_stmt 0   # libavcodec/pthread_frame.c:473:5
movq%r14, %rdi
callq   pthread_cond_broadcast
.loc6 474 5 is_stmt 1   # libavcodec/pthread_frame.c:474:5
movq%r15, %rdi
popq%rbx
-.Ltmp157:
+.Ltmp156:
popq%r12
popq%r14
popq%r15
popq%rbp
+.Ltmp157:
jmp pthread_mutex_unlock# TAILCALL
 .Ltmp158:
 .LBB2_6:# %.thread
#DEBUG_VALUE: ff_thread_report_progress:n <- %EBP
#DEBUG_VALUE: ff_thread_report_progress:field <- %R9D
#DEBUG_VALUE: ff_thread_report_progress:f <- %RDI
.loc6 475 1 discriminator 2 # libavcodec/pthread_frame.c:475:1
popq%rbx
popq%r12
popq%r14
popq%r15
popq%rbp
 .Ltmp159:
retq
 .Ltmp160:
 .Lfunc_end2:
.size   ff_thread_report_progress, .Lfunc_end2-ff_thread_report_progress
.cfi_endproc
==

Wan-Teh Chang
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] pthread_frame: use better memory orders for frame progress

2016-12-07 Thread Luca Barbato
On 30/11/2016 23:13, Wan-Teh Chang wrote:
> This patch is intended to avoid confusion when one inspects the source code.
> 
> ff_thread_report_progress() and ff_thread_await_progress() form a
> pair. ff_thread_await_progress() reads progress[field] with the
> "acquire" memory order (in the fast code path). Therefore, one expects
> to see ff_thread_report_progress() write progress[field] with the
> matching "release" memory order.
> 
> In the fast code path in  ff_thread_report_progress(), the atomic load
> of progress[field] doesn't need the "acquire" memory order because the
> calling thread is trying to make the data it just decoded visible to
> the other threads, rather than trying to read the data decoded by
> other threads.
> 
> Please let me know if you have other questions.

Assuming it is not causing noticeable slowdowns/speedups (a benchmark
would be welcome), I'm all for consistency.

Would be nice if you update the patch with this longer description though.

Anton, do you agree?

lu

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] pthread_frame: use better memory orders for frame progress

2016-11-30 Thread Wan-Teh Chang
Hi Anton,

On Wed, Nov 30, 2016 at 1:03 PM, Anton Khirnov  wrote:
>
> Did you see any measurable change in performance after this patch?

This patch is intended to avoid confusion when one inspects the source code.

ff_thread_report_progress() and ff_thread_await_progress() form a
pair. ff_thread_await_progress() reads progress[field] with the
"acquire" memory order (in the fast code path). Therefore, one expects
to see ff_thread_report_progress() write progress[field] with the
matching "release" memory order.

In the fast code path in  ff_thread_report_progress(), the atomic load
of progress[field] doesn't need the "acquire" memory order because the
calling thread is trying to make the data it just decoded visible to
the other threads, rather than trying to read the data decoded by
other threads.

Please let me know if you have other questions.

Thanks,
Wan-Teh Chang
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] pthread_frame: use better memory orders for frame progress

2016-11-30 Thread Anton Khirnov
Quoting Wan-Teh Chang (2016-11-30 21:33:37)
> This improves commit 59c70227405c214b29971e6272f3a3ff6fcce3d0.
> 
> In ff_thread_report_progress(), the fast code path can load
> progress[field] with the relaxed memory order, and the slow code path
> can store progress[field] with the release memory order.
> 
> In ff_thread_get_buffer(), initialize progress[0] and progress[1] using
> atomic_init().
> 
> Signed-off-by: Wan-Teh Chang 
> ---
>  libavcodec/pthread_frame.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/libavcodec/pthread_frame.c b/libavcodec/pthread_frame.c
> index 2736a81..142eaa5 100644
> --- a/libavcodec/pthread_frame.c
> +++ b/libavcodec/pthread_frame.c
> @@ -458,7 +458,7 @@ void ff_thread_report_progress(ThreadFrame *f, int n, int 
> field)
>  atomic_int *progress = f->progress ? (atomic_int*)f->progress->data : 
> NULL;
>  
>  if (!progress ||
> -atomic_load_explicit([field], memory_order_acquire) >= n)
> +atomic_load_explicit([field], memory_order_relaxed) >= n)
>  return;
>  
>  p = f->owner->internal->thread_ctx;
> @@ -468,7 +468,7 @@ void ff_thread_report_progress(ThreadFrame *f, int n, int 
> field)
>  
>  pthread_mutex_lock(>progress_mutex);
>  
> -atomic_store([field], n);
> +atomic_store_explicit([field], n, memory_order_release);
>  
>  pthread_cond_broadcast(>progress_cond);
>  pthread_mutex_unlock(>progress_mutex);
> @@ -745,8 +745,8 @@ int ff_thread_get_buffer(AVCodecContext *avctx, 
> ThreadFrame *f, int flags)
>  }
>  progress = (atomic_int*)f->progress->data;
>  
> -atomic_store([0], -1);
> -atomic_store([1], -1);
> +atomic_init([0], -1);
> +atomic_init([1], -1);
>  }
>  
>  pthread_mutex_lock(>parent->buffer_mutex);
> -- 
> 2.8.0.rc3.226.g39d4020

Did you see any measurable change in performance after this patch?

I used to have lot more explicit memory orders in my original version of
the patch, but then I dropped most of them, since I thought they were
unlikely to have any significant effect and just added a chance of hard
to detect racy bugs.

-- 
Anton Khirnov
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel