On Wed, May 29, 2019 at 05:06:40PM +0100, David Howells wrote:

> Looking at the perf ring buffer, there appears to be a missing barrier in
> perf_aux_output_end():
> 
>       rb->user_page->aux_head = rb->aux_head;
> 
> should be:
> 
>       smp_store_release(&rb->user_page->aux_head, rb->aux_head);

I've answered that in another email; the aux bit is 'magic'.

> It should also be using smp_load_acquire().  See
> Documentation/core-api/circular-buffers.rst

We use the control dependency instead, as described in the comment of
perf_output_put_handle():

         *   kernel                             user
         *
         *   if (LOAD ->data_tail) {            LOAD ->data_head
         *                      (A)             smp_rmb()       (C)
         *      STORE $data                     LOAD $data
         *      smp_wmb()       (B)             smp_mb()        (D)
         *      STORE ->data_head               STORE ->data_tail
         *   }
         *
         * Where A pairs with D, and B pairs with C.
         *
         * In our case (A) is a control dependency that separates the load of
         * the ->data_tail and the stores of $data. In case ->data_tail
         * indicates there is no room in the buffer to store $data we do not.
         *
         * D needs to be a full barrier since it separates the data READ
         * from the tail WRITE.
         *
         * For B a WMB is sufficient since it separates two WRITEs, and for C
         * an RMB is sufficient since it separates two READs.

Userspace can choose to use smp_load_acquire() over the first smp_rmb()
if that is efficient for the architecture (for w ahole bunch of archs
load-acquire would end up using mb() while rmb() is adequate and
cheaper).

Reply via email to