On 06/19/2015 04:33 PM, Andi Kleen wrote:
>> > I *think* we can avoid taking the srcu_read_lock() for the
>> > common case where there are no actual marks on the file
>> > being modified *or* the vfsmount.
> What is so expensive in it? Just the memory barrier in it?

The profiling doesn't hit on the mfence directly, but I assume that the
overhead is coming from there.  The "mov    0x8(%rdi),%rcx" is identical
before and after the barrier, but it appears much more expensive
_after_.  That makes no sense unless the barrier is the thing causing it.

Here's how the annotation mode of 'perf top' breaks it down:

>        │    ffffffff810fb480 <load0>:
>        │      nop
>        │      mov    (%rdi),%rax
>   0.58 │      push   %rbp
>        │      incl   %gs:0x7ef0f488(%rip)
>   1.73 │      mov    %rsp,%rbp
>        │      and    $0x1,%eax
>        │      movslq %eax,%rdx
>   0.58 │      mov    0x8(%rdi),%rcx
>        │      incq   %gs:(%rcx,%rdx,8)
>        │      mfence
>  69.94 │      add    $0x2,%rdx
>   7.51 │      mov    0x8(%rdi),%rcx
>   4.05 │      incq   %gs:(%rcx,%rdx,8)
>  13.87 │      decl   %gs:0x7ef0f45f(%rip)
>        │      pop    %rbp
>   1.73 │    ← retq
>                                                          

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to