On Mon, Mar 14, 2016 at 09:59:43AM +0000, Wang Nan wrote: > Convert perf_output_begin to __perf_output_begin and make the later > function able to write records from the end of the ring buffer. > Following commits will utilize the 'backward' flag. > > This patch doesn't introduce any extra performance overhead since we > use always_inline.
So while I agree that with __always_inline and constant propagation we _should_ end up with the same code, we have: $ size defconfig-build/kernel/events/ring_buffer.o.{pre,post} text data bss dec hex filename 3785 2 0 3787 ecb defconfig-build/kernel/events/ring_buffer.o.pre 3673 2 0 3675 e5b defconfig-build/kernel/events/ring_buffer.o.post The patch actually makes the file shrink. So I think we still want to have some actual performance numbers.