On Tue, 12 Dec 2023 11:16:17 -0500
Steven Rostedt <rost...@goodmis.org> wrote:

> From: "Steven Rostedt (Google)" <rost...@goodmis.org>
> 
> The maximum ring buffer data size is the maximum size of data that can be
> recorded on the ring buffer. Events must be smaller than the sub buffer
> data size minus any meta data. This size is checked before trying to
> allocate from the ring buffer because the allocation assumes that the size
> will fit on the sub buffer.
> 
> The maximum size was calculated as the size of a sub buffer page (which is
> currently PAGE_SIZE minus the sub buffer header) minus the size of the
> meta data of an individual event. But it missed the possible adding of a
> time stamp for events that are added long enough apart that the event meta
> data can't hold the time delta.
> 
> When an event is added that is greater than the current BUF_MAX_DATA_SIZE
> minus the size of a time stamp, but still less than or equal to
> BUF_MAX_DATA_SIZE, the ring buffer would go into an infinite loop, looking
> for a page that can hold the event. Luckily, there's a check for this loop
> and after 1000 iterations and a warning is emitted and the ring buffer is
> disabled. But this should never happen.
> 
> This can happen when a large event is added first, or after a long period
> where an absolute timestamp is prefixed to the event, increasing its size
> by 8 bytes. This passes the check and then goes into the algorithm that
> causes the infinite loop.
> 
> For events that are the first event on the sub-buffer, it does not need to
> add a timestamp, because the sub-buffer itself contains an absolute
> timestamp, and adding one is redundant.
> 
> The fix is to check if the event is to be the first event on the
> sub-buffer, and if it is, then do not add a timestamp.
> 
> This also fixes 32 bit adding a timestamp when a read of before_stamp or
> write_stamp is interrupted. There's still no need to add that timestamp if
> the event is going to be the first event on the sub buffer.
> 
> Also, if the buffer has "time_stamp_abs" set, then also check if the
> length plus the timestamp is greater than the BUF_MAX_DATA_SIZE.
> 
> Link: https://lore.kernel.org/all/20231212104549.58863...@gandalf.local.home/
> Link: 
> https://lore.kernel.org/linux-trace-kernel/20231212071837.5fdd6...@gandalf.local.home
> 
> Cc: sta...@vger.kernel.org
> Fixes: a4543a2fa9ef3 ("ring-buffer: Get timestamp after event is allocated")
> Fixes: 58fbc3c63275c ("ring-buffer: Consolidate add_timestamp to remove some 
> branches")
> Reported-by: Kent Overstreet <kent.overstr...@linux.dev> # (on IRC)
> Signed-off-by: Steven Rostedt (Google) <rost...@goodmis.org>

This looks good to me :)

Acked-by: Masami Hiramatsu (Google) <mhira...@kernel.org>

Thank you!

> ---
> Changes since v2: 
> https://lore.kernel.org/linux-trace-kernel/20231212065922.05f28...@gandalf.local.home
> 
> - Just test 'w' first, and then do the rest of the checks.
> 
>  kernel/trace/ring_buffer.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 8d2a4f00eca9..b8986f82eccf 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -3579,7 +3579,10 @@ __rb_reserve_next(struct ring_buffer_per_cpu 
> *cpu_buffer,
>                * absolute timestamp.
>                * Don't bother if this is the start of a new page (w == 0).
>                */
> -             if (unlikely(!a_ok || !b_ok || (info->before != info->after && 
> w))) {
> +             if (!w) {
> +                     /* Use the sub-buffer timestamp */
> +                     info->delta = 0;
> +             } else if (unlikely(!a_ok || !b_ok || info->before != 
> info->after)) {
>                       info->add_timestamp |= RB_ADD_STAMP_FORCE | 
> RB_ADD_STAMP_EXTEND;
>                       info->length += RB_LEN_TIME_EXTEND;
>               } else {
> @@ -3737,6 +3740,8 @@ rb_reserve_next_event(struct trace_buffer *buffer,
>       if (ring_buffer_time_stamp_abs(cpu_buffer->buffer)) {
>               add_ts_default = RB_ADD_STAMP_ABSOLUTE;
>               info.length += RB_LEN_TIME_EXTEND;
> +             if (info.length > BUF_MAX_DATA_SIZE)
> +                     goto out_fail;
>       } else {
>               add_ts_default = RB_ADD_STAMP_NONE;
>       }
> -- 
> 2.42.0
> 


-- 
Masami Hiramatsu (Google) <mhira...@kernel.org>

Reply via email to