[
Replying to Sashiko from v19:
https://sashiko.dev/?list=org.kernel.vger.linux-trace-kernel#/patchset/177751968499.2136606.17388366710182662849.stgit%40mhiramat.tok.corp.google.com
]
> > @@ -1892,9 +1895,27 @@ static int rb_validate_buffer(struct
> > buffer_data_page *dpage, int cpu,
> > * subbuf_size is considered invalid.
> > */
> > tail = local_read(&dpage->commit) & ~RB_MISSED_MASK;
> > - if (tail > meta->subbuf_size - BUF_PAGE_HDR_SIZE)
> > - return -1;
> > - return rb_read_data_buffer(dpage, tail, cpu, &ts, &delta);
> > + if (tail <= meta->subbuf_size - BUF_PAGE_HDR_SIZE)
> > + ret = rb_read_data_buffer(dpage, tail, cpu, &ts, &delta);
> Since tail is read from untrusted persistent memory, does this call evaluating
The persistent memory is not untrusted.
> event headers verify enough bytes remain in the sub-buffer?
> If tail is set to the maximum allowed value, and preceding events are crafted
> so that the last event starts at tail - 1, the event pointer inside
> rb_read_data_buffer() will point to the very last byte.
> Calling rb_event_length(event) will evaluate event->type_len via a 32-bit
> read which could read past the page boundary and cause a regression if the
> adjacent page is unmapped.
> Should the parsing loop verify tail - e >= sizeof(struct ring_buffer_event)
> before evaluating the event?
> > + else
> > + ret = -1;
> > +
> > + /*
> > + * The timestamp must be greater than @prev_ts and smaller than
> > @next_ts.
> > + * Since this function works in both forward (verify) and reverse
> > (unwind)
> > + * loop, we don't know both @prev_ts and @next_ts at the same time.
> > + * So use the known boundary as the boundary.
> > + */
> > + if (ret < 0 || (prev_ts && prev_ts > ts) || (next_ts && ts > next_ts)) {
> Does this timestamp validation properly enforce strictly monotonic page
> ordering?
> Inside rb_validate_buffer(), ts is updated by rb_read_data_buffer() to
> the timestamp of the last event in the current page. The condition
> (prev_ts && prev_ts > ts) only checks that the previous page's start time
> is not greater than the current page's end time.
> Could a corrupted page with a start timestamp earlier than the previous
> page incorrectly pass validation if its final event timestamp is large enough?
> Would it prevent this regression to enforce that the current page starts after
> the previous page by checking prev_ts > dpage->time_stamp?
This may be an enhancement but shouldn't be critical for what is happening now.
> > + local_set(&bpage->entries, 0);
> [ ... ]
> > @@ -1915,25 +1936,29 @@ static void rb_meta_validate_events(struct
> > ring_buffer_per_cpu *cpu_buffer)
> > orig_head = head_page = cpu_buffer->head_page;
> > orig_reader = cpu_buffer->reader_page;
> >
> > - /* Do the reader page first */
> > - ret = rb_validate_buffer(orig_reader->page, cpu_buffer->cpu, meta);
> > + /* Do the head page first */
> > + ret = rb_validate_buffer(head_page, cpu_buffer->cpu, meta, 0, 0);
> > + if (ret < 0) {
> > + pr_info("Ring buffer meta [%d] invalid head page detected\n",
> > + cpu_buffer->cpu);
> > + goto skip_rewind;
> > + }
> > + ts = head_page->page->time_stamp;
> > +
> > + /* Do the reader page - reader must be previous to head. */
> > + ret = rb_validate_buffer(orig_reader, cpu_buffer->cpu, meta, 0, ts);
> If rb_validate_buffer() for the head page fails, we take the goto skip_rewind
> path. Since skip_rewind jumps past this orig_reader validation, and the
> iteration loop explicitly skips orig_reader:
> /* Iterate until finding the commit page */
> for (i = 0; i < meta->nr_subbufs + 1; i++, rb_inc_page(&head_page)) {
> /* The original reader page has already been checked/counted. */
> if (head_page == orig_reader)
> continue;
> does this mean the reader page escapes validation entirely, introducing a
> regression?
This is a bug and needs to be fixed. If the head page is invalid it
shouldn't skip the rewind but instead jump to the "invalid:" label.
> If an attacker provides a corrupted persistent ring buffer with an invalid
> head page, would the kernel blindly trust the unvalidated commit length
> and event data of the reader page?
Not worried about attackers but I am worried about corrupted memory.
-- Steve