On Mon 05 Feb 2024 at 14:38, Brian Foster <[email protected]>
wrote:
On Mon, Feb 05, 2024 at 02:15:36PM -0500, Kent Overstreet wrote:
On Mon, Feb 05, 2024 at 10:48:14AM -0500, Brian Foster wrote:
> bch2_direct_IO_read() checks the request offset and size for
> sector
> alignment and then falls through to a couple calculations to
> shrink
> the size of the request based on the inode size. The problem
> is that
> these checks round up to the fs block size, which runs the
> risk of
> underflowing iter->count if the block size happens to be
> large
> enough. This is triggered by fstest generic/361 with a 4k
> block
> size, which subsequently leads to a crash.
>
> After some discussion, the original purpose of the shorten
> logic in this
> path is not totally clear. It appears to be intended as an
> optimization
> of limited value, so simplify things and avoid the underflow
> problem by
> removing it.
Thinking about this a bit more...
All that we're avoiding with the shorten is zeroing out the
rest of the
read, which will happen in bch2_read_extent(); we won't be
issuing any
IO, since that would require real data extents above i_size,
which is of
course not allowed.
But zeroing out the rest of the read is actual work that's
being
avoided, since if we return a short read (which is what we're
doing
here) it's undefined what happens to the rest of the buffer.
So... maybe we should keep it, for the crazy but inevitable
application
that uses a 1MB read buffer for everything, including O_DIRECT
reads to
4k files?
I had another change around that basically just protected
against the
underflow. I.e., something along the lines of the following, to
also
cover the additional use of shorten later in the function:
if (shorten >= iter->count)
shorten = 0;
The change is a clean fix without breaking logic in
bch2_direct_IO_read.
If without shroten, the next loop counting on iter->count must be
altered.
I'd like the simpler way.
--
Su
... and IIRC that seemed to work as well. But I'll probably have
to take
a closer look at that large buffer case to grok what's happening
and the
original purpose of this logic.
Brian