Thanks everyone! I'll get going on making this change in the near future.

On Fri, Jun 2, 2023 at 9:37 AM Nick Dimiduk <ndimi...@apache.org> wrote:

> Hi Bryan,
>
> Based on your observations, I'm in favor of changing the default for new
> minor releases going forward, and maybe some comments about this in the
> online book. I'm also in favor of exposing configurable readahead for
> STREAM reads, enabling easier experimentation for those keen to do so.
>
> Thanks,
> Nick
>
> On Fri, Jun 2, 2023 at 9:42 AM Xiaolin Ha <summer.he...@gmail.com> wrote:
>
> > +1 for disable readahead for pread
> >
> > Bryan Beaudreault <bbeaudrea...@apache.org> 于2023年5月31日周三 20:44写道:
> >
> > > Hello team,
> > >
> > > I recently discovered "hbase.store.reader.no-readahead", which defaults
> > to
> > > false (so readahead is enabled). This only applies to PREAD reads, not
> > > STREAM reads which always use readahead. When readahead is enabled, the
> > > default readahead amount in the DFSClient is 4mb. In my opinion this is
> > > extremely huge for HBase's use-case.
> > >
> > > Further, reads in HBase are always for a block at a time and blocks
> > > typically have more than one row in them. So we are already reading
> > ahead a
> > > bit via block reads. And lastly, readahead is typically useful for
> > > sequential read scenarios. It's unlikely for someone to do sequential
> IO
> > > via PREAD, instead they would use Scans (thus STREAM). In the case
> where
> > > someone is doing sequential IO via PREAD, they'd get some natural
> > readahead
> > > due to our reading of blocks at a time.
> > >
> > > I disabled readahead on about 50 servers across various clusters in our
> > > production environment, and saw a massive (10x or more) drop in disk IO
> > for
> > > random read and mixed read cases. Scan workloads were mostly unaffected
> > due
> > > to not using this setting. I also did a targeted load test of a
> cluster,
> > > with and without readahead, and was able to get double the random read
> > > throughput with it disabled.
> > >
> > > I'd like to update the default for this config to "true", thus
> disabling
> > > readahead for PREAD by default. I also think it's worth investigating
> > > making readahead configurable for STREAM reads, perhaps based on the
> > scan's
> > > max result size or blockBytesScanned of the last next() call.
> > >
> > > Any objections to changing the default?
> > >
> > > https://issues.apache.org/jira/browse/HBASE-27896
> > >
> >
>

Reply via email to