Unless the potential payoff is significant (yes, this might be hard to guess) I would vote for dropping a complex and incomplete (IMHO) disabled-by-default 'feature' that is, I would estimate, rarely used if at all, probably not at all.
On Wed, Dec 1, 2021 at 8:05 AM Bryan Beaudreault <[email protected]> wrote: > hbase.storescanner.parallel.seek.enable was added a few years ago in > https://issues.apache.org/jira/browse/HBASE-7495, but still defaults to > disabled. The description says "Enables StoreFileScanner parallel-seeking > in StoreScanner, a feature which can reduce response latency under special > conditions". > > It's not very clear what "special conditions" means. Reading through the > entire comment history on that issue seems to indicate it can help when you > have "high random read, low cache hit rate, many store files". > > We have a bunch of clusters with this shape, and in fact we use SSDs for > all storage so I figured this might help a lot. I tried setting this to > true on one RegionServer of one of our highest QPS clusters hoping I'd see > some clear improvement. This very simple test was pretty much a wash, so I > need to do more methodical testing. > > In the test one thing became clear though – is the default thread pool size > of 10 good enough for my use-case? I have no way of knowing, as there is no > logging or metrics that I can find around thread pool saturation. What I > ended up doing was spamming refresh of the /dump endpoint of the RS, and > noticed that there were sometimes 1-5 tasks queued for the RS_PARALLEL_SEEK > executor. This indicates maybe I should scale the thread pool, but > use-cases change over time so this seems like not a great way to determine > that. > > Task queuing seems not great for a feature which is aimed at reducing > latencies. I wonder if we should consider some changes to make this more > easy to deploy in production. Here are some ideas I had: > > - Can we generate a better default value for the thread pool size, maybe > based on number of RS handler threads or some other heuristic? > - Should we consider eliminating queuing for this feature? Instead, if > the threadpool is saturated run the seek in-line in the current thread > (i.e. revert to normal). This would be more similar to how hedged reads > work in HDFS. > - Can we expose a metric or logging to help operators know when to scale > up the thread pool? If we implemented the 2nd option above we could > expose > "seeksInCurrentThread" counter to track this, again similar to how > hedged > reads report on saturation. > > But with all of this said, I wonder if anyone is running this in production > and has any updated guidance on when to use this? Does it still make sense > given the last 8 years of development in HBase? Would it ever make sense to > make it enabled by default? > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk
