Bryan Beaudreault created HBASE-27227:
-----------------------------------------
Summary: Long running heavily filtered scans hold up too many
ByteBuffAllocator buffers
Key: HBASE-27227
URL: https://issues.apache.org/jira/browse/HBASE-27227
Project: HBase
Issue Type: Improvement
Reporter: Bryan Beaudreault
We have a workload which is launching long running scans searching for a needle
in a haystack. They have a timeout of 60s, so are allowed to run on the server
for 30s. Most of the rows are filtered, and the final result is usually only a
few kb.
When these scans are running, we notice our ByteBuffAllocator pool usage goes
to 100% and we start seeing 100+ MB/s of heap allocations. When the scans
finish, the pool goes back to normal and heap allocations go away.
My working theory here is that we are only releasing ByteBuff's once we call
{{shipper.shipped(),}} which only happens once a response is returned to the
user. This works fine for normal scans which are likely to quickly find enough
results to return, but for long running scans in which most of the results are
filtered we end up holding on to more and more buffers until the scan finally
returns.
We should consider whether it's possible to release buffers for blocks whose
cells have been completely skipped by a scan.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)