[
https://issues.apache.org/jira/browse/HBASE-26812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509099#comment-17509099
]
Lars Hofhansl edited comment on HBASE-26812 at 3/18/22, 10:32 PM:
------------------------------------------------------------------
As an example of the severity: With PHOENIX-6501 and scanners that need at
least two roundtrips the query in question did not finish after 10 minutes.
When I just replace Connection in Phoenix with the standard one retrieved from
{{org.apache.hadoop.hbase.client.ConnectionFactory}} the same query - without
any other changes - finishes in 6s.
In that case the scan calls already the rpc context set to null.
was (Author: lhofhansl):
As an example of the severity: With PHOENIX-6501 and scanners that need at
least two roundtrips the query in question did not finish after 10 minutes.
When I just replace Connection in Phoenix with the standard one retrieved from
`org.apache.hadoop.hbase.client.ConnectionFactory` the same query - without any
other changes - finishes in 6s.
In that case the scan calls already the rpc context set to null.
> ShortCircuitingClusterConnection fails to close RegionScanners when making
> short-circuited calls
> ------------------------------------------------------------------------------------------------
>
> Key: HBASE-26812
> URL: https://issues.apache.org/jira/browse/HBASE-26812
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.4.9
> Reporter: Lars Hofhansl
> Priority: Critical
>
> Just ran into this on the Phoenix side.
> We retrieve a Connection via
> {{{}RegionCoprocessorEnvironment.createConnection... getTable(...){}}}. And
> then call get on that table. The Get's key happens to be local. Now each call
> to table.get() leaves an open StoreScanner around forever. (verified with a
> memory profiler).
> There references are held via
> RegionScannerImpl.storeHeap.scannersForDelayedClose. Eventially the
> RegionServer goes into a GC of death and can only ended with kill -9.
> The reason appears to be that in this case there is no currentCall context.
> Some time in 2.x the Rpc handler/call was made responsible for closing open
> region scanners, but we forgot to handle {{ShortCircuitingClusterConnection}}
> It's not immediately clear how to fix this. But it does make
> ShortCircuitingClusterConnection useless and dangerous. If you use it, you
> *will* create a giant memory leak.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)