[ 
https://issues.apache.org/jira/browse/HBASE-26812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509099#comment-17509099
 ] 

Lars Hofhansl edited comment on HBASE-26812 at 3/18/22, 10:32 PM:
------------------------------------------------------------------

As an example of the severity: With PHOENIX-6501 and scanners that need at 
least two roundtrips the query in question did not finish after 10 minutes. 
When I just replace Connection in Phoenix with the standard one retrieved from 
{{org.apache.hadoop.hbase.client.ConnectionFactory}} the same query - without 
any other changes - finishes in 6s.

In that case the scan calls already the rpc context set to null.



was (Author: lhofhansl):
As an example of the severity: With PHOENIX-6501 and scanners that need at 
least two roundtrips the query in question did not finish after 10 minutes. 
When I just replace Connection in Phoenix with the standard one retrieved from 
`org.apache.hadoop.hbase.client.ConnectionFactory` the same query - without any 
other changes - finishes in 6s.

In that case the scan calls already the rpc context set to null.


> ShortCircuitingClusterConnection fails to close RegionScanners when making 
> short-circuited calls
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-26812
>                 URL: https://issues.apache.org/jira/browse/HBASE-26812
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.4.9
>            Reporter: Lars Hofhansl
>            Priority: Critical
>
> Just ran into this on the Phoenix side.
> We retrieve a Connection via 
> {{{}RegionCoprocessorEnvironment.createConnection... getTable(...){}}}. And 
> then call get on that table. The Get's key happens to be local. Now each call 
> to table.get() leaves an open StoreScanner around forever. (verified with a 
> memory profiler).
> There references are held via 
> RegionScannerImpl.storeHeap.scannersForDelayedClose. Eventially the 
> RegionServer goes into a GC of death and can only ended with kill -9.
> The reason appears to be that in this case there is no currentCall context. 
> Some time in 2.x the Rpc handler/call was made responsible for closing open 
> region scanners, but we forgot to handle {{ShortCircuitingClusterConnection}}
> It's not immediately clear how to fix this. But it does make 
> ShortCircuitingClusterConnection useless and dangerous. If you use it, you 
> *will* create a giant memory leak.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to