[
https://issues.apache.org/jira/browse/PHOENIX-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315655#comment-16315655
]
Karan Mehta commented on PHOENIX-4489:
--------------------------------------
[~vincentpoon]
Technically, as we discussed it shouldn't be a problem since we go out of scope
real quick after the generateSplits() method is executed and the connection
object should be garbage collected. However, if you checkout PHOENIX-4503, the
client is trying to read multiple spark dataframes inside a loop (almost 50
times). Such a code will get executed fast and will result in lots of
HConnections and ZKConnections getting created in a short span of time and I
suspect that even though GC gets triggered to clear them, it might actually
take some time before this to happen (until JVM feels the need). This can cause
issues with the application. I see many issues filed in this regard.
Also, since the connections are not instantiated via factory, it is difficult
to catch their quantity and limit the resources by having a custom
implementation. What do you think?
FYI, [~aertoria]
> HBase Connection leak in Phoenix MR Jobs
> ----------------------------------------
>
> Key: PHOENIX-4489
> URL: https://issues.apache.org/jira/browse/PHOENIX-4489
> Project: Phoenix
> Issue Type: Bug
> Reporter: Karan Mehta
> Assignee: Karan Mehta
> Attachments: PHOENIX-4489.001.patch
>
>
> Phoenix MR jobs uses a custom class {{PhoenixInputFormat}} to determine the
> splits and the parallelism of the work. The class directly opens up a HBase
> connection, which is not closed after the usage. Independently running MR
> jobs should not have any concern, however jobs that run through Phoenix-Spark
> can cause leak issues if this is left unclosed (since those jobs run as a
> part of same JVM).
> Apart from this, the connection should be instantiated with
> {{HBaseFactoryProvider.getHConnectionFactory()}} instead of the default one.
> It can be useful if a separate client is trying to run jobs and wants to
> provide a custom implementation of {{HConnection}}.
> [~jmahonin] Any ideas?
> [~jamestaylor] [~vincentpoon] Any concerns around this?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)