[
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671971#comment-16671971
]
Sankar Hariappan commented on HIVE-20682:
-----------------------------------------
[~maheshk114], [~pvary]
I tried to solve the discussed use cases with Reference count solution but it
have a race condition which can cause NPE.
* Async thread is created with BackgroundWork but it is not yet "scheduled"
which means Hive.set(sessionHive) is not invoked yet in async thread but it
will sometime.
* In the mean time, if the sessionHive is closed by master thread due to
change in sessionConf.
* If async thread gets HMS connection using sessionHive.getMSC() and after
which master thread closes it. Now, async thread might be referring to invalid
HMS client object. It can also be null if close happens first.
So, I'm stepping back to "closeAllow" flag solution (patch.04) where the only
drawback was each query creates new Hive object if sessionConf is changed.
This can be fixed as follows.
* The sessionHive object will be reset by master thread when it is found
mismatch with thread local Hive. This scenario can happen if previous query
execution from master thread found sessionConf is changed for MS related
configs and so re-create Hive connection. But it couldn't close sessionHive
object as allowClose flag is false. Resetting the sessionHive object ensure
that it is created only once if sessionConf is changed.
* As it is unknown if any async thread is still referring to the sessionHive
object, we cannot close it from any async threads. So, we can override
"finalize" method for Hive object to forcefully close it when it is garbage
collected.
I don't see any drawbacks with this approach. Please share your thoughts.
cc [~daijy]
> Async query execution can potentially fail if shared sessionHive is closed by
> master thread.
> --------------------------------------------------------------------------------------------
>
> Key: HIVE-20682
> URL: https://issues.apache.org/jira/browse/HIVE-20682
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 3.1.0, 4.0.0
> Reporter: Sankar Hariappan
> Assignee: Sankar Hariappan
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch,
> HIVE-20682.03.patch, HIVE-20682.04.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl*
> class when we open a new session for a client connection and by default all
> queries from this connection shares the same sessionHive object.
> If the master thread executes a *synchronous* query, it closes the
> sessionHive object (referred via thread local hiveDb) if
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local
> HiveDb but doesn't change the sessionHive object in the session. Whereas,
> *asynchronous* query execution via async threads never closes the sessionHive
> object and it just creates a new one if needed and sets it as their thread
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being
> executed by async threads refers to sessionHive object and the master thread
> receives a *synchronous* query that closes the same sessionHive object.
> Also, each query execution overwrites the thread local hiveDb object to
> sessionHive object which potentially leaks a metastore connection if the
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it
> shouldn't be allowed to be closed by any query execution threads when they
> re-create the Hive object due to changes in Hive configurations. But the Hive
> objects created by query execution threads should be closed when the thread
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive
> object which should be set to *false* for *sessionHive* and would be
> forcefully closed when the session is closed or released.
> cc [~pvary]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)