[
https://issues.apache.org/jira/browse/HIVE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173846#comment-16173846
]
Thai Bui commented on HIVE-17502:
---------------------------------
I see what you mean. So without HiveSession(Impl) and SessionState to support
reusing of sessions from the same user, the patch is not complete.
Interestingly, we have been using this patch for a couple of weeks with 3-5
concurrent users each making and reusing sessions according to this logics and
things were fine. I did observe a couple of weird problems but I don't thing
they are related.
Anyhow, I'm happy to contribute and/or solidify the HiveSession(Impl) and/or
SessionState to make them immutable and/or stateless more suitable to be reused
by multiple sessions. For example, the current SessionState is bind to a
thread-local static object, if it's bind to a unique SessionID locked in a
database (hive metadata store?), or Zookeeper, or HDFS files, things could have
been different.
It is also possible to change Hue to only issue 1 query at a time per user but
the point was to go beyond that to allow a much better user experience using
Hive 2 w/ LLAP + Hue 4.
Let me know what you guys think, we (our big data & analytics group at
Bazaarvoice) are happy to contribute. Currently we have to build a custom
hive-exec jar with this logic and deploy this jar specifically in Ambari for
this to work. Both worlds are fine but I would prefer to push the patch
upstream to make it official and potentially solidify HiveSession
implementation incremental. For example, I think making a new HS2 config option
`hive.sessions.default-session.reuse=false` could work for this patch. If the
option is false (by default), the logics stay the same and an exception is
thrown, if true, then the new patch logics apply, allowing avid users to have
multiple sessions per user. Understandably, having too many options is
confusing. If that's the case we'll just close this ticket but thanks for the
discussion either way!
> Reuse of default session should not throw an exception in LLAP w/ Tez
> ---------------------------------------------------------------------
>
> Key: HIVE-17502
> URL: https://issues.apache.org/jira/browse/HIVE-17502
> Project: Hive
> Issue Type: Bug
> Components: llap, Tez
> Affects Versions: 2.1.1, 2.2.0
> Environment: HDP 2.6.1.0-129, Hue 4
> Reporter: Thai Bui
> Assignee: Thai Bui
>
> Hive2 w/ LLAP on Tez doesn't allow a currently used, default session to be
> skipped mostly because of this line
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L365.
> However, some clients such as Hue 4, allow multiple sessions to be used per
> user. Under this configuration, a Thrift client will send a request to either
> reuse or open a new session. The reuse request could include the session id
> of a currently used snippet being executed in Hue, this causes HS2 to throw
> an exception:
> {noformat}
> 2017-09-10T17:51:36,548 INFO [Thread-89]: tez.TezSessionPoolManager
> (TezSessionPoolManager.java:canWorkWithSameSession(512)) - The current user:
> hive, session user: hive
> 2017-09-10T17:51:36,549 ERROR [Thread-89]: exec.Task
> (TezTask.java:execute(232)) - Failed to execute tez graph.
> org.apache.hadoop.hive.ql.metadata.HiveException: The pool session
> sessionId=5b61a578-6336-41c5-860d-9838166f97fe, queueName=llap, user=hive,
> doAs=false, isOpen=true, isDefault=true, expires in 591015330ms should have
> been returned to the pool
> at
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.canWorkWithSameSession(TezSessionPoolManager.java:534)
> ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> at
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:544)
> ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:147)
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79)
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> {noformat}
> Note that every query is issued as a single 'hive' user to share the LLAP
> daemon pool, a set of pre-determined number of AMs is initialized at setup
> time. Thus, HS2 should allow new sessions from a Thrift client to be used out
> of the pool, or an existing session to be skipped and an unused session from
> the pool to be returned. The logic to throw an exception in the
> `canWorkWithSameSession` doesn't make sense to me.
> I have a solution to fix this issue in my local branch at
> https://github.com/thaibui/hive/commit/078a521b9d0906fe6c0323b63e567f6eee2f3a70.
> When applied, the log will become like so
> {noformat}
> 2017-09-10T09:15:33,578 INFO [Thread-239]: tez.TezSessionPoolManager
> (TezSessionPoolManager.java:canWorkWithSameSession(533)) - Skipping default
> session sessionId=6638b1da-0f8a-405e-85f0-9586f484e6de, queueName=llap,
> user=hive, doAs=false, isOpen=true, isDefault=true, expires in 591868732ms
> since it is being used.
> {noformat}
> A test case is provided in my branch to demonstrate how it works. If possible
> I would like this patch to be applied to version 2.1, 2.2 and master. Since
> we are using 2.1 LLAP in production with Hue 4, this patch is critical to our
> success.
> Alternatively, if this patch is too broad in scope, I propose adding an
> option to allow "skipping of currently used default sessions". With this new
> option default to "false", existing behavior won't change unless the option
> is turned on.
> I will prepare an official path if this change to master &/ the other
> branches is acceptable. I'm not an contributor &/ committer, this will be my
> first time contributing to Hive and the Apache foundation. Any early review
> is greatly appreciated, thanks!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)