[ 
https://issues.apache.org/jira/browse/HIVE-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826628#comment-15826628
 ] 

Sergey Shelukhin edited comment on HIVE-15645 at 1/17/17 7:11 PM:
------------------------------------------------------------------

We had a repro on some cluster that indicates that the patch will fix the 
problem.
It has to do with config being out of sync with the property. First session 
gets config and property correct, but something (I am pretty sure it's the 
unset in open path) resets the config. Then the 2nd session (after expiration) 
gets the property correct but the config is not set, so it logs as if it is 
going to correct queue but goes to a wrong (default) queue, which is what we 
have observed for a specific session in the cluster. The field is also reset to 
null from conf (in a place where I added the warn log), after the log statement 
about the queue. The 3rd session (after the 2nd expiration) logs null queue 
(because the field is also null now), and goes to the wrong queue, as does 
every one after that. So, for pool sessions we set the session into conf every 
time now. I also added a warn log for the future, and a null check cause we 
never expect null queue for pool sessions. To fix this properly the separation 
of pool and non-pool sessions that was started at some point needs to be 
completed, but that's a major refactoring, not a bugfix.


was (Author: sershe):
We had a repro on some cluster that indicates that the patch will fix the 
problem.
It has to do with config being out of sync with the property. First session 
gets config and property correct, but something (I am pretty sure it's the 
unset in open path) resets the config. Then the 2nd session (after expiration) 
gets the property correct but the config is not set, so it logs as if it is 
going to correct queue but goes to wrong queue, which is what we have observed 
for a specific session. The field is also reset to null from conf (in a place 
where I added the warn log), after the log statement about the queue. The 3rd 
session (after the 2nd expiration) logs null queue (because the field is also 
null now), and goes to the wrong queue, as does every one after that. So, for 
pool sessions we set the session into conf every time now. I also added a warn 
log for the future, and a null check cause we never expect null queue for pool 
sessions. To fix this properly the separation of pool and non-pool sessions 
that was started at some point needs to be completed, but that's a major 
refactoring, not a bugfix.

> Tez session pool may restart sessions in a wrong queue
> ------------------------------------------------------
>
>                 Key: HIVE-15645
>                 URL: https://issues.apache.org/jira/browse/HIVE-15645
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Carter Shanklin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-15645.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to