[ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227311#comment-16227311
 ] 

Chris Drome commented on HIVE-17853:
------------------------------------

[~vihangk1], as per the description, consider the case of Oozie {{oozie}} 
impersonating a different user {{mithun}}. The {{oozie}} user will create a 
client and open the connection to the metastore within the doAs clause, which 
means that all operations during this session are performed as {{mithun}}.

A retry/reconnect can occur if the read timeout for an operation is exceeded or 
the lifetime of the connection is exceeded. At this point, {{close}} is called 
explicitly, followed by a call to {{open}} to establish a new connection. 
However, the reconnect call is not being performed in a doAs context, so it 
will create a new connection to the metastore as {{oozie}}.

There is no specific stack trace to attach here as it depends on the operations 
executed after the reconnect, and typically manifests as a failure caused by 
insufficient privileges. Worst case, if {{oozie}} has more privileges than 
{{mithun}}, it will successfully perform operations that {{mithun}} is not 
allowed to perform.

According to the API, fetching the UserGroupInformation object can throw an 
IOException. I'm not familiar with the cases under which this would occur. 
However, I didn't want to fail immediately, because if the connection was 
initially established within a doAs, the calling code should have been able to 
establish a proper identity. So I let as much work get accomplished until the 
reconnect fails, which shouldn't be a problem, because most metastore sessions 
are not long-lived.

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-17853
>                 URL: https://issues.apache.org/jira/browse/HIVE-17853
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 3.0.0, 2.4.0, 2.2.1
>            Reporter: Mithun Radhakrishnan
>            Assignee: Chris Drome
>            Priority: Critical
>         Attachments: HIVE-17853.01-branch-2.2.patch, 
> HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch
>
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to