[jira] [Commented] (DRILL-5496) Must restart drillbits whenever a secure Hive metastore is restarted

Paul Rogers (JIRA) Fri, 12 May 2017 11:06:23 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008494#comment-16008494
 ]


Paul Rogers commented on DRILL-5496:
------------------------------------

As it turns out, the Hive client in the Hive storage plugin is not designed to 
handle security.

* When we start the Hive storage plugin, we create a single instance of the 
{{HiveSchemaFactory}}.
* {{HiveSchemaFactory}} holds on to a {{DrillHiveMetaStoreClient}} connection. 
In the secure case, this connection is used to get security certificates for us 
in creating secure connections.
* {{HiveSchemaFactory}} has a Guava loading cache of user-specific, secure 
connections.

When the Hive metastore goes down, all connections become invalid including the 
non-secure and all the secure connections. But, we try to handle the problem as 
follows.

If a secure connection times out:

* Use the (now-invalid) insecure connection to get another ticket. But, since 
this isn't valid, we can't reconnect and so always fail.

If we try to use a cached secure connection before timeout, then this happens:

* Try to send a message.
* When that fails, try to reconnect (using the old certificate for the prior 
session.)
* When that fails, give up.

What we really need to do is:

* Recreate both the insecure *and* secure connections.

But, since the secure connection cache is held on the insecure connection, we 
can't easily recreate that connection: we'd get a new object.

So, we have to make some changes.

* Hold the secure connection cache on an object other than a connection.
* Use a connection proxy instead of the connection as key to the cache. The 
proxy allows maintaining the cache entry, but replacing the secure connection 
with a new one. (The proxy is just a wrapper around a replacable secure 
connection.)
* Similarly, provide a thread-safe way to reconnect the non-secure connection 
used to get tickets for the secure connection.

All this is not a huge project, but it is more than can be done in the context 
of a quick fix for this ticket. So, for this ticket I used a bit of a  hack: 
just throw away the entire schema builder and create a new one. But, that 
solution requires synchronizing all requests and is far from ideal.

> Must restart drillbits whenever a secure Hive metastore is restarted
> --------------------------------------------------------------------
>
>                 Key: DRILL-5496
>                 URL: https://issues.apache.org/jira/browse/DRILL-5496
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.11.0
>
>
> DRILL-4964: "Drill fails to connect to hive metastore after hive metastore is 
> restarted unless drillbits are restarted also" attempted to fix a bug in 
> Drill in which Drill hangs if Hive is restarted. Now, we see that all 
> subsequent "show schemas" queries fail.
> Steps to repro:
> 1. Build a secure cluster (we used MapR)
> 2. Install Hive and Drill services
> 3. Configure drill impersonation and authentication
> 4. Restart hivemeta service
> 5. Connect to drill and execute query involving hive storage, issue occurs
> 6. Restart the drill-bits services and execute the query, issue is no longer 
> hit
> The problem occurs in the same place as the earlier fix, but might represent 
> a slightly different use case: in this case the connection is secure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (DRILL-5496) Must restart drillbits whenever a secure Hive metastore is restarted

Reply via email to