Hi all,

As part of a customer incident report we have found and recently I've filed
HIVE-28977 <https://issues.apache.org/jira/browse/HIVE-28977> to optimize
the cleanup of the Hive delegation tokens.

Currently we can have 3 types of token stores: memory, DB, ZK.
Initially Hive was keeping the DTs only in memory, and the initial design
did the cleanup as a local java thread. Since typically we have a
standalone metastore, and the tokens are stored centrally (in the HMS
database or in ZK), the original design is not efficient as all the HMS and
HS2 instances are trying to do the same.
The point of the Hive jira ticket is to do the DT cleanup from a single
place, from the leader HMS instance.

Q1.
Per my understanding we can optimize only the DBTokenstore cleanup, but if
you agree we could also include the ZKTokenstore too. However I see some
risk in including the ZK tokenstore, as one can configure different ZNodes
for different (HS2/HMS) instances. Do you think if anyone does that, or can
we assume that the same type of tokenstore and same ZNode is used within a
single cluster?

Q2.
As we still need to keep the old behavior with the memory tokenstore, the
code should be fairly untouched, we cannot simply move the code only under
the metastore maintenance threads.
Because of that I am thinking that the solution / new code (as a new HMS
maintenance "remote" thread) should just simply "allow" or "disallow" the
run of the existing cleanup thread, so no major refactor would be needed.
Do you think if that is OK?

Thanks for sharing your thoughts on this.
 Miklos

Reply via email to