[
https://issues.apache.org/jira/browse/IMPALA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990098#comment-16990098
]
Michael Ho commented on IMPALA-3189:
------------------------------------
Hi [~tlipcon], we still saw that in the cold startup case even with KRPC under
a large enough scale (e.g. 300+ nodes). It will manifest as some sort of
negotiation error and we had to increase the timeout or something to work
around it (see IMPALA-5901)
> Address scalability issue with N^2 KDC requests on cluster startup
> ------------------------------------------------------------------
>
> Key: IMPALA-3189
> URL: https://issues.apache.org/jira/browse/IMPALA-3189
> Project: IMPALA
> Issue Type: Improvement
> Components: Distributed Exec, Security
> Affects Versions: Impala 2.5.0
> Reporter: Henry Robinson
> Priority: Critical
> Labels: kerberos, scalability
>
> When Impala runs a query that shuffles data amongst all nodes in a
> Kerberos-secured cluster, every node will need to acquire a TGS for every
> other node. In a cluster of 100 nodes or more, this can overwhelm the KDC,
> and queries can exit with an error ("Could not contact KDC for realm").
> A simple workaround is to run a warm-up query until it succeeds (which can
> take a few minutes after cluster startup). The KDC can also be scaled (e.g.
> with secondary KDC nodes).
> Impala can also consider either forcing a TGS request on start-up in a
> staggered fashion, or we can move to recommending SSL + client certificates
> for server<->server communication.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]