[jira] [Commented] (IMPALA-3189) Address scalability issue with N^2 KDC requests on cluster startup

2019-12-06 Thread Michael Ho (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990098#comment-16990098
 ] 

Michael Ho commented on IMPALA-3189:


Hi [~tlipcon], we still saw that in the cold startup case even with KRPC under 
a large enough scale (e.g. 300+ nodes). It will manifest as some sort of 
negotiation error and we had to increase the timeout or something to work 
around it (see IMPALA-5901)

> Address scalability issue with N^2 KDC requests on cluster startup
> --
>
> Key: IMPALA-3189
> URL: https://issues.apache.org/jira/browse/IMPALA-3189
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec, Security
>Affects Versions: Impala 2.5.0
>Reporter: Henry Robinson
>Priority: Critical
>  Labels: kerberos, scalability
>
> When Impala runs a query that shuffles data amongst all nodes in a 
> Kerberos-secured cluster, every node will need to acquire a TGS for every 
> other node. In a cluster of 100 nodes or more, this can overwhelm the KDC, 
> and queries can exit with an error ("Could not contact KDC for realm").
> A simple workaround is to run a warm-up query until it succeeds (which can 
> take a few minutes after cluster startup). The KDC can also be scaled (e.g. 
> with secondary KDC nodes). 
> Impala can also consider either forcing a TGS request on start-up in a 
> staggered fashion, or we can move to recommending SSL + client certificates 
> for server<->server communication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-3189) Address scalability issue with N^2 KDC requests on cluster startup

2019-12-06 Thread Todd Lipcon (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990091#comment-16990091
 ] 

Todd Lipcon commented on IMPALA-3189:
-

This should be largely better with KRPC since we maintain long-running 
connections between nodes. Do people still see this issue on the first query 
after startup?

> Address scalability issue with N^2 KDC requests on cluster startup
> --
>
> Key: IMPALA-3189
> URL: https://issues.apache.org/jira/browse/IMPALA-3189
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec, Security
>Affects Versions: Impala 2.5.0
>Reporter: Henry Robinson
>Priority: Critical
>  Labels: kerberos, scalability
>
> When Impala runs a query that shuffles data amongst all nodes in a 
> Kerberos-secured cluster, every node will need to acquire a TGS for every 
> other node. In a cluster of 100 nodes or more, this can overwhelm the KDC, 
> and queries can exit with an error ("Could not contact KDC for realm").
> A simple workaround is to run a warm-up query until it succeeds (which can 
> take a few minutes after cluster startup). The KDC can also be scaled (e.g. 
> with secondary KDC nodes). 
> Impala can also consider either forcing a TGS request on start-up in a 
> staggered fashion, or we can move to recommending SSL + client certificates 
> for server<->server communication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org