[jira] [Created] (IMPALA-9221) Optimize HashRing's map implementation

2019-12-06 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-9221:
-

 Summary: Optimize HashRing's map implementation
 Key: IMPALA-9221
 URL: https://issues.apache.org/jira/browse/IMPALA-9221
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Affects Versions: Impala 3.4.0
Reporter: Joe McDonnell


The hash ring used for consistent scheduling currently uses a std::map for the 
hash-to-IpAddr lookup. HashRing is heavy on reads, with writes only happening 
when executors come and go. There are some cases where we copy the HashRing.

The standard map uses a large number of small allocations. This hurts cache 
performance, adds overhead, and also increases the cost of copying the 
structure. Something like boost's flat_map or Abseil's btree_map is likely to 
be more efficient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9103) Impala Doc: Document the new SHOW EXTENDED TABLE statement

2019-12-06 Thread Alexandra Rodoni (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandra Rodoni updated IMPALA-9103:
-
Labels: future_release_doc  (was: future_release_doc in_34)

> Impala Doc: Document the new SHOW EXTENDED TABLE statement
> --
>
> Key: IMPALA-9103
> URL: https://issues.apache.org/jira/browse/IMPALA-9103
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alexandra Rodoni
>Assignee: Alexandra Rodoni
>Priority: Major
>  Labels: future_release_doc
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8976) Impala Doc: Document Z-Ordering

2019-12-06 Thread Alexandra Rodoni (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandra Rodoni updated IMPALA-8976:
-
Labels: future_release_doc  (was: future_release_doc in_34)

> Impala Doc: Document Z-Ordering
> ---
>
> Key: IMPALA-8976
> URL: https://issues.apache.org/jira/browse/IMPALA-8976
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alexandra Rodoni
>Assignee: Alexandra Rodoni
>Priority: Major
>  Labels: future_release_doc
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9006) Consolidate the Statestore subscriber's retry logic

2019-12-06 Thread Michael Ho (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-9006:
---
Attachment: 76c83e9.diff

> Consolidate the Statestore subscriber's retry logic
> ---
>
> Key: IMPALA-9006
> URL: https://issues.apache.org/jira/browse/IMPALA-9006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 3.4.0
>Reporter: Michael Ho
>Assignee: Michael Ho
>Priority: Minor
> Attachments: 76c83e9.diff
>
>
> Currently, a Statestore subscriber starts a separate thread after the initial 
> registration with Statestore to periodically check if the Statestore may have 
> failed and re-registered with Statestore if necessary. Similarly, the 
> function {{StatestoreSubscriber::Register()}} also relies on the old Thrift 
> client's retry logic to retry failed RPC attempts to Statestore. This is 
> needed as the initial registration relies on this retry logic to wait for 
> Statestore to startup in case an Impala daemon starts before the Statestore. 
> These two retry paths may be consolidated. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-3189) Address scalability issue with N^2 KDC requests on cluster startup

2019-12-06 Thread Michael Ho (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990098#comment-16990098
 ] 

Michael Ho commented on IMPALA-3189:


Hi [~tlipcon], we still saw that in the cold startup case even with KRPC under 
a large enough scale (e.g. 300+ nodes). It will manifest as some sort of 
negotiation error and we had to increase the timeout or something to work 
around it (see IMPALA-5901)

> Address scalability issue with N^2 KDC requests on cluster startup
> --
>
> Key: IMPALA-3189
> URL: https://issues.apache.org/jira/browse/IMPALA-3189
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec, Security
>Affects Versions: Impala 2.5.0
>Reporter: Henry Robinson
>Priority: Critical
>  Labels: kerberos, scalability
>
> When Impala runs a query that shuffles data amongst all nodes in a 
> Kerberos-secured cluster, every node will need to acquire a TGS for every 
> other node. In a cluster of 100 nodes or more, this can overwhelm the KDC, 
> and queries can exit with an error ("Could not contact KDC for realm").
> A simple workaround is to run a warm-up query until it succeeds (which can 
> take a few minutes after cluster startup). The KDC can also be scaled (e.g. 
> with secondary KDC nodes). 
> Impala can also consider either forcing a TGS request on start-up in a 
> staggered fashion, or we can move to recommending SSL + client certificates 
> for server<->server communication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-3189) Address scalability issue with N^2 KDC requests on cluster startup

2019-12-06 Thread Todd Lipcon (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990091#comment-16990091
 ] 

Todd Lipcon commented on IMPALA-3189:
-

This should be largely better with KRPC since we maintain long-running 
connections between nodes. Do people still see this issue on the first query 
after startup?

> Address scalability issue with N^2 KDC requests on cluster startup
> --
>
> Key: IMPALA-3189
> URL: https://issues.apache.org/jira/browse/IMPALA-3189
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec, Security
>Affects Versions: Impala 2.5.0
>Reporter: Henry Robinson
>Priority: Critical
>  Labels: kerberos, scalability
>
> When Impala runs a query that shuffles data amongst all nodes in a 
> Kerberos-secured cluster, every node will need to acquire a TGS for every 
> other node. In a cluster of 100 nodes or more, this can overwhelm the KDC, 
> and queries can exit with an error ("Could not contact KDC for realm").
> A simple workaround is to run a warm-up query until it succeeds (which can 
> take a few minutes after cluster startup). The KDC can also be scaled (e.g. 
> with secondary KDC nodes). 
> Impala can also consider either forcing a TGS request on start-up in a 
> staggered fashion, or we can move to recommending SSL + client certificates 
> for server<->server communication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9042) Support reading full-ACID ORC tables

2019-12-06 Thread Boglarka Egyed (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boglarka Egyed reassigned IMPALA-9042:
--

Assignee: Zoltán Borók-Nagy

> Support reading full-ACID ORC tables
> 
>
> Key: IMPALA-9042
> URL: https://issues.apache.org/jira/browse/IMPALA-9042
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Quanlong Huang
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org