[jira] [Created] (IMPALA-9221) Optimize HashRing's map implementation
Joe McDonnell created IMPALA-9221: - Summary: Optimize HashRing's map implementation Key: IMPALA-9221 URL: https://issues.apache.org/jira/browse/IMPALA-9221 Project: IMPALA Issue Type: Improvement Components: Backend Affects Versions: Impala 3.4.0 Reporter: Joe McDonnell The hash ring used for consistent scheduling currently uses a std::map for the hash-to-IpAddr lookup. HashRing is heavy on reads, with writes only happening when executors come and go. There are some cases where we copy the HashRing. The standard map uses a large number of small allocations. This hurts cache performance, adds overhead, and also increases the cost of copying the structure. Something like boost's flat_map or Abseil's btree_map is likely to be more efficient. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9103) Impala Doc: Document the new SHOW EXTENDED TABLE statement
[ https://issues.apache.org/jira/browse/IMPALA-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandra Rodoni updated IMPALA-9103: - Labels: future_release_doc (was: future_release_doc in_34) > Impala Doc: Document the new SHOW EXTENDED TABLE statement > -- > > Key: IMPALA-9103 > URL: https://issues.apache.org/jira/browse/IMPALA-9103 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Alexandra Rodoni >Assignee: Alexandra Rodoni >Priority: Major > Labels: future_release_doc > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8976) Impala Doc: Document Z-Ordering
[ https://issues.apache.org/jira/browse/IMPALA-8976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandra Rodoni updated IMPALA-8976: - Labels: future_release_doc (was: future_release_doc in_34) > Impala Doc: Document Z-Ordering > --- > > Key: IMPALA-8976 > URL: https://issues.apache.org/jira/browse/IMPALA-8976 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Alexandra Rodoni >Assignee: Alexandra Rodoni >Priority: Major > Labels: future_release_doc > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-9006) Consolidate the Statestore subscriber's retry logic
[ https://issues.apache.org/jira/browse/IMPALA-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Ho updated IMPALA-9006: --- Attachment: 76c83e9.diff > Consolidate the Statestore subscriber's retry logic > --- > > Key: IMPALA-9006 > URL: https://issues.apache.org/jira/browse/IMPALA-9006 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec >Affects Versions: Impala 3.4.0 >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Minor > Attachments: 76c83e9.diff > > > Currently, a Statestore subscriber starts a separate thread after the initial > registration with Statestore to periodically check if the Statestore may have > failed and re-registered with Statestore if necessary. Similarly, the > function {{StatestoreSubscriber::Register()}} also relies on the old Thrift > client's retry logic to retry failed RPC attempts to Statestore. This is > needed as the initial registration relies on this retry logic to wait for > Statestore to startup in case an Impala daemon starts before the Statestore. > These two retry paths may be consolidated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-3189) Address scalability issue with N^2 KDC requests on cluster startup
[ https://issues.apache.org/jira/browse/IMPALA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990098#comment-16990098 ] Michael Ho commented on IMPALA-3189: Hi [~tlipcon], we still saw that in the cold startup case even with KRPC under a large enough scale (e.g. 300+ nodes). It will manifest as some sort of negotiation error and we had to increase the timeout or something to work around it (see IMPALA-5901) > Address scalability issue with N^2 KDC requests on cluster startup > -- > > Key: IMPALA-3189 > URL: https://issues.apache.org/jira/browse/IMPALA-3189 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec, Security >Affects Versions: Impala 2.5.0 >Reporter: Henry Robinson >Priority: Critical > Labels: kerberos, scalability > > When Impala runs a query that shuffles data amongst all nodes in a > Kerberos-secured cluster, every node will need to acquire a TGS for every > other node. In a cluster of 100 nodes or more, this can overwhelm the KDC, > and queries can exit with an error ("Could not contact KDC for realm"). > A simple workaround is to run a warm-up query until it succeeds (which can > take a few minutes after cluster startup). The KDC can also be scaled (e.g. > with secondary KDC nodes). > Impala can also consider either forcing a TGS request on start-up in a > staggered fashion, or we can move to recommending SSL + client certificates > for server<->server communication. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-3189) Address scalability issue with N^2 KDC requests on cluster startup
[ https://issues.apache.org/jira/browse/IMPALA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990091#comment-16990091 ] Todd Lipcon commented on IMPALA-3189: - This should be largely better with KRPC since we maintain long-running connections between nodes. Do people still see this issue on the first query after startup? > Address scalability issue with N^2 KDC requests on cluster startup > -- > > Key: IMPALA-3189 > URL: https://issues.apache.org/jira/browse/IMPALA-3189 > Project: IMPALA > Issue Type: Improvement > Components: Distributed Exec, Security >Affects Versions: Impala 2.5.0 >Reporter: Henry Robinson >Priority: Critical > Labels: kerberos, scalability > > When Impala runs a query that shuffles data amongst all nodes in a > Kerberos-secured cluster, every node will need to acquire a TGS for every > other node. In a cluster of 100 nodes or more, this can overwhelm the KDC, > and queries can exit with an error ("Could not contact KDC for realm"). > A simple workaround is to run a warm-up query until it succeeds (which can > take a few minutes after cluster startup). The KDC can also be scaled (e.g. > with secondary KDC nodes). > Impala can also consider either forcing a TGS request on start-up in a > staggered fashion, or we can move to recommending SSL + client certificates > for server<->server communication. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-9042) Support reading full-ACID ORC tables
[ https://issues.apache.org/jira/browse/IMPALA-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boglarka Egyed reassigned IMPALA-9042: -- Assignee: Zoltán Borók-Nagy > Support reading full-ACID ORC tables > > > Key: IMPALA-9042 > URL: https://issues.apache.org/jira/browse/IMPALA-9042 > Project: IMPALA > Issue Type: New Feature >Reporter: Quanlong Huang >Assignee: Zoltán Borók-Nagy >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org