[jira] [Updated] (IMPALA-8730) TestExplain.test_explain_validate_cardinality_estimates flakiness
[ https://issues.apache.org/jira/browse/IMPALA-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anurag Mantripragada updated IMPALA-8730: - Description: Could be a data loading issue, cardinality estimates are off. Below is the error message: {code:java} metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates check_cardinality(result.data, '7.30K') metadata/test_explain.py:98: in check_cardinality query_result, expected_cardinality=expected_cardinality) metadata/test_explain.py:86: in check_row_size_and_cardinality assert m.groups()[1] == expected_cardinality E assert '7.00K' == '7.30K' E - 7.00K E ? ^ E + 7.30K E ? ^{code} was: Another flaky test uncovered due to IMPALA-7608. The test relies on exact cardinality numbers but after IMPALA-7608, these could be indeterministic. Below is the error message: {code:java} metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates check_cardinality(result.data, '7.30K') metadata/test_explain.py:98: in check_cardinality query_result, expected_cardinality=expected_cardinality) metadata/test_explain.py:86: in check_row_size_and_cardinality assert m.groups()[1] == expected_cardinality E assert '7.00K' == '7.30K' E - 7.00K E ? ^ E + 7.30K E ? ^{code} > TestExplain.test_explain_validate_cardinality_estimates flakiness > -- > > Key: IMPALA-8730 > URL: https://issues.apache.org/jira/browse/IMPALA-8730 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.3.0 >Reporter: Anurag Mantripragada >Assignee: Fang-Yu Rao >Priority: Critical > Labels: broken-build > > Could be a data loading issue, cardinality estimates are off. > Below is the error message: > {code:java} > metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates >check_cardinality(result.data, '7.30K') metadata/test_explain.py:98: in > check_cardinality query_result, > expected_cardinality=expected_cardinality) metadata/test_explain.py:86: in > check_row_size_and_cardinality assert m.groups()[1] == > expected_cardinality E assert '7.00K' == '7.30K' E - 7.00K E ? ^ > E + 7.30K E ? ^{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8712) Convert ExecQueryFInstance() RPC to become asynchronous
[ https://issues.apache.org/jira/browse/IMPALA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875278#comment-16875278 ] Michael Ho edited comment on IMPALA-8712 at 6/28/19 11:00 PM: -- We may be to work around some of the serialization overhead by serializing some of the immutable Thrift based RPC parameters once and send it as a sidecar. This should reduce the need to serialize it once per backend. was (Author: kwho): We may be to work around some of the serialization overhead by sending some of the currently Thrift based RPC parameters as a sidecar or something. > Convert ExecQueryFInstance() RPC to become asynchronous > --- > > Key: IMPALA-8712 > URL: https://issues.apache.org/jira/browse/IMPALA-8712 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Michael Ho >Assignee: Thomas Tauber-Marshall >Priority: Major > > Now that IMPALA-7467 is fixed, ExecQueryFInstance() can utilize the async RPC > capabilities of KRPC instead of relying on the half-baked way of using > {{ExecEnv::exec_rpc_thread_pool_}} to start query fragment instances. We > already have a reactor thread pool in KRPC to handle sending client RPCs > asynchronously. Also various tasks under IMPALA-5486 can also benefit from > making ExecQueryFInstance() asynchronous so the RPCs can be cancelled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8712) Convert ExecQueryFInstance() RPC to become asynchronous
[ https://issues.apache.org/jira/browse/IMPALA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875278#comment-16875278 ] Michael Ho commented on IMPALA-8712: We may be to work around some of the serialization overhead by sending some of the currently Thrift based RPC parameters as a sidecar or something. > Convert ExecQueryFInstance() RPC to become asynchronous > --- > > Key: IMPALA-8712 > URL: https://issues.apache.org/jira/browse/IMPALA-8712 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Michael Ho >Assignee: Thomas Tauber-Marshall >Priority: Major > > Now that IMPALA-7467 is fixed, ExecQueryFInstance() can utilize the async RPC > capabilities of KRPC instead of relying on the half-baked way of using > {{ExecEnv::exec_rpc_thread_pool_}} to start query fragment instances. We > already have a reactor thread pool in KRPC to handle sending client RPCs > asynchronously. Also various tasks under IMPALA-5486 can also benefit from > making ExecQueryFInstance() asynchronous so the RPCs can be cancelled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8712) Convert ExecQueryFInstance() RPC to become asynchronous
[ https://issues.apache.org/jira/browse/IMPALA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873765#comment-16873765 ] Michael Ho edited comment on IMPALA-8712 at 6/28/19 10:57 PM: -- On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. Some of the serialization overhead with ExecQueryFInstance() RPC even after IMPALA-7467 is still Thrift related as we just serialize a bunch of Thrift structures into a binary blob and send them via KRPC sidecar. The serialization is done in parallel by threads in {{exec_rpc_thread_pool_}}. was (Author: kwho): On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. Some of the serialization overhead with ExecQueryFInstance() RPC even after IMPALA-7467 is still Thrift related as we just serialize a bunch of Thrift structures into a binary blob and send them via KRPC sidecar. The serialization is done in parallel by threads in {{exec_rpc_thread_pool_}}. -If we convert those Thrift structures into Protobuf, then the serialization can be done in parallel by reactor threads in the KRPC stack.- > Convert ExecQueryFInstance() RPC to become asynchronous > --- > > Key: IMPALA-8712 > URL: https://issues.apache.org/jira/browse/IMPALA-8712 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Michael Ho >Assignee: Thomas Tauber-Marshall >Priority: Major > > Now that IMPALA-7467 is fixed, ExecQueryFInstance() can utilize the async RPC > capabilities of KRPC instead of relying on the half-baked way of using > {{ExecEnv::exec_rpc_thread_pool_}} to start query fragment instances. We > already have a reactor thread pool in KRPC to handle sending client RPCs > asynchronously. Also various tasks under IMPALA-5486 can also benefit from > making ExecQueryFInstance() asynchronous so the RPCs can be cancelled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8730) TestExplain.test_explain_validate_cardinality_estimates flakiness
[ https://issues.apache.org/jira/browse/IMPALA-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anurag Mantripragada updated IMPALA-8730: - Summary: TestExplain.test_explain_validate_cardinality_estimates flakiness (was: TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers.) > TestExplain.test_explain_validate_cardinality_estimates flakiness > -- > > Key: IMPALA-8730 > URL: https://issues.apache.org/jira/browse/IMPALA-8730 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.3.0 >Reporter: Anurag Mantripragada >Assignee: Fang-Yu Rao >Priority: Critical > Labels: broken-build > > Another flaky test uncovered due to IMPALA-7608. The test relies on exact > cardinality numbers but after IMPALA-7608, these could be indeterministic. > > Below is the error message: > {code:java} > metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates >check_cardinality(result.data, '7.30K') metadata/test_explain.py:98: in > check_cardinality query_result, > expected_cardinality=expected_cardinality) metadata/test_explain.py:86: in > check_row_size_and_cardinality assert m.groups()[1] == > expected_cardinality E assert '7.00K' == '7.30K' E - 7.00K E ? ^ > E + 7.30K E ? ^{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8730) TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers.
[ https://issues.apache.org/jira/browse/IMPALA-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875266#comment-16875266 ] Fang-Yu Rao edited comment on IMPALA-8730 at 6/28/19 10:38 PM: --- Thanks for pointing this out Anurag. After trying to execute the following SQL statement "SHOW TABLE STATS functional.alltypes;" from the Impala shell on my local dev box, I found that the table functional.alltypes DOES have statistical information, i.e., #Rows is not "-1" for each partition. That is, the cardinality returned by this SQL statement should be deterministic. However, the patch set for IMPALA-7608 ([https://gerrit.cloudera.org/c/12974/]) should not affect the code path when an hdfs table already has the statistical information and thus I suspect IMPALA-7608 might not cause this problem. For easy reference, I also provide the returned results of the SQL statement in the following. {code:java} +---+---+---++--+--+---++---+---+ | year | month | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location | +---+---+---++--+--+---++---+---+ | 2009 | 1 | 310 | 1 | 19.95KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=1 | | 2009 | 2 | 280 | 1 | 18.12KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=2 | | 2009 | 3 | 310 | 1 | 20.06KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=3 | | 2009 | 4 | 300 | 1 | 19.61KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=4 | | 2009 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=5 | | 2009 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=6 | | 2009 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=7 | | 2009 | 8 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=8 | | 2009 | 9 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=9 | | 2009 | 10 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=10 | | 2009 | 11 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=11 | | 2009 | 12 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=12 | | 2010 | 1 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=1 | | 2010 | 2 | 280 | 1 | 18.39KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=2 | | 2010 | 3 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=3 | | 2010 | 4 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=4 | | 2010 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=5 | | 2010 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=6 | | 2010 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=7 | | 2010
[jira] [Comment Edited] (IMPALA-8730) TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers.
[ https://issues.apache.org/jira/browse/IMPALA-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875266#comment-16875266 ] Fang-Yu Rao edited comment on IMPALA-8730 at 6/28/19 10:37 PM: --- Thanks for pointing this out Anurag. After trying to execute the following SQL statement "SHOW TABLE STATS functional.alltypes;" from the Impala shell on my local dev box, I found that the table functional.alltypes DOES have statistical information, i.e., #Rows is not "-1" for each partition. That is, the cardinality returned by this SQL statement should be deterministic. However, the patch set for IMPALA-7608 ([https://gerrit.cloudera.org/c/12974/]) should not affect the code path when an hdfs table already has the statistical information and thus I suspect IMPALA-7608 might not cause this problem. For easy reference, I also provide the returned results of the SQL statement in the following. {code:java} +---+---+---++--+--+---++---+---+ | year | month | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location | +---+---+---++--+--+---++---+---+ | 2009 | 1 | 310 | 1 | 19.95KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=1 | | 2009 | 2 | 280 | 1 | 18.12KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=2 | | 2009 | 3 | 310 | 1 | 20.06KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=3 | | 2009 | 4 | 300 | 1 | 19.61KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=4 | | 2009 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=5 | | 2009 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=6 | | 2009 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=7 | | 2009 | 8 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=8 | | 2009 | 9 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=9 | | 2009 | 10 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=10 | | 2009 | 11 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=11 | | 2009 | 12 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=12 | | 2010 | 1 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=1 | | 2010 | 2 | 280 | 1 | 18.39KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=2 | | 2010 | 3 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=3 | | 2010 | 4 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=4 | | 2010 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=5 | | 2010 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=6 | | 2010 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=7 | | 2010
[jira] [Comment Edited] (IMPALA-8730) TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers.
[ https://issues.apache.org/jira/browse/IMPALA-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875266#comment-16875266 ] Fang-Yu Rao edited comment on IMPALA-8730 at 6/28/19 10:36 PM: --- Thanks for pointing this out Anurag. After trying to execute the following SQL statement "SHOW TABLE STATS functional.alltypes;" from the Impala shell on my local dev box, I found that the table functional.alltypes DOES have statistical information, i.e., #Rows is not "-1" for each partition. That is, the cardinality returned by this SQL statement should be deterministic. However, the patch set for IMPALA-7608 ([https://gerrit.cloudera.org/c/12974/]) should not affect the code path when an hdfs table already has the statistical information and thus I suspect IMPALA-7608 might not cause this problem. For easy reference, I also provide the returned results of the SQL statement in the following. {code:java} +---+---+---++--+--+---++---+---+ | year | month | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location | +---+---+---++--+--+---++---+---+ | 2009 | 1 | 310 | 1 | 19.95KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=1 | | 2009 | 2 | 280 | 1 | 18.12KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=2 | | 2009 | 3 | 310 | 1 | 20.06KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=3 | | 2009 | 4 | 300 | 1 | 19.61KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=4 | | 2009 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=5 | | 2009 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=6 | | 2009 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=7 | | 2009 | 8 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=8 | | 2009 | 9 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=9 | | 2009 | 10 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=10 | | 2009 | 11 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=11 | | 2009 | 12 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=12 | | 2010 | 1 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=1 | | 2010 | 2 | 280 | 1 | 18.39KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=2 | | 2010 | 3 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=3 | | 2010 | 4 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=4 | | 2010 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=5 | | 2010 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=6 | | 2010 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=7 | | 2010
[jira] [Comment Edited] (IMPALA-8730) TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers.
[ https://issues.apache.org/jira/browse/IMPALA-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875266#comment-16875266 ] Fang-Yu Rao edited comment on IMPALA-8730 at 6/28/19 10:17 PM: --- Thanks for pointing this out Anurag. After trying to execute the following SQL statement "SHOW TABLE STATS functional.alltypes;" from the Impala shell, I found that the table functional.alltypes DOES have statistical information, i.e., #Rows is not "-1" for each partition. That is, the cardinality returned by this SQL statement should be deterministic. However, the patch set for IMPALA-7608 ([https://gerrit.cloudera.org/c/12974/]) should not affect the code path when an hdfs table already has the statistical information and thus I suspect IMPALA-7608 might not cause this problem. For easy reference, I also provide the returned results of the SQL statement in the following. {code:java} +---+---+---++--+--+---++---+---+ | year | month | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location | +---+---+---++--+--+---++---+---+ | 2009 | 1 | 310 | 1 | 19.95KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=1 | | 2009 | 2 | 280 | 1 | 18.12KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=2 | | 2009 | 3 | 310 | 1 | 20.06KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=3 | | 2009 | 4 | 300 | 1 | 19.61KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=4 | | 2009 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=5 | | 2009 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=6 | | 2009 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=7 | | 2009 | 8 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=8 | | 2009 | 9 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=9 | | 2009 | 10 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=10 | | 2009 | 11 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=11 | | 2009 | 12 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=12 | | 2010 | 1 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=1 | | 2010 | 2 | 280 | 1 | 18.39KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=2 | | 2010 | 3 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=3 | | 2010 | 4 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=4 | | 2010 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=5 | | 2010 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=6 | | 2010 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=7 | | 2010 | 8 | 310 | 1
[jira] [Commented] (IMPALA-8730) TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers.
[ https://issues.apache.org/jira/browse/IMPALA-8730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875266#comment-16875266 ] Fang-Yu Rao commented on IMPALA-8730: - Thanks for pointing this out Anurag. After trying to execute the following SQL statement "SHOW TABLE STATS functional.alltypes;" from the Impala shell, I found that the table functional.alltypes DOES have statistical information, i.e., #Rows is not "-1" for each partition. That is, the cardinality returned by this SQL statement should be deterministic. However, the patch set for IMPALA-7608 ([https://gerrit.cloudera.org/c/12974/]) should not affect the code path when an hdfs table already has the statistical information and thus I suspect IMPALA-7608 might not cause this problem. For easy reference, I also provide the returned results of the SQL statement in the following. {code:java} +---+---+---++--+--+---++---+---+ | year | month | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location | +---+---+---++--+--+---++---+---+ | 2009 | 1 | 310 | 1 | 19.95KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=1 | | 2009 | 2 | 280 | 1 | 18.12KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=2 | | 2009 | 3 | 310 | 1 | 20.06KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=3 | | 2009 | 4 | 300 | 1 | 19.61KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=4 | | 2009 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=5 | | 2009 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=6 | | 2009 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=7 | | 2009 | 8 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=8 | | 2009 | 9 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=9 | | 2009 | 10 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=10 | | 2009 | 11 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=11 | | 2009 | 12 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=12 | | 2010 | 1 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=1 | | 2010 | 2 | 280 | 1 | 18.39KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=2 | | 2010 | 3 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=3 | | 2010 | 4 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=4 | | 2010 | 5 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=5 | | 2010 | 6 | 300 | 1 | 19.71KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=6 | | 2010 | 7 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/alltypes/year=2010/month=7 | | 2010 | 8 | 310 | 1 | 20.36KB | NOT CACHED | NOT CACHED
[jira] [Created] (IMPALA-8731) Balance queries between executor groups
Lars Volker created IMPALA-8731: --- Summary: Balance queries between executor groups Key: IMPALA-8731 URL: https://issues.apache.org/jira/browse/IMPALA-8731 Project: IMPALA Issue Type: Improvement Components: Backend Affects Versions: Impala 3.3.0 Reporter: Lars Volker After IMPALA-8484, we should revisit the assignment policy that we use to distribute queries to executor groups. In particular we should implement a policy that balances queries across executor groups instead of filling them up one by one. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8731) Balance queries between executor groups
Lars Volker created IMPALA-8731: --- Summary: Balance queries between executor groups Key: IMPALA-8731 URL: https://issues.apache.org/jira/browse/IMPALA-8731 Project: IMPALA Issue Type: Improvement Components: Backend Affects Versions: Impala 3.3.0 Reporter: Lars Volker After IMPALA-8484, we should revisit the assignment policy that we use to distribute queries to executor groups. In particular we should implement a policy that balances queries across executor groups instead of filling them up one by one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8730) TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers.
Anurag Mantripragada created IMPALA-8730: Summary: TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers. Key: IMPALA-8730 URL: https://issues.apache.org/jira/browse/IMPALA-8730 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.3.0 Reporter: Anurag Mantripragada Assignee: Fang-Yu Rao Another flaky test uncovered due to IMPALA-7608. The test relies on exact cardinality numbers but after IMPALA-7608, these could be indeterministic. Below is the error message: {code:java} metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates check_cardinality(result.data, '7.30K') metadata/test_explain.py:98: in check_cardinality query_result, expected_cardinality=expected_cardinality) metadata/test_explain.py:86: in check_row_size_and_cardinality assert m.groups()[1] == expected_cardinality E assert '7.00K' == '7.30K' E - 7.00K E ? ^ E + 7.30K E ? ^{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8730) TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers.
Anurag Mantripragada created IMPALA-8730: Summary: TestExplain.test_explain_validate_cardinality_estimates flakiness due to reliance on exact cardinality numbers. Key: IMPALA-8730 URL: https://issues.apache.org/jira/browse/IMPALA-8730 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.3.0 Reporter: Anurag Mantripragada Assignee: Fang-Yu Rao Another flaky test uncovered due to IMPALA-7608. The test relies on exact cardinality numbers but after IMPALA-7608, these could be indeterministic. Below is the error message: {code:java} metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates check_cardinality(result.data, '7.30K') metadata/test_explain.py:98: in check_cardinality query_result, expected_cardinality=expected_cardinality) metadata/test_explain.py:86: in check_row_size_and_cardinality assert m.groups()[1] == expected_cardinality E assert '7.00K' == '7.30K' E - 7.00K E ? ^ E + 7.30K E ? ^{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-8728) Impala Doc: Remove --pull_incremental_statistics flag from the doc
[ https://issues.apache.org/jira/browse/IMPALA-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Rodoni closed IMPALA-8728. --- Resolution: Duplicate Already removed in IMPALA-8667 > Impala Doc: Remove --pull_incremental_statistics flag from the doc > -- > > Key: IMPALA-8728 > URL: https://issues.apache.org/jira/browse/IMPALA-8728 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Alex Rodoni >Assignee: Alex Rodoni >Priority: Major > Labels: future_release_doc, in_33 > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-8728) Impala Doc: Remove --pull_incremental_statistics flag from the doc
[ https://issues.apache.org/jira/browse/IMPALA-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Rodoni closed IMPALA-8728. --- Resolution: Duplicate Already removed in IMPALA-8667 > Impala Doc: Remove --pull_incremental_statistics flag from the doc > -- > > Key: IMPALA-8728 > URL: https://issues.apache.org/jira/browse/IMPALA-8728 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Alex Rodoni >Assignee: Alex Rodoni >Priority: Major > Labels: future_release_doc, in_33 > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8729) Impala Doc: Describe how to enable Metadata V2
Alex Rodoni created IMPALA-8729: --- Summary: Impala Doc: Describe how to enable Metadata V2 Key: IMPALA-8729 URL: https://issues.apache.org/jira/browse/IMPALA-8729 Project: IMPALA Issue Type: Task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8729) Impala Doc: Describe how to enable Metadata V2
Alex Rodoni created IMPALA-8729: --- Summary: Impala Doc: Describe how to enable Metadata V2 Key: IMPALA-8729 URL: https://issues.apache.org/jira/browse/IMPALA-8729 Project: IMPALA Issue Type: Task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8728) Impala Doc: Remove --pull_incremental_statistics flag from the doc
Alex Rodoni created IMPALA-8728: --- Summary: Impala Doc: Remove --pull_incremental_statistics flag from the doc Key: IMPALA-8728 URL: https://issues.apache.org/jira/browse/IMPALA-8728 Project: IMPALA Issue Type: Sub-task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8728) Impala Doc: Remove --pull_incremental_statistics flag from the doc
Alex Rodoni created IMPALA-8728: --- Summary: Impala Doc: Remove --pull_incremental_statistics flag from the doc Key: IMPALA-8728 URL: https://issues.apache.org/jira/browse/IMPALA-8728 Project: IMPALA Issue Type: Sub-task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8427) Document the behavior change in IMPALA-7800
[ https://issues.apache.org/jira/browse/IMPALA-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875200#comment-16875200 ] Alex Rodoni commented on IMPALA-8427: - https://gerrit.cloudera.org/#/c/13762/ > Document the behavior change in IMPALA-7800 > --- > > Key: IMPALA-8427 > URL: https://issues.apache.org/jira/browse/IMPALA-8427 > Project: IMPALA > Issue Type: Task > Components: Clients >Affects Versions: Impala 3.3.0 >Reporter: Michael Ho >Assignee: Alex Rodoni >Priority: Major > Labels: future_release_doc, in_33 > > IMPALA-7800 changes the default behavior of client connection timeout with > HS2 and Beeswax Thrift servers. Quote from the commit message: > {noformat} > The current implementation of the FE thrift server waits > indefinitely to open the new session, if the maximum number of > FE service threads specified by --fe_service_threads has been > allocated. > This patch introduces a startup flag to control how the server > should treat new connection requests if we have run out of the > configured number of server threads. > If --accepted_client_cnxn_timeout > 0, new connection requests are > rejected by the server if we can't get a server thread within > the specified timeout. > We set the default timeout to be 5 minutes. The old behavior > can be restored by setting --accepted_client_cnxn_timeout=0, > i.e., no timeout. The timeout applies only to client facing thrift > servers, i.e., HS2 and Beeswax servers. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8399) Batch the authorization requests to Ranger
[ https://issues.apache.org/jira/browse/IMPALA-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya updated IMPALA-8399: - Issue Type: Sub-task (was: Improvement) Parent: IMPALA-7916 > Batch the authorization requests to Ranger > -- > > Key: IMPALA-8399 > URL: https://issues.apache.org/jira/browse/IMPALA-8399 > Project: IMPALA > Issue Type: Sub-task > Components: Catalog, Frontend >Reporter: Fredy Wijaya >Priority: Major > > To reduce the network round trip we should consider batching authorization > requests to Ranger. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8651) Update grammar for Ranger revoke grant option statement
[ https://issues.apache.org/jira/browse/IMPALA-8651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya updated IMPALA-8651: - Issue Type: Sub-task (was: Improvement) Parent: IMPALA-7916 > Update grammar for Ranger revoke grant option statement > --- > > Key: IMPALA-8651 > URL: https://issues.apache.org/jira/browse/IMPALA-8651 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Austin Nobis >Priority: Major > > *REVOKE GRANT OPTION FOR SELECT ON DATABASE DB FROM USER USR;* > In Ranger, it is not possible to *REVOKE GRANT OPTION* for a specific > privilege (*SELECT*). The *GRANT OPTION* must be revoked for the entire > *DATABASE DB* resource. The Impala grammar should be updated to support > omitting the *FOR SELECT* if the authorization provider is set to Ranger. If > that grammar is used for Sentry, it should throw an exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8399) Batch the authorization requests to Ranger
[ https://issues.apache.org/jira/browse/IMPALA-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya updated IMPALA-8399: - Priority: Major (was: Critical) > Batch the authorization requests to Ranger > -- > > Key: IMPALA-8399 > URL: https://issues.apache.org/jira/browse/IMPALA-8399 > Project: IMPALA > Issue Type: Improvement > Components: Catalog, Frontend >Reporter: Fredy Wijaya >Priority: Major > > To reduce the network round trip we should consider batching authorization > requests to Ranger. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8651) Update grammar for Ranger revoke grant option statement
[ https://issues.apache.org/jira/browse/IMPALA-8651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya updated IMPALA-8651: - Fix Version/s: (was: Impala 3.3.0) > Update grammar for Ranger revoke grant option statement > --- > > Key: IMPALA-8651 > URL: https://issues.apache.org/jira/browse/IMPALA-8651 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Reporter: Austin Nobis >Priority: Major > > *REVOKE GRANT OPTION FOR SELECT ON DATABASE DB FROM USER USR;* > In Ranger, it is not possible to *REVOKE GRANT OPTION* for a specific > privilege (*SELECT*). The *GRANT OPTION* must be revoked for the entire > *DATABASE DB* resource. The Impala grammar should be updated to support > omitting the *FOR SELECT* if the authorization provider is set to Ranger. If > that grammar is used for Sentry, it should throw an exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8651) Update grammar for Ranger revoke grant option statement
[ https://issues.apache.org/jira/browse/IMPALA-8651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya updated IMPALA-8651: - Issue Type: Improvement (was: Sub-task) Parent: (was: IMPALA-7916) > Update grammar for Ranger revoke grant option statement > --- > > Key: IMPALA-8651 > URL: https://issues.apache.org/jira/browse/IMPALA-8651 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Reporter: Austin Nobis >Priority: Major > Fix For: Impala 3.3.0 > > > *REVOKE GRANT OPTION FOR SELECT ON DATABASE DB FROM USER USR;* > In Ranger, it is not possible to *REVOKE GRANT OPTION* for a specific > privilege (*SELECT*). The *GRANT OPTION* must be revoked for the entire > *DATABASE DB* resource. The Impala grammar should be updated to support > omitting the *FOR SELECT* if the authorization provider is set to Ranger. If > that grammar is used for Sentry, it should throw an exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8399) Batch the authorization requests to Ranger
[ https://issues.apache.org/jira/browse/IMPALA-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya updated IMPALA-8399: - Issue Type: Improvement (was: Sub-task) Parent: (was: IMPALA-7916) > Batch the authorization requests to Ranger > -- > > Key: IMPALA-8399 > URL: https://issues.apache.org/jira/browse/IMPALA-8399 > Project: IMPALA > Issue Type: Improvement > Components: Catalog, Frontend >Reporter: Fredy Wijaya >Priority: Critical > > To reduce the network round trip we should consider batching authorization > requests to Ranger. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8727) Impala Doc: DDL Docs Update for Kudu / HMS Integration
Alex Rodoni created IMPALA-8727: --- Summary: Impala Doc: DDL Docs Update for Kudu / HMS Integration Key: IMPALA-8727 URL: https://issues.apache.org/jira/browse/IMPALA-8727 Project: IMPALA Issue Type: Sub-task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8727) Impala Doc: DDL Docs Update for Kudu / HMS Integration
Alex Rodoni created IMPALA-8727: --- Summary: Impala Doc: DDL Docs Update for Kudu / HMS Integration Key: IMPALA-8727 URL: https://issues.apache.org/jira/browse/IMPALA-8727 Project: IMPALA Issue Type: Sub-task Components: Docs Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IMPALA-5392) Stack depth for threads printed in the Catalog UI under JVM Threads is not deep enough
[ https://issues.apache.org/jira/browse/IMPALA-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875072#comment-16875072 ] Abhiroj Panwar commented on IMPALA-5392: Can I take this up? > Stack depth for threads printed in the Catalog UI under JVM Threads is not > deep enough > -- > > Key: IMPALA-5392 > URL: https://issues.apache.org/jira/browse/IMPALA-5392 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 2.9.0 >Reporter: Mostafa Mokhtar >Priority: Major > Labels: newbie, supportability > Attachments: CatalogUI_FixedStackTrace.png, StackFrame_len_1.png, > StackFrame_len_2.png > > > The depth of the call stack is not sufficient to understand the status of the > system. > |Summary||CPU time (s)||User time (s)||Blocked time (ms)||Blocked > times||Native| > |"Thread-11" Id=39 RUNNABLE (in native) at > java.net.SocketInputStream.socketRead0(Native Method) at > java.net.SocketInputStream.read(SocketInputStream.java:152) at > java.net.SocketInputStream.read(SocketInputStream.java:122) at > java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at > java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at > java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked > java.io.BufferedInputStream@5172f7b7 at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) ... > Number of locked synchronizers = 1 - > java.util.concurrent.locks.ReentrantLock$NonfairSync@52e53927|72.09|58.61|-1|1|true| > |"Thread-34" Id=72 RUNNABLE (in native)|48.0821|39.3|-1|0|true| > |"Thread-7" Id=35 WAITING on > java.util.concurrent.locks.ReentrantLock$NonfairSync@52e53927 owned by > "Thread-11" Id=39 at sun.misc.Unsafe.park(Native Method) - waiting on > java.util.concurrent.locks.ReentrantLock$NonfairSync@52e53927 at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) > at > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) > at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) at > org.apache.impala.catalog.CatalogServiceCatalog.getCatalogObjects(CatalogServiceCatalog.java:331) > ...|38.1586|17.27|-1|0|false| > |"Thread-20" Id=53 RUNNABLE (in native)|34.9055|28.29|-1|1|true| > |"pool-3-thread-4" Id=63 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@612caa8c > at sun.misc.Unsafe.park(Native Method) - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@612caa8c > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745)|6.85982|5.19|-1|7314|false| > |"pool-3-thread-8" Id=88 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@612caa8c > at sun.misc.Unsafe.park(Native Method) - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@612caa8c > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745)|5.09183|3.35|-1|8022|false| > |"pool-3-thread-10" Id=107 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@612caa8c > at sun.misc.Unsafe.park(Native Method) - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@612caa8c > at
[jira] [Updated] (IMPALA-8726) Autovectorisation leads to worse performance in bit unpacking
[ https://issues.apache.org/jira/browse/IMPALA-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker updated IMPALA-8726: -- Attachment: no_vector.png > Autovectorisation leads to worse performance in bit unpacking > - > > Key: IMPALA-8726 > URL: https://issues.apache.org/jira/browse/IMPALA-8726 > Project: IMPALA > Issue Type: Improvement >Reporter: Daniel Becker >Priority: Minor > Attachments: no_vector.png > > > The compiler (GCC 4.9.2) autovectorises bit unpacking for bit widths 1, 2, 4 > and 8 (function BitPacking::UnpackValues), but this leads to actually worse > performance (see the attached graph). We should consider whether it is worth > disabling autovectorisation for bit unpacking, but future compiler versions > may do a better job. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8726) Autovectorisation leads to worse performance in bit unpacking
[ https://issues.apache.org/jira/browse/IMPALA-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker updated IMPALA-8726: -- Attachment: no_vector.png > Autovectorisation leads to worse performance in bit unpacking > - > > Key: IMPALA-8726 > URL: https://issues.apache.org/jira/browse/IMPALA-8726 > Project: IMPALA > Issue Type: Improvement >Reporter: Daniel Becker >Priority: Minor > > The compiler (GCC 4.9.2) autovectorises bit unpacking for bit widths 1, 2, 4 > and 8 (function BitPacking::UnpackValues), but this leads to actually worse > performance (see the attached graph). We should consider whether it is worth > disabling autovectorisation for bit unpacking, but future compiler versions > may do a better job. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8726) Autovectorisation leads to worse performance in bit unpacking
[ https://issues.apache.org/jira/browse/IMPALA-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker updated IMPALA-8726: -- Attachment: (was: no_vector.png) > Autovectorisation leads to worse performance in bit unpacking > - > > Key: IMPALA-8726 > URL: https://issues.apache.org/jira/browse/IMPALA-8726 > Project: IMPALA > Issue Type: Improvement >Reporter: Daniel Becker >Priority: Minor > > The compiler (GCC 4.9.2) autovectorises bit unpacking for bit widths 1, 2, 4 > and 8 (function BitPacking::UnpackValues), but this leads to actually worse > performance (see the attached graph). We should consider whether it is worth > disabling autovectorisation for bit unpacking, but future compiler versions > may do a better job. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8726) Autovectorisation leads to worse performance in bit unpacking
[ https://issues.apache.org/jira/browse/IMPALA-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker updated IMPALA-8726: -- Attachment: (was: no_vector.png) > Autovectorisation leads to worse performance in bit unpacking > - > > Key: IMPALA-8726 > URL: https://issues.apache.org/jira/browse/IMPALA-8726 > Project: IMPALA > Issue Type: Improvement >Reporter: Daniel Becker >Priority: Minor > > The compiler (GCC 4.9.2) autovectorises bit unpacking for bit widths 1, 2, 4 > and 8 (function BitPacking::UnpackValues), but this leads to actually worse > performance (see the attached graph). We should consider whether it is worth > disabling autovectorisation for bit unpacking, but future compiler versions > may do a better job. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8726) Autovectorisation leads to worse performance in bit unpacking
Daniel Becker created IMPALA-8726: - Summary: Autovectorisation leads to worse performance in bit unpacking Key: IMPALA-8726 URL: https://issues.apache.org/jira/browse/IMPALA-8726 Project: IMPALA Issue Type: Improvement Reporter: Daniel Becker Attachments: no_vector.png The compiler (GCC 4.9.2) autovectorises bit unpacking for bit widths 1, 2, 4 and 8 (function BitPacking::UnpackValues), but this leads to actually worse performance (see the attached graph). We should consider whether it is worth disabling autovectorisation for bit unpacking, but future compiler versions may do a better job. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8726) Autovectorisation leads to worse performance in bit unpacking
Daniel Becker created IMPALA-8726: - Summary: Autovectorisation leads to worse performance in bit unpacking Key: IMPALA-8726 URL: https://issues.apache.org/jira/browse/IMPALA-8726 Project: IMPALA Issue Type: Improvement Reporter: Daniel Becker Attachments: no_vector.png The compiler (GCC 4.9.2) autovectorises bit unpacking for bit widths 1, 2, 4 and 8 (function BitPacking::UnpackValues), but this leads to actually worse performance (see the attached graph). We should consider whether it is worth disabling autovectorisation for bit unpacking, but future compiler versions may do a better job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (IMPALA-8712) Convert ExecQueryFInstance() RPC to become asynchronous
[ https://issues.apache.org/jira/browse/IMPALA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873765#comment-16873765 ] Michael Ho edited comment on IMPALA-8712 at 6/28/19 7:22 AM: - On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. Some of the serialization overhead with ExecQueryFInstance() RPC even after IMPALA-7467 is still Thrift related as we just serialize a bunch of Thrift structures into a binary blob and send them via KRPC sidecar. The serialization is done in parallel by threads in {{exec_rpc_thread_pool_}}. -If we convert those Thrift structures into Protobuf, then the serialization can be done in parallel by reactor threads in the KRPC stack.- was (Author: kwho): On the other hand, {{exec_rpc_thread_pool_}} allows serialization of the RPC parameters to happen in parallel so it may not strictly be a simple conversion to asynchronous RPC without regression. So careful evaluation with huge RPC parameters (e.g. a large number of scan ranges) may be needed to see if there may be regression as a result. Some of the serialization overhead with ExecQueryFInstance() RPC even after IMPALA-7467 is still Thrift related as we just serialize a bunch of Thrift structures into a binary blob and send them via KRPC sidecar. The serialization is done in parallel by threads in {{exec_rpc_thread_pool_}}. If we convert those Thrift structures into Protobuf, then the serialization can be done in parallel by reactor threads in the KRPC stack. > Convert ExecQueryFInstance() RPC to become asynchronous > --- > > Key: IMPALA-8712 > URL: https://issues.apache.org/jira/browse/IMPALA-8712 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.3.0 >Reporter: Michael Ho >Assignee: Thomas Tauber-Marshall >Priority: Major > > Now that IMPALA-7467 is fixed, ExecQueryFInstance() can utilize the async RPC > capabilities of KRPC instead of relying on the half-baked way of using > {{ExecEnv::exec_rpc_thread_pool_}} to start query fragment instances. We > already have a reactor thread pool in KRPC to handle sending client RPCs > asynchronously. Also various tasks under IMPALA-5486 can also benefit from > making ExecQueryFInstance() asynchronous so the RPCs can be cancelled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org