[Impala-ASF-CR] IMPALA-10648: Invalidate catalogd table cache for hms ddl apis which modify tables and partitions.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17298 ) Change subject: IMPALA-10648: Invalidate catalogd table cache for hms ddl apis which modify tables and partitions. .. Patch Set 3: (13 comments) The solution looks good to me. Please adjust the code style to match existing codes. Thanks! http://gerrit.cloudera.org:8080/#/c/17298/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17298/3//COMMIT_MSG@9 PS3, Line 9: For non transactional tables Could you explain why we don't do this for transactional tables in the commit message? http://gerrit.cloudera.org:8080/#/c/17298/3//COMMIT_MSG@13 PS3, Line 13: (since table loading in cache takes time) but ensures consistency. This change is behind catalogd nit: please adjust the commit message body to fit into at-most 72 chars per line. http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java File fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java: http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@141 PS3, Line 141: import org.apache.hadoop.hive.metastore.api.InvalidPartitionException; Is this used somewhere? http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@222 PS3, Line 222: UnknownPartitionException Is this used somewhere? http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@347 PS3, Line 347: nit: We use 4 spaces indention for multi-line statement. However, many changes in this patch use 8 spaces. Could you configure your IDE to adjust this? http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@349 PS3, Line 349: nit: 4 spaces indention http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@915 PS3, Line 915: partitionList Is it possible that these partitions belong to different tables? * If no, let's call invalidateNonTransactionalTableIfExists() once * If yes, let's de-duplicate the table names first and then call invalidateNonTransactionalTableIfExists() for each table. http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@929 PS3, Line 929: for (PartitionSpec partitionSpec : list) { Same question as above http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1086 PS3, Line 1086: nit: 4 spaces indention http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@1087 PS3, Line 1087: , nit: 8 spaces indention and move the comma above http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@2849 PS3, Line 2849: if nit: In our code style, we need one space after "if". Many changes in this patch don't have it. Please adjust them. Thanks! http://gerrit.cloudera.org:8080/#/c/17298/3/fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java@2903 PS3, Line 2903: // return immediately if flag invalidateCacheOnDDLs_ is false : if(!invalidateCacheOnDDLs_) { : LOG.debug("Not removing table {}.{} from catalogd cache because " + : "invalidateCacheOnDDLs_ flag is set to {} ", dbNameWithCatalog, : tableName, invalidateCacheOnDDLs_); : return; : } : // Parse db name. Throw error if parsing fails. : String dbName = dbNameWithCatalog; : try { : dbName = MetaStoreUtils.parseDbName(dbNameWithCatalog, serverConf_)[1]; : } catch (MetaException ex) { : LOG.error("Successfully executed HMS api: {} but encountered error " + : "when parsing dbName {} to invalidate/remove table from cache with error message: {}", : apiName, dbNameWithCatalog, ex.getMessage()); : throw ex; : } : Db db = catalog_.getDb(dbName); : if(db == null) { : LOG.debug("Not removing table {}.{} because db {} does not exist in catalogd cache", : dbName, tableName, dbName); : return; : } : if(!db.containsTable(tableName)) { : LOG.debug("Not removing table {}.{} becau
[Impala-ASF-CR] IMPALA-10652: Optimize the checking of the size of incremental stats
liuyao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17299 ) Change subject: IMPALA-10652: Optimize the checking of the size of incremental stats .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/17299/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/17299/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@448 PS2, Line 448: numOfAllIncStats > May rename the variable to numOfAllIncStatsPartitions Done http://gerrit.cloudera.org:8080/#/c/17299/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@452 PS2, Line 452: if (partitionSet_ == null) { : numOfAllIncStatsPartitions = allPartitio > We may not need to verify the size limit when the partition set for increme It is a pre-check on the size of incremental stats to prevent the incremental stats from occupying too much memory after calculation. If no partition is specified, all partitions are calculated. Need to check whether the incremental stats of all partitions exceeds the threshold after calculation. -- To view, visit http://gerrit.cloudera.org:8080/17299 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4f35ea936445015a3b8b8102b1891db29751b5ee Gerrit-Change-Number: 17299 Gerrit-PatchSet: 3 Gerrit-Owner: liuyao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: liuyao Gerrit-Comment-Date: Wed, 14 Apr 2021 06:18:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10652: Optimize the checking of the size of incremental stats
Hello Aman Sinha, Qifan Chen, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17299 to look at the new patch set (#3). Change subject: IMPALA-10652: Optimize the checking of the size of incremental stats .. IMPALA-10652: Optimize the checking of the size of incremental stats Modify the estimation method of incremental statistics size: incremental statistics size = Existing partition statistics + This time calculation partition stats - Repeated calculation partition stats Testing: All partitions of a table have no incremental stats. --Calculate the incremental stats of all partitions, the incremental stats size exceeds the threshold, an error is reported. --Calculate the incremental stats of one partition, no error is reported. Change-Id: I4f35ea936445015a3b8b8102b1891db29751b5ee --- M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java 2 files changed, 45 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17299/3 -- To view, visit http://gerrit.cloudera.org:8080/17299 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4f35ea936445015a3b8b8102b1891db29751b5ee Gerrit-Change-Number: 17299 Gerrit-PatchSet: 3 Gerrit-Owner: liuyao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10445: Adjust NDV's scale with query option
fifteencai has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/17306 ) Change subject: IMPALA-10445: Adjust NDV's scale with query option .. IMPALA-10445: Adjust NDV's scale with query option This is a new way to control NDV's scale. Since IMPALA-2658, we can trade memory for more accurate estimation by setting larger `scale` in SQL function NDV(, ). However the use of larger NDV scale requires the modification of SQL queries which may not be practical in certain applications: - Firstly, SQL writers are reluctant to lower that scale. They prone to fill up the scale, which will make the cluster unstable. Especially when there are `group by`s with high cardinalities. So it is wiser to let cluster admin other than sql writer choose appropriate scale. - Secondly, In some application scenarios, queries are stored in DBs. In a BI system, for example, rewriting thousands of SQLs is risky. In this commit, we introduced a new Query Option `DEFAULT_NDV_SCALE` with the following semantics: 1. The allowed value is in the range [1..10]; 2. Previously, the scale used in NDV() functions was fixed at 2. Now the scale is provided by the newly added query options. 3. It does not influence the NDV scale for SQL function NDV(, ) in which the NDV scale is provided by the 2nd argument . We also refactored method `Analyze` to make sure APPX_COUNT_DISTINCT can work with this query option. After this, cluster admins can substitute `count(distinct )` with `ndv(, scale)`. Implementation details: - The default value of DEFAULT_NDV_SCALE is 2, so we won't change the default ndv behavior. - We port `CountDistinctToNdv` transform logic from `SelectStmt.analyze()` to `ExprRewriter`, making it compatible with further rewrite rules. - The newly added rewrite rule `DefaultNdvScaleRule` is applied after `CountDistinctToNdvRule`. Usage: To set a default ndv scale: ``` SET DEFAULT_NDV_SCALE = 10; ``` To unset: ``` SET DEFAULT_NDV_SCALE = 2; ``` Here are test results of a typical workload (cardinality=40,090,650): ++ | Metric| Count Distinct |NDV2|NDV5|NDV10 | ++ | Memory(GB) | 3.83 |1.84|1.85| 1.89 | | Duration(s) | 182.89| 30.22|29.72 | 29.24 | | ErrorRate |0% |1.8%|1.17% | 0.06% | ++ Testing: 1) Added 3 unit test cases in `ExprRewriteRulesTest`. 2) Added 5 unit test cases in `ExprRewriterTest`. 3) Ran all front-end unit test, passed. 4) Added a new query-option test. Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java A fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java A fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java 11 files changed, 298 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17306/12 -- To view, visit http://gerrit.cloudera.org:8080/17306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 Gerrit-Change-Number: 17306 Gerrit-PatchSet: 12 Gerrit-Owner: fifteencai Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: fifteencai
[Impala-ASF-CR] IMPALA-10445: Adjust NDV's scale with query option
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17306 ) Change subject: IMPALA-10445: Adjust NDV's scale with query option .. Patch Set 11: (2 comments) Looks good! http://gerrit.cloudera.org:8080/#/c/17306/11//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17306/11//COMMIT_MSG@14 PS11, Line 14: y nit: queries http://gerrit.cloudera.org:8080/#/c/17306/11/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java: http://gerrit.cloudera.org:8080/#/c/17306/11/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@539 PS11, Line 539: } The above tests look good! Can we add several more to test the combination of the two new rules: 1. SELECT count(DISTINCT id), ndv(id) FROM functional.alltypes 2. setDefault_ndv_scale(9) and then SELECT count(DISTINCT id), ndv(id) FROM functional.alltypes 3. setDefault_ndv_scale(5) and then SELECT count(DISTINCT id), ndv(id), ndv(id, 10) FROM functional.alltypes 3. setDefault_ndv_scale(5) and then SELECT ndv(id, 10), count(DISTINCT id), ndv(id) FROM functional.alltypes 4. setAppx_count_distinct(true);setDefault_ndv_scale(9) and then SELECT count(DISTINCT id), ndv(id) FROM functional.alltypes -- To view, visit http://gerrit.cloudera.org:8080/17306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 Gerrit-Change-Number: 17306 Gerrit-PatchSet: 11 Gerrit-Owner: fifteencai Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: fifteencai Gerrit-Comment-Date: Wed, 14 Apr 2021 02:08:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR](asf-site) Add slack channel in the community page
Quanlong Huang has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17311 ) Change subject: Add slack channel in the community page .. Add slack channel in the community page This patch adds links to join the apache-impala slack channel. Change-Id: I26112aff63cabf6f4ca437f12d71928e80106687 Reviewed-on: http://gerrit.cloudera.org:8080/17311 Reviewed-by: Jim Apple Tested-by: Quanlong Huang --- M community.html 1 file changed, 1 insertion(+), 0 deletions(-) Approvals: Jim Apple: Looks good to me, approved Quanlong Huang: Verified -- To view, visit http://gerrit.cloudera.org:8080/17311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: merged Gerrit-Change-Id: I26112aff63cabf6f4ca437f12d71928e80106687 Gerrit-Change-Number: 17311 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR](asf-site) Add slack channel in the community page
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17311 ) Change subject: Add slack channel in the community page .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: asf-site Gerrit-MessageType: comment Gerrit-Change-Id: I26112aff63cabf6f4ca437f12d71928e80106687 Gerrit-Change-Number: 17311 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 14 Apr 2021 00:58:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10445: Adjust NDV's scale with query option
fifteencai has posted comments on this change. ( http://gerrit.cloudera.org:8080/17306 ) Change subject: IMPALA-10445: Adjust NDV's scale with query option .. Patch Set 11: Thanks a lot to Quanlong and Qifan !!! I have reworded the commit msg and added unit tests. Please take a look. -- To view, visit http://gerrit.cloudera.org:8080/17306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 Gerrit-Change-Number: 17306 Gerrit-PatchSet: 11 Gerrit-Owner: fifteencai Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: fifteencai Gerrit-Comment-Date: Wed, 14 Apr 2021 00:56:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10445: Adjust NDV's scale with query option
fifteencai has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/17306 ) Change subject: IMPALA-10445: Adjust NDV's scale with query option .. IMPALA-10445: Adjust NDV's scale with query option This is a new way to control NDV's scale. Since IMPALA-2658, we can trade memory for more accurate estimation by setting larger `scale` in SQL function NDV(, ). However the use of larger NDV scale requires the modification of SQL query which may not be practical in certain applications: - Firstly, SQL writers are reluctant to lower that scale. They prone to fill up the scale, which will make the cluster unstable. Especially when there are `group by`s with high cardinalities. So it is wiser to let cluster admin other than sql writer choose appropriate scale. - Secondly, In some application scenarios, queries are stored in DBs. In a BI system, for example, rewriting thousands of SQLs is risky. In this commit, we introduced a new Query Option `DEFAULT_NDV_SCALE` with the following semantics: 1. The allowed value is in the range [1..10]; 2. Previously, the scale used in NDV() functions was fixed at 2. Now the scale is provided by the newly added query options. 3. It does not influence the NDV scale for SQL function NDV(, ) in which the NDV scale is provided by the 2nd argument . We also refactored method `Analyze` to make sure APPX_COUNT_DISTINCT can work with this query option. After this, cluster admins can substitute `count(distinct )` with `ndv(, scale)`. Implementation details: - The default value of DEFAULT_NDV_SCALE is 2, so we won't change the default ndv behavior. - We port `CountDistinctToNdv` transform logic from `SelectStmt.analyze()` to `ExprRewriter`, making it compatible with further rewrite rules. - The newly added rewrite rule `DefaultNdvScaleRule` is applied after `CountDistinctToNdvRule`. Usage: To set a default ndv scale: ``` SET DEFAULT_NDV_SCALE = 10; ``` To unset: ``` SET DEFAULT_NDV_SCALE = 2; ``` Here are test results of a typical workload (cardinality=40,090,650): ++ | Metric| Count Distinct |NDV2|NDV5|NDV10 | ++ | Memory(GB) | 3.83 |1.84|1.85| 1.89 | | Duration(s) | 182.89| 30.22|29.72 | 29.24 | | ErrorRate |0% |1.8%|1.17% | 0.06% | ++ Testing: 1) Added 3 unit test cases in `ExprRewriteRulesTest`. 2) Added 5 unit test cases in `ExprRewriterTest`. 3) Ran all front-end unit test, passed. 4) Added a new query-option test. Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java A fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java A fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java 11 files changed, 252 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17306/11 -- To view, visit http://gerrit.cloudera.org:8080/17306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 Gerrit-Change-Number: 17306 Gerrit-PatchSet: 11 Gerrit-Owner: fifteencai Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] WIP: IMPALA-10656: Fire insert events before commit
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17313 ) Change subject: WIP: IMPALA-10656: Fire insert events before commit .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7066/ -- To view, visit http://gerrit.cloudera.org:8080/17313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2ed812dbcb5f55efff3a910a3daeeb76cd3295b9 Gerrit-Change-Number: 17313 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 14 Apr 2021 00:45:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17289 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7069/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I527de530f7db1ce959e7ef2ae3ced18677221c9f Gerrit-Change-Number: 17289 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 23:37:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17289 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I527de530f7db1ce959e7ef2ae3ced18677221c9f Gerrit-Change-Number: 17289 Gerrit-PatchSet: 5 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 23:37:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17289 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I527de530f7db1ce959e7ef2ae3ced18677221c9f Gerrit-Change-Number: 17289 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 23:36:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10657: Remove accidental usage of shaded imports
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17314 ) Change subject: IMPALA-10657: Remove accidental usage of shaded imports .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa67aec96539db7861416f3e71cca83607e3d8c9 Gerrit-Change-Number: 17314 Gerrit-PatchSet: 2 Gerrit-Owner: John Sherman Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 13 Apr 2021 23:24:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10657: Remove accidental usage of shaded imports
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17314 ) Change subject: IMPALA-10657: Remove accidental usage of shaded imports .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7068/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa67aec96539db7861416f3e71cca83607e3d8c9 Gerrit-Change-Number: 17314 Gerrit-PatchSet: 2 Gerrit-Owner: John Sherman Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 13 Apr 2021 23:24:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10657: Remove accidental usage of shaded imports
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17314 ) Change subject: IMPALA-10657: Remove accidental usage of shaded imports .. Patch Set 1: Code-Review+2 This makes sense, and looking through the history, I agree that these don't seem to have any purpose. -- To view, visit http://gerrit.cloudera.org:8080/17314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa67aec96539db7861416f3e71cca83607e3d8c9 Gerrit-Change-Number: 17314 Gerrit-PatchSet: 1 Gerrit-Owner: John Sherman Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 13 Apr 2021 22:47:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10657: Remove accidental usage of shaded imports
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17314 ) Change subject: IMPALA-10657: Remove accidental usage of shaded imports .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8569/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifa67aec96539db7861416f3e71cca83607e3d8c9 Gerrit-Change-Number: 17314 Gerrit-PatchSet: 1 Gerrit-Owner: John Sherman Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 13 Apr 2021 21:42:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10657: Remove accidental usage of shaded imports
John Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17314 Change subject: IMPALA-10657: Remove accidental usage of shaded imports .. IMPALA-10657: Remove accidental usage of shaded imports - This changes seemingly accidental usages of shaded imports with the direct dependency - This is helpful to reduce confusion and possibly reduce the required jars for certain usages of the frontend jars Change-Id: Ifa67aec96539db7861416f3e71cca83607e3d8c9 --- M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java 4 files changed, 6 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/17314/1 -- To view, visit http://gerrit.cloudera.org:8080/17314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ifa67aec96539db7861416f3e71cca83607e3d8c9 Gerrit-Change-Number: 17314 Gerrit-PatchSet: 1 Gerrit-Owner: John Sherman
[Impala-ASF-CR] IMPALA-10445: Adjust NDV's scale with query option
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17306 ) Change subject: IMPALA-10445: Adjust NDV's scale with query option .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java File fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java: http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java@38 PS9, Line 38: DefaultNdvScaleRule > I wonder if this rule can be merged to the CountDistinctToNdvRule. That is, It seems this rule by itself is also important to transform NDV() to NDV(, ) under the table when transform count(distinct) to NDV() is not needed. I am OK with keeping this rule. -- To view, visit http://gerrit.cloudera.org:8080/17306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 Gerrit-Change-Number: 17306 Gerrit-PatchSet: 9 Gerrit-Owner: fifteencai Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 13 Apr 2021 21:18:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10584: Defer advancing read page when reservation is tight.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17195 ) Change subject: IMPALA-10584: Defer advancing read page when reservation is tight. .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8568/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a Gerrit-Change-Number: 17195 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 13 Apr 2021 20:59:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10584: Defer advancing read page when reservation is tight.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17195 ) Change subject: IMPALA-10584: Defer advancing read page when reservation is tight. .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/17195/4/be/src/runtime/buffered-tuple-stream.cc File be/src/runtime/buffered-tuple-stream.cc: http://gerrit.cloudera.org:8080/#/c/17195/4/be/src/runtime/buffered-tuple-stream.cc@733 PS4, Line 733: dvancing_read_page = true; > This DCHECK was part of my debugging when investigating the bug. Done http://gerrit.cloudera.org:8080/#/c/17195/4/be/src/runtime/buffered-tuple-stream.cc@750 PS4, Line 750: > Ack Done. Reword it a bit. -- To view, visit http://gerrit.cloudera.org:8080/17195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a Gerrit-Change-Number: 17195 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 13 Apr 2021 20:41:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10584: Defer advancing read page when reservation is tight.
Hello Quanlong Huang, Qifan Chen, Joe McDonnell, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17195 to look at the new patch set (#5). Change subject: IMPALA-10584: Defer advancing read page when reservation is tight. .. IMPALA-10584: Defer advancing read page when reservation is tight. TestScratchLimit::test_with_unlimited_scratch_limit has been intermittently crashing in ubuntu-16.04-dockerised-tests environment after result spooling is enabled by default in IMPALA-9856. DCHECK violation occurs in ReservationTracker::CheckConsistency() due to BufferedTupleStream wrongly tries to reclaim memory reservation while unpinning the stream. For this bug to surface, all of the following needs to happen: - Stream is in pinned mode. - There are only 2 pages in the stream: 1 read and 1 write. - Stream can not increase reservation anymore either due to memory pressure or low buffer/memory limit. - The stream read page has been fully read and is attached to output RowBatch. But the output RowBatch has not cleaned up yet. - BufferedTupleStream::UnpinStream is invoked. The memory accounting bug happens because UnpinStream proceeds to NextReadPage where the read page buffer was mistakenly assumed as released. default_page_len_ bytes were added into write_page_reservation_ and subsequently violates the total memory reservation. This patch fixes the bug by deferring advancement of the read iterator in UnpinStream if the read page is attached to output RowBatch and there is no more unused reservation. This is OK because after UnpinStream finished, the stream is now in unpinned mode and has_read_write_page is false. The next AddRow operation is then allowed to unpin the previous write page first before reusing the reservation to allocate a new write page. Testing: - Add be test DeferAdvancingReadPage. - Loop the TestScratchLimit::test_with_unlimited_scratch_limit in my local dev machine and verify that each test passed without triggering the DCHECK violation. - Reenable result spooling in TestScratchLimit that was disabled in IMPALA-10559. - Pass core tests. Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a --- M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/buffered-tuple-stream.cc M be/src/runtime/buffered-tuple-stream.h M tests/query_test/test_scratch_limit.py 4 files changed, 96 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17195/5 -- To view, visit http://gerrit.cloudera.org:8080/17195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a Gerrit-Change-Number: 17195 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 21: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7067/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 21 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 20:13:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 21: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 21 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 20:13:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 20: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 20 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 20:12:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10584: Defer advancing read page when reservation is tight.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17195 ) Change subject: IMPALA-10584: Defer advancing read page when reservation is tight. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/17195/4/be/src/runtime/buffered-tuple-stream.cc File be/src/runtime/buffered-tuple-stream.cc: http://gerrit.cloudera.org:8080/#/c/17195/4/be/src/runtime/buffered-tuple-stream.cc@733 PS4, Line 733: read_it_.read_page_ == pages_.begin() > I wonder if this assertion is always true, since in pinned mode, there can This DCHECK was part of my debugging when investigating the bug. The intent is to match assertion in NextReadPage (buffered-tuple-stream.cc:529). Agree that this DCHECK most likely will always be true since we check earlier that attached_to_output_batch == true. Will remove this DCHECK in next patch set. http://gerrit.cloudera.org:8080/#/c/17195/4/be/src/runtime/buffered-tuple-stream.cc@750 PS4, Line 750: unpin this first page > May add a comment on when this currently unpinned page is finally unpinned. Ack -- To view, visit http://gerrit.cloudera.org:8080/17195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a Gerrit-Change-Number: 17195 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 13 Apr 2021 20:06:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10654: Fix precision loss in DecimalValue to double conversion.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17303 ) Change subject: IMPALA-10654: Fix precision loss in DecimalValue to double conversion. .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8567/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1 Gerrit-Change-Number: 17303 Gerrit-PatchSet: 5 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 19:58:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10584: Defer advancing read page when reservation is tight.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17195 ) Change subject: IMPALA-10584: Defer advancing read page when reservation is tight. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/17195/4/be/src/runtime/buffered-tuple-stream.cc File be/src/runtime/buffered-tuple-stream.cc: http://gerrit.cloudera.org:8080/#/c/17195/4/be/src/runtime/buffered-tuple-stream.cc@733 PS4, Line 733: read_it_.read_page_ == pages_.begin() I wonder if this assertion is always true, since in pinned mode, there can be multiple pages in the stream in memory. The current read page (i.e. read_it_.read_page_) may be any of these pages? http://gerrit.cloudera.org:8080/#/c/17195/4/be/src/runtime/buffered-tuple-stream.cc@750 PS4, Line 750: unpin this first page May add a comment on when this currently unpinned page is finally unpinned. -- To view, visit http://gerrit.cloudera.org:8080/17195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a Gerrit-Change-Number: 17195 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 13 Apr 2021 19:53:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10654: Fix precision loss in DecimalValue to double conversion.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17303 ) Change subject: IMPALA-10654: Fix precision loss in DecimalValue to double conversion. .. Patch Set 5: (20 comments) http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h File be/src/thirdparty/fast_double_parser/fast_double_parser.h: http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@13 PS5, Line 13: #if (defined(sun) || defined(__sun)) line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@17 PS5, Line 17: #if defined(__CYGWIN__) || defined(__MINGW32__) || defined(__MINGW64__) line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@22 PS5, Line 22: * Determining whether we should import xlocale.h or not is line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@25 PS5, Line 25: #if defined(FAST_DOUBLE_PARSER_SOLARIS) || defined(FAST_DOUBLE_PARSER_CYGWIN) line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@67 PS5, Line 67: #endif // defined(FAST_DOUBLE_PARSER_SOLARIS) || defined(FAST_DOUBLE_PARSER_CYGWIN) line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@84 PS5, Line 84: * However, we have that line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@86 PS5, Line 86: * Thus it is possible for a number of the form w * 10^-342 where line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@94 PS5, Line 94: * Any number of form w * 10^309 where w>= 1 is going to be line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@153 PS5, Line 153: // credit: https://stackoverflow.com/questions/28868367/getting-the-high-part-of-64-bit-integer-multiplication line too long (110 > 90) http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@154 PS5, Line 154: really_inline uint64_t Emulate64x64to128(uint64_t& r_hi, const uint64_t x, const uint64_t y) { line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@159 PS5, Line 159: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@224 PS5, Line 224: */ line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@958 PS5, Line 958: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@960 PS5, Line 960: // The exponent is 1024 + 63 + power line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@976 PS5, Line 976: // The 65536 is (1<<16) and corresponds to line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@979 PS5, Line 979: // ((152170 * power ) >> 16) is equal to line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@980 PS5, Line 980: // floor(log(5**power)/log(2)) line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@982 PS5, Line 982: // Note that this is not magic: 152170/(1<<16) is line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@984 PS5, Line 984: // The 1<<16 value is a power of two; we could use a line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/17303/5/be/src/thirdparty/fast_double_parser/fast_double_parser.h@1097 PS5, Line 1097: #if defined(FAST_DOUBLE_PARSER_SOLARIS) || defined(FAST_DOUBLE_PARSER_CYGWIN) line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/17303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1 Gerrit-Change-Number: 17303 Gerrit-PatchSet: 5 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 19:38:33 + Gerrit-HasComme
[Impala-ASF-CR] IMPALA-10654: Fix precision loss in DecimalValue to double conversion.
Amogh Margoor has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17303 ) Change subject: IMPALA-10654: Fix precision loss in DecimalValue to double conversion. .. IMPALA-10654: Fix precision loss in DecimalValue to double conversion. Original approach to convert DecimalValue(internal representation of decimals) to double was not accurate. It was: static_cast(value_) / pow(10.0, scale). However only integers from −2^53 to 2^53 can be represented accurately by double precision without any loss. Hence, it would not work for numbers like -0.43149576573887316. For DecimalValue representing -0.43149576573887316, value_ would be -43149576573887316 and scale would be 17. As value_ < -2^53, result would not be accurate. In newer approach we are using third party library https://github.com/lemire/fast_double_parser, which handles above scenario in a performant manner. Testing: 1. Added End to End Tests covering following scenarios: a. Test to show precision limitation of 16 in the write path b. DecimalValue's value_ between -2^53 and 2^53. b. value_ outside above range but abs(value_) < UINT64_MAX c. abs(value_) > UINT64_MAX -covers DecimalValue<__int128_t> 2. Ran existing backend and end-to-end tests completely Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1 --- M be/src/runtime/decimal-value.inline.h A be/src/thirdparty/fast_double_parser/LICENSE A be/src/thirdparty/fast_double_parser/LICENSE.BSL A be/src/thirdparty/fast_double_parser/README.md A be/src/thirdparty/fast_double_parser/fast_double_parser.h M bin/rat_exclude_files.txt M testdata/workloads/functional-query/queries/QueryTest/values.test 7 files changed, 1,568 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/17303/5 -- To view, visit http://gerrit.cloudera.org:8080/17303 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1 Gerrit-Change-Number: 17303 Gerrit-PatchSet: 5 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10584: Defer advancing read page when reservation is tight.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17195 ) Change subject: IMPALA-10584: Defer advancing read page when reservation is tight. .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8566/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a Gerrit-Change-Number: 17195 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 13 Apr 2021 19:22:40 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-10656: Fire insert events before commit
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17313 ) Change subject: WIP: IMPALA-10656: Fire insert events before commit .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8565/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2ed812dbcb5f55efff3a910a3daeeb76cd3295b9 Gerrit-Change-Number: 17313 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 19:17:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10584: Defer advancing read page when reservation is tight.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17195 ) Change subject: IMPALA-10584: Defer advancing read page when reservation is tight. .. Patch Set 4: (2 comments) Patch set 4 submitted. http://gerrit.cloudera.org:8080/#/c/17195/3/be/src/runtime/buffered-tuple-stream-test.cc File be/src/runtime/buffered-tuple-stream-test.cc: http://gerrit.cloudera.org:8080/#/c/17195/3/be/src/runtime/buffered-tuple-stream-test.cc@1237 PS3, Line 1237: TEST_F(SimpleTupleStreamTest, UnpinReadPage) { > It is possible to reproduce the bug with some changes in this test? Added DeferAdvancingReadPage to test this corner case. http://gerrit.cloudera.org:8080/#/c/17195/3/be/src/runtime/buffered-tuple-stream.cc File be/src/runtime/buffered-tuple-stream.cc: http://gerrit.cloudera.org:8080/#/c/17195/3/be/src/runtime/buffered-tuple-stream.cc@723 PS3, Line 723: > Does this assume the size of the read page is default_page_len_? Should we That is a good point, thanks! However, replacing it with read_it_.read_page_->len() caught this DCHECK failure: "buffer-pool.cc:93] Check failed: is_open()". I think this because the the page handle has been closed in AttachBufferToBatch(). I made a work around by memorizing the page len in last_attached_page_len_ variable. -- To view, visit http://gerrit.cloudera.org:8080/17195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a Gerrit-Change-Number: 17195 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 13 Apr 2021 19:07:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10584: Defer advancing read page when reservation is tight.
Hello Quanlong Huang, Joe McDonnell, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17195 to look at the new patch set (#4). Change subject: IMPALA-10584: Defer advancing read page when reservation is tight. .. IMPALA-10584: Defer advancing read page when reservation is tight. TestScratchLimit::test_with_unlimited_scratch_limit has been intermittently crashing in ubuntu-16.04-dockerised-tests environment after result spooling is enabled by default in IMPALA-9856. DCHECK violation occurs in ReservationTracker::CheckConsistency() due to BufferedTupleStream wrongly tries to reclaim memory reservation while unpinning the stream. For this bug to surface, all of the following needs to happen: - Stream is in pinned mode. - There are only 2 pages in the stream: 1 read and 1 write. - Stream can not increase reservation anymore either due to memory pressure or low buffer/memory limit. - The stream read page has been fully read and is attached to output RowBatch. But the output RowBatch has not cleaned up yet. - BufferedTupleStream::UnpinStream is invoked. The memory accounting bug happens because UnpinStream proceeds to NextReadPage where the read page buffer was mistakenly assumed as released. default_page_len_ bytes were added into write_page_reservation_ and subsequently violates the total memory reservation. This patch fixes the bug by deferring advancement of the read iterator in UnpinStream if the read page is attached to output RowBatch and there is no more unused reservation. This is OK because after UnpinStream finished, the stream is now in unpinned mode and has_read_write_page is false. The next AddRow operation is then allowed to unpin the previous write page first before reusing the reservation to allocate a new write page. Testing: - Add be test DeferAdvancingReadPage. - Loop the TestScratchLimit::test_with_unlimited_scratch_limit in my local dev machine and verify that each test passed without triggering the DCHECK violation. - Reenable result spooling in TestScratchLimit that was disabled in IMPALA-10559. - Pass core tests. Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a --- M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/buffered-tuple-stream.cc M be/src/runtime/buffered-tuple-stream.h M tests/query_test/test_scratch_limit.py 4 files changed, 97 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17195/4 -- To view, visit http://gerrit.cloudera.org:8080/17195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a Gerrit-Change-Number: 17195 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] WIP: IMPALA-10656: Fire insert events before commit
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17313 ) Change subject: WIP: IMPALA-10656: Fire insert events before commit .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17313/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/17313/3/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@879 PS3, Line 879: unpartTable.getFileSystem(), new Path(unpartTable.getHdfsBaseDir()), overwrite, line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2ed812dbcb5f55efff3a910a3daeeb76cd3295b9 Gerrit-Change-Number: 17313 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 18:57:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-10656: Fire insert events before commit
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17313 ) Change subject: WIP: IMPALA-10656: Fire insert events before commit .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7066/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2ed812dbcb5f55efff3a910a3daeeb76cd3295b9 Gerrit-Change-Number: 17313 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 18:57:16 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-10656: Fire insert events before commit
Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17313 Change subject: WIP: IMPALA-10656: Fire insert events before commit .. WIP: IMPALA-10656: Fire insert events before commit Before this fix Impala committed an insert first, then reloaded the table from HMS, and generated the insert events based on the difference between the two snapshots. (e.g. which file was not present in the old snapshot but are there in the new one). Hive replication expects the insert events before the commit, so this may potentially lead to issues there. The solution is to collect the new files during the insert in the backend, and send the insert events based on this file set. This wasn't very hard to do as we were alrady collecting the files in some cases: - to move them from staging dir to their final location in case of non-partitioned tables - to write the file list to snapshot files in case of Icebert tables This patch unifies the paths above and collects all information about the created files regardless of the table type. Testing: - no new tests - I wasn't able to run EE tests yet, this is the reason behind the poc state - done some basic manual testing Change-Id: I2ed812dbcb5f55efff3a910a3daeeb76cd3295b9 --- M be/src/exec/hbase-table-sink.cc M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-text-table-writer.cc M be/src/exec/output-partition.h M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java 12 files changed, 225 insertions(+), 216 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/17313/3 -- To view, visit http://gerrit.cloudera.org:8080/17313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I2ed812dbcb5f55efff3a910a3daeeb76cd3295b9 Gerrit-Change-Number: 17313 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17300 ) Change subject: IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec Gerrit-Change-Number: 17300 Gerrit-PatchSet: 3 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 17:59:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17300 ) Change subject: IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats .. IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats This change updates the exception that is thrown when the user tries to insert into a partition which has unsupported format. The information to make this decision is available during analysis, therefore this commit also moves the check from the planner to the analyzer to have an earlier result. In the analyzer only the FeFsTables have to be checked therefore Kudu tables are not related. Also, there is a difference between static and dynamic partition clauses, for static partition clauses the partition format is available during compile, for dynaminc partition clauses it is only avaialble during runtime. Testing: - Added unit tests - Ran exhaustive tests successfully Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec Reviewed-on: http://gerrit.cloudera.org:8080/17300 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/datasets/functional/functional_schema_template.sql 4 files changed, 78 insertions(+), 21 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec Gerrit-Change-Number: 17300 Gerrit-PatchSet: 4 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 20: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 20 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 17:45:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10445: Adjust NDV's scale with query option
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17306 ) Change subject: IMPALA-10445: Adjust NDV's scale with query option .. Patch Set 9: (6 comments) Looks good to me! http://gerrit.cloudera.org:8080/#/c/17306/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17306/9//COMMIT_MSG@12 PS9, Line 12: estimation by setting larger `scale`. That scale is decided by SQL in SQL function NDV(, ) http://gerrit.cloudera.org:8080/#/c/17306/9//COMMIT_MSG@11 PS9, Line 11: Since IMPALA-2658, we can trade memory for more accurate : estimation by setting larger `scale`. That scale is decided by SQL : writers. However, it is a bumpy road for cluster admins to allow for : larger scales. Here lies 2 reasons: May rewrite the sentence as follows. Since IMPALA-2658, we can trade memory for more accurate estimation by setting larger NDV scale in SQL function NDV(, ). However, the use of larger NDV scale requires the modification of the NDV function in queries which may not be practical in certain applications. http://gerrit.cloudera.org:8080/#/c/17306/9//COMMIT_MSG@24 PS9, Line 24: In this commit, we introduced a new Query Option `DEFAULT_NDV_SCALE`. : During to the advantage of query option, Cluster admins can either tune : 1 desired query, or influence upcoming queries by placing a default : query option in a dynamic resource pool. This paragraph can be rewritten as follows. In this commit, we introduced a new Query Option `DEFAULT_NDV_SCALE` with the following semantics. 1. The allowed value is in the range [2..10]; 2. The value of the query option is the default NDV scale for the SQL function of form NDV(). Previously, the NDV scale used in such functions is fixed at 2; 3. It does not influence the NDV scale for SQL function NDV(, ) in which the NDV scale is provided by the 2nd argument . http://gerrit.cloudera.org:8080/#/c/17306/9/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/17306/9/common/thrift/ImpalaService.thrift@655 PS9, Line 655: , make it easier to change NDV's scale may change to " to SQL NDV function NDV(. http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java File fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java: http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java@58 PS9, Line 58: Do function name substitution Probably should say "Create the substituted form". http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java File fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java: http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java@38 PS9, Line 38: DefaultNdvScaleRule I wonder if this rule can be merged to the CountDistinctToNdvRule. That is, if APPX_COUNT_DISTINCT query option is set, we fire that rule. Within that rule, we check for query option DEFAULT_NDV_SCALE and form NDV() function in two ways: NDV() if the value of DEFAULT_NDV_SCALE is 2, or NDV(, ) otherwise. -- To view, visit http://gerrit.cloudera.org:8080/17306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 Gerrit-Change-Number: 17306 Gerrit-PatchSet: 9 Gerrit-Owner: fifteencai Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 13 Apr 2021 17:40:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/17289 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/17289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I527de530f7db1ce959e7ef2ae3ced18677221c9f Gerrit-Change-Number: 17289 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 17:28:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 20: Code-Review+1 (4 comments) Thanks Qifan for addressing my previous comments! I do not have any addition suggestion. http://gerrit.cloudera.org:8080/#/c/17252/15//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17252/15//COMMIT_MSG@23 PS15, Line 23: receive > Please see my comment in scan-node.cc. Done http://gerrit.cloudera.org:8080/#/c/17252/15//COMMIT_MSG@27 PS15, Line 27: lling > Done Done http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/exec/scan-node.cc@239 PS15, Line 239: end > Good question. Thanks Qifan for the detailed explanation! I do not have any more comment. http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/runtime/coordinator.cc@652 PS15, Line 652: // Also add the min/max value for the accumulated filter as follows. : // 'PartialUpdates' - The min and the max are partial > Reword the comment as follows and try to avoid describe how the accumulated Thanks Qifan for addressing my comment. I do not have any addition suggestion. -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 20 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 17:17:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10652: Optimize the checking of the size of incremental stats
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17299 ) Change subject: IMPALA-10652: Optimize the checking of the size of incremental stats .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/17299/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/17299/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@448 PS2, Line 448: incPartitionSize May rename the variable to numOfAllIncStatsPartitions http://gerrit.cloudera.org:8080/#/c/17299/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@452 PS2, Line 452: if (partitionSet_ == null) { : incPartitionSize = allPartitions.size(); We may not need to verify the size limit when the partition set for incremental stats update is empty. The reason is that any incremental stats update done before has already passed the size test. -- To view, visit http://gerrit.cloudera.org:8080/17299 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4f35ea936445015a3b8b8102b1891db29751b5ee Gerrit-Change-Number: 17299 Gerrit-PatchSet: 2 Gerrit-Owner: liuyao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Apr 2021 16:49:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 20: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 20 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 16:01:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17289 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8564/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I527de530f7db1ce959e7ef2ae3ced18677221c9f Gerrit-Change-Number: 17289 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 15:57:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10532: TestOverlapMinMaxFilters.test overlap min max filters seems flaky
Qifan Chen has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17289 ) Change subject: IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky .. IMPALA-10532: TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky This change disables the overlap min/max filter test for hdfs in erasure coding, due to the query plan change (from 3-node scan to 2-node scan) which splits the row groups among scan nodes differently. The SkipIfEC class in test harness skip.py is enhanced with a new skip reason 'different_scan_split' to facilitate this action. Testing: 1. Ran unit tests; 2. Ran core tests. Change-Id: I527de530f7db1ce959e7ef2ae3ced18677221c9f --- M tests/common/skip.py M tests/query_test/test_runtime_filters.py 2 files changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/17289/4 -- To view, visit http://gerrit.cloudera.org:8080/17289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I527de530f7db1ce959e7ef2ae3ced18677221c9f Gerrit-Change-Number: 17289 Gerrit-PatchSet: 4 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10631: Upgrade DataSketches to version 3.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17294 ) Change subject: IMPALA-10631: Upgrade DataSketches to version 3.0.0 .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7064/ -- To view, visit http://gerrit.cloudera.org:8080/17294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37622a7643d015b80f55b802421eae826aa7a4f9 Gerrit-Change-Number: 17294 Gerrit-PatchSet: 4 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 13 Apr 2021 13:42:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17312 ) Change subject: IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8563/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59cebcf087e703a4ab49fb44f6f5ba1044f26546 Gerrit-Change-Number: 17312 Gerrit-PatchSet: 2 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 13 Apr 2021 13:41:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17312 ) Change subject: IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8562/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59cebcf087e703a4ab49fb44f6f5ba1044f26546 Gerrit-Change-Number: 17312 Gerrit-PatchSet: 1 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 13 Apr 2021 13:34:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 20: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8561/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 20 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 13:25:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8560/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 13 Apr 2021 13:23:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends
Hello Aman Sinha, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17312 to look at the new patch set (#2). Change subject: IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends .. IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends This patch adds a new interface that returns an initialized TQueryCtx for use in requests submitted by external frontends. This is necessary to ensure that the query context in an externally generated TExecRequest has appropriate coordinator metadata. Testing: External frontend regression tests Change-Id: I59cebcf087e703a4ab49fb44f6f5ba1044f26546 Reviewed-by: Aman Sinha --- M be/src/rpc/hs2-http-test.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.h M common/thrift/ImpalaService.thrift M tests/hs2/test_hs2.py M tests/run-tests.py 6 files changed, 35 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17312/2 -- To view, visit http://gerrit.cloudera.org:8080/17312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I59cebcf087e703a4ab49fb44f6f5ba1044f26546 Gerrit-Change-Number: 17312 Gerrit-PatchSet: 2 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17312 ) Change subject: IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/17312/1/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/17312/1/be/src/service/impala-hs2-server.cc@575 PS1, Line 575: const ThriftServer::ConnectionContext* connection_context = ThriftServer::GetThreadConnectionContext(); line too long (105 > 90) http://gerrit.cloudera.org:8080/#/c/17312/1/be/src/service/impala-hs2-server.cc@577 PS1, Line 577: HS2_RETURN_ERROR(return_val, "Unsupported operation", SQLSTATE_OPTIONAL_FEATURE_NOT_IMPLEMENTED); line too long (101 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59cebcf087e703a4ab49fb44f6f5ba1044f26546 Gerrit-Change-Number: 17312 Gerrit-PatchSet: 1 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 13 Apr 2021 13:18:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends
Hello Aman Sinha, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/17312 to review the following change. Change subject: IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends .. IMPALA-10655: Add ImpalaServer interface to Initialize TQueryCtx for external frontends This patch adds a new interface that returns an initialized TQueryCtx for use in requests submitted by external frontends. This is necessary to ensure that the query context in an externally generated TExecRequest has appropriate coordinator metadata. Testing: External frontend regression tests Change-Id: I59cebcf087e703a4ab49fb44f6f5ba1044f26546 Reviewed-by: Aman Sinha --- M be/src/rpc/hs2-http-test.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.h M common/thrift/ImpalaService.thrift M tests/hs2/test_hs2.py M tests/run-tests.py 6 files changed, 33 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/17312/1 -- To view, visit http://gerrit.cloudera.org:8080/17312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I59cebcf087e703a4ab49fb44f6f5ba1044f26546 Gerrit-Change-Number: 17312 Gerrit-PatchSet: 1 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Aman Sinha
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Qifan Chen has uploaded a new patch set (#20). ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. IMPALA-10647 Improve always-true min/max filter handling in coordinator The change improves how a coordinator behaves when a just arriving min/max filter is always true. A new member 'always_true_filter_received_' is introduced to record such a fact. Similarily, the new member always_false_flipped_to_false_ is added to indicate that the always false flag is flipped from 'true' to 'false'. These two members only influence how the min and max columns in "Filter routing table" and "Final filter table" in profile are displayed as follows. 1. 'PartialUpdates' - The min and the max are partially updated; 2. 'AlwaysTrue' - One received filter is AlwaysTrue; 3. 'AlwaysFalse'- No filter is received or all received filters are empty; 4. 'Real values'- The final accumulated min/max from all received filters. A second change introduced is to record, in scan node, the arrival time of min/max filters (as a timestamp since the system is rebooted, obtained by calling MonotonicMillis()). A timestamp of similar nature is recorded for hdfs parquet scanners when a row group is processed. By comparing these two timestamps, one can easily diagnose issues related to late arrival of min/max filters. Testing: 1. Added three new tests in overlap_min_max_filters.test to verify that the min/max are displayed correctly when the min/max filter in hash join builder is set to always true, always false, or a pair of meaningful min and max values. 2. Ran unit tests; 3. Ran core tests successfully. Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test 8 files changed, 221 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/17252/20 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 20 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator
Qifan Chen has uploaded a new patch set (#19). ( http://gerrit.cloudera.org:8080/17252 ) Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator .. IMPALA-10647 Improve always-true min/max filter handling in coordinator The change improves how a coordinator behaves when a just arriving min/max filter is always true. A new member 'always_true_filter_received_' is introduced to record such a fact. Similarily, the new member always_false_flipped_to_false_ is added to indicate that the always false flag is flipped from 'true' to 'false'. These two members only influence how the min and max columns in "Filter routing table" and "Final filter table" in profile are displayed as follows. 1. 'PartialUpdates' - The min and the max are partially updated; 2. 'AlwaysTrue' - One received filter is AlwaysTrue; 3. 'AlwaysFalse'- No filter is received or all received filters are empty; 4. 'Real values'- The final accumulated min/max from all received filters. A second change introduced is to record, in scan node, the arrival time of min/max filters (as a timestamp since the system is rebooted, obtained by calling MonotonicMillis()). A timestamp of similar nature is recorded for hdfs parquet scanners when a row group is processed. By comparing these two timestamps, one can easily diagnose issues related to late arrival of min/max filters. Testing: 1. Added three new tests in overlap_min_max_filters.test to verify that the min/max are displayed corrected when the min/max filter in hash join builder is set to always true, always false, or a pair of meaningful min and max values. 2. Ran unit tests; 3. Ran core tests. Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test 8 files changed, 221 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/17252/19 -- To view, visit http://gerrit.cloudera.org:8080/17252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Gerrit-Change-Number: 17252 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10584: Defer advancing read page when reservation is tight.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17195 ) Change subject: IMPALA-10584: Defer advancing read page when reservation is tight. .. Patch Set 3: (2 comments) Thanks for the investigation! Still digesting the context. Left some questions first. http://gerrit.cloudera.org:8080/#/c/17195/3/be/src/runtime/buffered-tuple-stream-test.cc File be/src/runtime/buffered-tuple-stream-test.cc: http://gerrit.cloudera.org:8080/#/c/17195/3/be/src/runtime/buffered-tuple-stream-test.cc@1237 PS3, Line 1237: TEST_F(SimpleTupleStreamTest, UnpinReadPage) { It is possible to reproduce the bug with some changes in this test? http://gerrit.cloudera.org:8080/#/c/17195/3/be/src/runtime/buffered-tuple-stream.cc File be/src/runtime/buffered-tuple-stream.cc: http://gerrit.cloudera.org:8080/#/c/17195/3/be/src/runtime/buffered-tuple-stream.cc@723 PS3, Line 723: default_page_len_ Does this assume the size of the read page is default_page_len_? Should we handle large read page (size=max_page_len_) here and use read_it_.read_page_->len() instead? -- To view, visit http://gerrit.cloudera.org:8080/17195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I16137b6e423f190f60c3115a06ccd0f77e9f585a Gerrit-Change-Number: 17195 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 13 Apr 2021 12:57:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17300 ) Change subject: IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec Gerrit-Change-Number: 17300 Gerrit-PatchSet: 3 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 12:12:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17300 ) Change subject: IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7065/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec Gerrit-Change-Number: 17300 Gerrit-PatchSet: 3 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 12:12:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17300 ) Change subject: IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats .. Patch Set 2: Code-Review+2 Thanks for the changes! -- To view, visit http://gerrit.cloudera.org:8080/17300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec Gerrit-Change-Number: 17300 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 12:11:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10445: Adjust NDV's scale with query option
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17306 ) Change subject: IMPALA-10445: Adjust NDV's scale with query option .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8559/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 Gerrit-Change-Number: 17306 Gerrit-PatchSet: 9 Gerrit-Owner: fifteencai Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 13 Apr 2021 10:22:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10445: Adjust NDV's scale with query option
fifteencai has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/17306 ) Change subject: IMPALA-10445: Adjust NDV's scale with query option .. IMPALA-10445: Adjust NDV's scale with query option This is a new way to control NDV's scale. Since IMPALA-2658, we can trade memory for more accurate estimation by setting larger `scale`. That scale is decided by SQL writers. However, it is a bumpy road for cluster admins to allow for larger scales. Here lies 2 reasons: - Firstly, SQL writers are reluctant to low the scale. They prone to fill up the scale, which will make the cluster unstable, especially when there are `group by`s with high cardinalities. So it is wiser to let cluster admin instead of sql writer choose appropriate scale. - Secondly, In some application scenarios, queries are stored in DBs. In a BI system, for example, rewriting thousands of SQLs is risky. In this commit, we introduced a new Query Option `DEFAULT_NDV_SCALE`. During to the advantage of query option, Cluster admins can either tune 1 desired query, or influence upcoming queries by placing a default query option in a dynamic resource pool. We also refactored method `Analyze` to make sure APPX_COUNT_DISTINCT can work with this query option. After this, cluster admins can degrade service level by transforming `count(distinct id)` to `ndv(id, scale)`. Implementation details: - The default value of DEFAULT_NDV_SCALE is 2, so we won't change the default ndv behavior. - We port `CountDistinctToNdv` transform logic from `SelectStmt.analyze()` to `ExprRewriter`, making it compatible with further rewrite rules. - The newly added rewrite rule `DefaultNdvScaleRule` is applied after `CountDistinctToNdvRule`. Usage: To set a default ndv scale: ``` SET DEFAULT_NDV_SCALE = 10; -- ranges from 1 to 10, both inclusive. ``` To unset: ``` SET DEFAULT_NDV_SCALE = 2; ``` Here are test results of a typical workload (cardinality=40,090,650): ++ | Metric| Count Distinct |NDV2|NDV5|NDV10 | ++ | Memory(GB) | 3.83 |1.84|1.85| 1.89 | | Duration(s) | 182.89| 30.22|29.72 | 29.24 | | ErrorRate |0% |1.8%|1.17% | 0.06% | ++ Testing: 1) Added 3 unit test cases in `ExprRewriteRulesTest`. 2) Added 5 unit test cases in `ExprRewriterTest`. 3) Ran all front-end unit test, passed. 4) Added a new query-option test. Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java A fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java A fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java 11 files changed, 251 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17306/9 -- To view, visit http://gerrit.cloudera.org:8080/17306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 Gerrit-Change-Number: 17306 Gerrit-PatchSet: 9 Gerrit-Owner: fifteencai Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-10631: Upgrade DataSketches to version 3.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17294 ) Change subject: IMPALA-10631: Upgrade DataSketches to version 3.0.0 .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7064/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37622a7643d015b80f55b802421eae826aa7a4f9 Gerrit-Change-Number: 17294 Gerrit-PatchSet: 4 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 13 Apr 2021 07:59:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10631: Upgrade DataSketches to version 3.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17294 ) Change subject: IMPALA-10631: Upgrade DataSketches to version 3.0.0 .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37622a7643d015b80f55b802421eae826aa7a4f9 Gerrit-Change-Number: 17294 Gerrit-PatchSet: 4 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 13 Apr 2021 07:59:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10445: Adjust NDV's precision/scale by query option
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17306 ) Change subject: IMPALA-10445: Adjust NDV's precision/scale by query option .. Patch Set 8: Code-Review+1 (8 comments) Thanks for contributing this feature! The solution LGTM. I only have comments on code style. +Aman and Qifan if they want to have a look. http://gerrit.cloudera.org:8080/#/c/17306/8/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/17306/8/be/src/service/query-options.cc@1074 PS8, Line 1074: value nit: interger value http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java File fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java: http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java@18 PS8, Line 18: package org.apache.impala.rewrite; nit: need a blank line after this http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java@22 PS8, Line 22: import org.apache.impala.analysis.Analyzer; nit: need a blank line after this http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java@25 PS8, Line 25: * : * */ nit: correct this to */ http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java@27 PS8, Line 27: { nit: need a whitespace here http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java@41 PS8, Line 41: if (!analyzer.getQueryCtx().client_request.query_options.appx_count_distinct) : return expr; nit: multiline if-statement needs brackets http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java: http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java@790 PS8, Line 790: session.options().setDefault_ndv_scale(10); For test coverage, could you add a test without this option? i.e. add the following test before this line RewritesOk("ndv(bool_col)", rules, "ndv(bool_col)"); http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java: http://gerrit.cloudera.org:8080/#/c/17306/8/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@539 PS8, Line 539: nit: remove the blank line at the end -- To view, visit http://gerrit.cloudera.org:8080/17306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50 Gerrit-Change-Number: 17306 Gerrit-PatchSet: 8 Gerrit-Owner: fifteencai Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 13 Apr 2021 07:59:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10631: Upgrade DataSketches to version 3.0.0
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17294 ) Change subject: IMPALA-10631: Upgrade DataSketches to version 3.0.0 .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37622a7643d015b80f55b802421eae826aa7a4f9 Gerrit-Change-Number: 17294 Gerrit-PatchSet: 3 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 13 Apr 2021 07:57:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/17300 ) Change subject: IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats .. Patch Set 2: (3 comments) Hi Zoltan, thank you for the review. the change has been updated. http://gerrit.cloudera.org:8080/#/c/17300/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17300/1//COMMIT_MSG@13 PS1, Line 13: earlier > earlier Done http://gerrit.cloudera.org:8080/#/c/17300/1//COMMIT_MSG@15 PS1, Line 15: Kudu > Actually Iceberg tables are related because they might use Parquet or ORC u Done http://gerrit.cloudera.org:8080/#/c/17300/1//COMMIT_MSG@16 PS1, Line 16: related > related Done -- To view, visit http://gerrit.cloudera.org:8080/17300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec Gerrit-Change-Number: 17300 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Apr 2021 07:50:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats
Hello Zoltan Borok-Nagy, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17300 to look at the new patch set (#2). Change subject: IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats .. IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats This change updates the exception that is thrown when the user tries to insert into a partition which has unsupported format. The information to make this decision is available during analysis, therefore this commit also moves the check from the planner to the analyzer to have an earlier result. In the analyzer only the FeFsTables have to be checked therefore Kudu tables are not related. Also, there is a difference between static and dynamic partition clauses, for static partition clauses the partition format is available during compile, for dynaminc partition clauses it is only avaialble during runtime. Testing: - Added unit tests - Ran exhaustive tests successfully Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/datasets/functional/functional_schema_template.sql 4 files changed, 78 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/17300/2 -- To view, visit http://gerrit.cloudera.org:8080/17300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec Gerrit-Change-Number: 17300 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy