[Impala-ASF-CR] IMPALA-9962: Implement ds kll quantiles() function
Hello Adam Tamas, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16324 to look at the new patch set (#4). Change subject: IMPALA-9962: Implement ds_kll_quantiles() function .. IMPALA-9962: Implement ds_kll_quantiles() function This function is very similar to ds_kll_quantile() but this one can receive any number of rank parameters and returns a comma separated string that holds the results for all of the given ranks. For more details about ds_kll_quantile() see IMPALA-9959. Note, this function is meant to return an Array of floats as the result but with that we have to wait for the complex type support. Tracking Jira is IMPALA-9520. Change-Id: I76f6039977f4e14ded89a3ee4bc4e6ff855f5e7f --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/datasketches-common.cc M be/src/exprs/datasketches-common.h M be/src/exprs/datasketches-functions-ir.cc M be/src/exprs/datasketches-functions.h M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test 7 files changed, 155 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/16324/4 -- To view, visit http://gerrit.cloudera.org:8080/16324 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I76f6039977f4e14ded89a3ee4bc4e6ff855f5e7f Gerrit-Change-Number: 16324 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15531 ) Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions .. Patch Set 42: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6283/ -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 42 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 04:35:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 20: Thanks for the update - I'll do another pass over the code tomorrow. I think we can probably do more optimisations for partition pruning later, but this will be simpler for now. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 20 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Thu, 13 Aug 2020 03:25:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15531 ) Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions .. Patch Set 42: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 42 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 03:23:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15531 ) Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions .. Patch Set 42: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6283/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 42 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 03:23:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15531 ) Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions .. Patch Set 41: Code-Review+2 (2 comments) Thanks for all the code cleanup. I think it's not ideal that the files are slightly different from Kudu but I understand the difference now and I think we can move forward. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/kudu/util/block_bloom_filter.cc File be/src/kudu/util/block_bloom_filter.cc: PS40: > Hi, Tim. This file is same with kudu's, I already compare them. I compared with commit 1c5a04f7774f348b3c9fc4fd39de98c858c87c5f of kudu and it seems to be different? tarmstrong@tarmstrong-box2:~/Impala/impala$ diff -r {be/src/kudu/util/,~/repos/kudu/src/kudu/util/}block_bloom_filter.cc 21,24c21,24 < #include "sse2neon.h" < #else < #include < #include --- > #include "kudu/util/sse2neon.h" > #else //__aarch64__ > #include > #include 189a190,195 > #ifdef __aarch64__ > // IWYU pragma: no_include > uint8x16_t new_bucket_neon = vreinterpretq_u8_u32(vld1q_u32(new_bucket + > 4 * i)); > uint8x16_t* existing_bucket = > reinterpret_cast(&directory_[bucket_idx][4 * i]); > *existing_bucket = vorrq_u8(*existing_bucket, new_bucket_neon); > #else 193a200 > #endif http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/kudu/util/block_bloom_filter.cc File be/src/kudu/util/block_bloom_filter.cc: http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/kudu/util/block_bloom_filter.cc@189 PS41, Line 189: for (int i = 0; i < 2; ++i) { Kudu has some neon-specific code here. I guess the newer version of sse2neon has all the functions that are needed? If so, this is OK, but can you file a Kudu JIRA to clean this up? -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 41 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 03:23:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10077: Increase timeout for test concurrent invalidate metadata
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16333 ) Change subject: IMPALA-10077: Increase timeout for test_concurrent_invalidate_metadata .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6903/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16333 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I47e9f4793117b9a726fde165adea68ce31f539a8 Gerrit-Change-Number: 16333 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 13 Aug 2020 03:14:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 40: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 40 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 03:10:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans This work addresses the current limitation in computing the total row count for a Hive table in a scan. The row count can be incorrectly computed as 0, even though there exists data in the Hive table. This is the stats corruption at table level. Similar stats corruption exists for a partition. The row count of a table or a partition sometime can also be -1 which indicates a missing stats situation. In the fix, as long as no partition in a Hive table exhibits any missing or corrupt stats, the total row count for the table is computed from the row counts in all partitions. Otherwise, Impala looks at the table level stats particularly the table row count. In addition, if the table stats is missing or corrupted, Impala estimates a row count for the table, if feasible. This row count is the sum of the row count from the partitions with good stats, and an estimation of the number of rows in the partitions with missing or corrupt stats. Such estimation also applies when some partition has corrupt stats. One way to observe the fix is through the explain of queries scanning Hive tables with missing or corrupted stats. The cardinality for any full scan should be a positive value (i.e. the estimated row count), instead of 'unavailable'. At the beginning of the explain output, that table is still listed in the WARNING section for potentially corrupt table statistics. Testing: 1. Ran unit tests with queries documented in the case against Hive tables with the following configrations: a. No stats corruption in any partitions b. Stats corruption in some partitions c. Stats corruption in all partitions 2. Added two new tests in test_compute_stats.py: a. test_corrupted_stats_in_partitioned_Hive_tables b. test_corrupted_stats_in_unpartitioned_Hive_tables 3. Fixed failures in corrupt-stats.test 4. Ran "core" test Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Reviewed-on: http://gerrit.cloudera.org:8080/16098 Reviewed-by: Sahil Takiar Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/corrupt-stats.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test M tests/metadata/test_compute_stats.py M tests/metadata/test_explain.py 16 files changed, 311 insertions(+), 97 deletions(-) Approvals: Sahil Takiar: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 41 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16318 ) Change subject: WIP: IMPALA-9867: Add Support for Spilling to S3: Milestone 1 .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6902/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89 Gerrit-Change-Number: 16318 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 13 Aug 2020 03:05:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10077: Increase timeout for test concurrent invalidate metadata
Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16333 Change subject: IMPALA-10077: Increase timeout for test_concurrent_invalidate_metadata .. IMPALA-10077: Increase timeout for test_concurrent_invalidate_metadata test_concurrent_invalidate_metadata runs 20 iterations for concurrent invalidate metadata commands. Each iteration could take more than 6s. So it's easy to hit the current timeout limit, 120s. The main purpose of this test is to detect metadata bugs that could cause invalidate metadata hanging. It's not for performance. So this patch increases the timeout limit to 300s to fix the flakiness. Change-Id: I47e9f4793117b9a726fde165adea68ce31f539a8 --- M tests/custom_cluster/test_concurrent_ddls.py 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/16333/1 -- To view, visit http://gerrit.cloudera.org:8080/16333 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I47e9f4793117b9a726fde165adea68ce31f539a8 Gerrit-Change-Number: 16333 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang
[Impala-ASF-CR] WIP: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
Yida Wu has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16318 ) Change subject: WIP: IMPALA-9867: Add Support for Spilling to S3: Milestone 1 .. WIP: IMPALA-9867: Add Support for Spilling to S3: Milestone 1 Major Features 1) Local files as buffers for spilling to S3. 2) Async Upload and Sync Fetching of remote files. 3) Sync remote files deletion after query ends. 4) Local buffer files management. 5) Compatibility of spilling to local and remote. 6) All the errors from hdfs/s3 should terminate the query. Implementation Details: 1) An new enum type is added to specify the function of local files. LocalFileMode::BUFFER and LocalFileMode::FILE. LocalFileMode::BUFFER indicates that the local file is used as a buffer for remote operations. LocalFileMode::FILE indicates the local file is used for spilling to local. Also, startup option remote_tmp_file_local_buff_mode is added to specify the implementation of the reading pages from the remote. If set to true, the entire file would be fetched to the local buffer during reading(pinning) if it was evicted. If set to false, only a page is read for each reading. 2) Two disk queues have been added to do the file operation jobs. Queue name: RemoteS3DiskFileOper/RemoteDfsDiskFileOper File operations on the remote disk like upload and fetch should be done in these queues. The purpose of the queues is to seperate long run operations with short ones, and also to have a more accurate control on the thread number working on these file operation jobs, sometimes we might don't want too many upload and fetch jobs working in the same time. RemoteOperRange is the new type to carry the file operation jobs. Previously,we have request types of READ and WRITE. Now FETCH/UPLOAD/EVICT have been added. 3) The tmp files are deleted when the tmp file group is deconstructing. 4) The local buffer files management is to control the total size of local buffer files and evict files if needed. There are basically six status of a remote tmp file, IN_WRITING/DUMPED/IN_DUMPING/UPLOADED/DUMPED_UPLOADED/DELETED. A local buffer file can be evicted if it is in status REMOTE or it has been all pinned. An EVICT job is sent to the local disk queue if a file is chosen to be evicted. There are two modes to decide the sequence of choosing files to be evicted. Default is LIFO, the other is FIFO. It can be decided by startup option remote_tmp_files_avail_pool_lifo. 5) Spilling to local has higher priority than spilling to remote. If no local scratch space is available, temporary data will be spilled to remote. Remote scratch space uses the highest priority local scratch dir as its buffer. If no local scratch space or only one has been configured, a default local buffer should be used. The purpose of the design is to simplify the implementation in milestone 1 with less changes on the configuration. Limitations: * Only one remote scratch dir is supported. * The highest priority local scratch dir is used for the buffer of remote scratch space if remote scratch dir exists. Testcases: * Ran Unit Tests: $IMPALA_HOME/be/build/debug/runtime/buffered-tuple-stream-test $IMPALA_HOME/be/build/debug/runtime/tmp-file-mgr-test $IMPALA_HOME/be/build/debug/runtime/bufferpool/buffer-pool-test $IMPALA_HOME/be/build/debug/runtime/io/disk-io-mgr-test * Some new testcases has been added to tmp-file-mgr-test. TODO: - New Testcases for Spilling to S3. - Upper and lower bounds of new options related to size. - Preserve memory buffer for block buffers on file upload and fetch. - Add some new metrics, like the rate of accessing local buffer. Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89 --- M be/src/runtime/io/CMakeLists.txt M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h A be/src/runtime/io/file-writer.h M be/src/runtime/io/hdfs-file-reader.cc A be/src/runtime/io/hdfs-file-writer.cc A be/src/runtime/io/hdfs-file-writer.h M be/src/runtime/io/local-file-system.cc M be/src/runtime/io/local-file-system.h A be/src/runtime/io/local-file-writer.cc A be/src/runtime/io/local-file-writer.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr-internal.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M be/src/util/hdfs-util.cc M be/src/util/hdfs-util.h M common/thrift/metrics.json M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java 24 files changed, 2,680 insertions(+), 215 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/16318/2 -- To view, visit http://gerrit.cloudera.org:8080/16318 To unsubscribe, visit http://gerrit.cloudera.org:8080/se
[Impala-ASF-CR] WIP: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
Yida Wu has abandoned this change. ( http://gerrit.cloudera.org:8080/16264 ) Change subject: WIP: IMPALA-9867: Add Support for Spilling to S3: Milestone 1 .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/16264 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Ia5aa4036b4c72656b4297f9fbe42e21d2796a495 Gerrit-Change-Number: 16264 Gerrit-PatchSet: 1 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9676 Add aarch64 compile options for clang
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15755 ) Change subject: IMPALA-9676 Add aarch64 compile options for clang .. IMPALA-9676 Add aarch64 compile options for clang Add signed-char and armv8a and crc compile options to clang Change-Id: I69a5ff64bbd4427dd87ec6e884251e76d6a73122 Reviewed-on: http://gerrit.cloudera.org:8080/15755 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M be/CMakeLists.txt 1 file changed, 5 insertions(+), 1 deletion(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15755 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I69a5ff64bbd4427dd87ec6e884251e76d6a73122 Gerrit-Change-Number: 15755 Gerrit-PatchSet: 19 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9904 Fix bad cipher test failed case on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16172 ) Change subject: IMPALA-9904 Fix bad cipher test failed case on aarch64 .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19751b6bf1045fd6d901c5a67f74e8bdd6bf65d3 Gerrit-Change-Number: 16172 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 02:22:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9926 base64decode % will not return error when in newer OS
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16175 ) Change subject: IMPALA-9926 base64decode % will not return error when in newer OS .. IMPALA-9926 base64decode % will not return error when in newer OS for example, base64decode('YWxwaGE%') will return 'alpha\377' in newer os which has newer sasl library. I tested it on Ubuntu 18.04 aarch64 version. Change-Id: Ib9bd9e03d5f744c18c957cdaf2064fa918086004 Reviewed-on: http://gerrit.cloudera.org:8080/16175 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M be/src/exprs/expr-test.cc M testdata/workloads/functional-query/queries/QueryTest/exprs.test 2 files changed, 7 insertions(+), 5 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib9bd9e03d5f744c18c957cdaf2064fa918086004 Gerrit-Change-Number: 16175 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9995 Fix test alloc fail failed case on aarch64
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16307 ) Change subject: IMPALA-9995 Fix test_alloc_fail failed case on aarch64 .. IMPALA-9995 Fix test_alloc_fail failed case on aarch64 Length of Json object '{"a": 1}", '$.a' is 32 bytes on x86, but is 48 bytes on aarch64 Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd Reviewed-on: http://gerrit.cloudera.org:8080/16307 Tested-by: Tim Armstrong Reviewed-by: Tim Armstrong --- M testdata/workloads/functional-query/queries/QueryTest/alloc-fail-init.test 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd Gerrit-Change-Number: 16307 Gerrit-PatchSet: 5 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9904 Fix bad cipher test failed case on aarch64
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16172 ) Change subject: IMPALA-9904 Fix bad cipher test failed case on aarch64 .. IMPALA-9904 Fix bad cipher test failed case on aarch64 On aarch64 and os ubuntu 18.04, the openssl version is 1.1.1, the server which used openssl can start successfully even ciphers is bad. So here just don't test bad ciphers cases on aarch64. On x86, the server cannot start successfully because the lower openssl version, not because the bad cipher. Change-Id: I19751b6bf1045fd6d901c5a67f74e8bdd6bf65d3 Reviewed-on: http://gerrit.cloudera.org:8080/16172 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/thrift-server-test.cc M be/src/util/webserver-test.cc 3 files changed, 10 insertions(+), 5 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I19751b6bf1045fd6d901c5a67f74e8bdd6bf65d3 Gerrit-Change-Number: 16172 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16174 ) Change subject: IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64 .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16174 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58ab52acebb9bcddbf298efa886fd30ce35f68bf Gerrit-Change-Number: 16174 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 13 Aug 2020 02:22:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16174 ) Change subject: IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64 .. IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64 cast(pow(2, 31) as int) return 2147483647 on aarch64 but return 2147483648 on x86 I think aarch64 is correct. So here I will not convert it, just use aarch64's value Change-Id: I58ab52acebb9bcddbf298efa886fd30ce35f68bf Reviewed-on: http://gerrit.cloudera.org:8080/16174 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M be/src/exprs/expr-test.cc 1 file changed, 5 insertions(+), 0 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16174 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I58ab52acebb9bcddbf298efa886fd30ce35f68bf Gerrit-Change-Number: 16174 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16173 ) Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. IMPALA-9906 Fix thread-pool-test failed case on aarch64 Threads switch frequency is not so fast as x86. So here change the sleep task time from 100ms to 500ms Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Reviewed-on: http://gerrit.cloudera.org:8080/16173 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M be/src/util/thread-pool-test.cc 1 file changed, 9 insertions(+), 3 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9676 Add aarch64 compile options for clang
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15755 ) Change subject: IMPALA-9676 Add aarch64 compile options for clang .. Patch Set 18: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15755 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69a5ff64bbd4427dd87ec6e884251e76d6a73122 Gerrit-Change-Number: 15755 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 02:22:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16173 ) Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 02:22:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9995 Fix test alloc fail failed case on aarch64
Tim Armstrong has removed a vote on this change. Change subject: IMPALA-9995 Fix test_alloc_fail failed case on aarch64 .. Removed Verified-1 by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/16307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd Gerrit-Change-Number: 16307 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9926 base64decode % will not return error when in newer OS
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16175 ) Change subject: IMPALA-9926 base64decode % will not return error when in newer OS .. Patch Set 8: Verified+1 Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9bd9e03d5f744c18c957cdaf2064fa918086004 Gerrit-Change-Number: 16175 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 02:22:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9995 Fix test alloc fail failed case on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16307 ) Change subject: IMPALA-9995 Fix test_alloc_fail failed case on aarch64 .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd Gerrit-Change-Number: 16307 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 02:21:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9995 Fix test alloc fail failed case on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16307 ) Change subject: IMPALA-9995 Fix test_alloc_fail failed case on aarch64 .. Patch Set 4: Verified+1 This hit https://issues.apache.org/jira/browse/IMPALA-10054?jql=text%20~%20%22test_multiple_sort_run_bytes_limits%22. The fix for that is merged. I'm going to override in this case. -- To view, visit http://gerrit.cloudera.org:8080/16307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd Gerrit-Change-Number: 16307 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 02:21:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Support querying Iceberg table by impala .. Patch Set 20: (9 comments) > (10 comments) > > One thing I didn't understand about the patch is the support for > partition pruning. I don't see enough code here to handle all the > edge cases - partition evolution, different partition transforms, > etc. > > I'm OK with deferring optimizations to a later patch, but want to > make sure that this patch is correct and that we document what the > limitations are. Hi Tim, thanks for your patient review. This patch pruning partition mainly by pushdown predicates to Iceberg to filter data files. I've already add more comment about predicates pushdown. And all test cases passed: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11658/ http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@611 PS18, Line 611: // Iceberg table only has one partition spec now > What does this mean? We don't support partition evolution yet? When we create iceberg table with several partition columns, such as 'id identity', 'register_time day', these partition columns are treated as partition fields, all in a partition spec which is different from hdfs table. So I add a comment here: Iceberg table only has one partition. If we alter iceberg table's partitions, iceberg will generated snapshot with new partition spec, we can select different snapshot to query by iceberg api which is not supported in this patch. We only get the latest partition spec for latest data now, so we think iceberg table only has one partition spec in impala(latest partition spec) http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@615 PS18, Line 615: fieldName > nit: fieldName Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1889 PS18, Line 1889: protec > Can you make this protected instead of public, since it's only being used w Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@372 PS18, Line 372: protec > I think this can be protected Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@61 PS18, Line 61: > nit: icebergConjuncts_ Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@94 PS18, Line 94: > nit: Iceberg Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@123 PS18, Line 123: ListIterator it = conjuncts_.listIterator(); > I think this predicate pushdown needs a bit more explanation. Yes you are right, Tim. In case 1 and case 2, we need to evaluate predicates in the scan as well. After discussed with my colleague worked on Iceberg, we prepare to pushdown all predicates to Iceberg, and evaluate all predicates in the scan as well. More details you can see the comment on IcebergScanNode.extractIcebergConjuncts http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/data/README File testdata/data/README: http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/data/README@508 PS18, Line 508: hll_sketches_from_impala.parquet: > nit: remove this merge marker Done http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test: http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test@3 PS18, Line 3: Iceberg > nit: Iceberg Done -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 20 Gerrit-Owner: wangsheng Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public
[Impala-ASF-CR] IMPALA-9995 Fix test alloc fail failed case on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16307 ) Change subject: IMPALA-9995 Fix test_alloc_fail failed case on aarch64 .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6281/ -- To view, visit http://gerrit.cloudera.org:8080/16307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd Gerrit-Change-Number: 16307 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 01:45:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9382: part 1: transposed profile prototype
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15798 ) Change subject: IMPALA-9382: part 1: transposed profile prototype .. Patch Set 16: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6901/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a Gerrit-Change-Number: 15798 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 01:06:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9382: part 1: transposed profile prototype
Hello Kurt Deschler, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15798 to look at the new patch set (#16). Change subject: IMPALA-9382: part 1: transposed profile prototype .. IMPALA-9382: part 1: transposed profile prototype This adds an experimental profile representation that is denser than the traditional representation. Counters, info strings and other information for all instances of a fragment are merged into a single tree. Descriptive stats (min, max, mean) are shown for each counter, along with the values for each instance. It can be enabled by setting --gen_experimental_profile=true. The default behaviour is unchanged, aside from including a few extra counters in existing profiles. An example of the pretty-printed profile is attached to the JIRA. The thrift representation of the profile is extended so that all instances of a fragment can be merged together into a single "aggregated" fragment, with vectors of counters. The in-memory representation is transformed in a similar way. The RuntimeProfile class is restructured so that there is a common RuntimeProfileBase class, with RuntimeProfile and AggregatedRuntimeProfile base classes. Execution fills in counters in RuntimeProfile for each instances, then these are aggregated together into an AggregatedRuntimeProfile on the coordinator. This replaces the "averaged" profile concept with an abstraction that more clearly distinguishes what operations apply to aggregated and unaggregated profiles. In a future change, we could use AggregatedRuntimeProfile for status reports so that less data needs to be sent to the coordinator, and the coordinator needs to do less processing. The new profile removes the bad practice of including aggregated stats as strings from the new profile. These stats can now be automatically as aggregations of counters. The legacy uses of InfoString are preserved so as to not lose information but can be removed when we switch to the transposed profile. Also make TotalTime and InactiveTime behave like other counters - they are pretty-printed the same as other counters. Inactive time is also included in the local time calculation, fixing IMPALA-2794. TODO in later patches for IMPALA-9382: These will need to be fixed before this can be considered production ready. * The JSON profile generation is not fully implemented for aggregated profiles. * Not all counter times are included in aggregated profile, e.g. time series counters. * The pretty-printing of the various profile counters will need to be improved to be more readable, e.g. grouping by host, improving formatting. * The aggregated profile is only updated at the end of the query. We need to support live updating. * Consider how to show local time per instance - make it a first-class counter in the profile? Possible extensions: * We could better highlight outliers when pretty-printing the profile. Testing: * I diffed the text profile of TPC-DS Q1 to make sure there were no unexpected changes. * Added unit test for stats computation in AveragedCounter. * Passed core tests. * exhaustive tests * ASAN tests * Ran some tests locally with TSAN Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a --- M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/fragment-instance-state.cc M be/src/util/dummy-runtime-profile.h M be/src/util/pretty-printer-test.cc M be/src/util/pretty-printer.h M be/src/util/runtime-profile-counters.h M be/src/util/runtime-profile-test.cc M be/src/util/runtime-profile.cc M be/src/util/runtime-profile.h M common/thrift/RuntimeProfile.thrift 13 files changed, 1,543 insertions(+), 619 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/15798/16 -- To view, visit http://gerrit.cloudera.org:8080/15798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a Gerrit-Change-Number: 15798 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9382: part 1: transposed profile prototype
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15798 ) Change subject: IMPALA-9382: part 1: transposed profile prototype .. Patch Set 15: (4 comments) http://gerrit.cloudera.org:8080/#/c/15798/15/common/thrift/RuntimeProfile.thrift File common/thrift/RuntimeProfile.thrift: http://gerrit.cloudera.org:8080/#/c/15798/15/common/thrift/RuntimeProfile.thrift@62 PS15, Line 62: 3: required list has_value : 4: required list values > It would be good to have explicit comments about the cardinality here and h I added comments to the TAgg* structs to make it more explicit. http://gerrit.cloudera.org:8080/#/c/15798/15/common/thrift/RuntimeProfile.thrift@113 PS15, Line 113: 3: required list has_value I'm realising that I'm a little inconsistent about naming these lists, and particularly about whether they're plural or not. I fixed the has_value/have_values inconsistency in favour of the former, cause it sounds less awkward to me. I didn't see anyhting else that would obviously improve clarity but lmk if you think I could clean any of this up. http://gerrit.cloudera.org:8080/#/c/15798/15/common/thrift/RuntimeProfile.thrift@142 PS15, Line 142: The first map key is the info string : // key. The second map key is a distinct value of that key. > Just for clarity, maybe include an example key and example distinct value. This is a good point. I added a realistic example that motivates it. http://gerrit.cloudera.org:8080/#/c/15798/15/common/thrift/RuntimeProfile.thrift@144 PS15, Line 144: This means that the common case, where all > Missing the rest of this sentence. Done -- To view, visit http://gerrit.cloudera.org:8080/15798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a Gerrit-Change-Number: 15798 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 00:54:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9955,IMPALA-9957: Fix not enough reservation for large pages in GroupingAggregator
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16240 ) Change subject: IMPALA-9955,IMPALA-9957: Fix not enough reservation for large pages in GroupingAggregator .. Patch Set 8: (5 comments) I think this makes sense. It does add complexity unfortunately but I didn't see a way to make it significantly simpler http://gerrit.cloudera.org:8080/#/c/16240/8/be/src/exec/grouping-aggregator-partition.cc File be/src/exec/grouping-aggregator-partition.cc: http://gerrit.cloudera.org:8080/#/c/16240/8/be/src/exec/grouping-aggregator-partition.cc@130 PS8, Line 130: rowIsAdded nit: row_is_added http://gerrit.cloudera.org:8080/#/c/16240/8/be/src/exec/grouping-aggregator-partition.cc@144 PS8, Line 144: reinterpret_cast(&tuple nit: probably more readable to make this a local variable instead of duplicating the cast http://gerrit.cloudera.org:8080/#/c/16240/8/be/src/exec/grouping-aggregator.h File be/src/exec/grouping-aggregator.h: http://gerrit.cloudera.org:8080/#/c/16240/8/be/src/exec/grouping-aggregator.h@286 PS8, Line 286: std::unique_ptr large_write_page_reservation_; Do we need the unique_ptr indirection? Seems like we could just have the SubReservation and check is_closed() to see if it's initialised or not. http://gerrit.cloudera.org:8080/#/c/16240/8/be/src/exec/grouping-aggregator.h@546 PS8, Line 546: HashTableCtx* ht_ctx, bool has_more_rows) WARN_UNUSED_RESULT; Can you add comments to explain this argument. http://gerrit.cloudera.org:8080/#/c/16240/8/be/src/exec/grouping-aggregator.cc File be/src/exec/grouping-aggregator.cc: http://gerrit.cloudera.org:8080/#/c/16240/8/be/src/exec/grouping-aggregator.cc@639 PS8, Line 639: Status status; : if (LIKELY(stream->AddRow(row, &status))) return Status::OK(); : RETURN_IF_ERROR(status); : : // We fail to add a large row due to run out of unused reservation and fail to increase : // the reservation. If we don't have the serialize stream, spilling partitions don't : // need extra reservation so we can restore the large write page reservation (if we : // haven't done it) before spilling any partitions. : if (!needs_serialize_ && large_write_page_reservation_->GetReservation() > 0) { : if (LIKELY(AddRowWithExtraReservation(stream, row, &status))) return Status::OK(); : RETURN_IF_ERROR(status); : } Can we factor out this code (i.e. AddRow(), then fall back to AddRowWithExtraReservation) into a function. It is repeated 2x -- To view, visit http://gerrit.cloudera.org:8080/16240 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3d9c3a2e7f0da60071b920dec979729e86459775 Gerrit-Change-Number: 16240 Gerrit-PatchSet: 8 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 00:34:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10066: Fix test cancellation mid command fails
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16322 ) Change subject: IMPALA-10066: Fix test_cancellation_mid_command fails .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6280/ -- To view, visit http://gerrit.cloudera.org:8080/16322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib80706d52a85d2c19b13fbbe5695934658c0bf7e Gerrit-Change-Number: 16322 Gerrit-PatchSet: 5 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 12 Aug 2020 23:05:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9382: part 1: transposed profile prototype
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15798 ) Change subject: IMPALA-9382: part 1: transposed profile prototype .. Patch Set 15: (3 comments) Looked at the thrift changes, and it makes sense to me. I'm looking at the rest and I'll post comments once I make it through runtime-profile.h http://gerrit.cloudera.org:8080/#/c/15798/15/common/thrift/RuntimeProfile.thrift File common/thrift/RuntimeProfile.thrift: http://gerrit.cloudera.org:8080/#/c/15798/15/common/thrift/RuntimeProfile.thrift@62 PS15, Line 62: 3: required list has_value : 4: required list values It would be good to have explicit comments about the cardinality here and how the indices line up with the instances that are included. This applies in several places, so the comment may not be right here. If you wanted to back out the value for a specific instance, what would you do? i.e. the i'th index here corresponds to the i'th index of input_profiles, etc. http://gerrit.cloudera.org:8080/#/c/15798/15/common/thrift/RuntimeProfile.thrift@142 PS15, Line 142: The first map key is the info string : // key. The second map key is a distinct value of that key. Just for clarity, maybe include an example key and example distinct value. http://gerrit.cloudera.org:8080/#/c/15798/15/common/thrift/RuntimeProfile.thrift@144 PS15, Line 144: This means that the common case, where all Missing the rest of this sentence. -- To view, visit http://gerrit.cloudera.org:8080/15798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a Gerrit-Change-Number: 15798 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 22:45:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16311 ) Change subject: IMPALA-10030: Remove unnecessary jar dependencies .. Patch Set 2: (2 comments) Thanks for the cleanup! Just had a couple of comments. http://gerrit.cloudera.org:8080/#/c/16311/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16311/2//COMMIT_MSG@34 PS2, Line 34: * Ran core tests Can we test in some kind of environment with ranger just to confirm it wasn't depending incidentally on the kafka jars? http://gerrit.cloudera.org:8080/#/c/16311/2/fe/pom.xml File fe/pom.xml: http://gerrit.cloudera.org:8080/#/c/16311/2/fe/pom.xml@960 PS2, Line 960: I think this ozone exclusion deserves a comment (what you said in the commit message was a good explanation). -- To view, visit http://gerrit.cloudera.org:8080/16311 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc Gerrit-Change-Number: 16311 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 22:37:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 39: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6900/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 39 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 22:19:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 40: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6282/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 40 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 22:02:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 40: Code-Review+2 Carrying +2. -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 40 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 22:02:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Qifan Chen has uploaded a new patch set (#39). ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans This work addresses the current limitation in computing the total row count for a Hive table in a scan. The row count can be incorrectly computed as 0, even though there exists data in the Hive table. This is the stats corruption at table level. Similar stats corruption exists for a partition. The row count of a table or a partition sometime can also be -1 which indicates a missing stats situation. In the fix, as long as no partition in a Hive table exhibits any missing or corrupt stats, the total row count for the table is computed from the row counts in all partitions. Otherwise, Impala looks at the table level stats particularly the table row count. In addition, if the table stats is missing or corrupted, Impala estimates a row count for the table, if feasible. This row count is the sum of the row count from the partitions with good stats, and an estimation of the number of rows in the partitions with missing or corrupt stats. Such estimation also applies when some partition has corrupt stats. One way to observe the fix is through the explain of queries scanning Hive tables with missing or corrupted stats. The cardinality for any full scan should be a positive value (i.e. the estimated row count), instead of 'unavailable'. At the beginning of the explain output, that table is still listed in the WARNING section for potentially corrupt table statistics. Testing: 1. Ran unit tests with queries documented in the case against Hive tables with the following configrations: a. No stats corruption in any partitions b. Stats corruption in some partitions c. Stats corruption in all partitions 2. Added two new tests in test_compute_stats.py: a. test_corrupted_stats_in_partitioned_Hive_tables b. test_corrupted_stats_in_unpartitioned_Hive_tables 3. Fixed failures in corrupt-stats.test 4. Ran "core" test Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/corrupt-stats.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test M tests/metadata/test_compute_stats.py M tests/metadata/test_explain.py 16 files changed, 311 insertions(+), 97 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/16098/39 -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 39 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP IMPALA-7779 Parquet Scanner can write binary data into profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16331 ) Change subject: WIP IMPALA-7779 Parquet Scanner can write binary data into profile .. Patch Set 1: (1 comment) overall, looks correct. would be nice to have a test for this as well. http://gerrit.cloudera.org:8080/#/c/16331/1/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/16331/1/be/src/exec/parquet/hdfs-parquet-scanner.cc@1337 PS1, Line 1337: char hex[2 * sizeof(PARQUET_VERSION_NUMBER) + 2]; : hex[0] = '0'; : hex[1] = 'x'; : auto hex_ptr = hex + 2; : for (int i = 0; i < sizeof(PARQUET_VERSION_NUMBER); i++) { : sprintf(hex_ptr + i * 2, "%02x", magic_number_ptr[i]); : } : return Status(TErrorCode::PARQUET_BAD_VERSION_NUMBER, filename(), : string(hex, sizeof(hex)), scan_node_->hdfs_table()->fully_qualified_name()); you can just use "ReadWriteUtil::HexDump" - we do something very similar in HdfsAvroScanner::ReadFileHeader -- To view, visit http://gerrit.cloudera.org:8080/16331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I281d6fa7cb2f88f04588110943e3e768678b9cf1 Gerrit-Change-Number: 16331 Gerrit-PatchSet: 1 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 12 Aug 2020 21:27:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-7779 Parquet Scanner can write binary data into profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16331 ) Change subject: WIP IMPALA-7779 Parquet Scanner can write binary data into profile .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6899/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I281d6fa7cb2f88f04588110943e3e768678b9cf1 Gerrit-Change-Number: 16331 Gerrit-PatchSet: 1 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Wed, 12 Aug 2020 21:10:50 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-7779 Parquet Scanner can write binary data into profile
Qifan Chen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16331 Change subject: WIP IMPALA-7779 Parquet Scanner can write binary data into profile .. WIP IMPALA-7779 Parquet Scanner can write binary data into profile This fix addresses the current limitation in that an ill-formatted Parquet version string is not properly formatted before appearing in an error message or impalad.INFO. With the fix, any such string is converted to a hex string prefixed with 0x and each character is represented by two hex digits. Testing: "core" test (TBD). Change-Id: I281d6fa7cb2f88f04588110943e3e768678b9cf1 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M common/thrift/generate_error_codes.py 2 files changed, 12 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/16331/1 -- To view, visit http://gerrit.cloudera.org:8080/16331 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I281d6fa7cb2f88f04588110943e3e768678b9cf1 Gerrit-Change-Number: 16331 Gerrit-PatchSet: 1 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9995 Fix test alloc fail failed case on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16307 ) Change subject: IMPALA-9995 Fix test_alloc_fail failed case on aarch64 .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6281/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd Gerrit-Change-Number: 16307 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 20:37:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
Tim Armstrong has removed a vote on this change. Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. Removed Verified-1 by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 37: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 37 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 20:26:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16173 ) Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. Patch Set 7: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6279/ -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 20:25:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16173 ) Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. Patch Set 7: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6278/ -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 20:25:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9225: Add query option for retryable queries to spool all results before returning any to the client
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16323 ) Change subject: IMPALA-9225: Add query option for retryable queries to spool all results before returning any to the client .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/16323/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16323/3//COMMIT_MSG@25 PS3, Line 25: The coordinator fragment instance will only update its : opened_promise_ when the sender is blocked by a full queue or all : results are spooled, or when any errors happen. The DataSink::Send() : interface is extended to pass in a reference of the promise. So the sink : can update it to signal the fetch() request which is waiting for the : opened_promise_ to continue and actually fetch results. conceptually I think this approach makes sense, but I think it would make more sense to move some of the code around. see my other comments, but I think we probably don't want to change what opened_promise_ means. I think this change will effectively change the meaning of opened_promise_. instead of being set when Open() completes, it will be set when all results are spooled, which would probably be after Exec() completes. which is a bit confusing. conceptually, I think you the approach is correct. you basically want Coordinator::Wait() to block until all results have been spooled, which make sense. I just think we can move the logic directly in Coordinator::Wait() because it will be cleaner. http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/exec/data-sink.h File be/src/exec/data-sink.h: http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/exec/data-sink.h@136 PS3, Line 136: virtual Status Send(RuntimeState* state, RowBatch* batch, : Promise* sender_blocked=nullptr) = 0; See my comment in fragment-instance-state.cc, but I think this can be moved into the BufferedPlanRootSink class. You might have to do a cast in from the PlanRootSink to BufferedPlanRootSink, but that should be fine. BufferedPlanRootSink is always used when result spooling is enabled. It doesn't look like this used anywhere else but BufferedPlanRootSink anyway. You probably can't add it to the Send method in BufferedPlanRootSink since it inherits from DataSink::Send, so you might just have to add a new public method, something like IsBufferFull(). http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/fragment-instance-state.cc@430 PS3, Line 430: Promise* sender_blocked_promise = !opened_promise_.IsSet() ? : &opened_promise_ : nullptr; : RETURN_IF_ERROR(sink_->Send(runtime_state_, row_batch_.get(), : sender_blocked_promise)); I think it might make more sense to move all this logic into coordinator.cc There are a few benefits: (1) Most of this logic is really Coordinator specific. The opened_promised_ is actually only used by the Coordinator in Coordinator::Wait() (2) The Coordinator has a pointer to the PlanRootSink (see PlanRootSink* coord_sink_). So you can potentially avoid making the change to the Send() method in DataSink, which thus agains all the re-factoring to unrelated classes like KuduTableSink http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/query-driver.cc File be/src/runtime/query-driver.cc: http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/query-driver.cc@312 PS3, Line 312: if (query_ctx.client_request.query_options.safely_retry_queries) { : // Reset this flag in the retry query since we won't retry again, so results can be : // returned immediately. : query_ctx.client_request.query_options.__set_safely_retry_queries(false); : VLOG_QUERY << "Unset safely_retry_queries when retrying query " : << PrintId(client_request_state_->query_id()); yeah this makes sense, nice catch http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/spillable-row-batch-queue.h File be/src/runtime/spillable-row-batch-queue.h: http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/spillable-row-batch-queue.h@133 PS3, Line 133: MAX_SPILLED_RESULT_SPOOLING_MEM thanks for catching this http://gerrit.cloudera.org:8080/#/c/16323/3/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/16323/3/common/thrift/ImpalaService.thrift@561 PS3, Line 561: SAFELY_RETRY_QUERIES I do like the idea of making this configurable. some users might find the overhead of spooling all results too high - yet may still want queries to be retried. I think we might want to change the name though. If I'm a user, I would probably always want
[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16263 ) Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries .. IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries Strip debug symbols from libkudu_client.so and libstdc++.so. The same technique used to strip debug symbols from impalad binaries is used. This decreases the Docker image sizes by about 100 MB. Test: * Ran Dockerized tests Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Reviewed-on: http://gerrit.cloudera.org:8080/16263 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M docker/setup_build_context.py 1 file changed, 28 insertions(+), 7 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Gerrit-Change-Number: 16263 Gerrit-PatchSet: 10 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16263 ) Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Gerrit-Change-Number: 16263 Gerrit-PatchSet: 9 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 20:20:32 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16291 ) Change subject: WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6898/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5fa83c8009590124dded4783f77ef70fa30119e6 Gerrit-Change-Number: 16291 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 12 Aug 2020 18:26:34 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/16291 ) Change subject: WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/16291/3/be/src/service/control-service.cc File be/src/service/control-service.cc: http://gerrit.cloudera.org:8080/#/c/16291/3/be/src/service/control-service.cc@76 PS3, Line 76: mem_tracker_.get()->Close(); > Why was this included with this change? Hit DCHECK failure in MemTracker destructor since Close() is not called before release it. It happened in core test on Jenkins. http://gerrit.cloudera.org:8080/#/c/16291/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/16291/3/be/src/service/impala-server.cc@555 PS3, Line 555: TNetworkAddress ImpalaServer::GetConfiguredBackendAddress() { > I think we can just get rid of this function and use 'exec_env_' directly i Agree, will fix it. http://gerrit.cloudera.org:8080/#/c/16291/3/be/src/service/impala-server.cc@1031 PS3, Line 1031: ExecEnv::GetInstance() > I think you can use 'exec_env_' Right, will fix it. http://gerrit.cloudera.org:8080/#/c/16291/3/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/16291/3/common/thrift/ImpalaInternalService.thrift@a519 PS3, Line 519: > I think this TODO is probably worth leaving Will keep the TODO. http://gerrit.cloudera.org:8080/#/c/16291/3/tests/custom_cluster/test_restart_services.py File tests/custom_cluster/test_restart_services.py: http://gerrit.cloudera.org:8080/#/c/16291/3/tests/custom_cluster/test_restart_services.py@281 PS3, Line 281: for port in thrift_ports: > Might as well get rid of the loop since there's only one now. Will remove the loop. -- To view, visit http://gerrit.cloudera.org:8080/16291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5fa83c8009590124dded4783f77ef70fa30119e6 Gerrit-Change-Number: 16291 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 12 Aug 2020 18:02:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService
Wenzhe Zhou has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/16291 ) Change subject: WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService .. WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService The legacy Thrift based Impala internal service has been deprecated and can be removed now. The port 22000 can also be freed up. This patch removes ImpalaInternalService related code. The flag be_port is made as a REMOVED_FLAG and all infrastructures around it are cleaned up. TQueryCtx.coord_address is changed to TQueryCtx.coord_hostname since the port in TQueryCtx.coord_address is set as be_port and is unused now. Rename TQueryCtx.coord_krpc_address as TQueryCtx.coord_ip_address. Testing: - TODO: Pass the exhaustive test. Change-Id: I5fa83c8009590124dded4783f77ef70fa30119e6 --- M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/benchmarks/expr-benchmark.cc M be/src/common/global-flags.cc M be/src/exprs/expr-test.cc M be/src/exprs/utility-functions-ir.cc M be/src/rpc/thrift-server-test.cc D be/src/runtime/backend-client.h M be/src/runtime/client-cache-types.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/initial-reservations.cc M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/test-env.cc M be/src/scheduling/executor-blacklist.cc M be/src/scheduling/scheduler-test-util.h M be/src/service/CMakeLists.txt M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h D be/src/service/impala-internal-service.cc D be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/service/session-expiry-test.cc M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h M be/src/util/debug-util.cc M bin/generate_minidump_collection_testdata.py M bin/start-impala-cluster.py M common/thrift/ImpalaInternalService.thrift M infra/deploy/deploy.py M tests/common/impala_cluster.py M tests/common/impala_service.py M tests/custom_cluster/test_blacklist.py M tests/custom_cluster/test_process_failures.py M tests/custom_cluster/test_query_retries.py M tests/custom_cluster/test_restart_services.py M tests/shell/test_shell_interactive.py M tests/webserver/test_web_pages.py 48 files changed, 112 insertions(+), 352 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/16291/3 -- To view, visit http://gerrit.cloudera.org:8080/16291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5fa83c8009590124dded4783f77ef70fa30119e6 Gerrit-Change-Number: 16291 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10066: Fix test cancellation mid command fails
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16322 ) Change subject: IMPALA-10066: Fix test_cancellation_mid_command fails .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib80706d52a85d2c19b13fbbe5695934658c0bf7e Gerrit-Change-Number: 16322 Gerrit-PatchSet: 5 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 12 Aug 2020 17:55:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10066: Fix test cancellation mid command fails
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16322 ) Change subject: IMPALA-10066: Fix test_cancellation_mid_command fails .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6280/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib80706d52a85d2c19b13fbbe5695934658c0bf7e Gerrit-Change-Number: 16322 Gerrit-PatchSet: 5 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 12 Aug 2020 17:55:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) This implements scanning full ACID tables that contain complex types. The same technique works that we use for primitive types. I.e. we add a LEFT ANTI JOIN on top of the Hdfs scan node in order to subtract the deleted rows from the inserted rows. However, there were some types of queries where we couldn't do that. These are the queries that scan the nested collection items directly. E.g.: SELECT item FROM complextypestbl.int_array; The above query only creates a single tuple descriptor that holds the collection items. Since this tuple descriptor is not at the table-level, we cannot add slot references to the hidden ACID column which are at the top level of the table schema. To resolve this I added a statement rewriter that rewrites the above statement to the following: SELECT item FROM complextypestbl $a$1, $a$1.int_array; Now in this example we'll have two tuple descriptors, one for the table-level, and one for the collection item. So we can add the ACID slot refs to the table-level tuple descriptor. The rewrite is implemented by the new AcidRewriter class. Performance I executed the following query with num_nodes=1 on a non-transactional table (without the rewrite), and on an ACID table (with the rewrite): select count(*) from customer_nested.c_orders.o_lineitems; Without the rewrite: Fetched 1 row(s) in 0.41s +--++---+--+--+---++--+---+---+ | Operator | #Hosts | #Inst | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail| +--++---+--+--+---++--+---+---+ | F00:ROOT | 1 | 1 | 13.61us | 13.61us | || 0 B | 0 B | | | 01:AGGREGATE | 1 | 1 | 3.68ms | 3.68ms | 1 | 1 | 16.00 KB | 10.00 MB | FINALIZE | | 00:SCAN HDFS | 1 | 1 | 280.47ms | 280.47ms | 6.00M | 15.00M | 56.98 MB | 8.00 MB | tpch_nested_orc_def.customer.c_orders.o_lineitems | +--++---+--+--+---++--+---+---+ With the rewrite: Fetched 1 row(s) in 0.42s +---++---+--+--+-++--+---+---+ | Operator | #Hosts | #Inst | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail| +---++---+--+--+-++--+---+---+ | F00:ROOT | 1 | 1 | 25.16us | 25.16us | | | 0 B | 0 B | | | 05:AGGREGATE | 1 | 1 | 3.44ms | 3.44ms | 1 | 1 | 63.00 KB | 10.00 MB | FINALIZE | | 01:SUBPLAN| 1 | 1 | 16.52ms | 16.52ms | 6.00M | 125.92M| 47.00 KB | 0 B | | | |--04:NESTED LOOP JOIN| 1 | 1 | 188.47ms | 188.47ms | 0 | 10 | 24.00 KB | 12 B | CROSS JOIN| | | |--02:SINGULAR ROW SRC | 1 | 1 | 0ns | 0ns | 0 | 1 | 0 B | 0 B | | | | 03:UNNEST | 1 | 1 | 25.37ms | 25.37ms | 0 | 10 | 0 B | 0 B | $a$1.c_orders.o_lineitems o_lineitems | | 00:SCAN HDFS | 1 | 1 | 96.26ms | 96.26ms | 100.00K | 12.59M | 38.19 MB | 72.00 MB | default.customer_nested $a$1 | +---++---+--+--+-++--+---+---+ So the overhead is very little. Testing * Added planner tests to PlannerTest/acid-scans.test * E2E query tests to QueryTest/full-acid-complex-type-scans.test * E2E tests for rowid-generation: QueryTest/full-acid-rowid.test Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Reviewed-on: http://gerrit.cloudera.org:8080/16228 Reviewed-by: Zoltan Borok-Nagy Tested-by: Impala Public Jenki
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16173 ) Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 17:45:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 8: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/16228/7/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java: http://gerrit.cloudera.org:8080/#/c/16228/7/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@502 PS7, Line 502: } > There used to be log messages on the other rewrites, seems like now they ar Yeah I complained about those on a different review :) -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 12 Aug 2020 17:45:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16314 ) Change subject: IMPALA-7658: Proper codegen for HiveUdfCall .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6897/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 Gerrit-Change-Number: 16314 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 16:43:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16173 ) Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6279/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 16:23:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9676 Add aarch64 compile options for clang
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15755 ) Change subject: IMPALA-9676 Add aarch64 compile options for clang .. Patch Set 18: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15755 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69a5ff64bbd4427dd87ec6e884251e76d6a73122 Gerrit-Change-Number: 15755 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 16:23:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16173 ) Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6278/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 16:23:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9904 Fix bad cipher test failed case on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16172 ) Change subject: IMPALA-9904 Fix bad cipher test failed case on aarch64 .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19751b6bf1045fd6d901c5a67f74e8bdd6bf65d3 Gerrit-Change-Number: 16172 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 16:23:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16174 ) Change subject: IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64 .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16174 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58ab52acebb9bcddbf298efa886fd30ce35f68bf Gerrit-Change-Number: 16174 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 12 Aug 2020 16:23:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16314 ) Change subject: IMPALA-7658: Proper codegen for HiveUdfCall .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/16314/5/be/src/exprs/hive-udf-call.cc File be/src/exprs/hive-udf-call.cc: http://gerrit.cloudera.org:8080/#/c/16314/5/be/src/exprs/hive-udf-call.cc@486 PS5, Line 486: /// call_java:; preds = %child_not_null14, %child_null15 line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 Gerrit-Change-Number: 16314 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 16:20:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall
Daniel Becker has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/16314 ) Change subject: IMPALA-7658: Proper codegen for HiveUdfCall .. IMPALA-7658: Proper codegen for HiveUdfCall Implementing codegen for HiveUdfCall. TODO: Testing TODO: Benchmarks Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exprs/CMakeLists.txt A be/src/exprs/hive-udf-call-ir.cc M be/src/exprs/hive-udf-call.cc M be/src/exprs/hive-udf-call.h 8 files changed, 469 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/5 -- To view, visit http://gerrit.cloudera.org:8080/16314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 Gerrit-Change-Number: 16314 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/16314 ) Change subject: IMPALA-7658: Proper codegen for HiveUdfCall .. Patch Set 4: (1 comment) Added benchmark results on Jira. http://gerrit.cloudera.org:8080/#/c/16314/4/be/src/exprs/hive-udf-call.cc File be/src/exprs/hive-udf-call.cc: http://gerrit.cloudera.org:8080/#/c/16314/4/be/src/exprs/hive-udf-call.cc@288 PS4, Line 288: return builder->CreateCall( > I think you should be able to use ToNativePtr() here instead of calling out Done -- To view, visit http://gerrit.cloudera.org:8080/16314 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747 Gerrit-Change-Number: 16314 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 16:19:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16252 ) Change subject: IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user .. Patch Set 6: Code-Review+2 merge failure due to IMPALA-10054, rebased -- To view, visit http://gerrit.cloudera.org:8080/16252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070 Gerrit-Change-Number: 16252 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 16:18:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 37: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6277/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 37 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 15:23:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16263 ) Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6276/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Gerrit-Change-Number: 16263 Gerrit-PatchSet: 9 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 15:08:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16263 ) Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6275/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Gerrit-Change-Number: 16263 Gerrit-PatchSet: 9 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 14:35:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16263 ) Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Gerrit-Change-Number: 16263 Gerrit-PatchSet: 9 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 14:34:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries
Sahil Takiar has removed a vote on this change. Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries .. Removed Verified-1 by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/16263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294 Gerrit-Change-Number: 16263 Gerrit-PatchSet: 8 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 12 Aug 2020 14:32:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 36: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6896/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 36 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 13:46:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Qifan Chen has uploaded a new patch set (#36). ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans This work addresses the current limitation in computing the total row count for a Hive table in a scan. The row count can be incorrectly computed as 0, even though there exists data in the Hive table. This is the stats corruption at table level. Similar stats corruption exists for a partition. The row count of a table or a partition sometime can also be -1 which indicates a missing stats situation. In the fix, as long as no partition in a Hive table exhibits any missing or corrupt stats, the total row count for the table is computed from the row counts in all partitions. Otherwise, Impala looks at the table level stats particularly the table row count. In addition, if the table stats is missing or corrupted, Impala estimates a row count for the table, if feasible. This row count is the sum of the row count from the partitions with good stats, and an estimation of the number of rows in the partitions with missing or corrupt stats. Such estimation also applies when some partition has corrupt stats. One way to observe the fix is through the explain of queries scanning Hive tables with missing or corrupted stats. The cardinality for any full scan should be a positive value (i.e. the estimated row count), instead of 'unavailable'. At the beginning of the explain output, that table is still listed in the WARNING section for potentially corrupt table statistics. Testing: 1. Ran unit tests with queries documented in the case against Hive tables with the following configrations: a. No stats corruption in any partitions b. Stats corruption in some partitions c. Stats corruption in all partitions 2. Added two new tests in test_compute_stats.py: a. test_corrupted_stats_in_partitioned_Hive_tables b. test_corrupted_stats_in_unpartitioned_Hive_tables 3. Fixed failures in corrupt-stats.test, stats-extrapolation.test and test_compute_stats.py 4. Introduced a new filter ReplaceValueFilter in planner tests and updated several test result files. 5. Ran "core" test Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/corrupt-stats.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test M tests/metadata/test_compute_stats.py M tests/metadata/test_explain.py 16 files changed, 311 insertions(+), 97 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/16098/36 -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 36 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9225: Add query option for retryable queries to spool all results before returning any to the client
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16323 ) Change subject: IMPALA-9225: Add query option for retryable queries to spool all results before returning any to the client .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6273/ -- To view, visit http://gerrit.cloudera.org:8080/16323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I462dbfef9ddab9060b30a6937fca9122484a24a5 Gerrit-Change-Number: 16323 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 12 Aug 2020 13:14:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9962: Implement ds kll quantiles() function
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/16324 ) Change subject: IMPALA-9962: Implement ds_kll_quantiles() function .. Patch Set 3: Code-Review+1 LGTM! -- To view, visit http://gerrit.cloudera.org:8080/16324 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I76f6039977f4e14ded89a3ee4bc4e6ff855f5e7f Gerrit-Change-Number: 16324 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 12 Aug 2020 10:06:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16174 ) Change subject: IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64 .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6893/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16174 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58ab52acebb9bcddbf298efa886fd30ce35f68bf Gerrit-Change-Number: 16174 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 12 Aug 2020 09:59:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9995 Fix test alloc fail failed case on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16307 ) Change subject: IMPALA-9995 Fix test_alloc_fail failed case on aarch64 .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6895/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd Gerrit-Change-Number: 16307 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 09:58:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9926 base64decode % will not return error when in newer OS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16175 ) Change subject: IMPALA-9926 base64decode % will not return error when in newer OS .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6894/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib9bd9e03d5f744c18c957cdaf2064fa918086004 Gerrit-Change-Number: 16175 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 09:57:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16173 ) Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6892/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 09:55:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9904 Fix bad cipher test failed case on aarch64
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16172 ) Change subject: IMPALA-9904 Fix bad cipher test failed case on aarch64 .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6891/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I19751b6bf1045fd6d901c5a67f74e8bdd6bf65d3 Gerrit-Change-Number: 16172 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 09:45:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9676 Add aarch64 compile options for clang
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15755 ) Change subject: IMPALA-9676 Add aarch64 compile options for clang .. Patch Set 18: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6890/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15755 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69a5ff64bbd4427dd87ec6e884251e76d6a73122 Gerrit-Change-Number: 15755 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 09:44:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15531 ) Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions .. Patch Set 41: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6889/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 41 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 09:43:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9995 Fix test alloc fail failed case on aarch64
zhaoren...@hotmail.com has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/16307 ) Change subject: IMPALA-9995 Fix test_alloc_fail failed case on aarch64 .. IMPALA-9995 Fix test_alloc_fail failed case on aarch64 Length of Json object '{"a": 1}", '$.a' is 32 bytes on x86, but is 48 bytes on aarch64 Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd --- M testdata/workloads/functional-query/queries/QueryTest/alloc-fail-init.test 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/16307/4 -- To view, visit http://gerrit.cloudera.org:8080/16307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9a5a4ba19b225bdb4f18a68d6d9cb2c2d16f91fd Gerrit-Change-Number: 16307 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9926 base64decode % will not return error when in newer OS
zhaoren...@hotmail.com has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/16175 ) Change subject: IMPALA-9926 base64decode % will not return error when in newer OS .. IMPALA-9926 base64decode % will not return error when in newer OS for example, base64decode('YWxwaGE%') will return 'alpha\377' in newer os which has newer sasl library. I tested it on Ubuntu 18.04 aarch64 version. Change-Id: Ib9bd9e03d5f744c18c957cdaf2064fa918086004 --- M be/src/exprs/expr-test.cc M testdata/workloads/functional-query/queries/QueryTest/exprs.test 2 files changed, 7 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/16175/8 -- To view, visit http://gerrit.cloudera.org:8080/16175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib9bd9e03d5f744c18c957cdaf2064fa918086004 Gerrit-Change-Number: 16175 Gerrit-PatchSet: 8 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9906 Fix thread-pool-test failed case on aarch64
zhaoren...@hotmail.com has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/16173 ) Change subject: IMPALA-9906 Fix thread-pool-test failed case on aarch64 .. IMPALA-9906 Fix thread-pool-test failed case on aarch64 Threads switch frequency is not so fast as x86. So here change the sleep task time from 100ms to 500ms Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d --- M be/src/util/thread-pool-test.cc 1 file changed, 9 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/16173/7 -- To view, visit http://gerrit.cloudera.org:8080/16173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7b353f7eb9662995d9a8ae460bb1631933873d5d Gerrit-Change-Number: 16173 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64
zhaoren...@hotmail.com has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/16174 ) Change subject: IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64 .. IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64 cast(pow(2, 31) as int) return 2147483647 on aarch64 but return 2147483648 on x86 I think aarch64 is correct. So here I will not convert it, just use aarch64's value Change-Id: I58ab52acebb9bcddbf298efa886fd30ce35f68bf --- M be/src/exprs/expr-test.cc 1 file changed, 5 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/16174/7 -- To view, visit http://gerrit.cloudera.org:8080/16174 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I58ab52acebb9bcddbf298efa886fd30ce35f68bf Gerrit-Change-Number: 16174 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9904 Fix bad cipher test failed case on aarch64
zhaoren...@hotmail.com has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/16172 ) Change subject: IMPALA-9904 Fix bad cipher test failed case on aarch64 .. IMPALA-9904 Fix bad cipher test failed case on aarch64 On aarch64 and os ubuntu 18.04, the openssl version is 1.1.1, the server which used openssl can start successfully even ciphers is bad. So here just don't test bad ciphers cases on aarch64. On x86, the server cannot start successfully because the lower openssl version, not because the bad cipher. Change-Id: I19751b6bf1045fd6d901c5a67f74e8bdd6bf65d3 --- M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/thrift-server-test.cc M be/src/util/webserver-test.cc 3 files changed, 10 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/16172/7 -- To view, visit http://gerrit.cloudera.org:8080/16172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I19751b6bf1045fd6d901c5a67f74e8bdd6bf65d3 Gerrit-Change-Number: 16172 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9676 Add aarch64 compile options for clang
zhaoren...@hotmail.com has uploaded a new patch set (#18). ( http://gerrit.cloudera.org:8080/15755 ) Change subject: IMPALA-9676 Add aarch64 compile options for clang .. IMPALA-9676 Add aarch64 compile options for clang Add signed-char and armv8a and crc compile options to clang Change-Id: I69a5ff64bbd4427dd87ec6e884251e76d6a73122 --- M be/CMakeLists.txt 1 file changed, 5 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/15755/18 -- To view, visit http://gerrit.cloudera.org:8080/15755 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I69a5ff64bbd4427dd87ec6e884251e76d6a73122 Gerrit-Change-Number: 15755 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15531 ) Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions .. Patch Set 41: (8 comments) http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/util/sse2neon.h File be/src/util/sse2neon.h: http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/util/sse2neon.h@213 PS41, Line 213: // https://msdn.microsoft.com/en-us/library/bb514059%28v=vs.120%29.aspx?f=255&MSPPError=-2147217396 line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/util/sse2neon.h@406 PS41, Line 406: // https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2010/whtfzhzk(v=vs.100) line too long (104 > 90) http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/util/sse2neon.h@413 PS41, Line 413: // https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_set1_epi64x&expand=4961 line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/util/sse2neon.h@1054 PS41, Line 1054: // https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_shuffle_epi8&expand=5146 line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/util/sse2neon.h@1199 PS41, Line 1199: // https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2010/y41dkk37(v=vs.100) line too long (104 > 90) http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/util/sse2neon.h@1645 PS41, Line 1645: // https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_test_all_zeros&expand=5871 line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/util/sse2neon.h@3581 PS41, Line 3581: // https://github.com/ColinIanKing/linux-next-mirror/blob/b5f466091e130caaf0735976648f72bd5e09aa84/crypto/aegis128-neon-inner.c#L52 line too long (131 > 90) http://gerrit.cloudera.org:8080/#/c/15531/41/be/src/util/sse2neon.h@3681 PS41, Line 3681: // cpp-compiler-developer-guide-and-reference-allocating-and-freeing-aligned-memory-blocks line too long (98 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 41 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 09:22:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions
zhaoren...@hotmail.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/15531 ) Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions .. Patch Set 41: (13 comments) http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/codegen/llvm-codegen-test.cc File be/src/codegen/llvm-codegen-test.cc: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/codegen/llvm-codegen-test.cc@537 PS40, Line 537: // state. > Can you modify this so that less of this code is within the ifdef. I think Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/exec/delimited-text-parser.inline.h File be/src/exec/delimited-text-parser.inline.h: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/exec/delimited-text-parser.inline.h@242 PS40, Line 242: column_idx_ = num_partition_keys_; > I think you can move this into the #ifndef. Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/kudu/util/block_bloom_filter.cc File be/src/kudu/util/block_bloom_filter.cc: PS40: > For all the files under kudu/util, we need to pull in the changes from Kudu Hi, Tim. This file is same with kudu's, I already compare them. http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util-test.cc File be/src/util/bit-util-test.cc: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util-test.cc@137 PS40, Line 137: int buf_size = 0; > Does this need to be an ifdef? Can we use base::IsAArch64()? I prefer regul Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util-test.cc@187 PS40, Line 187: } > Same things here and below - use regular if statements where possible. Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.h File be/src/util/bit-util.h: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.h@132 PS40, Line 132: static inline int Popcount(uint64_t x) { > This is really ugly. It would be better just to define a separate version o Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.cc File be/src/util/bit-util.cc: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.cc@144 PS40, Line 144: #ifndef __aarch64__ > Can you group together the aarch64 implementations after the x86 implementa Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.cc@196 PS40, Line 196: bswap_fptr = SimdByteSwap::ByteSwap128; > The #ifdefs mixed with control flow is not readable, can you factor out int Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/bit-util.cc@237 PS40, Line 237: else if (len >= 32) { > Same here, too many #ifdefs mixed in with the control flow. Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/cpu-info.cc File be/src/util/cpu-info.cc: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/cpu-info.cc@160 PS40, Line 160: if (num_cores > 0) { > This seems confusing. I think we should define a new constant for neon to m Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/hash-util.h File be/src/util/hash-util.h: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/hash-util.h@42 PS40, Line 42: DCHECK(CpuInfo::IsSupported(CpuInfo::SSE4_2) || base::IsAarch64()); > We need to find a better solution than all these #ifdefs. Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/sse-util.h File be/src/util/sse-util.h: http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/sse-util.h@86 PS40, Line 86: /// support (e.g. IMPALA-6882). > Can you add a comment here like: Done http://gerrit.cloudera.org:8080/#/c/15531/40/be/src/util/sse2neon.h File be/src/util/sse2neon.h: PS40: > Where did this version of the file come from? I would like to know so I can Hi, Tim, the file from here :https://github.com/DLTcollab/sse2neon -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 41 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 09:22:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions
zhaoren...@hotmail.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/15531 ) Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions .. Patch Set 41: Hi, Tim, this has been fixed as your request. -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 41 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 09:22:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions
zhaoren...@hotmail.com has uploaded a new patch set (#41). ( http://gerrit.cloudera.org:8080/15531 ) Change subject: IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions .. IMPALA-9544 Replace Intel's SSE instructions with ARM's NEON instructions Replace Intel's SSE instructions with ARM's NEON instructions Replace Intel's crc32 instructions with ARM's instructions Replace Intel's popcntq instruction with ARM's mechanism Replace Intel's pcmpestri and pcmpestrm instructions with ARM mechanism Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 --- M CMakeLists.txt M be/CMakeLists.txt M be/src/benchmarks/bswap-benchmark.cc M be/src/benchmarks/int-hash-benchmark.cc M be/src/codegen/CMakeLists.txt M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/exec/delimited-text-parser.inline.h M be/src/kudu/util/block_bloom_filter.cc M be/src/kudu/util/group_varint-inl.h M be/src/kudu/util/group_varint-test.cc A be/src/kudu/util/sse2neon.h M be/src/util/bit-util-test.cc M be/src/util/bit-util.cc M be/src/util/bit-util.h M be/src/util/bloom-filter.cc M be/src/util/bloom-filter.h M be/src/util/hash-util-ir.cc M be/src/util/hash-util.h M be/src/util/sse-util.h A be/src/util/sse2neon.h 21 files changed, 3,969 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/15531/41 -- To view, visit http://gerrit.cloudera.org:8080/15531 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id7dfe17125b2910ece54e7dd18b4e4b25d7de8b9 Gerrit-Change-Number: 15531 Gerrit-PatchSet: 41 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16228 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types) .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6274/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f Gerrit-Change-Number: 16228 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 12 Aug 2020 09:21:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9962: Implement ds kll quantiles() function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16324 ) Change subject: IMPALA-9962: Implement ds_kll_quantiles() function .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6888/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16324 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I76f6039977f4e14ded89a3ee4bc4e6ff855f5e7f Gerrit-Change-Number: 16324 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 12 Aug 2020 08:57:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9962: Implement ds kll quantiles() function
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16324 ) Change subject: IMPALA-9962: Implement ds_kll_quantiles() function .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/16324/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test File testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test: http://gerrit.cloudera.org:8080/#/c/16324/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test@357 PS2, Line 357: : QUERY : select : ds_kll_quantiles(ds_kll_sketch(float_col), 0, 0.2, NULL, 0.8, 1) : from functional_parquet.alltypessmall; : CATCH : UDF ERROR: NULL provided in the input list. : : QUERY : select : ds_kll_quantiles(ds_kll_sketch(float_col), 0, 0.2, 0.5, 0.8, NULL) : from functional_parquet.alltypessmall; : CATCH : UDF ERROR: NULL provided in the input list. : : QUERY : select : ds_kll_quantiles(ds_kll_sketch(float_col), NULL) : from > Is this alright that if there is a NULL argument given, then the result is Good catch! I checked Hive and it gives an error in case of a null input. I followed that approach. Done -- To view, visit http://gerrit.cloudera.org:8080/16324 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I76f6039977f4e14ded89a3ee4bc4e6ff855f5e7f Gerrit-Change-Number: 16324 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 12 Aug 2020 08:41:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9962: Implement ds kll quantiles() function
Hello Adam Tamas, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16324 to look at the new patch set (#3). Change subject: IMPALA-9962: Implement ds_kll_quantiles() function .. IMPALA-9962: Implement ds_kll_quantiles() function This function is very similar to ds_kll_quantile() but this one can receive any number of rank parameters and returns a comma separated string that holds the results for all of the given ranks. For more details about ds_kll_quantile() see IMPALA-9959. Note, this function is meant to return an Array of floats as the result but with that we have to wait for the complex type support. Tracking Jira is IMPALA-9520. Change-Id: I76f6039977f4e14ded89a3ee4bc4e6ff855f5e7f --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/datasketches-common.cc M be/src/exprs/datasketches-common.h M be/src/exprs/datasketches-functions-ir.cc M be/src/exprs/datasketches-functions.h M common/function-registry/impala_functions.py M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test 7 files changed, 155 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/16324/3 -- To view, visit http://gerrit.cloudera.org:8080/16324 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I76f6039977f4e14ded89a3ee4bc4e6ff855f5e7f Gerrit-Change-Number: 16324 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins