[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 13: (4 comments) Hi All, David has ran several benchmark run. Perf number seems to improve from this async IO prototype. I will proceed cleaning up the code and add proper commit message. Here are some that I plan to address next. http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.h File be/src/exec/hdfs-orc-scanner.h: http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.h@123 PS13, Line 123: // ExecEnv::GetInstance()->disk_io_mgr()->max_buffer_size(); Can be removed? http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.cc@300 PS13, Line 300: // stream_->ReleaseCompletedResources(true); : stream_->ReleaseCompletedResources(false); Calling 'ReleaseCompletedResources(true)' seems to be OK here? http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-scan-node-base.cc@821 PS13, Line 821: // DCHECK_LE(offset + len, GetFileDesc(metadata->partition_id, file)->file_length) : //<< "Scan range beyond end of file (offset=" << offset << ", len=" << len << ")"; Can be removed? http://gerrit.cloudera.org:8080/#/c/15370/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/15370/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2134 PS13, Line 2134: for (SlotDescriptor slot: desc_.getSlots()) { Just for our note, we found a corner case here for "select count(*)" kind of query over ORC. Somehow, desc._getSlots() is empty in this corner case, but HdfsOrcScanner::StartColumnReading actually see couple streams that is eligible for async read. Patch set 12 already adds a workaround within HdfsOrcScanner::StartColumnReading to TryIncreaseReservation 8KB (min_buffer_size) for each eligible stream. If it can't increase, then the rest of the stream will be read synchronously. I will file a follow up JIRA to document this situation. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 13 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 30 Nov 2021 23:17:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9804/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 13 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 18 Nov 2021 17:22:04 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 13: (2 comments) http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-columnar-scanner.cc File be/src/exec/hdfs-columnar-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-columnar-scanner.cc@239 PS13, Line 239: columnar_scanner_actual_reservation_counter_->UpdateCounter(context_->total_reservation()); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.cc@208 PS13, Line 208: columnar_scanner_actual_reservation_counter_->UpdateCounter(context_->total_reservation()); line too long (95 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 13 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 18 Nov 2021 17:02:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Riza Suminto has uploaded a new patch set (#13) to the change originally created by Csaba Ringhofer. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-query/queries/QueryTest/scanner-reservation.test 14 files changed, 475 insertions(+), 213 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/13 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 13 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9799/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 12 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 17 Nov 2021 22:12:24 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 12: Patch set 12 has the following changes: 1. Remove several debugging logs. 2. Adjust resource-requirements.test (from PlannerTest.testResourceRequirements). 3. Add workaround to increase memory reservation for certain select count cases. Following queries failed without workaround from point 3: select count(*) from complextypes_partitioned.int_array; select count(*) from complextypes_partitioned.nested_struct.c.d.item inner_array; -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 12 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 17 Nov 2021 21:54:56 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Riza Suminto has uploaded a new patch set (#12) to the change originally created by Csaba Ringhofer. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-query/queries/QueryTest/scanner-reservation.test 14 files changed, 475 insertions(+), 213 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/12 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 12 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9794/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 11 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 17 Nov 2021 16:44:18 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9793/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 10 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 17 Nov 2021 16:34:58 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 11: Patch set 11 fix stream positioning bug and resolve some e2e tests failure. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 11 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 17 Nov 2021 16:24:48 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Riza Suminto has uploaded a new patch set (#11) to the change originally created by Csaba Ringhofer. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-query/queries/QueryTest/scanner-reservation.test 13 files changed, 517 insertions(+), 201 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/11 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 11 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 10: Patch set 10 is a rebase of patch set 9 over recent master branch. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 10 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 17 Nov 2021 16:14:16 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Riza Suminto has uploaded a new patch set (#10) to the change originally created by Csaba Ringhofer. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 12 files changed, 494 insertions(+), 195 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/10 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 10 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 9: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7464/ -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 10 Sep 2021 04:50:18 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9446/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 22:56:46 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 9: (19 comments) http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-columnar-scanner.cc File be/src/exec/hdfs-columnar-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-columnar-scanner.cc@153 PS9, Line 153: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << "reserved: " << min_buffer_size * col_range_lengths.size(); line too long (138 > 90) http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-columnar-scanner.cc@158 PS9, Line 158:LOG(INFO) << "col_range_lengths: " << col_range_lengths[i]; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-columnar-scanner.cc@160 PS9, Line 160: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-columnar-scanner.cc@172 PS9, Line 172: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-columnar-scanner.cc@172 PS9, Line 172: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-columnar-scanner.cc@203 PS9, Line 203: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << " bytes to add " << bytes_to_add; line too long (115 > 90) http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-columnar-scanner.cc@203 PS9, Line 203: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << " bytes to add " << bytes_to_add; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-columnar-scanner.cc@209 PS9, Line 209:LOG(INFO) << "column reservation: " << tmp_reservation.second; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@112 PS9, Line 112: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@113 PS9, Line 113:LOG(INFO) << "Read random from orc. offset: " << offset << " length: " << length; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@124 PS9, Line 124:LOG(INFO) << "Read async orc. offset: " << offset << " length: " << length; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@147 PS9, Line 147: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@203 PS9, Line 203: unique_ptr stream = stripe.getStreamInformation(stream_id); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@278 PS9, Line 278: DCHECK(false); tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@289 PS9, Line 289:return status; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@290 PS9, Line 290:} tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@291 PS9, Line 291://LOG(INFO) << "HdfsOrcScanner::ColumnRange::read skipping: " << (offset - position_); tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-orc-scanner.cc@304 PS9, Line 304: //LOG(INFO) << "HdfsOrcScanner::ColumnRange::read stream finished: "; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/15370/9/be/src/exec/hdfs-scan-node-base.cc@823 PS9, Line 823: if (offset + len > GetFileDesc(metadata->partition_id, file)->file_length) return nullptr; line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 22:04:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7464/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 22:04:38 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15370 to look at the new patch set (#9). Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 12 files changed, 496 insertions(+), 194 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/9 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 9 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 8: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7462/ -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 20:28:40 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9444/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 19:08:46 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7462/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 18:50:24 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 7: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7461/ -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 18:47:54 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 8: (18 comments) http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-columnar-scanner.cc File be/src/exec/hdfs-columnar-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-columnar-scanner.cc@153 PS8, Line 153: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << "reserved: " << min_buffer_size * col_range_lengths.size(); line too long (138 > 90) http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-columnar-scanner.cc@158 PS8, Line 158:LOG(INFO) << "col_range_lengths: " << col_range_lengths[i]; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-columnar-scanner.cc@160 PS8, Line 160: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-columnar-scanner.cc@172 PS8, Line 172: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-columnar-scanner.cc@172 PS8, Line 172: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-columnar-scanner.cc@203 PS8, Line 203: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << " bytes to add " << bytes_to_add; line too long (115 > 90) http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-columnar-scanner.cc@203 PS8, Line 203: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << " bytes to add " << bytes_to_add; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-columnar-scanner.cc@209 PS8, Line 209:LOG(INFO) << "column reservation: " << tmp_reservation.second; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@112 PS8, Line 112: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@113 PS8, Line 113://LOG(INFO) << "Read random from orc. offset: " << offset << " length: " << length; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@124 PS8, Line 124://LOG(INFO) << "Read async orc. offset: " << offset << " length: " << length; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@145 PS8, Line 145: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@198 PS8, Line 198: unique_ptr stream = stripe.getStreamInformation(stream_id); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@269 PS8, Line 269: DCHECK(false); tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@280 PS8, Line 280:return status; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@281 PS8, Line 281:} tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@282 PS8, Line 282://LOG(INFO) << "HdfsOrcScanner::ColumnRange::read skipping: " << (offset - position_); tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/8/be/src/exec/hdfs-orc-scanner.cc@295 PS8, Line 295: //LOG(INFO) << "HdfsOrcScanner::ColumnRange::read stream finished: "; tab used for whitespace -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 18:45:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15370 to look at the new patch set (#8). Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 11 files changed, 483 insertions(+), 191 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/8 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9441/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 13:03:11 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7461/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 12:42:29 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 7: (16 comments) http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-columnar-scanner.cc File be/src/exec/hdfs-columnar-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-columnar-scanner.cc@153 PS7, Line 153: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << "reserved: " << min_buffer_size * col_range_lengths.size(); line too long (138 > 90) http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-columnar-scanner.cc@158 PS7, Line 158:LOG(INFO) << "col_range_lengths: " << col_range_lengths[i]; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-columnar-scanner.cc@160 PS7, Line 160: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-columnar-scanner.cc@172 PS7, Line 172: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-columnar-scanner.cc@172 PS7, Line 172: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-columnar-scanner.cc@203 PS7, Line 203: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << " bytes to add " << bytes_to_add; line too long (115 > 90) http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-columnar-scanner.cc@203 PS7, Line 203: //LOG(INFO) << "reservation_to_distribute: " << reservation_to_distribute << " bytes to add " << bytes_to_add; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-columnar-scanner.cc@209 PS7, Line 209:LOG(INFO) << "column reservation: " << tmp_reservation.second; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-orc-scanner.cc@112 PS7, Line 112://LOG(INFO) << "Read random from orc. offset: " << offset << " length: " << length; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-orc-scanner.cc@123 PS7, Line 123://LOG(INFO) << "Read async orc. offset: " << offset << " length: " << length; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-orc-scanner.cc@191 PS7, Line 191: unique_ptr stream = stripe.getStreamInformation(stream_id); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-orc-scanner.cc@262 PS7, Line 262: DCHECK(false); tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-orc-scanner.cc@273 PS7, Line 273:return status; tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-orc-scanner.cc@274 PS7, Line 274:} tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-orc-scanner.cc@275 PS7, Line 275://LOG(INFO) << "HdfsOrcScanner::ColumnRange::read skipping: " << (offset - position_); tab used for whitespace http://gerrit.cloudera.org:8080/#/c/15370/7/be/src/exec/hdfs-orc-scanner.cc@288 PS7, Line 288: //LOG(INFO) << "HdfsOrcScanner::ColumnRange::read stream finished: "; tab used for whitespace -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 09 Sep 2021 12:40:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15370 to look at the new patch set (#7). Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 11 files changed, 476 insertions(+), 191 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/7 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7455/ -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 07 Sep 2021 17:00:55 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/9425/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 07 Sep 2021 11:49:21 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7455/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 07 Sep 2021 11:27:10 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15370 to look at the new patch set (#6). Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 11 files changed, 395 insertions(+), 184 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/6 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7454/ -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 5 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Mon, 06 Sep 2021 16:53:51 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/9423/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 5 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Mon, 06 Sep 2021 13:09:20 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7454/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 5 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Mon, 06 Sep 2021 12:47:22 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15370 to look at the new patch set (#5). Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 11 files changed, 395 insertions(+), 181 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/5 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 5 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7448/ -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 31 Aug 2021 16:15:55 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/9411/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 31 Aug 2021 14:41:56 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7448/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 31 Aug 2021 14:30:17 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7447/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 31 Aug 2021 14:30:08 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15370 to look at the new patch set (#3). Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 11 files changed, 395 insertions(+), 181 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/3 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/9257/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 09 Aug 2021 14:58:29 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 ) Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. Patch Set 2: Note that this was a quite hacky implementation - the problem is that when the ORC lib reads from the file, it only gives us an offset and length and we do not know which column (or stream) does it try to read. So we build a map of ranges beforehand (HdfsOrcScanner::StartColumnReading), and try to guess which range to advance during every individual read call and fall back to sync-IO if the read is not what we expected (HdfsOrcScanner::ScanRangeInputStream::read) This seems to work, but changes in ORC lib can easily lead "disabling" async scanning by reading in unexpected patterns. The best would be to move most of the logic to ORC, so that it would return the ranges to us and identify the given range in every read call. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 09 Aug 2021 14:44:27 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-6636: Use async IO in ORC scanner
Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15370 Change subject: WIP IMPALA-6636: Use async IO in ORC scanner .. WIP IMPALA-6636: Use async IO in ORC scanner Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-page-reader.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 11 files changed, 386 insertions(+), 188 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/15370/2 -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer