[Impala-ASF-CR] IMPALA-5448: fix invalid number of splits reported in Parquet scan node
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/8147 ) Change subject: IMPALA-5448: fix invalid number of splits reported in Parquet scan node .. Patch Set 1: Passed Jenkins pre-review-test: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/342/console -- To view, visit http://gerrit.cloudera.org:8080/8147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaacc2d775032f5707061e704f12e0a63cde695d1 Gerrit-Change-Number: 8147 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 28 Sep 2017 06:20:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5990: Part 1: JNI-based LZ4 de/compression
Alex Behm has abandoned this change. ( http://gerrit.cloudera.org:8080/8150 ) Change subject: IMPALA-5990: Part 1: JNI-based LZ4 de/compression .. Abandoned Will rework to simplify. -- To view, visit http://gerrit.cloudera.org:8080/8150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I237802944875b07080db0159ff8ec548150fd95e Gerrit-Change-Number: 8150 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-5990: Part 1: JNI-based LZ4 de/compression
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8150 ) Change subject: IMPALA-5990: Part 1: JNI-based LZ4 de/compression .. Patch Set 1: I had looked at that library but concluded we would not want to pull it in as-is because it includes a lot of stuff we don't need and also the LZ4 code - but we want to use the LZ4 code from our toolchain. Reimplementing the JNI wrappers we would need seems so simple that it's not worth the trouble pulling the library and integrating it into our build/toolchain. However, looking at the library code more carefully, I think we make the implementation I posted here much simpler, so let me try and rework that. -- To view, visit http://gerrit.cloudera.org:8080/8150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I237802944875b07080db0159ff8ec548150fd95e Gerrit-Change-Number: 8150 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 28 Sep 2017 05:36:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 6: > > I totally agree about doing anything fancy as a followup. In > > particular, I did not find anything at all useful in AVX to help. > > There is no vectored multiply large enough for this, and FMA > > operations won't help either as they basically only help with > > precision loss on floating point types. > > > > Considering the perf difference is so extreme in some cases, I > > think we should strongly consider either living with the broken > > behavior until we can come up with a solution, or else verifying > > with users making use of the DECIMAL_V2 feature that this will > not > > be a problem. > > AVX2 has 32x8 times 32x8 multiplication. I have elsewhere simulated > 64-bit multiplication with this, combines with additions and > shifts, and received a speedup on 64x4 times 64x4 operations. Oh, and AVX512 has vpmullq for multiplying vectors of 64-bit values. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 28 Sep 2017 01:56:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Tim Wood has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 9: Pre-review-test run of EE tests using IMPALA-5986 fix pass with blue. -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Wood Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Wed, 27 Sep 2017 21:05:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/7438 ) Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication .. Patch Set 6: > I totally agree about doing anything fancy as a followup. In > particular, I did not find anything at all useful in AVX to help. > There is no vectored multiply large enough for this, and FMA > operations won't help either as they basically only help with > precision loss on floating point types. > > Considering the perf difference is so extreme in some cases, I > think we should strongly consider either living with the broken > behavior until we can come up with a solution, or else verifying > with users making use of the DECIMAL_V2 feature that this will not > be a problem. AVX2 has 32x8 times 32x8 multiplication. I have elsewhere simulated 64-bit multiplication with this, combines with additions and shifts, and received a speedup on 64x4 times 64x4 operations. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-Change-Number: 7438 Gerrit-PatchSet: 6 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 27 Sep 2017 20:31:02 + Gerrit-HasComments: No
[Impala-ASF-CR] Move tests related to the old join node.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8153 ) Change subject: Move tests related to the old join node. .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1271/ -- To view, visit http://gerrit.cloudera.org:8080/8153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c Gerrit-Change-Number: 8153 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 27 Sep 2017 19:20:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5870: Improve runtime profile for partial sort
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8123 ) Change subject: IMPALA-5870: Improve runtime profile for partial sort .. IMPALA-5870: Improve runtime profile for partial sort A recent change (IMPALA-5498) added the ability to do partial sorts, which divide their input up into runs each of which is sorted individually, avoiding the need to spill. Some of the debug output wasn't updated vs. regular sorts, leading to confusion. This patch removes the counters 'SpilledRuns' and 'MergesPerformed' since they will always be 0, and it renames the 'IntialRunsCreated' counter to 'RunsCreated' since the 'Initial' refers to the fact that in a regular sort those runs may be spilled or merged. It also adds a profile info string 'SortType' that can take the values 'Total', 'TopN', or 'Partial' to reflect the type of exec node being used. Example profile snippet for a partial sort: SORT_NODE (id=2):(Total: 403.261us, non-child: 382.029us, % non-child: 94.73%) SortType: Partial ExecOption: Codegen Enabled - NumRowsPerRun: (Avg: 44 (44) ; Min: 44 (44) ; Max: 44 (44) ; Number of samples: 1) - InMemorySortTime: 34.201us - PeakMemoryUsage: 2.02 MB (2117632) - RowsReturned: 44 (44) - RowsReturnedRate: 109.11 K/sec - RunsCreated: 1 (1) - SortDataSize: 572.00 B (572) Testing: - Manually ran several sorting queries and inspected their profiles - Updated a kudu_insert test that relied on the 'SpilledRuns' counter to be 0 for a partial sort. Change-Id: I2b15af78d8299db8edc44ff820c85db1cbe0be1b Reviewed-on: http://gerrit.cloudera.org:8080/8123 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M be/src/exec/partial-sort-node.cc M be/src/exec/sort-node.cc M be/src/exec/topn-node.cc M be/src/runtime/sorter.cc M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test 5 files changed, 11 insertions(+), 4 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I2b15af78d8299db8edc44ff820c85db1cbe0be1b Gerrit-Change-Number: 8123 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5870: Improve runtime profile for partial sort
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8123 ) Change subject: IMPALA-5870: Improve runtime profile for partial sort .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2b15af78d8299db8edc44ff820c85db1cbe0be1b Gerrit-Change-Number: 8123 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 18:55:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Tim Wood has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 9: I've started a run of the full EE suite in pre_review_test to verify framework change does not affect other tests. -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Wood Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Wed, 27 Sep 2017 17:53:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Tim Wood has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 9: > Uploaded patch set 9. Pre-review build https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/344/ shows no flappers! -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Wood Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Wed, 27 Sep 2017 17:47:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Hello Matthew Mulder, Michael Brown, David Knupp, Alex Behm, Mostafa Mokhtar, Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8102 to look at the new patch set (#9). Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. Main source for TPCDS query and result definitions: https://github.com/gregrahn/tpcds-kit. TPC-DS v2.5.0 qualification queries from G. Rahn, Cloudera, Inc. Data set constructed in mini-cluster using $IMPALA_HOME/buildall.sh -testdata This commit continues previous work on IMPALA-5376 in the ASF Impala repo and the Cloudera Gerrit service. This commit splits multi-query tests in the TPC-DS suite definition into one query and result set per test file, as the test framework requires. Names for such files have -1, -2... inner suffixes. The complete TPC-DS test suite runs with passes, skips and xfails, but no failures, as reflected by runs of $IMPALA_HOME/tests/run-tests.py query_test/test_tpcds_queries.py ... Expected result sets come from the TPC-DS kit. Some TPC-DS test cases in this commit have been modified in sematically-neutral ways so as to pass on Impala; others are marked to skip or xfail due to bugs. Tests that flap are marked to skip, with a bug ID, since they don't reliably pass or xfail. The tests/query_test/test_tpcds_queries.py driver file is authoritative for the active/skip/xfail status for each case and a brief reason. The following list describes the current status as: --- test-name deviance from TPC-DS spec changes made --- tpcds-q22a.test RESULT MISMATCH in LSD of AVG() values Fixed AVG()s --- tpcds-q30.test UNRECOGNIZED CHARACTER MARKED XFAIL, IMPALA-5961. --- tpcds-q31.test RESULT MISMATCH in LSD of DECIMAL values ADDED TRUNCATE(2)s AROUND LAST 4 COLUMNS. MARKED SKIP, IMPALA-5956 --- tpcds-q35a.test RESULT MISMATCH MARKED XFAIL, IMPALA-5950. --- tpcds-q36a.test RESULT MISMATCH MARKED XFAIL, IMPALA-4741 --- tpcds-q39.test MULTIPLE RESULT SET not recognized by test framework MARKED XFAIL. --- tpcds-q47.test RESULT MISMATCH in LSD of DECIMAL values ADDED TRUNCATE(2) TO 8th COLUMN OF WITH TABLE, TAKE ACTUAL RESULT AS EXPECTED. --- tpcds-q49.test RESULT MISMATCH in LSD of DECIMAL values MARKED XFAIL, IMPALA-5945 --- tpcds-q57.test RESULT MISMATCH, excess scale in DECIMAL values FIXED, ADDED TRUNCATE(2) AROUND 6th COLUMN. --- tpcds-q58.test RESULT MISMATCH in DECIMAL values MARKED XFAIL. IMPALA-5946 --- tpcds-q59.test RESULT MISMATCH, excess scale in DECIMAL values FIXED, ADDED TRUNCATE(2) AROUND 4th-10th COLUMNS. --- tpcds-q61.test RESULT MISMATCH in DECIMAL value FIXED. CAST RESULT QUOTIENT TO DECIMAL(15, 4), TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q63.test RESULT MISMATCH, excess scale in DECIMAL values ADDED TRUNCATE(2) TO 3rd COLUMN --- tpcds-q64.test RESULT MISMATCH ADDED ORDER BY COLUMNS. --- tpcds-q66.test RESULT MISMATCH MARKED XFAIL, IMPALA-4741 --- tpcds-q77a.test RESULT MISMATCH FIXED. TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q78.test RESULT MISMATCH FIXED. TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q83.test RESULT MISMATCH MARKED XFAIL. IMPALA-5945. --- tpcds-q85.test MISSING TABLE "reason" MARKED XFAIL, IMPALA-5960 --- tpcds-q86a.test RESULT MISMATCH FIXED. TAKE ACTUAL RESULT AS EXPECTED --- tpcds-q89.test RESULT MISMATCH, DECIMAL values flap MARKED XFAIL. ADDED ROUND(2) TO 8th COLUMN, TAKE ACTUAL RESULTS AS EXPECTED, IMPALA-5956. --- tpcds-q90.test RESULT MISMATCH MARKED XFAIL, IMPALA-5945. --- tpcds-q93.test MISSING TABLE "reason" MARKED XFAIL, IMPALA-5960 --- tpcds-q98.test RESULT MISMATCH FIXED, ADDED ROUND() TO LAST COLUMN IMPALA-5986: Allow SET option names to contain digits when resetting them between queries. Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 --- A testdata/workloads/tpcds/queries/tpcds-q10.test A testdata/workloads/tpcds/queries/tpcds-q10a.test A testdata/workloads/tpcds/queries/tpcds-q11.test A testdata/workloads/tpcds/queries/tpcds-q12.test A testdata/workloads/tpcds/queries/tpcds-q13.test A testdata/workloads/tpcds/queries/tpcds-q14-1.test A testdata/workloads/tpcds/queries/tpcds-q14-2.test A testdata/workloads/tpcds/queries/tpcds-q14a-1.test A testdata/workloads/tpcds/queries/tpcds-q14a-2.test A testdata/workloads/tpcds/queries/tpcds-q15.test A testdata/workloads/tpcds/queries/tpcds-q16.test A testdata/workloads/tpcds/queries/tpcds-q17.test A testdata/workloads/tpcds/queries/tpcds-q18.test A testdata/workloads/tpcds/queries/tpcds-q18a.test A testdata/workloads/tpcds/queries/tpcds-q20.test A testdata/workloads/tpcds/queries/tpcds-q21.test A testdata/workloads/tpcds/queries/tpcds-q22.test A testdata/workloads/tpcds/queries/tpcds-q22a.test M testdata/workloads/tpcds/queries/tpcds-q23-1.test M testdata/workloads/tpcds/queries/tpcds-q23-2.test A testdata/workloads/tpcds/queries/tpcds-q24-1.test A
[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 ) Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala .. Patch Set 5: Ping? -- To view, visit http://gerrit.cloudera.org:8080/7954 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff Gerrit-Change-Number: 7954 Gerrit-PatchSet: 5 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Wed, 27 Sep 2017 17:20:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5983: Fix crash in to/from utc timestamp("10:00:00", 'MSK')
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8139 to look at the new patch set (#3). Change subject: IMPALA-5983: Fix crash in to/from_utc_timestamp("10:00:00", 'MSK') .. IMPALA-5983: Fix crash in to/from_utc_timestamp("10:00:00", 'MSK') Moscow timezone is handled differrently than other timezones, and it was possible to cause unhandled boost exception by trying to convert "dateless" timestamps like "10:00:00". These timestamps cannot be handled by timestamp conversion generally, because daylight saving time logic needs month and day to work correctly. The condition HasDateOrTime() incorrectly suggested that these timestamps can be handled, so I made the condition stricter. Change-Id: I592389333a16a15a680beed389e2702dfc16fe1d --- M be/src/exprs/timestamp-functions.cc M tests/query_test/test_timezones.py 2 files changed, 9 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/8139/3 -- To view, visit http://gerrit.cloudera.org:8080/8139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I592389333a16a15a680beed389e2702dfc16fe1d Gerrit-Change-Number: 8139 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44 PS6, Line 44: There is a significant regression (50% increase in runtime) in > I'm pretty sure it's the memory copying - making a memory allocation should Makes sense. Might still be worth doing a quick experiment with MT_DOP to see if the regression remains the same (MT_DOP is easier to reason about). I'm definitely in favor of this change, but uncompressed Parquet with a lot of string columns is unfortunately very common, despite our recommendations. -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 16:25:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 6: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/hdfs-parquet-scanner.cc@245 PS5, Line 245: context_->ReleaseCompletedResources(nullptr, true); > I can obviously fix it if you feel strongly but it just doesn't seem that i Ah, I understand. http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h File be/src/exec/parquet-column-readers.h: http://gerrit.cloudera.org:8080/#/c/8085/5/be/src/exec/parquet-column-readers.h@477 PS5, Line 477: int64_t size, const char* err_ctx, uint8_t** buffer); > Done. Also switch to const char* so we don't need to create a std::string f Thx. ctx somewhat reminds me of the various context objects that we use elsewhere, but I don't feel strongly about it. -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 16:20:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5990: Part 1: JNI-based LZ4 de/compression
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/8150 ) Change subject: IMPALA-5990: Part 1: JNI-based LZ4 de/compression .. Patch Set 1: It looks like libraries like https://github.com/lz4/lz4-java exist to do this sort of thing. What's our thinking on using them instead of implementing our own wrappers? -- To view, visit http://gerrit.cloudera.org:8080/8150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I237802944875b07080db0159ff8ec548150fd95e Gerrit-Change-Number: 8150 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 27 Sep 2017 16:20:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44 PS6, Line 44: There is a significant regression (50% increase in runtime) in > Just to clarify my point. My hope is that most of the overhead may ultimate I'm pretty sure it's the memory copying - making a memory allocation should at least an order of magnitude cheaper than doing a pass over a data page. Unsure if the difference is due to the extra instructions executed or the increase in cache pressure from having two copies of the data. I haven't measured but I'm also not convinced that the Disk IO Mgr's buffer caching is necessarily more efficient or scalable than TCMalloc. The IO mgr just has a global lock whereas TCMalloc has locks per size class plus batching via the thread cache. The buffer pool should be more scalable for large allocations than either in any case. The queries that regressed are close to the worst possible case since they don't do any work aside from materialising the strings and evaluating a conjunct. Plus the data is already present in the buffer cache. -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 16:09:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44 PS6, Line 44: There is a significant regression (50% increase in runtime) in > Maybe the main difference is that copying into a mempool will ultimately fr Just to clarify my point. My hope is that most of the overhead may ultimately go away when backing the mem pool by the buffer pool (which I presume can recycle buffers?). -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 15:42:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44 PS6, Line 44: There is a significant regression (50% increase in runtime) in > Do we understand where the regression comes from exactly? Somehow I find it Maybe the main difference is that copying into a mempool will ultimately free the copied memory, but I/O buffers may be recycled and would not take the tcmalloc hit from freeing from another thread? -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 15:38:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8085 ) Change subject: IMPALA-5307: part 1: don't transfer disk I/O buffers out of parquet .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8085/6//COMMIT_MSG@44 PS6, Line 44: There is a significant regression (50% increase in runtime) in Do we understand where the regression comes from exactly? Somehow I find it hard to believe a 50% overhead just from copying. Is it maybe another tcmalloc issue where different threads allocate/free the memory? Have you tried a similar experiment with MT_DOP? -- To view, visit http://gerrit.cloudera.org:8080/8085 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I767c1e2dabde7d5bd7a4d5c1ec6d14801b8260d2 Gerrit-Change-Number: 8085 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 15:34:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: Part 3: remove TODO from RCFile
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8151 ) Change subject: IMPALA-5307: Part 3: remove TODO from RCFile .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I594bb246cab64e15de750114890881a2ad9f504d Gerrit-Change-Number: 8151 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Wed, 27 Sep 2017 15:27:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: Part 3: remove TODO from RCFile
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8151 Change subject: IMPALA-5307: Part 3: remove TODO from RCFile .. IMPALA-5307: Part 3: remove TODO from RCFile Our RCFile implementations already copies out data (it sets set_contains_tuple_data to false). Remove a TODO that suggests undoing this. The current implementation is suboptimal but improving RCFile performance is not a priority. Change-Id: I594bb246cab64e15de750114890881a2ad9f504d --- M be/src/exec/hdfs-rcfile-scanner.cc 1 file changed, 0 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/8151/1 -- To view, visit http://gerrit.cloudera.org:8080/8151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I594bb246cab64e15de750114890881a2ad9f504d Gerrit-Change-Number: 8151 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro
Tim Armstrong has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/8146 ) Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro .. IMPALA-5307: Part 2: copy out strings in uncompressed Avro The approach is to re-materialize strings in those tuples that survive conjunct evaluation and may reference disk I/O buffers directly. This means that perf should not regress for the following cases: * Compressed Avro files. * Non-string columns. * Selective scans where the majority of tuples are filtered out. This approach will also work for the Sequence and Text scanners. Includes some improvements to Avro codegen to replace more constants to help win back some performance (with limited success): replaced InitTuple() with an optimised version and substituted tuple_byte_size() with a constant. Removes dead code for handling CHAR(n) - CHAR(n) is now always fixed length. Perf: Did microbenchmarks on uncompressed Avro files, one with all columns from lineitem and one with only l_comment. Tests were run with: set num_scanner_threads=1; I ran the query 5 times and extracted MaterializeTupleTime from the profile to measure CPU cost of materialization. Overall string materialization got significantly slower, mainly because of the extra memcpy() calls required. Selecting one string from a table with multiple columns: select min(l_comment) from biglineitem_avro 1.814 -> 2.096 Selecting one string from a table with one column: select min(l_comment) from biglineitem_comment; profile; 1.708 -> 3.7 Selecting one string from a table with one column with predicate: select min(l_comment) from biglineitem_comment where length(l_comment) > 1; 1.691 -> 1.449 Selecting all columns: select min(l_orderkey), min(l_partkey), min(l_suppkey), min(l_linenumber), min(l_quantity), min(l_extendedprice), min(l_discount), min(l_tax), min(l_returnflag), min(l_linestatus), min(l_shipdate), min(l_commitdate), min(l_receiptdate), min(l_shipinstruct), min(l_shipmode), min(l_comment) from biglineitem_avro; profile; 2.335 -> 3.711 Selecting an int column (no strings): select min(l_linenumber) from biglineitem_avro 1.806 -> 1.819 Testing: Ran exhaustive tests. Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233 --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/common/status.cc M be/src/common/status.h M be/src/exec/hdfs-avro-scanner-ir.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/runtime/CMakeLists.txt M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h A be/src/runtime/tuple-ir.cc M be/src/runtime/tuple.cc M be/src/runtime/tuple.h 19 files changed, 329 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/8146/7 -- To view, visit http://gerrit.cloudera.org:8080/8146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233 Gerrit-Change-Number: 8146 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-5988: optimise MemPool::TryAllocate()
Tim Armstrong has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8145 ) Change subject: IMPALA-5988: optimise MemPool::TryAllocate() .. IMPALA-5988: optimise MemPool::TryAllocate() Testing: Ran core tests. Perf: Experiments using this on top of a WIP Avro patch for IMPALA-5307 showed noticable improvements in CPU efficiency - up to 10% Change-Id: I088012084fe535a67a3cd1103ced5a02027f961a --- M be/src/exec/analytic-eval-node.cc M be/src/exec/data-source-scan-node.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/text-converter.inline.h M be/src/exprs/scalar-expr-evaluator.cc M be/src/runtime/mem-pool.h 6 files changed, 30 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/8145/2 -- To view, visit http://gerrit.cloudera.org:8080/8145 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I088012084fe535a67a3cd1103ced5a02027f961a Gerrit-Change-Number: 8145 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4252: Move runtime filters to ScanNode
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8148 ) Change subject: IMPALA-4252: Move runtime filters to ScanNode .. Patch Set 1: (4 comments) Looks fine, just very minor comments. It looks like there weren't really any changes to the code aside from the move and method signature change. http://gerrit.cloudera.org:8080/#/c/8148/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8148/1//COMMIT_MSG@12 PS1, Line 12: Was the move totally mechanical? It would be good to mention if there were any changes required to the logic that need closer scrutiny. http://gerrit.cloudera.org:8080/#/c/8148/1/be/src/exec/scan-node.h File be/src/exec/scan-node.h: http://gerrit.cloudera.org:8080/#/c/8148/1/be/src/exec/scan-node.h@195 PS1, Line 195: time_ms Comment needs an update - this is no longer an argument. http://gerrit.cloudera.org:8080/#/c/8148/1/be/src/exec/scan-node.h@198 PS1, Line 198: RuntimeState* state Both HdfsScanNodeBase and KuduScanNodeBase have a "RuntimeState* runtime_state_" member. Might be cleaner to just move that member to ScanNode and avoid the inconsistency of passing RuntimeState* via arguments some of the time. http://gerrit.cloudera.org:8080/#/c/8148/1/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/8148/1/be/src/exec/scan-node.cc@79 PS1, Line 79: // TODO: Move this to Prepare() Does this comment still make sense? I wish whoever added it had mentioned why it wasn't already moved to Prepare(). -- To view, visit http://gerrit.cloudera.org:8080/8148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17bdc869046dc2cd837d02f333441fa6324ff086 Gerrit-Change-Number: 8148 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 15:19:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Move tests related to the old join node.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8153 ) Change subject: Move tests related to the old join node. .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1271/ -- To view, visit http://gerrit.cloudera.org:8080/8153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c Gerrit-Change-Number: 8153 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 27 Sep 2017 15:16:57 + Gerrit-HasComments: No
[Impala-ASF-CR] Move tests related to the old join node.
Hello anujphadke, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8153 to look at the new patch set (#2). Change subject: Move tests related to the old join node. .. Move tests related to the old join node. No tests were added/dropped or modified. They are consolidated into fewer .test files. Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c --- M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/inline-view.test D testdata/workloads/functional-query/queries/QueryTest/joins-partitioned.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test D testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test M tests/query_test/test_join_queries.py M tests/query_test/test_runtime_filters.py 7 files changed, 126 insertions(+), 136 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/8153/2 -- To view, visit http://gerrit.cloudera.org:8080/8153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c Gerrit-Change-Number: 8153 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] Move tests related to the old join node.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8153 ) Change subject: Move tests related to the old join node. .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c Gerrit-Change-Number: 8153 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 27 Sep 2017 15:16:05 + Gerrit-HasComments: No
[Impala-ASF-CR] Move tests related to the old join node.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8153 ) Change subject: Move tests related to the old join node. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8153/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8153/1//COMMIT_MSG@8 PS1, Line 8: > It would be good to mention that the tests were only moved. Done -- To view, visit http://gerrit.cloudera.org:8080/8153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c Gerrit-Change-Number: 8153 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 27 Sep 2017 15:15:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5951: test catalogd timeout fails to cause expected exception
Thomas Tauber-Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8154 Change subject: IMPALA-5951: test_catalogd_timeout fails to cause expected exception .. IMPALA-5951: test_catalogd_timeout fails to cause expected exception test_catalogd_timeout sets a Kudu operation timeout of 1ms and then performs various Kudu operations which it expects to fail due to a timeout. Since the test was written, things have sped up - for example, Impala used to create a new Kudu client for each operation, but that was changed in IMPALA-5167, such that the operations now occasionally complete quickly enough that they don't timeout. This patch disables all but the first test. The remaining tests can likely be reactivated once we have a way of invalidating clients (IMPALA-5685). Change-Id: I29fd67d0acc0ee15943c416f2179ad716d2cac05 --- M testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-catalogd.test 1 file changed, 13 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/8154/1 -- To view, visit http://gerrit.cloudera.org:8080/8154 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I29fd67d0acc0ee15943c416f2179ad716d2cac05 Gerrit-Change-Number: 8154 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall
[Impala-ASF-CR] Test cleanup related to the old join node.
anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/8153 ) Change subject: Test cleanup related to the old join node. .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c Gerrit-Change-Number: 8153 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Wed, 27 Sep 2017 15:07:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5983: Fix crash in to/from utc timestamp("10:00:00", 'MSK')
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/8139 ) Change subject: IMPALA-5983: Fix crash in to/from_utc_timestamp("10:00:00", 'MSK') .. Patch Set 2: > Thanks for fixing this. Can you add some tests? I have added a new check an end-to-end tests. -- To view, visit http://gerrit.cloudera.org:8080/8139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I592389333a16a15a680beed389e2702dfc16fe1d Gerrit-Change-Number: 8139 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Wed, 27 Sep 2017 15:05:31 + Gerrit-HasComments: No
[Impala-ASF-CR] Test cleanup related to the old join node.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8153 ) Change subject: Test cleanup related to the old join node. .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8153/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8153/1//COMMIT_MSG@8 PS1, Line 8: It would be good to mention that the tests were only moved. -- To view, visit http://gerrit.cloudera.org:8080/8153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c Gerrit-Change-Number: 8153 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 15:06:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Test cleanup related to the old join node.
Alex Behm has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8153 Change subject: Test cleanup related to the old join node. .. Test cleanup related to the old join node. Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c --- M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/inline-view.test D testdata/workloads/functional-query/queries/QueryTest/joins-partitioned.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test D testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test M tests/query_test/test_join_queries.py M tests/query_test/test_runtime_filters.py 7 files changed, 126 insertions(+), 136 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/8153/1 -- To view, visit http://gerrit.cloudera.org:8080/8153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Idda4b34b5e6e9b5012b177a4c00077aa7fec394c Gerrit-Change-Number: 8153 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm
[Impala-ASF-CR] IMPALA-5983: Fix crash in to/from utc timestamp("10:00:00", 'MSK')
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8139 to look at the new patch set (#2). Change subject: IMPALA-5983: Fix crash in to/from_utc_timestamp("10:00:00", 'MSK') .. IMPALA-5983: Fix crash in to/from_utc_timestamp("10:00:00", 'MSK') Moscow timezone is handled differrently than other timezones, and it was possible to cause unhandled boost exception by trying to convert "dateless" timestamps like "10:00:00". These timestamps cannot be handled by timestamp conversion generally, because daylight saving time logic needs month and day to work correctly. The condition HasDateOrTime() incorrectly suggested that these timestamps can be handled, so I made the condition stricter. Change-Id: I592389333a16a15a680beed389e2702dfc16fe1d --- M be/src/exprs/timestamp-functions.cc M tests/query_test/test_timezones.py 2 files changed, 9 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/8139/2 -- To view, visit http://gerrit.cloudera.org:8080/8139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I592389333a16a15a680beed389e2702dfc16fe1d Gerrit-Change-Number: 8139 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-5870: Improve runtime profile for partial sort
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8123 ) Change subject: IMPALA-5870: Improve runtime profile for partial sort .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1270/ -- To view, visit http://gerrit.cloudera.org:8080/8123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2b15af78d8299db8edc44ff820c85db1cbe0be1b Gerrit-Change-Number: 8123 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 27 Sep 2017 14:55:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5310: Add COMPUTE STATS TABLESAMPLE.
Balazs Jeszenszky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8136 ) Change subject: IMPALA-5310: Add COMPUTE STATS TABLESAMPLE. .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/8136/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8136/1//COMMIT_MSG@24 PS1, Line 24: Limitations add: only for HDFS tables http://gerrit.cloudera.org:8080/#/c/8136/1//COMMIT_MSG@25 PS1, Line 25: --enable_stats_extrapolation=true Is this really required to be a startup option, wouldn't a query option (SET) work? I expect it will be enabled most of the time, and disabling it would be a troubleshooting step or workaround for some issues, particularly in planning. It would be better not to have to restart to do that. http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@67 PS1, Line 67: Stats extrapolation disabled How would Impala behave if stats are computed with TABLESAMPLE, and then stats extrapolation is disabled? http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8136/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@777 PS1, Line 777: if (totalBytes_ == 0) { totalBytes_ doesn't get a new value in the above section, why not make this check at the beginning, as it was? -- To view, visit http://gerrit.cloudera.org:8080/8136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f3e72471ac563adada4a4156033a85852b7c8b7 Gerrit-Change-Number: 8136 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 27 Sep 2017 09:58:19 + Gerrit-HasComments: Yes