[Impala-ASF-CR] IMPALA-4675: Case-insensitive matching of Parquet fields.
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4675: Case-insensitive matching of Parquet fields. .. Patch Set 6: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/336/ -- To view, visit http://gerrit.cloudera.org:8080/5891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Nathan SalmonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nathan Salmon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4675: Case-insensitive matching of Parquet fields.
Alex Behm has posted comments on this change. Change subject: IMPALA-4675: Case-insensitive matching of Parquet fields. .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Nathan SalmonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nathan Salmon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2020: Inline big number strings
Jim Apple has posted comments on this change. Change subject: IMPALA-2020: Inline big number strings .. Patch Set 12: Code-Review+1 Thanks for simplifying so much. -- To view, visit http://gerrit.cloudera.org:8080/5902 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5095a366d914cebb0b64bd434a08dbb55c90ed30 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 6: (6 comments) http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: Line 256: // actually fail once we have variable-length pages. > I just kept this logic from the old BufferedTupleStream. How would the stream use more reservation after UnpinStream()? Wouldn't we spend a reservation on the read and write pages both when the stream is pinned and then unpinned? I need to read the code through some more, but it seems to that a fair amount of complexity comes from sharing the reservation when they overlap. If we do end up sticking with this scheme, how about verifying the pin count is exactly 1 for the read/write page(s)? http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS6, Line 112: DCHECK_LE(read_ptr_, read_end_ptr); would be nice to write this and line 106 the same to make it a little faster to read Line 252: if (!read_page_->is_pinned()) { why does this not have to check if read and write page are the same? Line 276: if (!pinned_ && write_page_ != _.front()) { this could use a comment: // pages_.front() will become the read_page_. Line 332: && ( == write_page_ || (read_write_ && == &*read_page_))) { why is this right to unpin the read_page_ if !read_write_? http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS6, Line 517: CalcNumPinned this looks dead -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4675: Case-insensitive matching of Parquet fields.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4675: Case-insensitive matching of Parquet fields. .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Nathan SalmonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nathan Salmon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2020: Inline big number strings
Dan Hecht has posted comments on this change. Change subject: IMPALA-2020: Inline big number strings .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/5902/12/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: Line 42: // This means the multiplication can cause an unwanted decimal overflow. > What does this mean in terms of regressions when decimal_v2=false? Will we incorrectly overflow too soon now? Or was the old code not overflowing soon enough? Or something different? -- To view, visit http://gerrit.cloudera.org:8080/5902 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5095a366d914cebb0b64bd434a08dbb55c90ed30 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2020: Inline big number strings
Zach Amsden has posted comments on this change. Change subject: IMPALA-2020: Inline big number strings .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/5902/12/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: Line 42: // This means the multiplication can cause an unwanted decimal overflow. This comment was incorrect. The utility I was using to convert large numbers to binary apparently went through a double conversion itself somewhere, and truncated all binary numbers to the leading 53 bit representation, in turn leading me to believe we had precision available that was not utilized. -- To view, visit http://gerrit.cloudera.org:8080/5902 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5095a366d914cebb0b64bd434a08dbb55c90ed30 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5015: Run parquet stats test.py with mt dop > 0
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5015: Run parquet_stats_test.py with mt_dop > 0 .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/335/ -- To view, visit http://gerrit.cloudera.org:8080/6232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4f1a17708ba3aa5dd0bf2090e2ac71eff436b86c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5008: Fix reading stats for TINYINT and SMALLINT
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5008: Fix reading stats for TINYINT and SMALLINT .. Patch Set 5: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/334/ -- To view, visit http://gerrit.cloudera.org:8080/6226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5008: Fix reading stats for TINYINT and SMALLINT
Lars Volker has posted comments on this change. Change subject: IMPALA-5008: Fix reading stats for TINYINT and SMALLINT .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5008: Fix reading stats for TINYINT and SMALLINT
Lars Volker has posted comments on this change. Change subject: IMPALA-5008: Fix reading stats for TINYINT and SMALLINT .. Patch Set 3: Rebased. Carrying Tim's +2. -- To view, visit http://gerrit.cloudera.org:8080/6226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5008: Fix reading stats for TINYINT and SMALLINT
Lars Volker has posted comments on this change. Change subject: IMPALA-5008: Fix reading stats for TINYINT and SMALLINT .. Patch Set 3: (2 comments) Thanks for your review. I addressed the comments in PS4. Will rebase next. http://gerrit.cloudera.org:8080/#/c/6226/3/be/src/exec/parquet-column-stats.cc File be/src/exec/parquet-column-stats.cc: Line 39: return ret; > return true? Done Line 51: return ret; > return true? Done -- To view, visit http://gerrit.cloudera.org:8080/6226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5008: Fix reading stats for TINYINT and SMALLINT
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6226 to look at the new patch set (#4). Change subject: IMPALA-5008: Fix reading stats for TINYINT and SMALLINT .. IMPALA-5008: Fix reading stats for TINYINT and SMALLINT TINYINT and SMALLINT types use 1 and 2 byte slots respectively. However, statistics for the corresponding INT_8 and INT_16 Parquet types are encoded using 4 bytes. When reading back we were missing a conversion to the smaller types, thus overwriting the memory behind them. This was caught by the address sanitizer. The fix is to perform the necessary conversion. Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 --- M be/src/exec/parquet-column-stats.cc M be/src/runtime/coordinator.cc 2 files changed, 26 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6226/4 -- To view, visit http://gerrit.cloudera.org:8080/6226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage .. Patch Set 11: After talking to Alex, I completely rewrote the patch and reran the benchmarks. The state class no longer has a trailing var-len array. The samples array is stored in a separate memory allocation. I think this approach is much more straightforward and is simpler to understand. -- To view, visit http://gerrit.cloudera.org:8080/6025 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage
Taras Bobrovytsky has uploaded a new patch set (#11). Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage .. IMPALA-4787: Optimize APPX_MEDIAN() memory usage Before this change, ReservoirSample functions (such as APPX_MEDIAN()) allocated memory for 20,000 elements up front per grouping key. This caused inefficient memory usage for aggregations with many grouping keys. This patch fixes this by initially allocating memory for 16 elements. Once the buffer becomes full, we reallocate a new buffer with double capacity and copy the original buffer into the new one. We continue doubling the buffer size until the buffer has room for 20,000 elements as before. Testing: Added some EE APPX_MEDIAN() tests on larger datasets that exercise the resize code path. Perf Benchrmark (about 35,000 elements per bucket): SELECT MAX(a) from ( SELECT c1, appx_median(c2) as a FROM benchmark GROUP BY c1) t BEFORE: 11s067ms Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail - 06:AGGREGATE1 124.726us 124.726us 1 1 28.00 KB -1.00 B FINALIZE 05:EXCHANGE 1 29.544us 29.544us 3 1 0 -1.00 B UNPARTITIONED 02:AGGREGATE3 86.406us 120.372us 3 1 44.00 KB 10.00 MB 04:AGGREGATE31s840ms2s824ms 2.00K -1 1.02 GB 128.00 MB FINALIZE 03:EXCHANGE 31s163ms1s989ms 6.00K -1 0 0 HASH(c1) 01:AGGREGATE33s356ms3s416ms 6.00K -1 1.95 GB 128.00 MB STREAMING 00:SCAN HDFS3 64.962ms 65.490ms 65.54M -1 25.97 MB 64.00 MB tpcds_10_parquet.benchmark AFTER: 9s465ms Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail 06:AGGREGATE1 73.961us 73.961us 1 1 28.00 KB -1.00 B FINALIZE 05:EXCHANGE 1 18.101us 18.101us 3 1 0 -1.00 B UNPARTITIONED 02:AGGREGATE3 75.795us 83.969us 3 1 44.00 KB 10.00 MB 04:AGGREGATE31s608ms 2s683ms 2.00K -1 1.02 GB 128.00 MB FINALIZE 03:EXCHANGE 3 826.683ms 1s322ms 6.00K -1 0 0 HASH(c1) 01:AGGREGATE32s457ms 2s672ms 6.00K -1 3.14 GB 128.00 MB STREAMING 00:SCAN HDFS3 81.514ms 89.056ms 65.54M -1 25.94 MB 64.00 MB tpcds_10_parquet.benchmark Memory Benchmark (about 12 elements per bucket): SELECT MAX(a) FROM ( SELECT ss_customer_sk, APPX_MEDIAN(ss_sold_date_sk) as a FROM tpcds_parquet.store_sales GROUP BY ss_customer_sk) t BEFORE: 7s477ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail - 06:AGGREGATE1 114.686us 114.686us1 1 28.00 KB -1.00 B FINALIZE 05:EXCHANGE 1 18.214us 18.214us3 1 0 -1.00 B UNPARTITIONED 02:AGGREGATE3 147.055us 165.464us3 1 28.00 KB 10.00 MB 04:AGGREGATE32s043ms2s147ms 14.82K -1 4.94 GB 128.00 MB FINALIZE 03:EXCHANGE 3 840.528ms 943.254ms 15.61K -1 0 0 HASH(ss_customer_sk) 01:AGGREGATE31s769ms1s869ms 15.61K -1 5.32 GB 128.00 MB STREAMING 00:SCAN HDFS3 17.941ms 37.109ms 183.59K -1 1.94 MB 16.00 MB tpcds_parquet.store_sales AFTER: 434ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail - 06:AGGREGATE1 125.915us 125.915us1 1 28.00 KB -1.00 B FINALIZE 05:EXCHANGE 1 72.179us 72.179us3 1 0 -1.00 B UNPARTITIONED 02:AGGREGATE3 79.054us 83.385us3 1 28.00 KB 10.00 MB 04:AGGREGATE36.559ms7.669ms 14.82K -1 17.32 MB 128.00 MB FINALIZE 03:EXCHANGE 3 67.370us 85.068us 15.60K -1 0 0 HASH(ss_customer_sk) 01:AGGREGATE3 19.245ms 24.472ms 15.60K -1 9.48 MB 128.00 MB STREAMING 00:SCAN HDFS3 53.173ms 55.844ms 183.59K -1 1.18 MB 16.00 MB tpcds_parquet.store_sales
[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage
Hello Matthew Jacobs, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6025 to look at the new patch set (#11). Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage .. IMPALA-4787: Optimize APPX_MEDIAN() memory usage Before this change, ReservoirSample functions (such as APPX_MEDIAN()) allocated memory for 20,000 elements up front per grouping key. This caused inefficient memory usage for aggregations with many grouping keys. This patch fixes this by initially allocating memory for 16 elements. Once the buffer becomes full, we reallocate a new buffer with double capacity and copy the original buffer into the new one. We continue doubling the buffer size until the buffer has room for 20,000 elements as before. Testing: Added some EE APPX_MEDIAN() tests on larger datasets that exercise the resize code path. Perf Benchrmark (about 35,000 elements per bucket): SELECT MAX(a) from ( SELECT c1, appx_median(c2) as a FROM benchmark GROUP BY c1) t BEFORE: 11s067ms Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail - 06:AGGREGATE1 124.726us 124.726us 1 1 28.00 KB -1.00 B FINALIZE 05:EXCHANGE 1 29.544us 29.544us 3 1 0 -1.00 B UNPARTITIONED 02:AGGREGATE3 86.406us 120.372us 3 1 44.00 KB 10.00 MB 04:AGGREGATE31s840ms2s824ms 2.00K -1 1.02 GB 128.00 MB FINALIZE 03:EXCHANGE 31s163ms1s989ms 6.00K -1 0 0 HASH(c1) 01:AGGREGATE33s356ms3s416ms 6.00K -1 1.95 GB 128.00 MB STREAMING 00:SCAN HDFS3 64.962ms 65.490ms 65.54M -1 25.97 MB 64.00 MB tpcds_10_parquet.benchmark AFTER: 9s465ms Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail 06:AGGREGATE1 73.961us 73.961us 1 1 28.00 KB -1.00 B FINALIZE 05:EXCHANGE 1 18.101us 18.101us 3 1 0 -1.00 B UNPARTITIONED 02:AGGREGATE3 75.795us 83.969us 3 1 44.00 KB 10.00 MB 04:AGGREGATE31s608ms 2s683ms 2.00K -1 1.02 GB 128.00 MB FINALIZE 03:EXCHANGE 3 826.683ms 1s322ms 6.00K -1 0 0 HASH(c1) 01:AGGREGATE32s457ms 2s672ms 6.00K -1 3.14 GB 128.00 MB STREAMING 00:SCAN HDFS3 81.514ms 89.056ms 65.54M -1 25.94 MB 64.00 MB tpcds_10_parquet.benchmark Memory Benchmark (about 12 elements per bucket): SELECT MAX(a) FROM ( SELECT ss_customer_sk, APPX_MEDIAN(ss_sold_date_sk) as a FROM tpcds_parquet.store_sales GROUP BY ss_customer_sk) t BEFORE: 7s477ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail - 06:AGGREGATE1 114.686us 114.686us1 1 28.00 KB -1.00 B FINALIZE 05:EXCHANGE 1 18.214us 18.214us3 1 0 -1.00 B UNPARTITIONED 02:AGGREGATE3 147.055us 165.464us3 1 28.00 KB 10.00 MB 04:AGGREGATE32s043ms2s147ms 14.82K -1 4.94 GB 128.00 MB FINALIZE 03:EXCHANGE 3 840.528ms 943.254ms 15.61K -1 0 0 HASH(ss_customer_sk) 01:AGGREGATE31s769ms1s869ms 15.61K -1 5.32 GB 128.00 MB STREAMING 00:SCAN HDFS3 17.941ms 37.109ms 183.59K -1 1.94 MB 16.00 MB tpcds_parquet.store_sales AFTER: 434ms Operator #Hosts Avg Time Max Time#Rows Est. #Rows Peak Mem Est. Peak Mem Detail - 06:AGGREGATE1 125.915us 125.915us1 1 28.00 KB -1.00 B FINALIZE 05:EXCHANGE 1 72.179us 72.179us3 1 0 -1.00 B UNPARTITIONED 02:AGGREGATE3 79.054us 83.385us3 1 28.00 KB 10.00 MB 04:AGGREGATE36.559ms7.669ms 14.82K -1 17.32 MB 128.00 MB FINALIZE 03:EXCHANGE 3 67.370us 85.068us 15.60K -1 0 0 HASH(ss_customer_sk) 01:AGGREGATE3 19.245ms 24.472ms 15.60K -1 9.48 MB 128.00 MB STREAMING 00:SCAN
[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage .. Patch Set 10: (3 comments) http://gerrit.cloudera.org:8080/#/c/6025/10/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: Line 957: // We allocate a contiguous chunk of memory for ReservoirSampleState and an array of > the initial class comment should describe what it is. also, the array is re Completely rewrote patch. I think it's clearer now and more encapsulated. Line 1018: void increment_source_size() { source_size_++; } > bad formatting: these functions aren't getters/setters. Removed these Line 1040: // Trailing var-len array of ReservoirSamples. > "The actual size is ." This is not a trailing var-len array any more -- To view, visit http://gerrit.cloudera.org:8080/6025 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4946: fix hang in BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4946: fix hang in BufferPool .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I13fc95b5a664544dee789c4107fccf81d2077347 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4946: fix hang in BufferPool
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4946: fix hang in BufferPool .. IMPALA-4946: fix hang in BufferPool Once the write is removed from the "in flight" list, both the Client and Page may be destroyed by a different thread. The fix is to signal condition variables before inside the critical section that removes the write from the in flight list. Also fix a potential pitfall with DiskIoMgr::CancelContext() where concurrent calls to the method, which can be called asynchronously with other methods, could result in a hang in DiskIoMgr::CancelContext(). I do not believe any Impala code calls it concurrently from multiple threads, so the bug was only latent. Testing: I was able to reproduce reliably by inserting a 1s sleep before the NotifyAll() calls. After the fix, the hang didn't reproduce with sleeps inside or outside the critical section. I could not come up with a unit test that had a higher reproduction rate than the current tests - the window for the race is very small. I considered adding a debug stress option to insert these delays, but with all the code moved into the critical section it wouldn't be useful. Change-Id: I13fc95b5a664544dee789c4107fccf81d2077347 Reviewed-on: http://gerrit.cloudera.org:8080/6224 Reviewed-by: Dan HechtTested-by: Impala Public Jenkins --- M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/disk-io-mgr-internal.h 2 files changed, 5 insertions(+), 3 deletions(-) Approvals: Impala Public Jenkins: Verified Dan Hecht: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I13fc95b5a664544dee789c4107fccf81d2077347 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] PREVIEW: IMPALA-3742: partitions INSERTs into Kudu tables
Matthew Jacobs has posted comments on this change. Change subject: PREVIEW: IMPALA-3742: partitions INSERTs into Kudu tables .. Patch Set 3: Thanks, Thomas. I think the overall approach makes sense. I haven't gone through and left detailed comments yet, but I think it's reasonable to keep going along this path, e.g. supporting other DML statements, commenting, tests, etc. Will this end up adding an exchange node if there wasn't one before (i.e. that's what we expect)? I guess that might happen because the partitioning exprs get set, though it wasn't obvious in DistributedPlanner. Updating the planner tests would help. -- To view, visit http://gerrit.cloudera.org:8080/6037 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic10b3295159354888efcde3df76b0edb24161515 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2328: Address additional comments
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-2328: Address additional comments .. IMPALA-2328: Address additional comments - test_parquet_stats.py was missing and the tests weren't run during GVO. - The tests in parquet_stats.test assume that the queries were executed in a single fragment, so they now run with 'num_nodes = 1'. - Parquet columns are now resolved correctly. - Parquet files with missing columns are now handled correctly. - Predicates with implicit casts can now be evaluated against parquet::Statistics. - This change also cleans up some old friend declarations I came across. Change-Id: I54c205fad7afc4a0b0a7d0f654859de76db29a02 Reviewed-on: http://gerrit.cloudera.org:8080/6147 Reviewed-by: Lars VolkerTested-by: Impala Public Jenkins --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exprs/case-expr.h M be/src/exprs/expr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test A tests/query_test/test_parquet_stats.py 6 files changed, 129 insertions(+), 17 deletions(-) Approvals: Impala Public Jenkins: Verified Lars Volker: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I54c205fad7afc4a0b0a7d0f654859de76db29a02 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2328: Address additional comments
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-2328: Address additional comments .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I54c205fad7afc4a0b0a7d0f654859de76db29a02 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4675: Case-insensitive matching of Parquet fields.
Hello Lars Volker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5891 to look at the new patch set (#5). Change subject: IMPALA-4675: Case-insensitive matching of Parquet fields. .. IMPALA-4675: Case-insensitive matching of Parquet fields. The query option PARQUET_FALLBACK_SCHEMA_RESOLUTION allows matching of Parquet fields by name instead of by index (the default). Parquet column names are case sensitive, but Impala treats db/table/column/field names as case-insensitive. Today, there is no way today to select Parquet columns with mixed casing via SQL using the name-based field resolution policy. This patch changes the matching of Parquet fields to be case-insensitive. Testing: - Modified the data files backing complextypestbl to contain fields with mixed casing. - Several existing tests run against this table, including the test for name-based resolution. - I confirmed that without this fix, the existing name-based resolution tests fail on the modified data files. - I locally ran test_scanners.py and test_nested_types.py on exhaustive with this fix. Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 --- M be/src/exec/parquet-metadata-utils.cc M be/src/exec/parquet-metadata-utils.h M testdata/ComplexTypesTbl/nonnullable.avsc M testdata/ComplexTypesTbl/nonnullable.json M testdata/ComplexTypesTbl/nonnullable.parq M testdata/ComplexTypesTbl/nullable.avsc M testdata/ComplexTypesTbl/nullable.json M testdata/ComplexTypesTbl/nullable.parq M testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test M tests/query_test/test_scanners.py 10 files changed, 71 insertions(+), 76 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/5891/5 -- To view, visit http://gerrit.cloudera.org:8080/5891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Nathan SalmonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nathan Salmon
[Impala-ASF-CR] IMPALA-4675: Case-insensitive matching of Parquet fields.
Alex Behm has posted comments on this change. Change subject: IMPALA-4675: Case-insensitive matching of Parquet fields. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/5891/4/be/src/exec/parquet-metadata-utils.cc File be/src/exec/parquet-metadata-utils.cc: Line 36: using boost::algorithm::iequals; > why boost and not something like strcasecmp? Works for me. Done. -- To view, visit http://gerrit.cloudera.org:8080/5891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Nathan SalmonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nathan Salmon Gerrit-HasComments: Yes
[Impala-ASF-CR] Update parameter order in test mt dop.py
Impala Public Jenkins has submitted this change and it was merged. Change subject: Update parameter order in test_mt_dop.py .. Update parameter order in test_mt_dop.py The rest of the code base uses them in a different order. This is purely for consistency. Maybe it'll help prevent a bug in the future. Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Reviewed-on: http://gerrit.cloudera.org:8080/6228 Reviewed-by: Lars VolkerTested-by: Impala Public Jenkins --- M tests/query_test/test_mt_dop.py 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Impala Public Jenkins: Verified Lars Volker: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] Update parameter order in test mt dop.py
Impala Public Jenkins has posted comments on this change. Change subject: Update parameter order in test_mt_dop.py .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4810: add DECIMAL test case to strict mode tests
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4810: add DECIMAL test case to strict_mode tests .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idd66c0fb5a4d201919d39f73dea08b87339d6469 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4810: add DECIMAL test case to strict mode tests
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4810: add DECIMAL test case to strict_mode tests .. IMPALA-4810: add DECIMAL test case to strict_mode tests The string parsing code already errors if the decimal column either overflows or underflows (i.e. loses scale). Let's just add a test case. Change-Id: Idd66c0fb5a4d201919d39f73dea08b87339d6469 Reviewed-on: http://gerrit.cloudera.org:8080/6150 Reviewed-by: Dan HechtTested-by: Impala Public Jenkins --- M testdata/data/overflow.txt M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/overflow.test M testdata/workloads/functional-query/queries/QueryTest/strict-mode-abort.test M testdata/workloads/functional-query/queries/QueryTest/strict-mode.test 5 files changed, 35 insertions(+), 9 deletions(-) Approvals: Impala Public Jenkins: Verified Dan Hecht: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Idd66c0fb5a4d201919d39f73dea08b87339d6469 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan Hecht Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-2020: Inline big number strings
Zach Amsden has posted comments on this change. Change subject: IMPALA-2020: Inline big number strings .. Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/5902/5//COMMIT_MSG Commit Message: PS5, Line 38: 0.83s > I mean run the test a few times and provide something like https://en.wikip Do we have any nice scripts to do that? I'd rather not go to that trouble when all that is left is the obvious win of getting the constants declared in the header and not requiring DecimalUtil::InitMaxUnscaledDecimal() http://gerrit.cloudera.org:8080/#/c/5902/5/be/src/util/decimal-util.h File be/src/util/decimal-util.h: Line 57: template > Neither looks done in PS9: private or rename Moot point now - killed the code PS5, Line 58: T > I'm not so sure: it would lose precision there in the low bits. Sadly, other uses GetScaleMultiplier below for exactly this - to get a power of 10 in double. PS5, Line 59: boost:: > It helps me keep track of comments on patches when they are responded to on Duly noted Line 60: return scale == 0 ? 1 : 10 * GetConstScaleMultiplier(scale-1); > Why not just another '?:' operator? moot Line 136: if (__builtin_const_p(scale) && p < 10) return GetConstScaleMultiplier(scale); > Please post the numbers here when you have decided. Wash. The only win was from getting the MAX_UNSCALED_DECIMAL constants inline. -- To view, visit http://gerrit.cloudera.org:8080/5902 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5095a366d914cebb0b64bd434a08dbb55c90ed30 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4616: Add missing Kudu column options
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4616: Add missing Kudu column options .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I96a0fce7f6bc0c086b259d3119daa72c94b0af7b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4616: Add missing Kudu column options
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4616: Add missing Kudu column options .. IMPALA-4616: Add missing Kudu column options Adds support for missing Kudu column options in ALTER TABLE (it was there in CREATE TABLE already): * encoding * compression * block_size Also adds support for adding nullable columns with default values. All of the above was not originally implemented due to limitations in the Kudu client, but have since been fixed: KUDU-1746, KUDU-1747 Testing: Updates and adds relevant test cases. Change-Id: I96a0fce7f6bc0c086b259d3119daa72c94b0af7b Reviewed-on: http://gerrit.cloudera.org:8080/6220 Reviewed-by: Marcel KornackerTested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test 6 files changed, 78 insertions(+), 100 deletions(-) Approvals: Marcel Kornacker: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I96a0fce7f6bc0c086b259d3119daa72c94b0af7b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-4899: Fix parquet table writer dictionary leak
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4899: Fix parquet table writer dictionary leak .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnellGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4899: Fix parquet table writer dictionary leak
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4899: Fix parquet table writer dictionary leak .. IMPALA-4899: Fix parquet table writer dictionary leak Currently, in HdfsTableSink, OutputPartitions are added to the RuntimeState object pool to be freed at the end of the query. However, for clustered inserts into a partitioned table, the OutputPartitions are only used one at a time. They can be immediately freed once done writing to that partition. In addition, the HdfsParquetTableWriter's ColumnWriters are also added to this object pool. These constitute a significant amount of memory, as they contain the dictionaries for Parquet encoding. This change makes HdfsParquetTableWriter's ColumnWriters use unique_ptrs so that they are cleaned up when the HdfsParquetTableWriter is deleted. It also uses a unique_ptr on the PartitionPair for the OutputPartition. The table writers maintain a pointer to the OutputPartition. This remains a raw pointer. This is safe, because OutputPartition has a scoped_ptr to the table writer. The table writer will never outlive the OutputPartition. Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f Reviewed-on: http://gerrit.cloudera.org:8080/6181 Reviewed-by: Marcel KornackerTested-by: Impala Public Jenkins --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h M be/src/exec/hdfs-table-writer.h 5 files changed, 26 insertions(+), 14 deletions(-) Approvals: Marcel Kornacker: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5008: Fix reading stats for TINYINT and SMALLINT
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5008: Fix reading stats for TINYINT and SMALLINT .. Patch Set 3: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/6226/3/be/src/exec/parquet-column-stats.cc File be/src/exec/parquet-column-stats.cc: Line 39: return ret; return true? Line 51: return ret; return true? -- To view, visit http://gerrit.cloudera.org:8080/6226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[native-toolchain-CR] Add a script to build Kudu using existing toolchain artifacts
Matthew Jacobs has posted comments on this change. Change subject: Add a script to build Kudu using existing toolchain artifacts .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6167 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I237580e1545033467a92285ca8bb8db1cf8c804e Gerrit-PatchSet: 3 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[native-toolchain-CR] Add a script to build Kudu using existing toolchain artifacts
Matthew Jacobs has submitted this change and it was merged. Change subject: Add a script to build Kudu using existing toolchain artifacts .. Add a script to build Kudu using existing toolchain artifacts Adds 'build-kudu-only.sh' to allow building Kudu with pre-existing toolchain artifacts. This is necessary because new Kudu builds are needed much more frequently than other toolchain components. The script works by downloading the existing packages needed for the Kudu build from a specified toolchain stored in s3. The packages are extracted to the toolchain build directory, and the toolchain is set up to perform just the Kudu build. Refactors init.sh to separate out environment settings and compiler/tooling related bootstrapping and configuration. Testing: Set up a jenkins job to run this; completed successfully on all supported platforms. Change-Id: I237580e1545033467a92285ca8bb8db1cf8c804e --- A build-kudu-only.sh M build.sh M buildall.sh M functions.sh A init-compiler.sh M init.sh M source/autoconf/build.sh M source/automake/build.sh M source/gcc/build.sh M source/libtool/build.sh 10 files changed, 296 insertions(+), 122 deletions(-) Approvals: Matthew Jacobs: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6167 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I237580e1545033467a92285ca8bb8db1cf8c804e Gerrit-PatchSet: 3 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Add a script to build Kudu using existing toolchain artifacts
Tim Armstrong has posted comments on this change. Change subject: Add a script to build Kudu using existing toolchain artifacts .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6167 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I237580e1545033467a92285ca8bb8db1cf8c804e Gerrit-PatchSet: 3 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4546: Fix Moscow timezone conversion after 2014
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4546: Fix Moscow timezone conversion after 2014 .. IMPALA-4546: Fix Moscow timezone conversion after 2014 In 2014 Moscow timezone rules changed from UTC+3 with no DST to UTC+4 with no DST. A special case has been added to timestamp functions to handle this. Testing: Added BE Expr tests. Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528 Reviewed-on: http://gerrit.cloudera.org:8080/5969 Reviewed-by: Taras BobrovytskyTested-by: Impala Public Jenkins --- M be/src/exprs/expr-test.cc M be/src/exprs/timestamp-functions.cc M be/src/exprs/timezone_db.cc M be/src/exprs/timezone_db.h 4 files changed, 115 insertions(+), 45 deletions(-) Approvals: Impala Public Jenkins: Verified Taras Bobrovytsky: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5969 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4546: Fix Moscow timezone conversion after 2014
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4546: Fix Moscow timezone conversion after 2014 .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5969 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4946: fix hang in BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4946: fix hang in BufferPool .. Patch Set 1: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/333/ -- To view, visit http://gerrit.cloudera.org:8080/6224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I13fc95b5a664544dee789c4107fccf81d2077347 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[native-toolchain-CR] Add a script to build Kudu using existing toolchain artifacts
Matthew Jacobs has uploaded a new patch set (#3). Change subject: Add a script to build Kudu using existing toolchain artifacts .. Add a script to build Kudu using existing toolchain artifacts Adds 'build-kudu-only.sh' to allow building Kudu with pre-existing toolchain artifacts. This is necessary because new Kudu builds are needed much more frequently than other toolchain components. The script works by downloading the existing packages needed for the Kudu build from a specified toolchain stored in s3. The packages are extracted to the toolchain build directory, and the toolchain is set up to perform just the Kudu build. Refactors init.sh to separate out environment settings and compiler/tooling related bootstrapping and configuration. Testing: Set up a jenkins job to run this; completed successfully on all supported platforms. Change-Id: I237580e1545033467a92285ca8bb8db1cf8c804e --- A build-kudu-only.sh M build.sh M buildall.sh M functions.sh A init-compiler.sh M init.sh M source/autoconf/build.sh M source/automake/build.sh M source/gcc/build.sh M source/libtool/build.sh 10 files changed, 296 insertions(+), 122 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/67/6167/3 -- To view, visit http://gerrit.cloudera.org:8080/6167 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I237580e1545033467a92285ca8bb8db1cf8c804e Gerrit-PatchSet: 3 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Add a script to build Kudu using existing toolchain artifacts
Matthew Jacobs has posted comments on this change. Change subject: Add a script to build Kudu using existing toolchain artifacts .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/6167/2/build-kudu-single.sh File build-kudu-single.sh: Line 1: #!/usr/bin/env bash > Doesn't matter too much but build-kudu-only seems clearer to me Done Line 15: > Can you add a short comment up the top explaining how to use it and what en Done Line 35: function download_toolchain_dependency() { > Does this download all versions of the package? Whatever is in that directory, I suppose that would be unnecessary if the toolchain build was 'historical', though I'm pretty sure we never do that. (In fact I was thinking we should remove 'historical' anyway.) Line 110: pkg_name=${dir%*/} > Can we call this something instead of pkg_name? Like pkg_string? pkg_name u Done http://gerrit.cloudera.org:8080/#/c/6167/2/init-compiler.sh File init-compiler.sh: PS2, Line 20: > trailing space Done -- To view, visit http://gerrit.cloudera.org:8080/6167 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I237580e1545033467a92285ca8bb8db1cf8c804e Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4675: Case-insensitive matching of Parquet fields.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4675: Case-insensitive matching of Parquet fields. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/5891/4/be/src/exec/parquet-metadata-utils.cc File be/src/exec/parquet-metadata-utils.cc: Line 36: using boost::algorithm::iequals; why boost and not something like strcasecmp? -- To view, visit http://gerrit.cloudera.org:8080/5891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Nathan SalmonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nathan Salmon Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2020: Inline big number strings
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5902 to look at the new patch set (#10). Change subject: IMPALA-2020: Inline big number strings .. IMPALA-2020: Inline big number strings We can directly spcecify these in the header instead of keeping the constants out in a .o. This lets the compiler inline them in a lot more places, potentially giving better behavior than a table lookup by directly inserting the constant. This in turn allows divides by now known compile time constants to be converted into multiply. This is likely to become even more imporant with big integer divide, which requires additional computation for DECIMAL_V2. Testing: Compare the binary output of decimal-util.cc before and after; ran expr test suite. Using the following query I observe a small but statistically present win of ~3% over an unpatched build. select sum(l_extendedprice / l_discount) from tpch10_parquet.lineitem; Before: +---+ | sum(l_extendedprice / l_discount) | +---+ | 61076151920731.010714279183910| +---+ Fetched 1 row(s) in 9.05s After: +---+ | sum(l_extendedprice / l_discount) | +---+ | 61076151920731.010714279183910| +---+ Fetched 1 row(s) in 8.79s Will do some additional benchmarking after the V2 stuff and rounding changes land. Change-Id: I5095a366d914cebb0b64bd434a08dbb55c90ed30 --- M be/src/benchmarks/overflow-benchmark.cc M be/src/common/init.cc M be/src/util/decimal-util.cc M be/src/util/decimal-util.h 4 files changed, 8 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/5902/10 -- To view, visit http://gerrit.cloudera.org:8080/5902 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5095a366d914cebb0b64bd434a08dbb55c90ed30 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-2020: Inline big number strings
Jim Apple has posted comments on this change. Change subject: IMPALA-2020: Inline big number strings .. Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/5902/5//COMMIT_MSG Commit Message: PS5, Line 38: 0.83s > what do you mean exactly? I mean run the test a few times and provide something like https://en.wikipedia.org/wiki/Interquartile_range so we know if it sped things up or slowed them down. http://gerrit.cloudera.org:8080/#/c/5902/5/be/src/util/decimal-util.h File be/src/util/decimal-util.h: Line 57: template > Done Neither looks done in PS9: private or rename PS5, Line 58: T > Not necessarily, this would work for float / double as well. I'm not so sure: it would lose precision there in the low bits. PS5, Line 59: boost:: > I'll try it It helps me keep track of comments on patches when they are responded to only when the ball is back in my court. Line 60: return scale == 0 ? 1 : 10 * GetConstScaleMultiplier(scale-1); > I think it can be done, not sure how simply. Since this is constexpr, we a Why not just another '?:' operator? Line 136: if (__builtin_const_p(scale) && p < 10) return GetConstScaleMultiplier(scale); > Let's remove it and find out. Please post the numbers here when you have decided. -- To view, visit http://gerrit.cloudera.org:8080/5902 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5095a366d914cebb0b64bd434a08dbb55c90ed30 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] Update parameter order in test mt dop.py
Alex Behm has posted comments on this change. Change subject: Update parameter order in test_mt_dop.py .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6228/2/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: Line 46: def test_compute_stats(self, vector, unique_database): I don't think the order matters, but by convention we put the vector first. It's not a bug, but an aesthetic thing -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide.
Ambreen Kazi has posted comments on this change. Change subject: IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6231/1/docs/topics/impala_auditing.xml File docs/topics/impala_auditing.xml: PS1, Line 59: in the impalad startup options. > Let's put this pair of links back the way they were - I'll fix the conditio Done -- To view, visit http://gerrit.cloudera.org:8080/6231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I65a2aa96d7d45f6c1b348105727ddb5c4a6be6d2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Ambreen KaziGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: John Russell Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide.
Ambreen Kazi has uploaded a new patch set (#2). Change subject: IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide. .. IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide. The scope of this gerrit is dealing with chunks of CM content still remaining in the security guide. Some of work on the SSL, LDAP and Sentry topics was done by John in a separate gerrit. Change-Id: I65a2aa96d7d45f6c1b348105727ddb5c4a6be6d2 --- M docs/topics/impala_auditing.xml M docs/topics/impala_kerberos.xml M docs/topics/impala_lineage.xml M docs/topics/impala_security_guidelines.xml 4 files changed, 5 insertions(+), 81 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/6231/2 -- To view, visit http://gerrit.cloudera.org:8080/6231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I65a2aa96d7d45f6c1b348105727ddb5c4a6be6d2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Ambreen KaziGerrit-Reviewer: John Russell
[Impala-ASF-CR] IMPALA-5015: Run parquet stats test.py with mt dop > 0
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-5015: Run parquet_stats_test.py with mt_dop > 0 .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4f1a17708ba3aa5dd0bf2090e2ac71eff436b86c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[native-toolchain-CR] Bump breakpad version
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6213 to look at the new patch set (#3). Change subject: Bump breakpad version .. Bump breakpad version This will give us better support for DWARF version 4, support for redaction of sensitive data, and a multitude of bug fixes. Our previous version has been from June 12th 2015, this change bumps it to Feb 28th 2017. It also includes a script from Kudu that can be used to create tarballs from the Breakpad Git repository. Breakpad itself does not do releases. Change-Id: I8859ced5048a9f3098c9ec3f24e7331700a1 --- M buildall.sh A source/breakpad/make-breakpad-src-archive.sh 2 files changed, 63 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/13/6213/3 -- To view, visit http://gerrit.cloudera.org:8080/6213 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8859ced5048a9f3098c9ec3f24e7331700a1 Gerrit-PatchSet: 3 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Henry Robinson
[Impala-ASF-CR] PREVIEW: IMPALA-3742: partitions INSERTs into Kudu tables
Thomas Tauber-Marshall has posted comments on this change. Change subject: PREVIEW: IMPALA-3742: partitions INSERTs into Kudu tables .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/6037/1/be/src/runtime/data-stream-sender.cc File be/src/runtime/data-stream-sender.cc: PS1, Line 457: } else { : RETURN_IF_ERROR(channels_[partition % num_channels]->AddRow(current_row)); : } : } : } : COUNTER_ADD(total_sent_rows_counter_, batch->num_rows()); : RETURN_IF_ERROR(state->CheckQueryState()); > we'll need to find a way to avoid doing this for every row batch Done PS1, Line 466: : Status DataStreamSender::FlushFinal(RuntimeState* state) { : DCHECK(!flushed_); : DCHECK(!closed_); : flushed_ = true; : for (int i = 0; i < channels_.size(); ++i) { : // If we hit an error here, we can return without closing the remaining channels as : // the error is propagated back to the coordinator, which in turn cancels the query, : // which will cause the remaining open channels to be closed. : RETURN_IF_ERROR(channels_[i]->FlushAndSendEos(state)); : } : return Status::OK(); : } : : void DataStreamSender::Close(RuntimeState* state) { : if (closed_) return; : for (int i = 0; i < channels_.size(); ++i) { : channels_[i]->Teardown(state); : } : if (partitioner_ != NULL) { : partitioner_->Close(state); : } : DataSink::Close(state); : closed_ = true; : } : : Status DataStreamSender::SerializeBatch(RowBatch* src, TRowBatch* dest, int num_receivers) { : VLOG_ROW << "serializing " << src->num_rows() << " rows"; : { : SCOPED_TIMER(profile_->total_time_counter()); : SCOPED_TIMER(serialize_batch_timer_); : RETURN_IF_ERROR(src->Serialize(dest)); : int bytes = RowBatch::GetBatchSize(*dest); : int uncompressed_bytes = bytes - dest->tuple_data.size() + dest->uncompressed_size; : // The size output_batch would be if we didn't compress tuple_data (will be equal to : // actual batch size if tuple_data isn't compressed) : : COUNTER_ADD(bytes_sent_counter_, bytes * num_receivers); : COUNTER_ADD(uncompressed_bytes_counter_, uncompressed_bytes * num_receivers); : } : return Status::OK(); : } : : int64_t DataStreamSender::GetNumDataBytesSent() const { : // TODO: do we need synchronization here or are reads & writes to 8-byte ints : // atomic? : int64_t result = 0; : for (int i = 0; i < channels_.size(); ++i) { : result += channels_[i]->num_data_bytes_sent(); : } : return result; : } : : } : : > let's see if we can share some code with kudu-table-sink, at least the swit Done http://gerrit.cloudera.org:8080/#/c/6037/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: PS1, Line 634: boolean isKuduTable = table_ instanceof KuduTable; : Set kuduPartitionByColumnNames = null; : if (isKuduTable) { : > should this be a Set? Could be duplicates. Also I'm not sure if we should u I believe only the partition columns are needed here, based on testing and on the comment in kudu/client/client.h which says ' May return a bad Status if the provided row does not have all partitions of the partition key set' -- To view, visit http://gerrit.cloudera.org:8080/6037 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic10b3295159354888efcde3df76b0edb24161515 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] PREVIEW: IMPALA-3742: partitions INSERTs into Kudu tables
Thomas Tauber-Marshall has uploaded a new patch set (#3). Change subject: PREVIEW: IMPALA-3742: partitions INSERTs into Kudu tables .. PREVIEW: IMPALA-3742: partitions INSERTs into Kudu tables Bulk inserts into Kudu are currently painful because we just send rows randomly, which creates a lot of work for Kudu since it partitions and sorts data before writing, causing writes to be slow. We can alleviate this by sending the rows to Kudu already partitioned and sorted. This patch partitions the rows to insert according to Kudu's partitioning scheme. A followup patch will deal with sorting. It accomplishes this by inserting an exchange node into the plan before the insert. The DataStreamSender then uses a new abstraction, DataStreamPartitioner, that calls into the Kudu client to determine the partition for each row. In the future, DataStreamPartitioner can be extended to other partitioning types. This patch is a PREVIEW so we can decide if we're happy with the partitioning API Kudu has proposed and get that in on the Kudu side. It does not have any tests, and has not been tested for performance. Change-Id: Ic10b3295159354888efcde3df76b0edb24161515 --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/runtime/CMakeLists.txt M be/src/runtime/coordinator.cc A be/src/runtime/data-stream-partitioner.cc A be/src/runtime/data-stream-partitioner.h M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/scheduling/simple-scheduler.cc M bin/impala-config.sh M common/thrift/Partitions.thrift M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/TableSink.java 17 files changed, 343 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/6037/3 -- To view, visit http://gerrit.cloudera.org:8080/6037 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic10b3295159354888efcde3df76b0edb24161515 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Matthew Jacobs
[native-toolchain-CR] Bump breakpad version
Lars Volker has submitted this change and it was merged. Change subject: Bump breakpad version .. Bump breakpad version This will give us better support for DWARF version 4, support for redaction of sensitive data, and a multitude of bug fixes. Our previous version has been from June 12th 2015, this change bumps it to Feb 28th 2017. It also includes a script from Kudu that can be used to create tarballs from the Breakpad Git repository. Breakpad itself does not do releases. Change-Id: I8859ced5048a9f3098c9ec3f24e7331700a1 --- M buildall.sh A source/breakpad/make-breakpad-src-archive.sh 2 files changed, 63 insertions(+), 1 deletion(-) Approvals: Lars Volker: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/6213 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I8859ced5048a9f3098c9ec3f24e7331700a1 Gerrit-PatchSet: 3 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker
[native-toolchain-CR] Bump breakpad version
Lars Volker has posted comments on this change. Change subject: Bump breakpad version .. Patch Set 3: Code-Review+2 Verified+1 (2 comments) Thank you for the review. I addressed all your comments and ran a full build. http://gerrit.cloudera.org:8080/#/c/6213/2//COMMIT_MSG Commit Message: PS2, Line 7: version > remove, not sure where 'upstream' is in this context. Done http://gerrit.cloudera.org:8080/#/c/6213/2/source/breakpad/make-breakpad-src-archive.sh File source/breakpad/make-breakpad-src-archive.sh: Line 21: # releases. This script was taken from the Apache Kudu project at > Mention where the original is? Done -- To view, visit http://gerrit.cloudera.org:8080/6213 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8859ced5048a9f3098c9ec3f24e7331700a1 Gerrit-PatchSet: 3 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal.
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal. .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/6070 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4967fae275a8822274aece14a15107237445aba5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel HaleGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal.
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal. .. IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal. Most of these fixes entailed hiding the paragraph where there are mentions of CM and writing a replacement paragraph for the upstream docs. This removes the CM references from the rendered docs. A subsequent cleanup project will remove occurrences of CM from the XML. This patch includes fixes to files in response to John Russell's first set of review comments. Change-Id: I4967fae275a8822274aece14a15107237445aba5 Reviewed-on: http://gerrit.cloudera.org:8080/6070 Reviewed-by: John RussellTested-by: Impala Public Jenkins --- M docs/topics/impala_isilon.xml M docs/topics/impala_logging.xml M docs/topics/impala_s3.xml M docs/topics/impala_webui.xml 4 files changed, 35 insertions(+), 27 deletions(-) Approvals: Impala Public Jenkins: Verified John Russell: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6070 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I4967fae275a8822274aece14a15107237445aba5 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale Gerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide.
John Russell has posted comments on this change. Change subject: IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6231/1/docs/topics/impala_auditing.xml File docs/topics/impala_auditing.xml: PS1, Line 59: in the impalad startup options. Let's put this pair of links back the way they were - I'll fix the conditionalized links as a separate work item. -- To view, visit http://gerrit.cloudera.org:8080/6231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I65a2aa96d7d45f6c1b348105727ddb5c4a6be6d2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Ambreen KaziGerrit-Reviewer: John Russell Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal.
John Russell has posted comments on this change. Change subject: IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal. .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6070 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4967fae275a8822274aece14a15107237445aba5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel HaleGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4675: Case-insensitive matching of Parquet fields.
Lars Volker has posted comments on this change. Change subject: IMPALA-4675: Case-insensitive matching of Parquet fields. .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Nathan SalmonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nathan Salmon Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5015: Run parquet stats test.py with mt dop > 0
Lars Volker has posted comments on this change. Change subject: IMPALA-5015: Run parquet_stats_test.py with mt_dop > 0 .. Patch Set 1: This has merge conflicts because it depends on "IMPALA-2328: Address additional comments", which is currently running the submit job. -- To view, visit http://gerrit.cloudera.org:8080/6232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4f1a17708ba3aa5dd0bf2090e2ac71eff436b86c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5015: Run parquet stats test.py with mt dop > 0
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/6232 Change subject: IMPALA-5015: Run parquet_stats_test.py with mt_dop > 0 .. IMPALA-5015: Run parquet_stats_test.py with mt_dop > 0 Change-Id: I4f1a17708ba3aa5dd0bf2090e2ac71eff436b86c --- M tests/query_test/test_mt_dop.py M tests/query_test/test_parquet_stats.py 2 files changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/6232/1 -- To view, visit http://gerrit.cloudera.org:8080/6232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I4f1a17708ba3aa5dd0bf2090e2ac71eff436b86c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal. Most of these fixes entailed hiding the paragraph where there are mentions of CM and writing a replacement paragraph for the u
Michael Brown has posted comments on this change. Change subject: IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal. Most of these fixes entailed hiding the paragraph where there are mentions of CM and writing a replacement paragraph for the upstream docs. This removes the CM references from the rendered docs. A .. Patch Set 3: This needs a re-work of the commit message. -- To view, visit http://gerrit.cloudera.org:8080/6070 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4967fae275a8822274aece14a15107237445aba5 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel HaleGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide.
Ambreen Kazi has uploaded a new change for review. http://gerrit.cloudera.org:8080/6231 Change subject: IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide. .. IMPALA-5006 [DOCS] Remove chunks of Cloudera-specific content from the Impala Security Guide. The scope of this gerrit is dealing with chunks of CM content still remaining in the security guide. Some of work on the SSL, LDAP and Sentry topics was done by John in a separate gerrit. Change-Id: I65a2aa96d7d45f6c1b348105727ddb5c4a6be6d2 --- M docs/topics/impala_auditing.xml M docs/topics/impala_kerberos.xml M docs/topics/impala_lineage.xml M docs/topics/impala_security_guidelines.xml 4 files changed, 4 insertions(+), 81 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/6231/1 -- To view, visit http://gerrit.cloudera.org:8080/6231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I65a2aa96d7d45f6c1b348105727ddb5c4a6be6d2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Ambreen Kazi
[Impala-ASF-CR] IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal. Most of these fixes entailed hiding the paragraph where there are mentions of CM and writing a replacement paragraph for the u
John Russell has posted comments on this change. Change subject: IMPALA-3401 [DOCS] Phase 5 of "Cloudera Manager" removal. Most of these fixes entailed hiding the paragraph where there are mentions of CM and writing a replacement paragraph for the upstream docs. This removes the CM references from the rendered docs. A .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6070 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4967fae275a8822274aece14a15107237445aba5 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel HaleGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2328: Address additional comments
Lars Volker has posted comments on this change. Change subject: IMPALA-2328: Address additional comments .. Patch Set 6: Code-Review+2 Rebased, carrying Marcel's +2. -- To view, visit http://gerrit.cloudera.org:8080/6147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I54c205fad7afc4a0b0a7d0f654859de76db29a02 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Update parameter order in test mt dop.py
Impala Public Jenkins has posted comments on this change. Change subject: Update parameter order in test_mt_dop.py .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/331/ -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] Update parameter order in test mt dop.py
Lars Volker has posted comments on this change. Change subject: Update parameter order in test_mt_dop.py .. Patch Set 2: Code-Review+2 Carrying MJ's +2. -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] Update parameter order in test mt dop.py
Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6228 to look at the new patch set (#2). Change subject: Update parameter order in test_mt_dop.py .. Update parameter order in test_mt_dop.py The rest of the code base uses them in a different order. This is purely for consistency. Maybe it'll help prevent a bug in the future. Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c --- M tests/query_test/test_mt_dop.py 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/6228/2 -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4810: add DECIMAL test case to strict mode tests
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4810: add DECIMAL test case to strict_mode tests .. Patch Set 6: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/330/ -- To view, visit http://gerrit.cloudera.org:8080/6150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idd66c0fb5a4d201919d39f73dea08b87339d6469 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] Fix parameter order in test mt dop.py
David Knupp has posted comments on this change. Change subject: Fix parameter order in test_mt_dop.py .. Patch Set 1: Sure, for the sake of consistency, I'm OK with this change. There are actually two places where this occurs. tests/query_test/test_mt_dop.py: def test_compute_stats(self, unique_database, vector): tests/query_test/test_scanners.py: def test_resolution_by_name(self, unique_database, vector): -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4810: add DECIMAL test case to strict mode tests
Dan Hecht has posted comments on this change. Change subject: IMPALA-4810: add DECIMAL test case to strict_mode tests .. Patch Set 6: Code-Review+2 Rebase -- To view, visit http://gerrit.cloudera.org:8080/6150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idd66c0fb5a4d201919d39f73dea08b87339d6469 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4810: add DECIMAL test case to strict mode tests
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4810: add DECIMAL test case to strict_mode tests .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idd66c0fb5a4d201919d39f73dea08b87339d6469 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4810: add DECIMAL test case to strict mode tests
Hello Impala Public Jenkins, Matthew Jacobs, Zach Amsden, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6150 to look at the new patch set (#5). Change subject: IMPALA-4810: add DECIMAL test case to strict_mode tests .. IMPALA-4810: add DECIMAL test case to strict_mode tests The string parsing code already errors if the decimal column either overflows or underflows (i.e. loses scale). Let's just add a test case. Change-Id: Idd66c0fb5a4d201919d39f73dea08b87339d6469 --- M testdata/data/overflow.txt M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/overflow.test M testdata/workloads/functional-query/queries/QueryTest/strict-mode-abort.test M testdata/workloads/functional-query/queries/QueryTest/strict-mode.test 5 files changed, 35 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/6150/5 -- To view, visit http://gerrit.cloudera.org:8080/6150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idd66c0fb5a4d201919d39f73dea08b87339d6469 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4899: Fix parquet table writer dictionary leak
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4899: Fix parquet table writer dictionary leak .. Patch Set 6: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/329/ -- To view, visit http://gerrit.cloudera.org:8080/6181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnellGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4899: Fix parquet table writer dictionary leak
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4899: Fix parquet table writer dictionary leak .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnellGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4946: fix hang in BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4946: fix hang in BufferPool .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I13fc95b5a664544dee789c4107fccf81d2077347 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Fix parameter order in test mt dop.py
Matthew Jacobs has posted comments on this change. Change subject: Fix parameter order in test_mt_dop.py .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/6228/1/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: Line 46: def test_compute_stats(self, vector, unique_database): > The order doesn't matter, since the calls use named parameters: test...(vec ok, it's kind of misleading to call this a fix then. can you change the text to just say this is for consistency? -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2328: Address additional comments
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-2328: Address additional comments .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I54c205fad7afc4a0b0a7d0f654859de76db29a02 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Fix parameter order in test mt dop.py
Lars Volker has posted comments on this change. Change subject: Fix parameter order in test_mt_dop.py .. Patch Set 1: (1 comment) Thanks for having a look. David, please see my reply to MJ's comment. http://gerrit.cloudera.org:8080/#/c/6228/1/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: Line 46: def test_compute_stats(self, vector, unique_database): > how do we know what is the 'right' order? what happens when it's not in the The order doesn't matter, since the calls use named parameters: test...(vector=..., unique_database=...). However this is the only place in our code base where the order is different, so I thought we may as well fix it. Maybe it'll prevent a bug in the future when someone calls the method from another test or so. -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4616: Add missing Kudu column options
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4616: Add missing Kudu column options .. Patch Set 1: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/328/ -- To view, visit http://gerrit.cloudera.org:8080/6220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I96a0fce7f6bc0c086b259d3119daa72c94b0af7b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4616: Add missing Kudu column options
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4616: Add missing Kudu column options .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I96a0fce7f6bc0c086b259d3119daa72c94b0af7b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5008: Fix reading stats for TINYINT and SMALLINT
Lars Volker has posted comments on this change. Change subject: IMPALA-5008: Fix reading stats for TINYINT and SMALLINT .. Patch Set 2: (3 comments) Thanks for the review. Please see PS 3. http://gerrit.cloudera.org:8080/#/c/6226/2/be/src/exec/parquet-column-stats.cc File be/src/exec/parquet-column-stats.cc: Line 34: if (col_stats < std::numeric_limits::min() || > col_stats isn't valid if ret is false, right? yes, fixed. Line 39: DCHECK(false); > I don't think we should DCHECK here - users do sometimes run DEBUG builds o I wasn't aware. Removed it. Line 50: if (col_stats < std::numeric_limits::min() || > See above comments. Done -- To view, visit http://gerrit.cloudera.org:8080/6226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix parameter order in test mt dop.py
David Knupp has posted comments on this change. Change subject: Fix parameter order in test_mt_dop.py .. Patch Set 1: With pytest fixtures, I don't think order matters. -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5008: Fix reading stats for TINYINT and SMALLINT
Lars Volker has uploaded a new patch set (#3). Change subject: IMPALA-5008: Fix reading stats for TINYINT and SMALLINT .. IMPALA-5008: Fix reading stats for TINYINT and SMALLINT TINYINT and SMALLINT types use 1 and 2 byte slots respectively. However, statistics for the corresponding INT_8 and INT_16 Parquet types are encoded using 4 bytes. When reading back we were missing a conversion to the smaller types, thus overwriting the memory behind them. This was caught by the address sanitizer. The fix is to perform the necessary conversion. Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 --- M be/src/exec/parquet-column-stats.cc M be/src/runtime/coordinator.cc 2 files changed, 26 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/6226/3 -- To view, visit http://gerrit.cloudera.org:8080/6226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Fix parameter order in test mt dop.py
Matthew Jacobs has posted comments on this change. Change subject: Fix parameter order in test_mt_dop.py .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6228/1/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: Line 46: def test_compute_stats(self, vector, unique_database): how do we know what is the 'right' order? what happens when it's not in the right order? -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2328: Address additional comments
Lars Volker has posted comments on this change. Change subject: IMPALA-2328: Address additional comments .. Patch Set 4: (3 comments) Thanks for the review, please see PS5. http://gerrit.cloudera.org:8080/#/c/6147/4/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 494: // Resolve the parquet column using schema_resolver_. This will perform either name > a short single line will suffice here (no need to say 'using schema_resolve Done Line 505: // during the scan, so any predicate would evaluate to false. Return early. > ' is null' would evaluate to true, did you cover that? Yes, comparisons with null literals are filtered in the Frontend. min/max are only set when there are non-null values, so we don't consider statistics for "is null". Line 512: // In case there is a bug, we return an error, which will abort the query. > sounds like this should also be a dcheck, no? We abort in L515. I implemented it like this to re-use the error string for the status. Is there an easier way? -- To view, visit http://gerrit.cloudera.org:8080/6147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I54c205fad7afc4a0b0a7d0f654859de76db29a02 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2328: Address additional comments
Hello Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6147 to look at the new patch set (#5). Change subject: IMPALA-2328: Address additional comments .. IMPALA-2328: Address additional comments - test_parquet_stats.py was missing and the tests weren't run during GVO. - The tests in parquet_stats.test assume that the queries were executed in a single fragment, so they now run with 'num_nodes = 1'. - Parquet columns are now resolved correctly. - Parquet files with missing columns are now handled correctly. - Predicates with implicit casts can now be evaluated against parquet::Statistics. - This change also cleans up some old friend declarations I came across. Change-Id: I54c205fad7afc4a0b0a7d0f654859de76db29a02 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exprs/case-expr.h M be/src/exprs/expr.h M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test A tests/query_test/test_parquet_stats.py 6 files changed, 129 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/6147/5 -- To view, visit http://gerrit.cloudera.org:8080/6147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I54c205fad7afc4a0b0a7d0f654859de76db29a02 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2328: Address additional comments
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-2328: Address additional comments .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/6147/4/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 494: // Resolve the parquet column using schema_resolver_. This will perform either name a short single line will suffice here (no need to say 'using schema_resolver_', i can see that; also no need to comment on how schema_resolver works; if there is additional need to document that behavior, it should be done in the respective header file). something like "resolve column path to determine col idx" Line 505: // during the scan, so any predicate would evaluate to false. Return early. ' is null' would evaluate to true, did you cover that? Line 512: // In case there is a bug, we return an error, which will abort the query. sounds like this should also be a dcheck, no? -- To view, visit http://gerrit.cloudera.org:8080/6147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I54c205fad7afc4a0b0a7d0f654859de76db29a02 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4675: Case-insensitive matching of Parquet fields.
Lars Volker has posted comments on this change. Change subject: IMPALA-4675: Case-insensitive matching of Parquet fields. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/5891/4/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: Line 510: def test_resolution_by_name(self, vector, unique_database): > They are py.test fixtures, and the ordering should not matter. Ah, right. They're using named parameters in the calls. That escaped me. I pushed a change to fix the line in test_mt_dop.py. -- To view, visit http://gerrit.cloudera.org:8080/5891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Nathan SalmonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Nathan Salmon Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix parameter order in test mt dop.py
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/6228 Change subject: Fix parameter order in test_mt_dop.py .. Fix parameter order in test_mt_dop.py Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c --- M tests/query_test/test_mt_dop.py 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/6228/1 -- To view, visit http://gerrit.cloudera.org:8080/6228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I925711eb4d4334c179d336f6ebcc91d26ce8879c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-3402: [DOCS] Remove/reword all CDH 4 references
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-3402: [DOCS] Remove/reword all CDH 4 references .. IMPALA-3402: [DOCS] Remove/reword all CDH 4 references Along the way, remove nearby references to CDH 5 and especially the CDH version numbers in the 'Fixed Issues' titles. The fixed issues titles are not amenable to the keydef/keyref substitution technique because they are all 3-part numbers like Impala 2.2.1 that don't have substitution variables defined for them. Change-Id: I9ffb400a244930134705b0f0039087506f5e70d3 Reviewed-on: http://gerrit.cloudera.org:8080/6146 Reviewed-by: John RussellTested-by: Impala Public Jenkins --- M docs/topics/impala_avro.xml M docs/topics/impala_char.xml M docs/topics/impala_decimal.xml M docs/topics/impala_fixed_issues.xml M docs/topics/impala_incompatible_changes.xml M docs/topics/impala_install.xml M docs/topics/impala_new_features.xml M docs/topics/impala_noncm_installation.xml M docs/topics/impala_parquet.xml M docs/topics/impala_perf_resources.xml M docs/topics/impala_scalability.xml M docs/topics/impala_subqueries.xml M docs/topics/impala_txtfile.xml M docs/topics/impala_udf.xml 14 files changed, 68 insertions(+), 111 deletions(-) Approvals: Impala Public Jenkins: Verified John Russell: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9ffb400a244930134705b0f0039087506f5e70d3 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell
[Impala-ASF-CR] IMPALA-2020: Inline big number strings
Zach Amsden has posted comments on this change. Change subject: IMPALA-2020: Inline big number strings .. Patch Set 5: (8 comments) http://gerrit.cloudera.org:8080/#/c/5902/5//COMMIT_MSG Commit Message: PS5, Line 27: Query: select sum(cast(cast(l_extendedprice as double) as : decimal(14,2))) from tpch10_parquet.lineitem : Query submitted at: 2017-02-18 01:23:54 (Coordinator: : http://impala-dev:25000) : Query progress can be monitored at: : http://impala-dev:25000/query_plan?query_id=f0410a35e981a86b:e6d92070 > elide Done PS5, Line 38: 0.83s > error bars? what do you mean exactly? http://gerrit.cloudera.org:8080/#/c/5902/5/be/src/exprs/expr-value.h File be/src/exprs/expr-value.h: Line 177: decimal4_val = +DecimalUtil::MAX_UNSCALED_DECIMAL4; > Why do you want to convert it to an Rvalue? Just to avoid having to define I just defined it in the .cc - not doing so (and not converting to an rvalue) breaks the debug build, but not the release build, as this does actually get inlined if defined in the header. http://gerrit.cloudera.org:8080/#/c/5902/5/be/src/util/decimal-util.h File be/src/util/decimal-util.h: Line 57: template > Describe what it does. Done PS5, Line 58: T > Always is_integral? Not necessarily, this would work for float / double as well. PS5, Line 59: boost:: > static_assert is now part of the language, but I'm not sure if it works on I'll try it Line 60: return scale == 0 ? 1 : 10 * GetConstScaleMultiplier(scale-1); > Check for no overflow, if it can be done simply. I think it can be done, not sure how simply. Since this is constexpr, we are pretty constrained in what we can use but I think another static assert could do it. Line 136: if (__builtin_const_p(scale) && p < 10) return GetConstScaleMultiplier(scale); > Does this by itself speed anything up? If scale is constant at compile time Let's remove it and find out. -- To view, visit http://gerrit.cloudera.org:8080/5902 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5095a366d914cebb0b64bd434a08dbb55c90ed30 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5008: Fix reading stats for TINYINT and SMALLINT
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5008: Fix reading stats for TINYINT and SMALLINT .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/6226/2/be/src/exec/parquet-column-stats.cc File be/src/exec/parquet-column-stats.cc: Line 34: if (col_stats < std::numeric_limits::min() || col_stats isn't valid if ret is false, right? Line 39: DCHECK(false); I don't think we should DCHECK here - users do sometimes run DEBUG builds on their clusters (often their dev environment). Line 50: if (col_stats < std::numeric_limits::min() || See above comments. -- To view, visit http://gerrit.cloudera.org:8080/6226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9b10508db53747e7b08c8bd9a69c763b82135a78 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3402: [DOCS] Remove/reword all CDH 4 references
John Russell has posted comments on this change. Change subject: IMPALA-3402: [DOCS] Remove/reword all CDH 4 references .. Patch Set 6: Code-Review+2 Rebased against asf-gerrit -- To view, visit http://gerrit.cloudera.org:8080/6146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9ffb400a244930134705b0f0039087506f5e70d3 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John RussellGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3402: [DOCS] Remove/reword all CDH 4 references
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-3402: [DOCS] Remove/reword all CDH 4 references .. Patch Set 6: Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/57/ -- To view, visit http://gerrit.cloudera.org:8080/6146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9ffb400a244930134705b0f0039087506f5e70d3 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John RussellGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3402: [DOCS] Remove/reword all CDH 4 references
Hello Impala Public Jenkins, Ambreen Kazi, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6146 to look at the new patch set (#6). Change subject: IMPALA-3402: [DOCS] Remove/reword all CDH 4 references .. IMPALA-3402: [DOCS] Remove/reword all CDH 4 references Along the way, remove nearby references to CDH 5 and especially the CDH version numbers in the 'Fixed Issues' titles. The fixed issues titles are not amenable to the keydef/keyref substitution technique because they are all 3-part numbers like Impala 2.2.1 that don't have substitution variables defined for them. Change-Id: I9ffb400a244930134705b0f0039087506f5e70d3 --- M docs/topics/impala_avro.xml M docs/topics/impala_char.xml M docs/topics/impala_decimal.xml M docs/topics/impala_fixed_issues.xml M docs/topics/impala_incompatible_changes.xml M docs/topics/impala_install.xml M docs/topics/impala_new_features.xml M docs/topics/impala_noncm_installation.xml M docs/topics/impala_parquet.xml M docs/topics/impala_perf_resources.xml M docs/topics/impala_scalability.xml M docs/topics/impala_subqueries.xml M docs/topics/impala_txtfile.xml M docs/topics/impala_udf.xml 14 files changed, 68 insertions(+), 111 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/6146/6 -- To view, visit http://gerrit.cloudera.org:8080/6146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ffb400a244930134705b0f0039087506f5e70d3 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John RussellGerrit-Reviewer: Ambreen Kazi Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell
[Impala-ASF-CR] IMPALA-4546: Fix Moscow timezone conversion after 2014
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4546: Fix Moscow timezone conversion after 2014 .. Patch Set 8: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/327/ -- To view, visit http://gerrit.cloudera.org:8080/5969 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4546: Fix Moscow timezone conversion after 2014
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4546: Fix Moscow timezone conversion after 2014 .. Patch Set 8: Code-Review+2 Forwarding the +2. Somehow the DHECK_LT fix was not included in the last patch. -- To view, visit http://gerrit.cloudera.org:8080/5969 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4546: Fix Moscow timezone conversion after 2014
Taras Bobrovytsky has uploaded a new patch set (#8). Change subject: IMPALA-4546: Fix Moscow timezone conversion after 2014 .. IMPALA-4546: Fix Moscow timezone conversion after 2014 In 2014 Moscow timezone rules changed from UTC+3 with no DST to UTC+4 with no DST. A special case has been added to timestamp functions to handle this. Testing: Added BE Expr tests. Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528 --- M be/src/exprs/expr-test.cc M be/src/exprs/timestamp-functions.cc M be/src/exprs/timezone_db.cc M be/src/exprs/timezone_db.h 4 files changed, 115 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/5969/8 -- To view, visit http://gerrit.cloudera.org:8080/5969 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4546: Fix Moscow timezone conversion after 2014
Hello Jim Apple, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5969 to look at the new patch set (#8). Change subject: IMPALA-4546: Fix Moscow timezone conversion after 2014 .. IMPALA-4546: Fix Moscow timezone conversion after 2014 In 2014 Moscow timezone rules changed from UTC+3 with no DST to UTC+4 with no DST. A special case has been added to timestamp functions to handle this. Testing: Added BE Expr tests. Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528 --- M be/src/exprs/expr-test.cc M be/src/exprs/timestamp-functions.cc M be/src/exprs/timezone_db.cc M be/src/exprs/timezone_db.h 4 files changed, 115 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/5969/8 -- To view, visit http://gerrit.cloudera.org:8080/5969 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id6e3f2c9f6ba29749a26bc1087e664637bc02528 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky