[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8372 ) Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE .. Patch Set 5: (2 comments) Do you have 3 runs showing that there is no new flaky test cases? http://gerrit.cloudera.org:8080/#/c/8372/5/testdata/workloads/tpcds/queries/tpcds-q77a.test File testdata/workloads/tpcds/queries/tpcds-q77a.test: http://gerrit.cloudera.org:8080/#/c/8372/5/testdata/workloads/tpcds/queries/tpcds-q77a.test@4 PS5, Line 4: -- FIXED. USE ACTUAL RESULT AS EXPECTED RESULT Please add explanation for what was different. http://gerrit.cloudera.org:8080/#/c/8372/5/testdata/workloads/tpcds/queries/tpcds-q86a.test File testdata/workloads/tpcds/queries/tpcds-q86a.test: http://gerrit.cloudera.org:8080/#/c/8372/5/testdata/workloads/tpcds/queries/tpcds-q86a.test@4 PS5, Line 4: -- FIXED. USE ACTUAL RESULT AS EXPECTED RESULT What was the difference? -- To view, visit http://gerrit.cloudera.org:8080/8372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a Gerrit-Change-Number: 8372 Gerrit-PatchSet: 5 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Tue, 14 Nov 2017 00:43:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8372 ) Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q26.test File testdata/workloads/tpcds/queries/tpcds-q26.test: http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q26.test@3 PS4, Line 3: -- RESULT MISMATCH FROM ORIGINAL Please use a more descriptive comment. What kind of mismatch? Which column? Description of mismatch. http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q26.test@4 PS4, Line 4: -- TAKE ACTUAL RESULT AS EXPECTED What does this mean? http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q47.test File testdata/workloads/tpcds/queries/tpcds-q47.test: http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q47.test@3 PS4, Line 3: -- RESULT MISMATCH FROM ORIGINAL in LSD of DECIMAL values Would be good to use an example to better describe the mismatch http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q59.test File testdata/workloads/tpcds/queries/tpcds-q59.test: http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q59.test@3 PS4, Line 3: Remove line http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q63.test File testdata/workloads/tpcds/queries/tpcds-q63.test: http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q63.test@3 PS4, Line 3: Remove line -- To view, visit http://gerrit.cloudera.org:8080/8372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a Gerrit-Change-Number: 8372 Gerrit-PatchSet: 4 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Fri, 03 Nov 2017 22:35:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8372 ) Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8372/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8372/3//COMMIT_MSG@14 PS3, Line 14: Rounds of testing : for this commit show that the earlier anomalies no longer occur I don't of any code change that got in lately that would result in change in behavior. How many iterations did you try for those tests? -- To view, visit http://gerrit.cloudera.org:8080/8372 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a Gerrit-Change-Number: 8372 Gerrit-PatchSet: 3 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Tue, 31 Oct 2017 00:09:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5541: Reject BATCH SIZE greater than 65536
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8419 ) Change subject: IMPALA-5541: Reject BATCH_SIZE greater than 65536 .. Patch Set 1: 65K should be fine for experimentation. -- To view, visit http://gerrit.cloudera.org:8080/8419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idd5a2490a73b6915224160d7604b4badc72c1d97 Gerrit-Change-Number: 8419 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 30 Oct 2017 23:51:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 7: TPCDS is a better workload for min/max filtering than TPCH, because TPCDS relies on dynamic partition pruning to reduce scan size from the probe side of the joins. On the other hand min/max filtering won't help TPCH much because each partition from l_shipdate and o_orderdate will have the min and max values for l_orderkey and o_orderkey. -- To view, visit http://gerrit.cloudera.org:8080/7793 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 7 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Anonymous Coward #345 Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 25 Oct 2017 02:50:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 7: I noticed that - BloomFilterBytes is always 0 in the query profiles. Can you please veirfy? -- To view, visit http://gerrit.cloudera.org:8080/7793 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 7 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Anonymous Coward #345 Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 23 Oct 2017 02:37:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 18: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 18 Gerrit-Owner: Tim WoodGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Fri, 20 Oct 2017 23:57:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 11: (5 comments) http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q28.test File testdata/workloads/tpcds/queries/tpcds-q28.test: http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q28.test@6 PS17, Line 6: from (select truncate(avg(ss_list_price), 2) B1_LP, ss_list_price is decimal(7,2) why is the truncate required here? http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q31.test File testdata/workloads/tpcds/queries/tpcds-q31.test: http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q31.test@4 PS12, Line 4: -- ADDED TRUNCATE(2)s AROUND LAST 4 COLUMNS. Why did this query get taken out? http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q31.test@20 PS12, Line 20:,truncate(ws2.web_sales/ws1.web_sales, 2) web_q1_q2_increase Also why is truncate here needed? ws_ext_sales_price is a decimal(7,2) http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q5a.test File testdata/workloads/tpcds/queries/tpcds-q5a.test: http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q5a.test@3 PS12, Line 3: with ssr as Why is TPCDS-5A removed? http://gerrit.cloudera.org:8080/#/c/8102/11/testdata/workloads/tpcds/queries/tpcds-q85.test File testdata/workloads/tpcds/queries/tpcds-q85.test: http://gerrit.cloudera.org:8080/#/c/8102/11/testdata/workloads/tpcds/queries/tpcds-q85.test@5 PS11, Line 5: select substr(r_reason_desc,1,20) Is there a separate work item added to add reason? -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 11 Gerrit-Owner: Tim WoodGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Fri, 13 Oct 2017 04:44:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8235 ) Change subject: IMPALA-5429: Multi threaded block metadata loading .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@42 PS5, Line 42: DEFINE_int32(max_s3_parts_parallel_load, 10, I would be more aggressive with this parameter and put it at 20. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@787 PS5, Line 787: int threadPoolSize = FileSystemUtil.supportsStorageIds(tableFs) ? What is the expected behavior for tables with mixed FSs? As a mix of S3 and HDFS partitions. http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801 PS5, Line 801: getLoadingThreadPoolSize > can different partitions have different number of files? if so, work across Parallelization of metadata loading is done on per partition granularity. -- To view, visit http://gerrit.cloudera.org:8080/8235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481 Gerrit-Change-Number: 8235 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 12 Oct 2017 04:51:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 17: (1 comment) http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test File testdata/workloads/tpcds/queries/tpcds-q26.test: http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test@6 PS17, Line 6: truncate(avg(cs_quantity) * 10.0) / 10 agg1, What is the reason behind changing the query text along with the results? -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 17 Gerrit-Owner: Tim WoodGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Wed, 11 Oct 2017 19:32:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@644 PS3, Line 644: for (int i = 0; i < channels_.size(); ++i) { The RowBatch is serialized once per channel which is very wasteful. IMPALA-6041. Compare to https://github.com/michaelhkw/incubator-impala/blob/krpc-testing-hung/be/src/runtime/data-stream-sender.cc#L429 -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 3 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 11 Oct 2017 19:14:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 14: @Tim, Can you try rerunning Q48 with DECIMAL_V2=1; -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 14 Gerrit-Owner: Tim WoodGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Mon, 09 Oct 2017 18:34:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5448: fix invalid number of splits reported in Parquet scan node
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8147 ) Change subject: IMPALA-5448: fix invalid number of splits reported in Parquet scan node .. Patch Set 4: Did you the test the case if one partition is snappy and another gzip both scanned by the same HDFS SCAN Node? Can you please paste what the output in the query profile looks like? -- To view, visit http://gerrit.cloudera.org:8080/8147 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaacc2d775032f5707061e704f12e0a63cde695d1 Gerrit-Change-Number: 8147 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong HuangGerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 05 Oct 2017 02:51:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8102 ) Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 12: Tim, Please remove all the unsupported queries from this code review as they are breaking the stress tests. Consider adding a rewrite for the unsupported queries in a followup code review. -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-Change-Number: 8102 Gerrit-PatchSet: 12 Gerrit-Owner: Tim WoodGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Wed, 04 Oct 2017 20:57:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3316: [DOCS] Add known issue for timezone conversion slowdown
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8165 ) Change subject: IMPALA-3316: [DOCS] Add known issue for timezone conversion slowdown .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8165/2/docs/topics/impala_known_issues.xml File docs/topics/impala_known_issues.xml: http://gerrit.cloudera.org:8080/#/c/8165/2/docs/topics/impala_known_issues.xml@320 PS2, Line 320: The slowdown only occurs when processing Parquet files that were generated by Hive, and Lockup happens when reading a timestamp column, not any column within the file. -- To view, visit http://gerrit.cloudera.org:8080/8165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9933ced07e339d589f7f74173cfebe938084e65c Gerrit-Change-Number: 8165 Gerrit-PatchSet: 2 Gerrit-Owner: John RussellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Comment-Date: Tue, 03 Oct 2017 17:11:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected.
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8100 ) Change subject: IMPALA-5940: Avoid log spew by using Status::Expected. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8100/2/be/src/runtime/client-cache.h File be/src/runtime/client-cache.h: http://gerrit.cloudera.org:8080/#/c/8100/2/be/src/runtime/client-cache.h@243 PS2, Line 243: return Status::NoTrace(TErrorCode::RPC_RECV_TIMEOUT, strings::Substitute( Is it necessary to have Status write the message to the log? is it not sufficient to write an info message via LOG.info. -- To view, visit http://gerrit.cloudera.org:8080/8100 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I38088482377a1c3e794a9c8178ef83f29957a330 Gerrit-Change-Number: 8100 Gerrit-PatchSet: 2 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Tue, 26 Sep 2017 23:08:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for Impala. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q10.test File testdata/workloads/tpcds/queries/tpcds-q10.test: Line 3: select Remove trailing spaces from all new files. -- To view, visit http://gerrit.cloudera.org:8080/8102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim WoodGerrit-Reviewer: Alex Behm Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5926 : Avoid printing expensive stack when closing a session
Mostafa Mokhtar has abandoned this change. Change subject: IMPALA-5926 : Avoid printing expensive stack when closing a session .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/8060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I9d0ca485dd17bca758d916040745288c1a20c69f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Mostafa MokhtarGerrit-Reviewer: Alex Behm
[Impala-ASF-CR] IMPALA-5926 : Avoid printing expensive stack when closing a session
Mostafa Mokhtar has uploaded a new change for review. http://gerrit.cloudera.org:8080/8060 Change subject: IMPALA-5926 : Avoid printing expensive stack when closing a session .. IMPALA-5926 : Avoid printing expensive stack when closing a session When conducting high concurrency tests for short running queries noticed that queries are spending lots of time in Unregister query. Investigation showed that CloseSessionInternal calls status("Session closed") which unnecessarily prints the stack to the log which is expensive and not required, refer to IMPALA-5275. The fix uses Expected(const std::string& error_msg) which doesn't print the stack. Table below summarizes speedup for highly selective scan query. +---+--+-+-+ | Num users | Baseline Queries/minutes | Fix Queries/minutes | Speedup | +---+--+-+-+ | 1 | 19 | 24 | 1.23x | | 2 | 41 | 48 | 1.17x | | 4 | 71 | 91 | 1.28x | | 8 | 96 | 161 | 1.67x | | 16| 117 | 226 | 1.92x | | 32| 140 | 266 | 1.90x | | 64| 174 | 269 | 1.54x | | 128 | 202 | 265 | 1.31x | +---+--+-+-+ Change-Id: I9d0ca485dd17bca758d916040745288c1a20c69f --- M be/src/service/impala-server.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/8060/1 -- To view, visit http://gerrit.cloudera.org:8080/8060 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I9d0ca485dd17bca758d916040745288c1a20c69f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-5275: Avoid printing status stack trace on hot paths
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5275: Avoid printing status stack trace on hot paths .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/7449/3//COMMIT_MSG Commit Message: Line 22: 5. In Status::MemLimitExceeded(), create Status object without What about the cancellation code path? -- To view, visit http://gerrit.cloudera.org:8080/7449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ief083f558fba587381aa7fe8f99da279da02f1f2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5498: Support for partial sorts
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5498: Support for partial sorts .. Patch Set 4: Can you extend the Sort metrics in the query profile to include Avg, Min and Max run size? This will be needed to reflect on the quality of the sorted data? -- To view, visit http://gerrit.cloudera.org:8080/7267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5498: Support for partial sorts
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5498: Support for partial sorts .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7267/4/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: Line 527: if (!insertStmt.hasNoClusteredHint() && !ctx_.isSingleNodeExec()) { Why is this limited to Kudu? Inserting into Partitioned HDFS tables is a very common use case. -- To view, visit http://gerrit.cloudera.org:8080/7267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places
Mostafa Mokhtar has posted comments on this change. Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John RussellGerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource limits .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John RussellGerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource limits .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/7348/1/docs/topics/impala_known_issues.xml File docs/topics/impala_known_issues.xml: Line 571: Impala could encounter a serious error due to resource usage from multithreading. This might mean that Impala supports multi threads. I would use "under very high concurrency" -- To view, visit http://gerrit.cloudera.org:8080/7348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John RussellGerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/7300/1/docs/topics/impala_default_join_distribution_mode.xml File docs/topics/impala_default_join_distribution_mode.xml: Line 48: Impala uses the broadcast technique that transmits the entire contents > What is the answer if both tables are missing stats? Does Impala make a ded If both tables are missing stats the table listed first in the query will be the probe side while the second table will be broadcasted. Line 61: from each table to each executor node. > I'd prefer to prepare and fine-tune a brief explanation so I could reuse th This is the description for the SHUFFLE join, we should use similar wording [SHUFFLE] - Makes that join operation use the "partitioned" technique, which divides up corresponding rows from both tables using a hashing algorithm, sending subsets of the rows to other nodes for processing. (The keyword SHUFFLE is used to indicate a "partitioned join", because that type of join is not related to "partitioned tables".) Since the alternative "broadcast" join mechanism is the default when table and index statistics are unavailable, you might use this hint for queries where broadcast joins are unsuitable; typically, partitioned joins are more efficient for joins between large tables of similar size. http://gerrit.cloudera.org:8080/#/c/7300/2/docs/topics/impala_default_join_distribution_mode.xml File docs/topics/impala_default_join_distribution_mode.xml: Line 40: This option determines the join strategy that Impala uses when any of the tables Alex's comment around not using "Join strategy" hasn't been addressed. Can you please use "join distribution" instead? -- To view, visit http://gerrit.cloudera.org:8080/7300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John RussellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] Loads all TPC-DS tables
Mostafa Mokhtar has posted comments on this change. Change subject: Loads all TPC-DS tables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/6877/3/testdata/workloads/functional-query/queries/QueryTest/seq-writer.test File testdata/workloads/functional-query/queries/QueryTest/seq-writer.test: Line 101: where ss_sold_date_sk between 2451170 and 2451200; > To keep the test time under a reasonable amount. This change loaded way mor Can you modify the predicate to where (ss_sold_date_sk between 2451170 and 2451200) or ss_sold_date_sk is null Good to have some rows in the default partition. -- To view, visit http://gerrit.cloudera.org:8080/6877 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic5277245fd20827c9c09ce5c1a7a37266ca476b9 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Wood Gerrit-HasComments: Yes
[Impala-ASF-CR] Loads all TPC-DS tables
Mostafa Mokhtar has posted comments on this change. Change subject: Loads all TPC-DS tables .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/6877 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic5277245fd20827c9c09ce5c1a7a37266ca476b9 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] Loads all TPC-DS tables
Mostafa Mokhtar has posted comments on this change. Change subject: Loads all TPC-DS tables .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6877/2/testdata/datasets/tpcds/tpcds_schema_template.sql File testdata/datasets/tpcds/tpcds_schema_template.sql: Line 659: WHERE ss_sold_date_sk IS NULL; Replace the multiple insert statements with insert overwrite table store_sales partition(ss_sold_date_sk) /*+ clustered,shuffle */ select -- To view, visit http://gerrit.cloudera.org:8080/6877 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic5277245fd20827c9c09ce5c1a7a37266ca476b9 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4166: Add SORT BY sql clause
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4166: Add SORT BY sql clause .. Patch Set 29: Code-Review+1 (1 comment) Thanks for adding the test cases. http://gerrit.cloudera.org:8080/#/c/6495/29/testdata/workloads/functional-query/queries/QueryTest/alter-table.test File testdata/workloads/functional-query/queries/QueryTest/alter-table.test: Line 1130: alter table insert_sorted replace columns (i int, e double, f boolean); For more coverage how about. (int bigint, e decimal(12,2), f boolean) -- To view, visit http://gerrit.cloudera.org:8080/6495 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I08834f38a941786ab45a4381c2732d929a934f75 Gerrit-PatchSet: 29 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4166: Add SORT BY sql clause
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4166: Add SORT BY sql clause .. Patch Set 27: (1 comment) http://gerrit.cloudera.org:8080/#/c/6495/27/testdata/workloads/functional-query/queries/QueryTest/alter-table.test File testdata/workloads/functional-query/queries/QueryTest/alter-table.test: Line 1058: QUERY Can you please add tests cases that cover inserts and queries after alter table? And if possibly verify the Order by operator has the correct columns after the table is modified. -- To view, visit http://gerrit.cloudera.org:8080/6495 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I08834f38a941786ab45a4381c2732d929a934f75 Gerrit-PatchSet: 27 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5036: Parquet count star optimization .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/6812/1/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 455: DCHECK_LE(row_group_rows_read_, file_metadata_.num_rows); What if file_metadata_.num_rows or file_metadata_.row_groups[row_group_idx_].num_rows have negative values? We have seen cases where a single file had too many rows which causes an overflow and stats had a negative value. Line 1455: // Column readers are not needed because we are not reading from any columns if this > DCHECK that there is exactly one materialized slot Can we then optimize something like select count(l_comment) from lineitem to select count(*) from lineitem The later is 7x faster. http://gerrit.cloudera.org:8080/#/c/6812/1/testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test File testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test: Line 34: | output: sum_zero_if_empty(functional_parquet.alltypes.parquet-stats: num_rows) > i don't know what this means. Why do we need to print this information in the plan? Won't this be enabled for all Parquet files moving forward? -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4624: Implement Parquet dictionary filtering
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4624: Implement Parquet dictionary filtering .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/5904/8/testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test File testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test: Line 1: Please add coverage for nested data filtering as well. -- To view, visit http://gerrit.cloudera.org:8080/5904 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3a7cc3bd0523fbf3c79bd924219e909ef671cfd7 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2328: Read support for min/max Parquet statistics
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-2328: Read support for min/max Parquet statistics .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/6032/4/testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test File testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test: Line 2: QUERY Can you add coverage for filtering on the root level for nested data? -- To view, visit http://gerrit.cloudera.org:8080/6032 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I39b836165756fcf929c801048d91c50c8fdcdae4 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4624: Implement Parquet dictionary filtering
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4624: Implement Parquet dictionary filtering .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/5904/8/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 515: RETURN_IF_ERROR(InitColumns(row_group_idx_, column_readers_)); InitColumns calls AddScanRanges, so does this mean that RowGroups are read regardless if the dictionary predicate qualifies or not? https://github.com/apache/incubator-impala/blob/master/be/src/exec/hdfs-parquet-scanner.cc#L1311 -- To view, visit http://gerrit.cloudera.org:8080/5904 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3a7cc3bd0523fbf3c79bd924219e909ef671cfd7 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4624: Implement Parquet dictionary filtering
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4624: Implement Parquet dictionary filtering .. Patch Set 8: Please add a query option to enabled/disable filtering. -- To view, visit http://gerrit.cloudera.org:8080/5904 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3a7cc3bd0523fbf3c79bd924219e909ef671cfd7 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6025/2/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: PS2, Line 1024: new_size > 2x works for me but: With 10x adding a couple of more rows to the table the memory requirements for a query can go from 3GB to 30GB, which might not be acceptable. Hash table used for joins behave similarly, they start small and double in size as need be. If the goal is to improve performance of APPEX_MEDIAN then we should be looking at more than just the memory curve. -- To view, visit http://gerrit.cloudera.org:8080/6025 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3748: add query-wide resource acquisition step
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-3748: add query-wide resource acquisition step .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/5739/6/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: Line 83: prepare_status_ = process_mem_tracker->MemLimitExceeded(NULL, msg, 0); > What happens when mem limit is exceeded on one Impalad? And how is this integrating with admission control? Ideally if the query is admitted it shouldn't fail on acquiring min reservation. -- To view, visit http://gerrit.cloudera.org:8080/5739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia21a3c0f0b0a7175116883ef9871b93c8ce8bb81 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3748: add query-wide resource acquisition step
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-3748: add query-wide resource acquisition step .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/5739/6/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: Line 83: prepare_status_ = process_mem_tracker->MemLimitExceeded(NULL, msg, 0); What happens when mem limit is exceeded on one Impalad? Will all fragments fail similar to hitting mem_limit while running? -- To view, visit http://gerrit.cloudera.org:8080/5739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia21a3c0f0b0a7175116883ef9871b93c8ce8bb81 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Add doc for MT DOP query option.
Mostafa Mokhtar has posted comments on this change. Change subject: Add doc for MT_DOP query option. .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/5652/1/docs/topics/impala_mt_dop.xml File docs/topics/impala_mt_dop.xml: Line 39: Sets the degree of parallelism used for certain operations that Should we mention the operations where mt_dop applies? Compute stats and queries that have scan and aggregate only operators? Line 42: and increased memory and CPU usage during statement processing. I would reword to "ideal balance between response time, memory and CPU". As some operations like "compute stats" consume less overall memory due to running with less scanner threads. -- To view, visit http://gerrit.cloudera.org:8080/5652 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ife2786532b425af6d230074f1c0b5c7dcb2b8a92 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John RussellGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4609: prefix thread counters in fragment profile
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4609: prefix thread counters in fragment profile .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icb8cfbddc758d06b25a14343310bfd9a932ad1f0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4609: prefix thread counters in fragment profile
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4609: prefix thread counters in fragment profile .. Patch Set 1: How about? - TotalThreadsInvoluntaryContextSwitches: 53 (53) - TotalThreadsVoluntaryContextSwitches: 100 (100) - TotalThreadsWallClockTime: 6s668ms - TotalCpuUserTime: 3s030ms - TotalCpuSysTime: 3s030ms - TotalNetworkReceiveTime: 0.000ns - TotalNetworkSendTime: 151.000ns - TotalStorageWaitTime: 266.419ms -- To view, visit http://gerrit.cloudera.org:8080/5392 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icb8cfbddc758d06b25a14343310bfd9a932ad1f0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/5148/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: Line 347: LOG.debug("Loading block md for " + name_ + " directory " + dirPath.toString()); Please add If(LOG.isDebugEnabled()) to avoid calling the dirPath.toString() as this causes lots of unnecessary allocations which causes GC pauses . -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 4: Just tried out the latest patch and metadata loading is 5.4x faster. With the patch metadata loading for 80 partitions with 250K files finished in 27 seconds compared to 146 seconds without. Most of the CPU time is spent in the RemoteIterator, to further speedup metadata loading I recommend using a thread pool. Stack Trace Sample CountPercentage(%) org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table) 509 74.307 org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table, boolean, boolean, Set)509 74.307 org.apache.impala.catalog.HdfsTable.loadAllPartitions(List, Table) 507 74.015 org.apache.impala.catalog.HdfsTable.loadMetadataAndDiskIds(FileSystem, List, HashMap) 497 72.555 org.apache.impala.catalog.HdfsTable.loadBlockMetadata(FileSystem, Path, HashMap, Map) 472 68.905 org.apache.hadoop.fs.FileSystem$5.hasNext() 365 53.285 org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNext() 339 49.489 org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNextNoFilter() 258 37.664 org.apache.hadoop.hdfs.DFSClient.listPaths(String, byte[], boolean) 258 37.664 com.sun.proxy.$Proxy21.getListing(String, byte[], boolean) 258 37.664 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Object, Method, Object[])258 37.664 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Method, Object[]) 258 37.664 java.lang.reflect.Method.invoke(Object, Object[]) 258 37.664 org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus.makeQualifiedLocated(URI, Path) 81 11.825 -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5148/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: Line 368: if (!pathMd.filteredParitionPaths.contains(fileStatus.getPath())) continue; > Looks like this contains() and the indexOf() below can become very expensiv Tried out the patch and the ArrayList is very inefficient and causing a huge regression. Stack Trace Sample CountPercentage(%) org.apache.impala.catalog.HdfsTable.loadBlockMetadata(FileSystem, Path, HdfsTable$FilteredPartitionPathsMd, Map)19,323 99.974 java.util.ArrayList.contains(Object) 9,831 50.864 java.util.ArrayList.indexOf(Object) 9,458 48.934 -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: Line 377: bool is_mt_exec = request.query_ctx.request.query_options.mt_dop != 1; Possibly make is_mt_exec a member of Coordinator as query_options.mt_dop is checked 5 times. http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: PS3, Line 467: Mt Is the expectation to remove the "Mt" prefix once all operators support multi threading? http://gerrit.cloudera.org:8080/#/c/4054/6/fe/src/main/java/com/cloudera/impala/planner/Planner.java File fe/src/main/java/com/cloudera/impala/planner/Planner.java: Line 218: "statistics. Drop and re-compute statistics to resolve this problem.\n" + I believe this is printing too much information by default. -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes