[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE

2017-11-13 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8372 )

Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use 
of TRUNCATE
..


Patch Set 5:

(2 comments)

Do you have 3 runs showing that there is no new flaky test cases?

http://gerrit.cloudera.org:8080/#/c/8372/5/testdata/workloads/tpcds/queries/tpcds-q77a.test
File testdata/workloads/tpcds/queries/tpcds-q77a.test:

http://gerrit.cloudera.org:8080/#/c/8372/5/testdata/workloads/tpcds/queries/tpcds-q77a.test@4
PS5, Line 4: -- FIXED. USE ACTUAL RESULT AS EXPECTED RESULT
Please add explanation for what was different.


http://gerrit.cloudera.org:8080/#/c/8372/5/testdata/workloads/tpcds/queries/tpcds-q86a.test
File testdata/workloads/tpcds/queries/tpcds-q86a.test:

http://gerrit.cloudera.org:8080/#/c/8372/5/testdata/workloads/tpcds/queries/tpcds-q86a.test@4
PS5, Line 4: -- FIXED. USE ACTUAL RESULT AS EXPECTED RESULT
What was the difference?



--
To view, visit http://gerrit.cloudera.org:8080/8372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a
Gerrit-Change-Number: 8372
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Tue, 14 Nov 2017 00:43:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE

2017-11-03 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8372 )

Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use 
of TRUNCATE
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q26.test
File testdata/workloads/tpcds/queries/tpcds-q26.test:

http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q26.test@3
PS4, Line 3: -- RESULT MISMATCH FROM ORIGINAL
Please use a more descriptive comment.
What kind of mismatch?
Which column?
Description of mismatch.


http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q26.test@4
PS4, Line 4: -- TAKE ACTUAL RESULT AS EXPECTED
What does this mean?


http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q47.test
File testdata/workloads/tpcds/queries/tpcds-q47.test:

http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q47.test@3
PS4, Line 3: -- RESULT MISMATCH FROM ORIGINAL in LSD of DECIMAL values
Would be good to use an example to better describe the mismatch


http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q59.test
File testdata/workloads/tpcds/queries/tpcds-q59.test:

http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q59.test@3
PS4, Line 3:
Remove line


http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q63.test
File testdata/workloads/tpcds/queries/tpcds-q63.test:

http://gerrit.cloudera.org:8080/#/c/8372/4/testdata/workloads/tpcds/queries/tpcds-q63.test@3
PS4, Line 3:
Remove line



--
To view, visit http://gerrit.cloudera.org:8080/8372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a
Gerrit-Change-Number: 8372
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Fri, 03 Nov 2017 22:35:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6087: Revisit tests withheld from TPC-DS suite for use of TRUNCATE

2017-10-30 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8372 )

Change subject: IMPALA-6087: Revisit tests withheld from TPC-DS suite for use 
of TRUNCATE
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8372/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8372/3//COMMIT_MSG@14
PS3, Line 14: Rounds of testing
: for this commit show that the earlier anomalies no longer occur
I don't of any code change that got in lately that would result in change in 
behavior.
How many iterations did you try for those tests?



-- 
To view, visit http://gerrit.cloudera.org:8080/8372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I79d2e34621639c8f8c4c4eb0b0944eaefca13a7a
Gerrit-Change-Number: 8372
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Tue, 31 Oct 2017 00:09:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5541: Reject BATCH SIZE greater than 65536

2017-10-30 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8419 )

Change subject: IMPALA-5541: Reject BATCH_SIZE greater than 65536
..


Patch Set 1:

65K should be fine for experimentation.


--
To view, visit http://gerrit.cloudera.org:8080/8419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd5a2490a73b6915224160d7604b4badc72c1d97
Gerrit-Change-Number: 8419
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 30 Oct 2017 23:51:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-10-24 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 7:

TPCDS is a better workload for min/max filtering than TPCH, because TPCDS 
relies on dynamic partition pruning to reduce scan size from the probe side of 
the joins.

On the other hand min/max filtering won't help TPCH much because each partition 
from l_shipdate and o_orderdate will have the min and max values for l_orderkey 
and o_orderkey.


--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 25 Oct 2017 02:50:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-10-22 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 7:

I noticed that - BloomFilterBytes is always 0 in the query profiles.
Can you please veirfy?


--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Oct 2017 02:37:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-20 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8102 )

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 18: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-Change-Number: 8102
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Fri, 20 Oct 2017 23:57:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-12 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8102 )

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 11:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q28.test
File testdata/workloads/tpcds/queries/tpcds-q28.test:

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q28.test@6
PS17, Line 6: from (select truncate(avg(ss_list_price), 2) B1_LP,
ss_list_price is decimal(7,2) why is the truncate required here?


http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q31.test
File testdata/workloads/tpcds/queries/tpcds-q31.test:

http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q31.test@4
PS12, Line 4: -- ADDED TRUNCATE(2)s AROUND LAST 4 COLUMNS.
Why did this query get taken out?


http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q31.test@20
PS12, Line 20:,truncate(ws2.web_sales/ws1.web_sales, 2) 
web_q1_q2_increase
Also why is truncate here needed?
ws_ext_sales_price is a decimal(7,2)


http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q5a.test
File testdata/workloads/tpcds/queries/tpcds-q5a.test:

http://gerrit.cloudera.org:8080/#/c/8102/12/testdata/workloads/tpcds/queries/tpcds-q5a.test@3
PS12, Line 3: with ssr as
Why is TPCDS-5A removed?


http://gerrit.cloudera.org:8080/#/c/8102/11/testdata/workloads/tpcds/queries/tpcds-q85.test
File testdata/workloads/tpcds/queries/tpcds-q85.test:

http://gerrit.cloudera.org:8080/#/c/8102/11/testdata/workloads/tpcds/queries/tpcds-q85.test@5
PS11, Line 5: select  substr(r_reason_desc,1,20)
Is there a separate work item added to add reason?



--
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-Change-Number: 8102
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Fri, 13 Oct 2017 04:44:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5429: Multi threaded block metadata loading

2017-10-11 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8235 )

Change subject: IMPALA-5429: Multi threaded block metadata loading
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/8235/5/be/src/catalog/catalog.cc@42
PS5, Line 42: DEFINE_int32(max_s3_parts_parallel_load, 10,
I would be more aggressive with this parameter and put it at 20.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@787
PS5, Line 787: int threadPoolSize = 
FileSystemUtil.supportsStorageIds(tableFs) ?
What is the expected behavior for tables with mixed FSs?
As a mix of S3 and HDFS partitions.


http://gerrit.cloudera.org:8080/#/c/8235/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@801
PS5, Line 801: getLoadingThreadPoolSize
> can different partitions have different number of files? if so, work across
Parallelization of metadata loading is done on per partition granularity.



--
To view, visit http://gerrit.cloudera.org:8080/8235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I07eaa7151dfc4d56da8db8c2654bd65d8f808481
Gerrit-Change-Number: 8235
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 12 Oct 2017 04:51:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-11 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8102 )

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test
File testdata/workloads/tpcds/queries/tpcds-q26.test:

http://gerrit.cloudera.org:8080/#/c/8102/17/testdata/workloads/tpcds/queries/tpcds-q26.test@6
PS17, Line 6: truncate(avg(cs_quantity) * 10.0) / 10 agg1,
What is the reason behind changing the query text along with the results?



--
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-Change-Number: 8102
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Wed, 11 Oct 2017 19:32:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-10-11 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/8023/3/be/src/runtime/krpc-data-stream-sender.cc@644
PS3, Line 644: for (int i = 0; i < channels_.size(); ++i) {
The RowBatch is serialized once per channel which is very wasteful.
IMPALA-6041.
Compare to 
https://github.com/michaelhkw/incubator-impala/blob/krpc-testing-hung/be/src/runtime/data-stream-sender.cc#L429



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 11 Oct 2017 19:14:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-09 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8102 )

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 14:

@Tim,

Can you try rerunning Q48 with DECIMAL_V2=1;


--
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-Change-Number: 8102
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Mon, 09 Oct 2017 18:34:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5448: fix invalid number of splits reported in Parquet scan node

2017-10-04 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8147 )

Change subject: IMPALA-5448: fix invalid number of splits reported in Parquet 
scan node
..


Patch Set 4:

Did you the test the case if one partition is snappy and another gzip both 
scanned by the same HDFS SCAN Node?
Can you please paste what the output in the query profile looks like?


--
To view, visit http://gerrit.cloudera.org:8080/8147
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaacc2d775032f5707061e704f12e0a63cde695d1
Gerrit-Change-Number: 8147
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 05 Oct 2017 02:51:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-10-04 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8102 )

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 12:

Tim,
Please remove all the unsupported queries from this code review as they are 
breaking the stress tests.
Consider adding a rewrite for the unsupported queries in a followup code review.


--
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-Change-Number: 8102
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Wed, 04 Oct 2017 20:57:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3316: [DOCS] Add known issue for timezone conversion slowdown

2017-10-03 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8165 )

Change subject: IMPALA-3316: [DOCS] Add known issue for timezone conversion 
slowdown
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8165/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8165/2/docs/topics/impala_known_issues.xml@320
PS2, Line 320: The slowdown only occurs when processing Parquet 
files that were generated by Hive, and
Lockup happens when reading a timestamp column, not any column within the file.



--
To view, visit http://gerrit.cloudera.org:8080/8165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9933ced07e339d589f7f74173cfebe938084e65c
Gerrit-Change-Number: 8165
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Comment-Date: Tue, 03 Oct 2017 17:11:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5940: Avoid log spew by using Status::Expected.

2017-09-26 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8100 )

Change subject: IMPALA-5940: Avoid log spew by using Status::Expected.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8100/2/be/src/runtime/client-cache.h
File be/src/runtime/client-cache.h:

http://gerrit.cloudera.org:8080/#/c/8100/2/be/src/runtime/client-cache.h@243
PS2, Line 243: return Status::NoTrace(TErrorCode::RPC_RECV_TIMEOUT, 
strings::Substitute(
Is it necessary to have Status write the message to the log?  is it not 
sufficient to write an info message via LOG.info.



--
To view, visit http://gerrit.cloudera.org:8080/8100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I38088482377a1c3e794a9c8178ef83f29957a330
Gerrit-Change-Number: 8100
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Tue, 26 Sep 2017 23:08:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5376: Implement all TPCDS test cases or alternates for Impala.

2017-09-19 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5376: Implement all TPCDS test cases or alternates for 
Impala.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8102/1/testdata/workloads/tpcds/queries/tpcds-q10.test
File testdata/workloads/tpcds/queries/tpcds-q10.test:

Line 3: select  
Remove trailing spaces from all new files.


-- 
To view, visit http://gerrit.cloudera.org:8080/8102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6e284888600a7a69d1f23fcb7dac21cbb13b7d66
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5926 : Avoid printing expensive stack when closing a session

2017-09-13 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has abandoned this change.

Change subject: IMPALA-5926 : Avoid printing expensive stack when closing a 
session
..


Abandoned

-- 
To view, visit http://gerrit.cloudera.org:8080/8060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I9d0ca485dd17bca758d916040745288c1a20c69f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Alex Behm 


[Impala-ASF-CR] IMPALA-5926 : Avoid printing expensive stack when closing a session

2017-09-13 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8060

Change subject: IMPALA-5926 : Avoid printing expensive stack when closing a 
session
..

IMPALA-5926 : Avoid printing expensive stack when closing a session

When conducting high concurrency tests for short running queries
 noticed that queries are spending lots of time in Unregister query.
Investigation showed that CloseSessionInternal calls
status("Session closed") which unnecessarily prints the stack
to the log which is expensive and not required, refer to IMPALA-5275.

The fix uses Expected(const std::string& error_msg) which doesn't
print the stack.

Table below summarizes speedup for highly selective scan query.

+---+--+-+-+
| Num users | Baseline Queries/minutes | Fix Queries/minutes | Speedup |
+---+--+-+-+
| 1 | 19   | 24  | 1.23x   |
| 2 | 41   | 48  | 1.17x   |
| 4 | 71   | 91  | 1.28x   |
| 8 | 96   | 161 | 1.67x   |
| 16| 117  | 226 | 1.92x   |
| 32| 140  | 266 | 1.90x   |
| 64| 174  | 269 | 1.54x   |
| 128   | 202  | 265 | 1.31x   |
+---+--+-+-+

Change-Id: I9d0ca485dd17bca758d916040745288c1a20c69f
---
M be/src/service/impala-server.cc
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/8060/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9d0ca485dd17bca758d916040745288c1a20c69f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-5275: Avoid printing status stack trace on hot paths

2017-07-20 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5275: Avoid printing status stack trace on hot paths
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7449/3//COMMIT_MSG
Commit Message:

Line 22: 5. In Status::MemLimitExceeded(), create Status object without
What about the cancellation code path?


-- 
To view, visit http://gerrit.cloudera.org:8080/7449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ief083f558fba587381aa7fe8f99da279da02f1f2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5498: Support for partial sorts

2017-07-14 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5498: Support for partial sorts
..


Patch Set 4:

Can you extend the Sort metrics in the query profile to include Avg, Min and 
Max run size?
This will be needed to reflect on the quality of the sorted data?

-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5498: Support for partial sorts

2017-07-10 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5498: Support for partial sorts
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7267/4/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 527:   if (!insertStmt.hasNoClusteredHint() && 
!ctx_.isSingleNodeExec()) {
Why is this limited to Kudu?

Inserting into Partitioned HDFS tables is a very common use case.


-- 
To view, visit http://gerrit.cloudera.org:8080/7267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places

2017-07-10 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits

2017-07-10 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource 
limits
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits

2017-07-10 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource 
limits
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7348/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

Line 571:   Impala could encounter a serious error due to resource 
usage from multithreading.
This might mean that Impala supports multi threads. 
I would use "under very high concurrency"


-- 
To view, visit http://gerrit.cloudera.org:8080/7348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option

2017-07-06 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode 
query option
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7300/1/docs/topics/impala_default_join_distribution_mode.xml
File docs/topics/impala_default_join_distribution_mode.xml:

Line 48:   Impala uses the broadcast technique that transmits the 
entire contents
> What is the answer if both tables are missing stats? Does Impala make a ded
If both tables are missing stats the table listed first in the query will be 
the probe side while the second table will be broadcasted.


Line 61:   from each table to each executor node.
> I'd prefer to prepare and fine-tune a brief explanation so I could reuse th
This is the description for the SHUFFLE join, we should use similar wording

[SHUFFLE] - Makes that join operation use the "partitioned" technique, which 
divides up corresponding rows from both tables using a hashing algorithm, 
sending subsets of the rows to other nodes for processing. (The keyword SHUFFLE 
is used to indicate a "partitioned join", because that type of join is not 
related to "partitioned tables".) Since the alternative "broadcast" join 
mechanism is the default when table and index statistics are unavailable, you 
might use this hint for queries where broadcast joins are unsuitable; 
typically, partitioned joins are more efficient for joins between large tables 
of similar size.


http://gerrit.cloudera.org:8080/#/c/7300/2/docs/topics/impala_default_join_distribution_mode.xml
File docs/topics/impala_default_join_distribution_mode.xml:

Line 40:   This option determines the join strategy that Impala uses when 
any of the tables
Alex's comment around not using "Join strategy" hasn't been addressed. 

Can you please use "join distribution" instead?


-- 
To view, visit http://gerrit.cloudera.org:8080/7300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Loads all TPC-DS tables

2017-05-23 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: Loads all TPC-DS tables
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6877/3/testdata/workloads/functional-query/queries/QueryTest/seq-writer.test
File testdata/workloads/functional-query/queries/QueryTest/seq-writer.test:

Line 101: where ss_sold_date_sk between 2451170 and 2451200;
> To keep the test time under a reasonable amount. This change loaded way mor
Can you modify the predicate to 
where (ss_sold_date_sk between 2451170 and 2451200) or ss_sold_date_sk is 
null

Good to have some rows in the default partition.


-- 
To view, visit http://gerrit.cloudera.org:8080/6877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5277245fd20827c9c09ce5c1a7a37266ca476b9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Loads all TPC-DS tables

2017-05-22 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: Loads all TPC-DS tables
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5277245fd20827c9c09ce5c1a7a37266ca476b9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] Loads all TPC-DS tables

2017-05-16 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: Loads all TPC-DS tables
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6877/2/testdata/datasets/tpcds/tpcds_schema_template.sql
File testdata/datasets/tpcds/tpcds_schema_template.sql:

Line 659: WHERE ss_sold_date_sk IS NULL;
Replace the multiple insert statements with 
insert overwrite table   store_sales partition(ss_sold_date_sk)  /*+ 
clustered,shuffle */ select 


-- 
To view, visit http://gerrit.cloudera.org:8080/6877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5277245fd20827c9c09ce5c1a7a37266ca476b9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4166: Add SORT BY sql clause

2017-05-11 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4166: Add SORT BY sql clause
..


Patch Set 29: Code-Review+1

(1 comment)

Thanks for adding the test cases.

http://gerrit.cloudera.org:8080/#/c/6495/29/testdata/workloads/functional-query/queries/QueryTest/alter-table.test
File testdata/workloads/functional-query/queries/QueryTest/alter-table.test:

Line 1130: alter table insert_sorted replace columns (i int, e double, f 
boolean);
For more coverage how about. 
(int bigint, e decimal(12,2), f boolean)


-- 
To view, visit http://gerrit.cloudera.org:8080/6495
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I08834f38a941786ab45a4381c2732d929a934f75
Gerrit-PatchSet: 29
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4166: Add SORT BY sql clause

2017-05-11 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4166: Add SORT BY sql clause
..


Patch Set 27:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6495/27/testdata/workloads/functional-query/queries/QueryTest/alter-table.test
File testdata/workloads/functional-query/queries/QueryTest/alter-table.test:

Line 1058:  QUERY
Can you please add tests cases that cover inserts and queries after alter table?
And if possibly verify the Order by operator has the correct columns after the 
table is modified.


-- 
To view, visit http://gerrit.cloudera.org:8080/6495
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I08834f38a941786ab45a4381c2732d929a934f75
Gerrit-PatchSet: 27
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-05-08 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6812/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 455: DCHECK_LE(row_group_rows_read_, file_metadata_.num_rows);
What if file_metadata_.num_rows or 
file_metadata_.row_groups[row_group_idx_].num_rows have negative values?

We have seen cases where a single file had too many rows which causes an 
overflow and stats had a negative value.


Line 1455: // Column readers are not needed because we are not reading from 
any columns if this
> DCHECK that there is exactly one materialized slot
Can we then optimize something like 
select count(l_comment) from lineitem to select count(*) from lineitem

The later is 7x faster.


http://gerrit.cloudera.org:8080/#/c/6812/1/testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test:

Line 34: |  output: 
sum_zero_if_empty(functional_parquet.alltypes.parquet-stats: num_rows)
> i don't know what this means.
Why do we need to print this information in the plan?
Won't this be enabled for all Parquet files moving forward?


-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4624: Implement Parquet dictionary filtering

2017-02-21 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4624: Implement Parquet dictionary filtering
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5904/8/testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test:

Line 1: 
Please add coverage for nested data filtering as well.


-- 
To view, visit http://gerrit.cloudera.org:8080/5904
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3a7cc3bd0523fbf3c79bd924219e909ef671cfd7
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2328: Read support for min/max Parquet statistics

2017-02-21 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-2328: Read support for min/max Parquet statistics
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6032/4/testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test:

Line 2:  QUERY
Can you add coverage for filtering on the root level for nested data?


-- 
To view, visit http://gerrit.cloudera.org:8080/6032
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I39b836165756fcf929c801048d91c50c8fdcdae4
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4624: Implement Parquet dictionary filtering

2017-02-19 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4624: Implement Parquet dictionary filtering
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5904/8/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 515: RETURN_IF_ERROR(InitColumns(row_group_idx_, column_readers_));
InitColumns calls AddScanRanges, so does this mean that RowGroups are read 
regardless if the dictionary predicate qualifies or not?
https://github.com/apache/incubator-impala/blob/master/be/src/exec/hdfs-parquet-scanner.cc#L1311


-- 
To view, visit http://gerrit.cloudera.org:8080/5904
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3a7cc3bd0523fbf3c79bd924219e909ef671cfd7
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4624: Implement Parquet dictionary filtering

2017-02-19 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4624: Implement Parquet dictionary filtering
..


Patch Set 8:

Please add a query option to enabled/disable filtering.

-- 
To view, visit http://gerrit.cloudera.org:8080/5904
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3a7cc3bd0523fbf3c79bd924219e909ef671cfd7
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage

2017-02-18 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6025/2/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

PS2, Line 1024: new_size 
> 2x works for me but:
With 10x adding a couple of more rows to the table the memory requirements for 
a query can go from 3GB to 30GB, which might not be acceptable. 
Hash table used for joins behave similarly, they start small and double in size 
as need be. 
If the goal is to improve performance of APPEX_MEDIAN then we should be looking 
at more than just the memory curve.


-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3748: add query-wide resource acquisition step

2017-01-23 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-3748: add query-wide resource acquisition step
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5739/6/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 83: prepare_status_ = process_mem_tracker->MemLimitExceeded(NULL, msg, 
0);
> What happens when mem limit is exceeded on one Impalad?
And how is this integrating with admission control? Ideally if the query is 
admitted it shouldn't fail on acquiring min reservation.


-- 
To view, visit http://gerrit.cloudera.org:8080/5739
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia21a3c0f0b0a7175116883ef9871b93c8ce8bb81
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3748: add query-wide resource acquisition step

2017-01-23 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-3748: add query-wide resource acquisition step
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5739/6/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 83: prepare_status_ = process_mem_tracker->MemLimitExceeded(NULL, msg, 
0);
What happens when mem limit is exceeded on one Impalad?
Will all fragments fail similar to hitting mem_limit while running?


-- 
To view, visit http://gerrit.cloudera.org:8080/5739
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia21a3c0f0b0a7175116883ef9871b93c8ce8bb81
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add doc for MT DOP query option.

2017-01-17 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: Add doc for MT_DOP query option.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5652/1/docs/topics/impala_mt_dop.xml
File docs/topics/impala_mt_dop.xml:

Line 39:   Sets the degree of parallelism used for certain operations that
Should we mention the operations where mt_dop applies?
Compute stats and queries that have scan and aggregate only operators?


Line 42:   and increased memory and CPU usage during statement processing.
I would reword to "ideal balance between response time, memory and CPU". 
As some operations like "compute stats" consume less overall memory due to 
running with less scanner threads.


-- 
To view, visit http://gerrit.cloudera.org:8080/5652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ife2786532b425af6d230074f1c0b5c7dcb2b8a92
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4609: prefix thread counters in fragment profile

2016-12-08 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4609: prefix thread counters in fragment profile
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5392
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icb8cfbddc758d06b25a14343310bfd9a932ad1f0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4609: prefix thread counters in fragment profile

2016-12-07 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4609: prefix thread counters in fragment profile
..


Patch Set 1:

How about?

   - TotalThreadsInvoluntaryContextSwitches: 53 (53)
   - TotalThreadsVoluntaryContextSwitches: 100 (100)
   - TotalThreadsWallClockTime: 6s668ms
   - TotalCpuUserTime: 3s030ms
   - TotalCpuSysTime: 3s030ms
   - TotalNetworkReceiveTime: 0.000ns
   - TotalNetworkSendTime: 151.000ns
   - TotalStorageWaitTime: 266.419ms

-- 
To view, visit http://gerrit.cloudera.org:8080/5392
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icb8cfbddc758d06b25a14343310bfd9a932ad1f0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-28 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5148/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 347:   LOG.debug("Loading block md for " + name_ + " directory " + 
dirPath.toString());
Please add If(LOG.isDebugEnabled()) to avoid calling the dirPath.toString()  as 
this causes lots of unnecessary allocations which causes GC pauses .


-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-26 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 4:

Just tried out the latest patch and metadata loading is 5.4x faster. 

With the patch metadata loading for 80 partitions with 250K files finished in 
27 seconds compared to 146 seconds without. 

Most of the CPU time is spent in the RemoteIterator, to further speedup 
metadata loading I recommend using a thread pool. 

Stack Trace Sample CountPercentage(%)
org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table)  
509 74.307
   org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table, 
boolean, boolean, Set)509 74.307
  org.apache.impala.catalog.HdfsTable.loadAllPartitions(List, Table)
507 74.015
 org.apache.impala.catalog.HdfsTable.loadMetadataAndDiskIds(FileSystem, 
List, HashMap)  497 72.555
org.apache.impala.catalog.HdfsTable.loadBlockMetadata(FileSystem, 
Path, HashMap, Map)   472 68.905
   org.apache.hadoop.fs.FileSystem$5.hasNext()  365 53.285
  
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNext() 
339 49.489
 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNextNoFilter()
  258 37.664
org.apache.hadoop.hdfs.DFSClient.listPaths(String, 
byte[], boolean) 258 37.664

com.sun.proxy.$Proxy21.getListing(String, byte[], boolean)  258 37.664
  
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Object, Method, 
Object[])258 37.664
 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Method, 
Object[])   258 37.664
 
java.lang.reflect.Method.invoke(Object, Object[])  258 37.664
 
org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus.makeQualifiedLocated(URI, 
Path)  81  11.825

-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-22 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5148/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 368: if 
(!pathMd.filteredParitionPaths.contains(fileStatus.getPath())) continue;
> Looks like this contains() and the indexOf() below can become very expensiv
Tried out the patch and the ArrayList is very inefficient and causing a huge 
regression. 

Stack Trace Sample CountPercentage(%)
org.apache.impala.catalog.HdfsTable.loadBlockMetadata(FileSystem, Path, 
HdfsTable$FilteredPartitionPathsMd, Map)19,323  99.974
  java.util.ArrayList.contains(Object)  9,831   50.864
  java.util.ArrayList.indexOf(Object)   9,458   48.934


-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend

2016-09-18 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-3902: Scheduler improvements for running multiple 
fragment instances on a single backend
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 377:   bool is_mt_exec = request.query_ctx.request.query_options.mt_dop != 
1;
Possibly make is_mt_exec a member of Coordinator as query_options.mt_dop is 
checked 5 times.


http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS3, Line 467: Mt
Is the expectation to remove the "Mt" prefix once all operators support multi 
threading?


http://gerrit.cloudera.org:8080/#/c/4054/6/fe/src/main/java/com/cloudera/impala/planner/Planner.java
File fe/src/main/java/com/cloudera/impala/planner/Planner.java:

Line 218:   "statistics. Drop and re-compute statistics to resolve this 
problem.\n" +
I believe this is printing too much information by default.


-- 
To view, visit http://gerrit.cloudera.org:8080/4054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes