[Impala-ASF-CR] Fix for startup crash in scheduler-benchmark.

2020-11-23 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Tim Armstrong, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16778

to look at the new patch set (#2).

Change subject: Fix for startup crash in scheduler-benchmark.
..

Fix for startup crash in scheduler-benchmark.

Updated scheduler-benchmark's main() to use the newer
impala::InitCommonRuntime() initialization methods.

Before this change the benchmark would compile properly but crash
when run.

Testing:
 * built and ran in debug and release mode.

Change-Id: Ib9fba3b97f102e41f2024a2bfaacbf0568bd4c68
---
M be/src/benchmarks/scheduler-benchmark.cc
1 file changed, 6 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/16778/2
--
To view, visit http://gerrit.cloudera.org:8080/16778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9fba3b97f102e41f2024a2bfaacbf0568bd4c68
Gerrit-Change-Number: 16778
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Fix for run-time crash in scheduler-benchmark

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16778 )

Change subject: Fix for run-time crash in scheduler-benchmark
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7720/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9fba3b97f102e41f2024a2bfaacbf0568bd4c68
Gerrit-Change-Number: 16778
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 24 Nov 2020 03:33:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix for run-time crash in scheduler-benchmark

2020-11-23 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16778 )

Change subject: Fix for run-time crash in scheduler-benchmark
..


Patch Set 1:

Small fix to get the scheduler benchmark running again.


--
To view, visit http://gerrit.cloudera.org:8080/16778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9fba3b97f102e41f2024a2bfaacbf0568bd4c68
Gerrit-Change-Number: 16778
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 24 Nov 2020 03:14:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix for run-time crash in scheduler-benchmark

2020-11-23 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16778


Change subject: Fix for run-time crash in scheduler-benchmark
..

Fix for run-time crash in scheduler-benchmark

Updated scheduler-benchmark's main() to use the newer impala
initialization methods.

Before this change the benchmark would compile properly but would crash
at run-time.

Testing:
 * built and ran in debug and release mode.

Change-Id: Ib9fba3b97f102e41f2024a2bfaacbf0568bd4c68
---
M be/src/benchmarks/scheduler-benchmark.cc
1 file changed, 6 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/16778/1
--
To view, visit http://gerrit.cloudera.org:8080/16778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib9fba3b97f102e41f2024a2bfaacbf0568bd4c68
Gerrit-Change-Number: 16778
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 


[Impala-ASF-CR] IMPALA-10314: Optimize planning time for simple limits

2020-11-23 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16723 )

Change subject: IMPALA-10314: Optimize planning time for simple limits
..


Patch Set 11:

> Patch Set 11:
>
> (4 comments)
>
> > Patch Set 11:
> >
> > (4 comments)
> >
> > Just a couple corner cases I have run into; given this is an opt-in 
> > optimization now it might not be incorrect to ignore these.
> >
> > I think it's good to think about the case where this optimization helps and 
> > not risk an incorrect limit in other cases. Where this helps most.
> > a. lots of files
> > b. small limits
> >
> > a) the scan range and scheduling overhead is only slow when there are many 
> > hosts + files.
> >
> > b) for large limits maybe the bulk of query run time goes to fetching 
> > results and not the planning, but that said it may not hurt too much in 
> > this case.
>
> Thanks for the comments. I will create a follow-up JIRA to address couple of 
> these comments considering that this CR was +2 ed.
> Note that the more generalized issue of optimizing for limits is something 
> that Tim and I had some offline discussion about and he created 
> 'IMPALA-10347: Explore approaches to optimizing queries that will likely be 
> short-circuited by limits'

Created https://issues.apache.org/jira/browse/IMPALA-10353


--
To view, visit http://gerrit.cloudera.org:8080/16723
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574
Gerrit-Change-Number: 16723
Gerrit-PatchSet: 11
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 24 Nov 2020 02:37:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10314: Optimize planning time for simple limits

2020-11-23 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16723 )

Change subject: IMPALA-10314: Optimize planning time for simple limits
..


Patch Set 11:

(4 comments)

> Patch Set 11:
>
> (4 comments)
>
> Just a couple corner cases I have run into; given this is an opt-in 
> optimization now it might not be incorrect to ignore these.
>
> I think it's good to think about the case where this optimization helps and 
> not risk an incorrect limit in other cases. Where this helps most.
> a. lots of files
> b. small limits
>
> a) the scan range and scheduling overhead is only slow when there are many 
> hosts + files.
>
> b) for large limits maybe the bulk of query run time goes to fetching results 
> and not the planning, but that said it may not hurt too much in this case.

Thanks for the comments. I will create a follow-up JIRA to address couple of 
these comments considering that this CR was +2 ed.
Note that the more generalized issue of optimizing for limits is something that 
Tim and I had some offline discussion about and he created 'IMPALA-10347: 
Explore approaches to optimizing queries that will likely be short-circuited by 
limits'

http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@854
PS11, Line 854:   FileSystem partitionFs;
> I'd consider only doing this optimization for "record oriented" or "splitta
It simplifies the logic and our internal testing at least if we could apply it 
across the board..perhaps for text format we have an allowance that 10% more 
files be considered to accommodate 'invalid' files .. will that be acceptable ?


http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@867
PS11, Line 867:   for (FileDescriptor fd : 
partition.getFileDescriptors()) {
> Also maybe a threshold on the number of scan ranges where we wouldn't bothe
This is related to your other comment below about bailing out under certain 
conditions. I'll try to run the benchmark .. that's a good idea.


http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@873
PS11, Line 873: simpleLimitNumRows++;  // conservatively estimate 1 
row per file
> The flip side of this might be for simple limits that are relatively large
One complication is that the total number of files is not known up front 
(unless we aggregate it up front). We are pruning at 2 levels: once in the 
HdfsPartitionPruner where we limit the number of partitions and then here where 
we limit the number of files per partition.  In both places, as each partition 
is processed, we look at the # files but don't know the total.  We could decide 
to do the aggregation of the num files by making a separate pass over all 
partitions during HdfsPartitionPruner but that will add some overhead.


http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1121
PS11, Line 1121:
> For Hive ACID and deleted records would this logic still work? Might be a h
For ACID tables with deleted rows, the planner will internally create an Hash 
Anti Join to handle the not-in, so yeah the limit should not be applied in such 
cases because it is no longer a simple scan.  I will create a separate JIRA to 
handle that case since additional testing and code changes would be needed.
Thanks for raising this.



--
To view, visit http://gerrit.cloudera.org:8080/16723
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574
Gerrit-Change-Number: 16723
Gerrit-PatchSet: 11
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 24 Nov 2020 02:06:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 24 Nov 2020 01:57:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..

IMPALA-8202: Extend query timeout for test_mem_limit

With the current timeout set to 1 sec, we saw flaky failure where
the query timed out due to client inactivity. This can happen if
the thread keeping it alive fails to execute a fetch within the
query timeout period. This patch attempts to fix this flakiness
by increasing the timeout period.

Testing:
Looped the test locally.

Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Reviewed-on: http://gerrit.cloudera.org:8080/16774
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/custom_cluster/test_admission_controller.py
1 file changed, 2 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-23 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
> Is there any relation to this and the max_row_Size?
My understanding is that DEFAULT_SPILLABLE_BUFFER_SIZE control the minimal unit 
of page size when increasing memory reservation. It is intended to tune 
performance in spill operation (larger buffer sizes result in Impala issuing 
larger I/O requests to storage devices, which might result in higher 
throughput) rather than in-memory allocation.
it contribute to calculation of maxRowBufferSize, but not to 
maxMemReservationBytes.

Let say user only set MAX_ROW_SIZE to 256MB, and leave 
DEFAULT_SPILLABLE_BUFFER_SIZE and MAX_RESULT_SPOOLING_MEM as default (2MB and 
100 MB accordingly). Without the patch, the result is the following:
bufferSize = 2MB
maxRowBufferSize = 256MB (rounded up to nearest power of two)
minMemReservationBytes = 4MB
maxMemReservationBytes = 100MB

maxMemReservationBytes should be at least 2 * MAX_ROW_SIZE for reservation of 
read+write page to be satisfied, which is not true in this case.


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 4 * 1024,
> is this having any effect on the query? the max_result_spooling_mem is alre
It intended to keep the test the same, keeping the maxMemReservationBytes to 
8KB.

Default MAX_ROW_SIZE is 512KB. With the proposed changes in 
PlannerRootSink.java and without explicitly lowering MAX_ROW_SIZE to 4KB, 
MAX_RESULT_SPOOLING_MEM will be ignored and maxMemReservationBytes will be set 
to 1MB instead of intended 8KB.



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 24 Nov 2020 01:28:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10312: bump timeout in test ddl queries are closed

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16762 )

Change subject: IMPALA-10312: bump timeout in test_ddl_queries_are_closed
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5885df6494122dffe2bbc6877cec3b90a9eb4ec6
Gerrit-Change-Number: 16762
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 24 Nov 2020 00:55:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10312: bump timeout in test ddl queries are closed

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16762 )

Change subject: IMPALA-10312: bump timeout in test_ddl_queries_are_closed
..

IMPALA-10312: bump timeout in test_ddl_queries_are_closed

This increases the timeout from 10s to 30s for waiting for the
queries to be closed under the theory that the test failure is
caused by random slowness.

Change-Id: I5885df6494122dffe2bbc6877cec3b90a9eb4ec6
Reviewed-on: http://gerrit.cloudera.org:8080/16762
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/shell/test_shell_interactive.py
1 file changed, 5 insertions(+), 3 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5885df6494122dffe2bbc6877cec3b90a9eb4ec6
Gerrit-Change-Number: 16762
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10337: Consider MAX ROW SIZE when computing max reservation

2020-11-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16765 )

Change subject: IMPALA-10337: Consider MAX_ROW_SIZE when computing max 
reservation
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/16765/1/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@87
PS1, Line 87: getDefault_spillable_buffer_size
Is there any relation to this and the max_row_Size?
should it be able to hold the full row size as well?


http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16765/1/tests/custom_cluster/test_query_retries.py@589
PS1, Line 589: 'max_row_size': 4 * 1024,
is this having any effect on the query? the max_result_spooling_mem is already 
set to 8 and this is set to 4 which will result in the maxMemReservationBytes 
set to 8 in either.
Similar thing for the other test



--
To view, visit http://gerrit.cloudera.org:8080/16765
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Gerrit-Change-Number: 16765
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 24 Nov 2020 00:15:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4238: make TestClientSsl more robust

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16760 )

Change subject: IMPALA-4238: make TestClientSsl more robust
..

IMPALA-4238: make TestClientSsl more robust

This changes the test to wait until it is executing in the backend
before trying to cancel it. This should remove planning time as
a variable that might cause the test to be flaky (e.g. if planning
is slow on S3 because of the time taken to list files).

Also dump the /queries debug page when the assertion is hit to
aid debugging.

Change-Id: I0c884f76659005e7245a156ee33c249b86662b75
Reviewed-on: http://gerrit.cloudera.org:8080/16760
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/custom_cluster/test_client_ssl.py
1 file changed, 8 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0c884f76659005e7245a156ee33c249b86662b75
Gerrit-Change-Number: 16760
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-4238: make TestClientSsl more robust

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16760 )

Change subject: IMPALA-4238: make TestClientSsl more robust
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c884f76659005e7245a156ee33c249b86662b75
Gerrit-Change-Number: 16760
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 24 Nov 2020 00:11:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9050: fix TestScanRangeLengths params

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16761 )

Change subject: IMPALA-9050: fix TestScanRangeLengths params
..

IMPALA-9050: fix TestScanRangeLengths params

This test is only relevant from HDFS-based table formats. The option
under test does not affect behaviour for Kudu or HBase.

Change-Id: I9b8591335dcdc85ce27674b35661444a46d30d5a
Reviewed-on: http://gerrit.cloudera.org:8080/16761
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/query_test/test_scanners.py
1 file changed, 3 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16761
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9b8591335dcdc85ce27674b35661444a46d30d5a
Gerrit-Change-Number: 16761
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9050: fix TestScanRangeLengths params

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16761 )

Change subject: IMPALA-9050: fix TestScanRangeLengths params
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16761
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b8591335dcdc85ce27674b35661444a46d30d5a
Gerrit-Change-Number: 16761
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 24 Nov 2020 00:11:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8990: Fix flakiness in test set request pool

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16749 )

Change subject: IMPALA-8990: Fix flakiness in test_set_request_pool
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6702/


--
To view, visit http://gerrit.cloudera.org:8080/16749
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife06509e936443579ca60780013ce01352c8932e
Gerrit-Change-Number: 16749
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 22:48:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7719/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 21:51:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..

IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

The DESCRIBE HISTORY works for Iceberg tables and displays the
snapshot history of the table.

An example output:
DESCRIBE HISTORY iceberg_multi_snapshots;
++-+-+-+
| creation_time  | snapshot_id | parent_id   | 
is_current_ancestor |
++-+-+-+
| 2020-10-13 14:01:07.234000 | 4400379706200951771 | NULL| TRUE 
   |
| 2020-10-13 14:01:19.307000 | 4221472712544505868 | 4400379706200951771 | TRUE 
   |
++-+-+-+

The purpose here was to have similar output with this new feature as
what SparkSql returns for "SELECT * from tablename.history".
See "History" section of
https://iceberg.apache.org/spark/#inspecting-tables

Testing:
  - iceberg-negative.test was extended to check that DESCRIBE HISTORY
is not applicable for non-Iceberg tables.
  - iceberg-table-history.test: Covers basic usage of DESCRIBE
HISTORY. Tests on tables created with Impala and also with Spark.

Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Reviewed-on: http://gerrit.cloudera.org:8080/16599
Reviewed-by: Zoltan Borok-Nagy 
Reviewed-by: wangsheng 
Tested-by: Impala Public Jenkins 
---
M be/src/service/client-request-state.cc
M be/src/service/frontend.cc
M be/src/service/frontend.h
M common/thrift/Frontend.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
A fe/src/main/java/org/apache/impala/analysis/DescribeHistoryStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/data/README
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
A 
testdata/workloads/functional-query/queries/QueryTest/iceberg-table-history.test
M tests/query_test/test_iceberg.py
14 files changed, 283 insertions(+), 15 deletions(-)

Approvals:
  Zoltan Borok-Nagy: Looks good to me, approved
  wangsheng: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10314: Optimize planning time for simple limits

2020-11-23 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16723 )

Change subject: IMPALA-10314: Optimize planning time for simple limits
..


Patch Set 11:

(4 comments)

Just a couple corner cases I have run into; given this is an opt-in 
optimization now it might not be incorrect to ignore these.

I think it's good to think about the case where this optimization helps and not 
risk an incorrect limit in other cases. Where this helps most.
a. lots of files
b. small limits

a) the scan range and scheduling overhead is only slow when there are many 
hosts + files.

b) for large limits maybe the bulk of query run time goes to fetching results 
and not the planning, but that said it may not hurt too much in this case.

http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@854
PS11, Line 854:   FileSystem partitionFs;
I'd consider only doing this optimization for "record oriented" or "splittable" 
file formats, like parquet/avro. For text tables it's not uncommon for parsing 
or record boundary issues to cause an entire file to be invalid.


http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@867
PS11, Line 867:   for (FileDescriptor fd : 
partition.getFileDescriptors()) {
Also maybe a threshold on the number of scan ranges where we wouldn't bother 
with the optimization, be/src/benchmarks/scheduler-benchmark is a good test to 
run, if I recall tens of nodes and hundreds of files isn't too slow. Just 
thinking that the cases where this optimization assumption is potentially 
problematic are when there are a few malformed files better to err on the side 
of caution then.


http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@873
PS11, Line 873: simpleLimitNumRows++;  // conservatively estimate 1 
row per file
The flip side of this might be for simple limits that are relatively large or 
larger than the number of files a maximum bail out threshold might make sense. 
I.e. if the limit is 10,000 then no sense in doing this optimization unless the 
number of files is much greater than 10,000?


http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/16723/11/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1121
PS11, Line 1121:
For Hive ACID and deleted records would this logic still work? Might be a 
helpful test.



--
To view, visit http://gerrit.cloudera.org:8080/16723
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574
Gerrit-Change-Number: 16723
Gerrit-PatchSet: 11
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 21:32:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16771 )

Change subject: IMPALA-10345: Impala hits DCHECK in 
parquet-column-stats.inline.h
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6698/


--
To view, visit http://gerrit.cloudera.org:8080/16771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
Gerrit-Change-Number: 16771
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 23 Nov 2020 21:23:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10334: test stats extrapolation output doesn't match on erasure coding build

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16756 )

Change subject: IMPALA-10334: test_stats_extrapolation output doesn't match on 
erasure coding build
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16c11aa0a1ec2d4569c272d2454915041039f950
Gerrit-Change-Number: 16756
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 23 Nov 2020 20:36:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10334: test stats extrapolation output doesn't match on erasure coding build

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16756 )

Change subject: IMPALA-10334: test_stats_extrapolation output doesn't match on 
erasure coding build
..

IMPALA-10334: test_stats_extrapolation output doesn't match on erasure coding 
build

This patch skips test_stats_extrapolation for erasure code builds. The
reason is that an extra erasure code information line can be included
in the scan explain section when a hdfs table is erasure coded. This
makes the explain output different between a normal build and an
erasure code build. A new reason 'contain_full_explain' is added to
SkipIfEC to facilitate this.

Testing:
  Ran erasure coding version of the EE and CLUSTER tests.
  Ran core tests

Change-Id: I16c11aa0a1ec2d4569c272d2454915041039f950
Reviewed-on: http://gerrit.cloudera.org:8080/16756
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/common/skip.py
M tests/metadata/test_stats_extrapolation.py
2 files changed, 4 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I16c11aa0a1ec2d4569c272d2454915041039f950
Gerrit-Change-Number: 16756
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7718/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:59:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10312: bump timeout in test ddl queries are closed

2020-11-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16762 )

Change subject: IMPALA-10312: bump timeout in test_ddl_queries_are_closed
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5885df6494122dffe2bbc6877cec3b90a9eb4ec6
Gerrit-Change-Number: 16762
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:47:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6703/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:51:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:51:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..


Patch Set 2: Code-Review+2

Carrying over +2


--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:51:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16774/1/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/16774/1/tests/custom_cluster/test_admission_controller.py@1623
PS1, Line 1623:
> flake8: E226 missing whitespace around arithmetic operator
Done



--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:51:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Bikramjeet Vig (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16774

to look at the new patch set (#2).

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..

IMPALA-8202: Extend query timeout for test_mem_limit

With the current timeout set to 1 sec, we saw flaky failure where
the query timed out due to client inactivity. This can happen if
the thread keeping it alive fails to execute a fetch within the
query timeout period. This patch attempts to fix this flakiness
by increasing the timeout period.

Testing:
Looped the test locally.

Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
---
M tests/custom_cluster/test_admission_controller.py
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/16774/2
--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:48:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8990: Fix flakiness in test set request pool

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16749 )

Change subject: IMPALA-8990: Fix flakiness in test_set_request_pool
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16749
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife06509e936443579ca60780013ce01352c8932e
Gerrit-Change-Number: 16749
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:48:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8990: Fix flakiness in test set request pool

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16749 )

Change subject: IMPALA-8990: Fix flakiness in test_set_request_pool
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6702/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16749
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife06509e936443579ca60780013ce01352c8932e
Gerrit-Change-Number: 16749
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:48:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10312: bump timeout in test ddl queries are closed

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16762 )

Change subject: IMPALA-10312: bump timeout in test_ddl_queries_are_closed
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6701/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5885df6494122dffe2bbc6877cec3b90a9eb4ec6
Gerrit-Change-Number: 16762
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:48:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10312: bump timeout in test ddl queries are closed

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16762 )

Change subject: IMPALA-10312: bump timeout in test_ddl_queries_are_closed
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5885df6494122dffe2bbc6877cec3b90a9eb4ec6
Gerrit-Change-Number: 16762
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:48:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4238: make TestClientSsl more robust

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16760 )

Change subject: IMPALA-4238: make TestClientSsl more robust
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6700/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c884f76659005e7245a156ee33c249b86662b75
Gerrit-Change-Number: 16760
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:46:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4238: make TestClientSsl more robust

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16760 )

Change subject: IMPALA-4238: make TestClientSsl more robust
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c884f76659005e7245a156ee33c249b86662b75
Gerrit-Change-Number: 16760
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:46:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9050: fix TestScanRangeLengths params

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16761 )

Change subject: IMPALA-9050: fix TestScanRangeLengths params
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6699/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16761
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b8591335dcdc85ce27674b35661444a46d30d5a
Gerrit-Change-Number: 16761
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:46:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9050: fix TestScanRangeLengths params

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16761 )

Change subject: IMPALA-9050: fix TestScanRangeLengths params
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16761
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b8591335dcdc85ce27674b35661444a46d30d5a
Gerrit-Change-Number: 16761
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:46:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9050: fix TestScanRangeLengths params

2020-11-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16761 )

Change subject: IMPALA-9050: fix TestScanRangeLengths params
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16761
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b8591335dcdc85ce27674b35661444a46d30d5a
Gerrit-Change-Number: 16761
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:46:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4238: make TestClientSsl more robust

2020-11-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16760 )

Change subject: IMPALA-4238: make TestClientSsl more robust
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0c884f76659005e7245a156ee33c249b86662b75
Gerrit-Change-Number: 16760
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:45:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16774 )

Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16774/1/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/16774/1/tests/custom_cluster/test_admission_controller.py@1623
PS1, Line 1623: /
flake8: E226 missing whitespace around arithmetic operator



--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:38:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8202: Extend query timeout for test mem limit

2020-11-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16774


Change subject: IMPALA-8202: Extend query timeout for test_mem_limit
..

IMPALA-8202: Extend query timeout for test_mem_limit

With the current timeout set to 1 sec, we saw flaky failure where
the query timed out due to client inactivity. This can happen if
the thread keeping it alive fails to execute a fetch within the
query timeout period. This patch attempts to fix this flakiness
by increasing the timeout period.

Testing:
Looped the test locally.

Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
---
M tests/custom_cluster/test_admission_controller.py
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/16774/1
--
To view, visit http://gerrit.cloudera.org:8080/16774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic02f73bea528af12053043e0a57b4158532833b4
Gerrit-Change-Number: 16774
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..


Patch Set 9: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 23 Nov 2020 18:18:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16771 )

Change subject: IMPALA-10345: Impala hits DCHECK in 
parquet-column-stats.inline.h
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6698/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
Gerrit-Change-Number: 16771
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:57:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16771 )

Change subject: IMPALA-10345: Impala hits DCHECK in 
parquet-column-stats.inline.h
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
Gerrit-Change-Number: 16771
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:57:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate

2020-11-23 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16720 )

Change subject: IMPALA-10325 Parquet scan should use min/max statistics to skip 
pages based on equi-join predicate
..


Patch Set 12:

(9 comments)

Did a first round, found a couple of nits

http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG@20
PS12, Line 20: togther
nit: together


http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG@21
PS12, Line 21: evaualted
nit: evaluated


http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/hdfs-scan-node-base.cc@231
PS12, Line 231:   for (const auto& entry : 
tnode.hdfs_scan_node.overlap_predicate_filter_ids) {
  : overlap_predicate_filter_ids_.push_back(entry);
  :   }
if 'overlap_predicate_filter_ids' is also a vector then probably we should just 
use

 overlap_predicate_filter_ids_ = 
tnode.hdfs_scan_node.overlap_predicate_filter_ids;


http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc@553
PS12, Line 553:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc@595
PS12, Line 595: 4
nit: magic constant


http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc@811
PS12, Line 811: return pnode.overlap_predicate_start_index_;
nit: seems like it's also a member of the scan node.


http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc@817
PS12, Line 817:   return pnode.overlap_predicate_filter_ids_;
nit: seems like it's also a member of the scan node.


http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc@950
PS12, Line 950: 2
nit: magic constant


http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc@976
PS12, Line 976: 4
nit: magic constant



--
To view, visit http://gerrit.cloudera.org:8080/16720
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691
Gerrit-Change-Number: 16720
Gerrit-PatchSet: 12
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:54:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

2020-11-23 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16771 )

Change subject: IMPALA-10345: Impala hits DCHECK in 
parquet-column-stats.inline.h
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
Gerrit-Change-Number: 16771
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:49:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on equi-join predicate

2020-11-23 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16720 )

Change subject: IMPALA-10325 Parquet scan should use min/max statistics to skip 
pages based on equi-join predicate
..


Patch Set 12:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG@9
PS12, Line 9: This patch adds the logic to utilize min/max stats
> Does this patch also leads to utilizing min/max filters per-row, similarly
That is an interesting thought. I would think we shall get some ideas with 
performance testing and the collecting of overlapping information.

min/max evaluation per row may be advantageous to string data as it may not 
need to go over every character in the string before finding an inequality.


http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG@9
PS12, Line 9: This patch adds the logic to utilize min/max stats
> I think this would be a good thing to do (I think the patch does this autom
Yes, in which order is interesting. If we apply it on strings, min/max first 
probably makes sense.


http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc@549
PS12, Line 549: if ( eval_min_max ) {
> I am wondering if it is possible to handle min/max runtime filters more sim
That seems a good idea, in that the new logic here can be moved over to the 
min/max filter itself (e.g. to a new method EvalOverLap()) so that other types 
of hdfs scanners (e.g., ORC) can benefit. It probably can also simplify things 
a little bit here.

Let me take a look into it.


http://gerrit.cloudera.org:8080/#/c/16720/12/be/src/exec/parquet/hdfs-parquet-scanner.cc@862
PS12, Line 862: TYPE_DATETIME
> You meant TYPE_TIMESTAMP, right? DATETIME is completely unsupported in Impa
Done



--
To view, visit http://gerrit.cloudera.org:8080/16720
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691
Gerrit-Change-Number: 16720
Gerrit-PatchSet: 12
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:28:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

2020-11-23 Thread Zoltan Borok-Nagy (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16771

to look at the new patch set (#2).

Change subject: IMPALA-10345: Impala hits DCHECK in 
parquet-column-stats.inline.h
..

IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

During Parquet file writing, a DCHECK checks if row group stats have
copied the min/max string values into their internal buffers. This check
is at the finalization of each page. The copying of the string values
happened at the end of each row batch.

Thus, if a row batch spans over multiple pages then the min/max
string values don't get copied by the end of the page. Since the
memory is attached to the row batch this isn't really an error.

As a workaround this commit also copies the min/max string values
at the end of the page if they haven't been copied yet.

Testing
 * Added e2e test

Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M testdata/workloads/functional-query/queries/QueryTest/parquet-page-index.test
M tests/query_test/test_parquet_stats.py
3 files changed, 18 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/16771/2
--
To view, visit http://gerrit.cloudera.org:8080/16771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
Gerrit-Change-Number: 16771
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

2020-11-23 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16771 )

Change subject: IMPALA-10345: Impala hits DCHECK in 
parquet-column-stats.inline.h
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16771/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16771/1//COMMIT_MSG@9
PS1, Line 9: The DCHECK checks if row group stats have copied the min/max string
> Can you add somewhere that this can occur during insert? People (or at leas
Done



--
To view, visit http://gerrit.cloudera.org:8080/16771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
Gerrit-Change-Number: 16771
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:17:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10334: test stats extrapolation output doesn't match on erasure coding build

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16756 )

Change subject: IMPALA-10334: test_stats_extrapolation output doesn't match on 
erasure coding build
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16c11aa0a1ec2d4569c272d2454915041039f950
Gerrit-Change-Number: 16756
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:04:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10334: test stats extrapolation output doesn't match on erasure coding build

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16756 )

Change subject: IMPALA-10334: test_stats_extrapolation output doesn't match on 
erasure coding build
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6697/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16c11aa0a1ec2d4569c272d2454915041039f950
Gerrit-Change-Number: 16756
Gerrit-PatchSet: 4
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:04:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10334: test stats extrapolation output doesn't match on erasure coding build

2020-11-23 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16756 )

Change subject: IMPALA-10334: test_stats_extrapolation output doesn't match on 
erasure coding build
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16c11aa0a1ec2d4569c272d2454915041039f950
Gerrit-Change-Number: 16756
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:04:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

2020-11-23 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16771 )

Change subject: IMPALA-10345: Impala hits DCHECK in 
parquet-column-stats.inline.h
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16771/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16771/1//COMMIT_MSG@9
PS1, Line 9: The DCHECK checks if row group stats have copied the min/max string
Can you add somewhere that this can occur during insert? People (or at least 
me) think about SELECT by default.



--
To view, visit http://gerrit.cloudera.org:8080/16771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
Gerrit-Change-Number: 16771
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 15:04:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6695/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 23 Nov 2020 14:03:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..


Patch Set 9: Code-Review+2

Also LGTM


--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 23 Nov 2020 14:02:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16771 )

Change subject: IMPALA-10345: Impala hits DCHECK in 
parquet-column-stats.inline.h
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7717/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
Gerrit-Change-Number: 16771
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 23 Nov 2020 13:54:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10334: test stats extrapolation output doesn't match on erasure coding build

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16756 )

Change subject: IMPALA-10334: test_stats_extrapolation output doesn't match on 
erasure coding build
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7716/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16c11aa0a1ec2d4569c272d2454915041039f950
Gerrit-Change-Number: 16756
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Comment-Date: Mon, 23 Nov 2020 13:52:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..


Patch Set 9: Code-Review+2

LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 23 Nov 2020 13:35:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

2020-11-23 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16771


Change subject: IMPALA-10345: Impala hits DCHECK in 
parquet-column-stats.inline.h
..

IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h

The DCHECK checks if row group stats have copied the min/max string
values into their internal buffers. This check is at the end of each
page. The copying of the string values happened at the end of each
row batch.

Thus, if a row batch spans over multiple pages then the min/max
string values don't get copied by the end of the page. Since the
memory is attached to the row batch this isn't really an error.

As a workaround this commit also copies the min/max string values
at the end of the page if they haven't been copied yet.

Testing
 * Added e2e test

Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M testdata/workloads/functional-query/queries/QueryTest/parquet-page-index.test
M tests/query_test/test_parquet_stats.py
3 files changed, 18 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/16771/1
--
To view, visit http://gerrit.cloudera.org:8080/16771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b
Gerrit-Change-Number: 16771
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10334: test stats extrapolation output doesn't match on erasure coding build

2020-11-23 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/16756 )

Change subject: IMPALA-10334: test_stats_extrapolation output doesn't match on 
erasure coding build
..

IMPALA-10334: test_stats_extrapolation output doesn't match on erasure coding 
build

This patch skips test_stats_extrapolation for erasure code builds. The
reason is that an extra erasure code information line can be included
in the scan explain section when a hdfs table is erasure coded. This
makes the explain output different between a normal build and an
erasure code build. A new reason 'contain_full_explain' is added to
SkipIfEC to facilitate this.

Testing:
  Ran erasure coding version of the EE and CLUSTER tests.
  Ran core tests

Change-Id: I16c11aa0a1ec2d4569c272d2454915041039f950
---
M tests/common/skip.py
M tests/metadata/test_stats_extrapolation.py
2 files changed, 4 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/16756/3
--
To view, visit http://gerrit.cloudera.org:8080/16756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16c11aa0a1ec2d4569c272d2454915041039f950
Gerrit-Change-Number: 16756
Gerrit-PatchSet: 3
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7715/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 23 Nov 2020 13:09:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6694/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 23 Nov 2020 12:48:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..


Patch Set 9:

(1 comment)

PS9 is a rebase with master to resolve merge conflict,

http://gerrit.cloudera.org:8080/#/c/16599/7/testdata/data/iceberg_test/iceberg_multi_snapshots/metadata/v3.metadata.json
File 
testdata/data/iceberg_test/iceberg_multi_snapshots/metadata/v3.metadata.json:

http://gerrit.cloudera.org:8080/#/c/16599/7/testdata/data/iceberg_test/iceberg_multi_snapshots/metadata/v3.metadata.json@4
PS7, Line 4:
> The docekerised tests don't like the namenode hardcoded. We can easily remo
Instead of adding this to the data load I created the table in the test and ran 
2 inserts to have multiple snapshots. I found this way easier than to re-import 
the files for the table.



--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 23 Nov 2020 12:46:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Gabor Kaszab (Code Review)
Hello Qifan Chen, Zoltan Borok-Nagy, wangsheng, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16599

to look at the new patch set (#9).

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..

IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

The DESCRIBE HISTORY works for Iceberg tables and displays the
snapshot history of the table.

An example output:
DESCRIBE HISTORY iceberg_multi_snapshots;
++-+-+-+
| creation_time  | snapshot_id | parent_id   | 
is_current_ancestor |
++-+-+-+
| 2020-10-13 14:01:07.234000 | 4400379706200951771 | NULL| TRUE 
   |
| 2020-10-13 14:01:19.307000 | 4221472712544505868 | 4400379706200951771 | TRUE 
   |
++-+-+-+

The purpose here was to have similar output with this new feature as
what SparkSql returns for "SELECT * from tablename.history".
See "History" section of
https://iceberg.apache.org/spark/#inspecting-tables

Testing:
  - iceberg-negative.test was extended to check that DESCRIBE HISTORY
is not applicable for non-Iceberg tables.
  - iceberg-table-history.test: Covers basic usage of DESCRIBE
HISTORY. Tests on tables created with Impala and also with Spark.

Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
---
M be/src/service/client-request-state.cc
M be/src/service/frontend.cc
M be/src/service/frontend.h
M common/thrift/Frontend.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
A fe/src/main/java/org/apache/impala/analysis/DescribeHistoryStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/data/README
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
A 
testdata/workloads/functional-query/queries/QueryTest/iceberg-table-history.test
M tests/query_test/test_iceberg.py
14 files changed, 283 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16599/9
--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7714/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 23 Nov 2020 11:35:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10333: Fix utf-8 test failures when impala-shell using older thrift versions

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16767 )

Change subject: IMPALA-10333: Fix utf-8 test failures when impala-shell using 
older thrift versions
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb0baa9b3a1480673af77f7cc35c05eacf4b449f
Gerrit-Change-Number: 16767
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Nov 2020 11:30:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10333: Fix utf-8 test failures when impala-shell using older thrift versions

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16767 )

Change subject: IMPALA-10333: Fix utf-8 test failures when impala-shell using 
older thrift versions
..

IMPALA-10333: Fix utf-8 test failures when impala-shell using older thrift 
versions

In some branches that impala-shell still uses older version of thrift,
e.g. thrift-0.9.3-p8, test_utf8_decoding_error_handling will fail since
the internal string representation of thrift versions lower than 0.10.0
is still bytes. Strings won't be decoded to unicodes so there won't be
any decoding errors. The test expects some bytes that can't be decoded
correctly be replaced with U+FFFD so fails.

This patch improve the test by also expecting results from older thrift
versions. So it can be cherry-picked to older branches.

Tests:
 - Verify the test in master branch and a downstream branch that still
   uses thrift-0.9.3-p8 in impala-shell.

Change-Id: Ieb0baa9b3a1480673af77f7cc35c05eacf4b449f
Reviewed-on: http://gerrit.cloudera.org:8080/16767
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M tests/shell/test_shell_commandline.py
1 file changed, 12 insertions(+), 2 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ieb0baa9b3a1480673af77f7cc35c05eacf4b449f
Gerrit-Change-Number: 16767
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6693/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 23 Nov 2020 11:17:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

2020-11-23 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/16599 )

Change subject: IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables
..

IMPALA-10288: Implement DESCRIBE HISTORY for Iceberg tables

The DESCRIBE HISTORY works for Iceberg tables and displays the
snapshot history of the table.

An example output:
DESCRIBE HISTORY iceberg_multi_snapshots;
++-+-+-+
| creation_time  | snapshot_id | parent_id   | 
is_current_ancestor |
++-+-+-+
| 2020-10-13 14:01:07.234000 | 4400379706200951771 | NULL| TRUE 
   |
| 2020-10-13 14:01:19.307000 | 4221472712544505868 | 4400379706200951771 | TRUE 
   |
++-+-+-+

The purpose here was to have similar output with this new feature as
what SparkSql returns for "SELECT * from tablename.history".
See "History" section of
https://iceberg.apache.org/spark/#inspecting-tables

Testing:
  - iceberg-negative.test was extended to check that DESCRIBE HISTORY
is not applicable for non-Iceberg tables.
  - iceberg-table-history.test: Covers basic usage of DESCRIBE
HISTORY. Tests on tables created with Impala and also with Spark.

Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
---
M be/src/service/client-request-state.cc
M be/src/service/frontend.cc
M be/src/service/frontend.h
M common/thrift/Frontend.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
A fe/src/main/java/org/apache/impala/analysis/DescribeHistoryStmt.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/data/README
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
A 
testdata/workloads/functional-query/queries/QueryTest/iceberg-table-history.test
M tests/query_test/test_iceberg.py
14 files changed, 283 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16599/8
--
To view, visit http://gerrit.cloudera.org:8080/16599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I56a4b92c27e8e4a79109696cbae62735a00750e5
Gerrit-Change-Number: 16599
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng