[Impala-ASF-CR] IMPALA-9866: Query plan debug page stuck in fetch and render loop.

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16092 )

Change subject: IMPALA-9866: Query plan debug page stuck in fetch and render 
loop.
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Gerrit-Change-Number: 16092
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 23 Jun 2020 02:42:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9866: Query plan debug page stuck in fetch and render loop.

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16092 )

Change subject: IMPALA-9866: Query plan debug page stuck in fetch and render 
loop.
..

IMPALA-9866: Query plan debug page stuck in fetch and render loop.

Once a query is completed we stop fetching and rendering the plan. This
speeds interaction with large query plans in the web browser as well as
reduces some load on the query coordinator.

Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Reviewed-on: http://gerrit.cloudera.org:8080/16092
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/service/impala-http-handler.cc
M www/query_plan.tmpl
2 files changed, 8 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Gerrit-Change-Number: 16092
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9878: Fix use-after-free in TmpFileMgrTest's TestAllocation

2020-06-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16099 )

Change subject: IMPALA-9878: Fix use-after-free in TmpFileMgrTest's 
TestAllocation
..

IMPALA-9878: Fix use-after-free in TmpFileMgrTest's TestAllocation

ASAN found a use-after-free for the in this code:
  file_group.Close(); <--- free underlying storage for 'file'
  EXPECT_FALSE(boost::filesystem::exists(file->path())); <-- use 'file'
This switches it to a copy of file->path().

Testing:
 - Ran tmp-file-mgr-test under ASAN

Change-Id: Idd5cbae70c287c78db8d1c560d8c777d6bed5b56
Reviewed-on: http://gerrit.cloudera.org:8080/16099
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/tmp-file-mgr-test.cc
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Idd5cbae70c287c78db8d1c560d8c777d6bed5b56
Gerrit-Change-Number: 16099
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9878: Fix use-after-free in TmpFileMgrTest's TestAllocation

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16099 )

Change subject: IMPALA-9878: Fix use-after-free in TmpFileMgrTest's 
TestAllocation
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd5cbae70c287c78db8d1c560d8c777d6bed5b56
Gerrit-Change-Number: 16099
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 22:06:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9866: Query plan debug page stuck in fetch and render loop.

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16092 )

Change subject: IMPALA-9866: Query plan debug page stuck in fetch and render 
loop.
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Gerrit-Change-Number: 16092
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 21:46:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9866: Query plan debug page stuck in fetch and render loop.

2020-06-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16092 )

Change subject: IMPALA-9866: Query plan debug page stuck in fetch and render 
loop.
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Gerrit-Change-Number: 16092
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 21:46:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9866: Query plan debug page stuck in fetch and render loop.

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16092 )

Change subject: IMPALA-9866: Query plan debug page stuck in fetch and render 
loop.
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6035/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Gerrit-Change-Number: 16092
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 21:46:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-452: Add support for string concatenation operator using ||

2020-06-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15877 )

Change subject: IMPALA-452: Add support for string concatenation operator using 
||
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3f990d56ecb1e18d1b2737e8c5eab0d524edfaf
Gerrit-Change-Number: 15877
Gerrit-PatchSet: 12
Gerrit-Owner: Martin Zink 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Martin Zink 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 21:36:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-452: Add support for string concatenation operator using ||

2020-06-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15877 )

Change subject: IMPALA-452: Add support for string concatenation operator using 
||
..

IMPALA-452: Add support for string concatenation operator using ||

Separated "||" and "OR" into different tokens.
OR (KW_OR) remains the same. (it creates CompoundPredicate
and expects two BOOLEAN operands)
|| (KW_LOGICAL_OR) creates CompoundVerticalBarExpr
which expects two BOOLEAN operands or two STRING operands

CompoundVerticalBarExpr creates either a CompoundPredicate
or a FunctionCallExpr member variable based on the type
of the left operand during analyze.

Similarly to BetweenPredicate it cannot be executed directly
so its needs to be replaced by its member variable
by ExtractCompoundVerticalBarExprRule.

Change-Id: Ie3f990d56ecb1e18d1b2737e8c5eab0d524edfaf
Reviewed-on: http://gerrit.cloudera.org:8080/15877
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 
---
M be/src/exprs/expr-test.cc
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/CompoundVerticalBarExpr.java
A 
fe/src/main/java/org/apache/impala/rewrite/ExtractCompoundVerticalBarExprRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
12 files changed, 416 insertions(+), 7 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/15877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie3f990d56ecb1e18d1b2737e8c5eab0d524edfaf
Gerrit-Change-Number: 15877
Gerrit-PatchSet: 13
Gerrit-Owner: Martin Zink 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Martin Zink 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-06-22 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..


Patch Set 6:

(5 comments)

adding Tim to the review as well

http://gerrit.cloudera.org:8080/#/c/16098/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16098/6/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1218
PS6, Line 1218: if (numRows < -1 || (numRows == 0 && 
tbl_.getTotalHdfsBytes() > 0)) {
  :   hasCorruptTableStats_ = true;
  : }
  : return numRows;
what about for unpartitioned tables? it looks like if the table stats are 
corrupted, and numRows == 0, we still return numRows = 0 from this method.
would be good to add a test case for unpartitioned tables as well.


http://gerrit.cloudera.org:8080/#/c/16098/6/testdata/data/alltypes_parquet_year2009_month01.parquet
File testdata/data/alltypes_parquet_year2009_month01.parquet:

http://gerrit.cloudera.org:8080/#/c/16098/6/testdata/data/alltypes_parquet_year2009_month01.parquet@1
PS6, Line 1: PAR1°ºLìØ  
ô×abcdefghijklmnopqrsNOPQRSTUVWXYZ[\]^_`ˆ‰Š‹ŒŽ‘’“”•–—˜™štuvwxyz{|}~€‚ƒ„
…†‡ 
!"#$%& 
do you need to create a new file for this test? shouldn't one of the existing 
files work?


http://gerrit.cloudera.org:8080/#/c/16098/6/tests/metadata/test_compute_stats.py
File tests/metadata/test_compute_stats.py:

http://gerrit.cloudera.org:8080/#/c/16098/6/tests/metadata/test_compute_stats.py@197
PS6, Line 197:   DROP TABLE {0}.{1} PURGE;
you shouldn't need this since the table is created in a unique database


http://gerrit.cloudera.org:8080/#/c/16098/6/tests/metadata/test_compute_stats.py@198
PS6, Line 198:   set hive.stats.autogather=true;
do you need this?


http://gerrit.cloudera.org:8080/#/c/16098/6/tests/metadata/test_compute_stats.py@213
PS6, Line 213: TBLPROPERTIES ("transactional"="true", 
"transactional_properties"="insert_only");
do you need this?



--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 22 Jun 2020 21:26:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2658: Extend the NDV function to accept a precision

2020-06-22 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15997 )

Change subject: IMPALA-2658: Extend the NDV function to accept a precision
..


Patch Set 36:

(9 comments)

mostly nits

http://gerrit.cloudera.org:8080/#/c/15997/34//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15997/34//COMMIT_MSG@16
PS34, Line 16:
> nit: extra white space
ping


http://gerrit.cloudera.org:8080/#/c/15997/34//COMMIT_MSG@35
PS34, Line 35: The enhancement involves both the front and the back end.
 :
 : In the frond end, a 2nd parameter in NDV() is allowed and 
verified.
 : In addition, the data type of the intermediate result in the
 : plan records the correct amount of memory needed. This is 
assisted
 : by the inclusion of additional template aggregate function 
objects
 : in the built-in database.
 :
 : In the back end, the current hardcoded precision of 10 is 
removed. The
 : HLL algorithm now works with the default, or any valid precision 
values.
 : The precision value is computed from the corresponding scale 
value
 : stored in the query plan.
> I think u can remove this whole section.
ping - you can add this as a comment in the JIRA instead. generally commit 
messages should focus on what and why vs. how - 
https://chris.beams.io/posts/git-commit/#why-not-how


http://gerrit.cloudera.org:8080/#/c/15997/34//COMMIT_MSG@56
PS34, Line 56: Performance
> would be good to do some quick sanity checks to make sure ndv performance d
ping


http://gerrit.cloudera.org:8080/#/c/15997/34//COMMIT_MSG@83
PS34, Line 83:
 : Change-Id: I48a4517bd0959f7021143073d37505a46c551a58
 :
 :
 :
 :
> Remove this, its too verbose. You paste a link to the google doc containing
ping


http://gerrit.cloudera.org:8080/#/c/15997/36//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15997/36//COMMIT_MSG@51
PS36, Line 51: 2. Added and ran a new regression test (test_ndv)) in 
TestAggregationQueries
this applies to the rest of the commit message as well, but if you look at the 
commit message in gerrit 
(https://gerrit.cloudera.org/#/c/15997/36//COMMIT_MSG), notice a vertical 
dashed red line on the right. that line is basically the cutoff length for each 
line of the commit message.

ideally each line in the commit message is only 72 characters or less. if you 
use vim to write commit messages, there are standard tools that will do this 
for you automatically (e.g. https://github.com/tpope/vim-fugitive)


http://gerrit.cloudera.org:8080/#/c/15997/36/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/15997/36/be/src/exprs/aggregate-functions-ir.cc@1473
PS36, Line 1473:
nit: extra white space


http://gerrit.cloudera.org:8080/#/c/15997/36/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java:

http://gerrit.cloudera.org:8080/#/c/15997/36/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java@889
PS36, Line 889: help
is this suppose to be "helper"?


http://gerrit.cloudera.org:8080/#/c/15997/36/tests/query_test/test_aggregation.py
File tests/query_test/test_aggregation.py:

http://gerrit.cloudera.org:8080/#/c/15997/36/tests/query_test/test_aggregation.py@281
PS36, Line 281:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/15997/36/tests/query_test/test_aggregation.py@281
PS36, Line 281: Test two argument version of NDV()
same comment as before, referring to the "two argument version of NDV()" is 
vague. it would better to refer to it as "the version of NDV() that accepts a 
scale value".



--
To view, visit http://gerrit.cloudera.org:8080/15997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48a4517bd0959f7021143073d37505a46c551a58
Gerrit-Change-Number: 15997
Gerrit-PatchSet: 36
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 22 Jun 2020 20:53:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6394/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 22 Jun 2020 20:49:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6393/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 22 Jun 2020 20:49:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-452: Add support for string concatenation operator using ||

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15877 )

Change subject: IMPALA-452: Add support for string concatenation operator using 
||
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6392/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3f990d56ecb1e18d1b2737e8c5eab0d524edfaf
Gerrit-Change-Number: 15877
Gerrit-PatchSet: 12
Gerrit-Owner: Martin Zink 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Martin Zink 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 20:38:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-06-22 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..

IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

This work addresses the current limitation in computing the total row
count for a Hive table. The row count can be incorrectly computed as 0,
even though there exists data in partitions of the Hive table.

In the fix, as long as no partition in a Hive table exhibits any stats
corruptions including the kind described above, the total row count for
the table is computed from the row counts in all partitions. Otherwise,
Impala estimates the total row count from the total size of the partitions
and the row width, if feasible.

Testing:
1. Ran unit tests with queries documented in the case against Hive tables
   with the following configrations:
   a. No stats corruption in any partitions
   b. Stats corruption in some partitions
   c. Stats corruption in all partitions
2. Added a new test in test_compute_stats.py:
 test_corrupted_stats_impala_9744
3. Ran "core" test

Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A testdata/data/alltypes_parquet_year2009_month01.parquet
M tests/metadata/test_compute_stats.py
3 files changed, 72 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/16098/6
--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 6
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16098/5/tests/metadata/test_compute_stats.py
File tests/metadata/test_compute_stats.py:

http://gerrit.cloudera.org:8080/#/c/16098/5/tests/metadata/test_compute_stats.py@232
PS5, Line 232: +
flake8: E226 missing whitespace around arithmetic operator



--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 22 Jun 2020 20:22:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-06-22 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..

IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

This work addresses the current limitation in computing the total row
count for a Hive table. The row count can be incorrectly computed as 0,
even though there exists data in partitions of the Hive table.

In the fix, as long as no partition in a Hive table exhibits any stats
corruptions including the kind described above, the total row count for
the table is computed from the row counts in all partitions. Otherwise,
Impala estimates the total row count from the total size of the partitions
and the row width, if feasible.

Testing:
1. Ran unit tests with queries documented in the case against Hive tables
   with the following configrations:
   a. No stats corruption in any partitions
   b. Stats corruption in some partitions
   c. Stats corruption in all partitions
2. Added a new test in test_compute_stats.py:
 test_corrupted_stats_impala_9744
3. Ran "core" test

Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A testdata/data/alltypes_parquet_year2009_month01.parquet
M tests/metadata/test_compute_stats.py
3 files changed, 72 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/16098/5
--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 5
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-452: Add support for string concatenation operator using ||

2020-06-22 Thread Martin Zink (Code Review)
Martin Zink has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15877 )

Change subject: IMPALA-452: Add support for string concatenation operator using 
||
..


Patch Set 12:

(11 comments)

Hell... I thought when I publish the next patchset gerrit will finalize the 
draft commands. Sorry about that.

http://gerrit.cloudera.org:8080/#/c/15877/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15877/9//COMMIT_MSG@7
PS9, Line 7: IMPALA-452: Add support for string concatenation operator using ||
> commit message is still weirdly formatted. https://cwiki.apache.org/conflue
Done


http://gerrit.cloudera.org:8080/#/c/15877/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15877/10//COMMIT_MSG@22
PS10, Line 22:
> I compared the behaviour of this operator to postgres and hive, and it look
Yeah that is a good idea, I will also check out the different behaviors of 
different SQL parsers regarding the implicit casts.
Follow-up Jira seems a good way, I would love to work on that.


http://gerrit.cloudera.org:8080/#/c/15877/10//COMMIT_MSG@22
PS10, Line 22:
> We should also add some more tests to AnalyzeExprsTest.java that test both
I've added a bunch of positive and negative test cases to AnalyzeExprsTest


http://gerrit.cloudera.org:8080/#/c/15877/9/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15877/9/be/src/exprs/expr-test.cc@3892
PS9, Line 3892: (
> nit: we don't usually have a space before parentheses here and below. It's
Yeah old habits die hard, didn't know about the clang tidy this should not be 
an issue anymore thanks :)


http://gerrit.cloudera.org:8080/#/c/15877/10/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15877/10/be/src/exprs/expr-test.cc@3893
PS10, Line 3893:   TestStringValue("cast('foo  ' AS CHAR(5)) || '3'", "foo  3");
> Can you add some tests for string concatenation with NULL STRING arguments?
Originally OR accepted NULL so I've kept that argument type solely for the 
logical OR version. (changing the least amount compared to previous behaviour). 
I've added some test cases here and to AnalyzeStmtsTest.java as well to reflect 
this behavior.


http://gerrit.cloudera.org:8080/#/c/15877/11/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15877/11/be/src/exprs/expr-test.cc@3893
PS11, Line 3893:   TestStringValue("cast('foo  ' AS CHAR(5)) || '3'", "foo  3");
> I think we also want a positive test case, e.g.
Done


http://gerrit.cloudera.org:8080/#/c/15877/11/be/src/exprs/expr-test.cc@3896
PS11, Line 3896:   TestIsNull("cast(NULL AS STRING) || 'abc'", TYPE_STRING);
> Also the reverse, i.e. 'abc' || NULL. Since the condition in the planner ch
Done


http://gerrit.cloudera.org:8080/#/c/15877/9/fe/src/main/java/org/apache/impala/analysis/CompoundVerticalBarExpr.java
File fe/src/main/java/org/apache/impala/analysis/CompoundVerticalBarExpr.java:

http://gerrit.cloudera.org:8080/#/c/15877/9/fe/src/main/java/org/apache/impala/analysis/CompoundVerticalBarExpr.java@33
PS9, Line 33:
> Can you comment that this is initialized during analysis.
Done


http://gerrit.cloudera.org:8080/#/c/15877/10/fe/src/main/java/org/apache/impala/analysis/CompoundVerticalBarExpr.java
File fe/src/main/java/org/apache/impala/analysis/CompoundVerticalBarExpr.java:

http://gerrit.cloudera.org:8080/#/c/15877/10/fe/src/main/java/org/apache/impala/analysis/CompoundVerticalBarExpr.java@80
PS10, Line 80:   encapsulatedExpr_.analyze(analyzer);
> It's weird that we get different errors for, say string || int, vs int || s
Agreed, I also reworked the error message to reflect that both CHAR, and 
VARCHAR can be parsed with STRINGs (reflecting how isStringType() works)


http://gerrit.cloudera.org:8080/#/c/15877/10/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/15877/10/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java@849
PS10, Line 849: RewritesOk("FALSE || id = 0", 
extractCompoundVerticalBarExprRule, "FALSE OR id = 0");
> Can we add a few more test cases for mixed types. E.g. a couple of cases of
Sure, I've added those test cases to AnalyzeStmtsTest.java (they would not fit 
here, because they fail during analysis not rewrite phase)


http://gerrit.cloudera.org:8080/#/c/15877/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/15877/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3099
PS9, Line 3099: select distinct bool_col || int_col < smallint_col,
> Can you reformat this so that it's more readable (i.e. not all one one line
Done



--
To view, visit http://gerrit.cloudera.org

[Impala-ASF-CR] IMPALA-452: Add support for string concatenation operator using ||

2020-06-22 Thread Martin Zink (Code Review)
Martin Zink has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/15877 )

Change subject: IMPALA-452: Add support for string concatenation operator using 
||
..

IMPALA-452: Add support for string concatenation operator using ||

Separated "||" and "OR" into different tokens.
OR (KW_OR) remains the same. (it creates CompoundPredicate
and expects two BOOLEAN operands)
|| (KW_LOGICAL_OR) creates CompoundVerticalBarExpr
which expects two BOOLEAN operands or two STRING operands

CompoundVerticalBarExpr creates either a CompoundPredicate
or a FunctionCallExpr member variable based on the type
of the left operand during analyze.

Similarly to BetweenPredicate it cannot be executed directly
so its needs to be replaced by its member variable
by ExtractCompoundVerticalBarExprRule.

Change-Id: Ie3f990d56ecb1e18d1b2737e8c5eab0d524edfaf
---
M be/src/exprs/expr-test.cc
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/CompoundVerticalBarExpr.java
A 
fe/src/main/java/org/apache/impala/rewrite/ExtractCompoundVerticalBarExprRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
12 files changed, 416 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/15877/12
--
To view, visit http://gerrit.cloudera.org:8080/15877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie3f990d56ecb1e18d1b2737e8c5eab0d524edfaf
Gerrit-Change-Number: 15877
Gerrit-PatchSet: 12
Gerrit-Owner: Martin Zink 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Martin Zink 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-06-22 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 7: Code-Review+1

Couple of minor comments. Overall, this LGTM.


--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Jun 2020 17:55:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-06-22 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16082/2/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
File testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test:

http://gerrit.cloudera.org:8080/#/c/16082/2/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test@58
PS2, Line 58: |  row-size=100B cardinality=143
> Yeah I don't think we should pass the conjuncts to the delete deltas. Also
>>  It's because COMPUTE STATS is query-based, so it runs count(*) and 
>> count(distinct) queries on a table and stores the results of it.
Makes sense..so Compute Stats is also doing the left anti join.  Do you know if 
the sampling clause of Compute Stats will be supported for the ACID table ? i.e 
a 10% sample on the delta table would need to be anti-joined with the 
delete_delta.  I suppose that should work. In any case, that's a separate 
investigation.

Regarding the naming, yes, 'EVENTS'  is more accurate than 'TABLE' since it is 
not really a table.


http://gerrit.cloudera.org:8080/#/c/16082/4/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
File testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test:

http://gerrit.cloudera.org:8080/#/c/16082/4/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test@286
PS4, Line 286: |  |  hash predicates: 
functional_orc_def.alltypes_deleted_rows.month = 
functional_orc_def.alltypes_deleted_rows-delete-delta.month, 
functional_orc_def.alltypes_deleted_rows.row__id.bucket = 
functional_orc_def.alltypes_deleted_rows-delete-delta.row__id.bucket, 
functional_orc_def.alltypes_deleted_rows.row__id.originaltransaction = 
functional_orc_def.alltypes_deleted_rows-delete-delta.row__id.originaltransaction,
 functional_orc_def.alltypes_deleted_rows.row__id.rowid = 
functional_orc_def.alltypes_deleted_rows-delete-delta.row__id.rowid, 
functional_orc_def.alltypes_deleted_rows.year = 
functional_orc_def.alltypes_deleted_rows-delete-delta.year
> It only adds the hidden ACID columns + the partitioning columns (year, mont
I see..I missed that the other columns were partitioning cols. I would think 
that not all the columns need to be in the 'hash predicate' .. as long as the 
ones with high NDV are considered for hashing, the rest could be part of the 
'other join predicates' to avoid hashing on too many columns.   But this could 
be a future enhancement.



--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Jun 2020 17:54:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9878: Fix use-after-free in TmpFileMgrTest's TestAllocation

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16099 )

Change subject: IMPALA-9878: Fix use-after-free in TmpFileMgrTest's 
TestAllocation
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6034/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd5cbae70c287c78db8d1c560d8c777d6bed5b56
Gerrit-Change-Number: 16099
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 17:05:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9878: Fix use-after-free in TmpFileMgrTest's TestAllocation

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16099 )

Change subject: IMPALA-9878: Fix use-after-free in TmpFileMgrTest's 
TestAllocation
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6391/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd5cbae70c287c78db8d1c560d8c777d6bed5b56
Gerrit-Change-Number: 16099
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 17:04:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9878: Fix use-after-free in TmpFileMgrTest's TestAllocation

2020-06-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16099 )

Change subject: IMPALA-9878: Fix use-after-free in TmpFileMgrTest's 
TestAllocation
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd5cbae70c287c78db8d1c560d8c777d6bed5b56
Gerrit-Change-Number: 16099
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 16:58:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6390/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Jun 2020 16:48:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9878: Fix use-after-free in TmpFileMgrTest's TestAllocation

2020-06-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16099


Change subject: IMPALA-9878: Fix use-after-free in TmpFileMgrTest's 
TestAllocation
..

IMPALA-9878: Fix use-after-free in TmpFileMgrTest's TestAllocation

ASAN found a use-after-free for the in this code:
  file_group.Close(); <--- free underlying storage for 'file'
  EXPECT_FALSE(boost::filesystem::exists(file->path())); <-- use 'file'
This switches it to a copy of file->path().

Testing:
 - Ran tmp-file-mgr-test under ASAN

Change-Id: Idd5cbae70c287c78db8d1c560d8c777d6bed5b56
---
M be/src/runtime/tmp-file-mgr-test.cc
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16099/1
--
To view, visit http://gerrit.cloudera.org:8080/16099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idd5cbae70c287c78db8d1c560d8c777d6bed5b56
Gerrit-Change-Number: 16099
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 6:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/6389/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Jun 2020 16:41:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-06-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 7:

PS 6&7 are only rebases.


--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Jun 2020 16:21:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-06-22 Thread Zoltan Borok-Nagy (Code Review)
Hello Aman Sinha, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16082

to look at the new patch set (#7).

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..

IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive 
types)

Hive ACID supports row-level DELETE and UPDATE operations on a table.
It achieves it via assigning a unique row-id for each row, and
maintaining two sets of files in a table. The first set is in the
base/delta directories, they contain the INSERTed rows. The second set
of files are in the delete-delta directories, they contain the DELETEd
rows.

(UPDATE operations are implemented via DELETE+INSERT.)

In the filesystem it looks like e.g.:
 * full_acid/delta_001_001_/_0
 * full_acid/delta_002_002_/_0
 * full_acid/delete_delta_003_003_/_0

During scanning we need to return INSERTed rows minus DELETEd rows.
This patch implements it by creating an ANTI JOIN between the INSERT and
DELETE sets. It is a planner-only modification. Every HDFS SCAN
that scans full ACID tables (that also have deleted rows) are converted
to two HDFS SCANs, one for the INSERT deltas, and one for the DELETE
deltas. Then a LEFT ANTI HASH JOIN with BROADCAST distribution mode is
created above them.

Later we can add support for other distribution modes if the performance
requires it. E.g. if we have too many deleted rows then probably we are
better off with PARTITIONED distribution mode. We could estimate the
number of deleted rows by sampling the delete delta files.

The current patch only works for primitive types. I.e. we cannot select
nested data if the table has deleted rows.

Testing:
 * added planner test
 * added e2e tests

Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
---
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test
A testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test
M tests/query_test/test_acid.py
15 files changed, 1,143 insertions(+), 89 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16082/7
--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-06-22 Thread Zoltan Borok-Nagy (Code Review)
Hello Aman Sinha, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16082

to look at the new patch set (#6).

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..

IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive 
types)

Hive ACID supports row-level DELETE and UPDATE operations on a table.
It achieves it via assigning a unique row-id for each row, and
maintaining two sets of files in a table. The first set is in the
base/delta directories, they contain the INSERTed rows. The second set
of files are in the delete-delta directories, they contain the DELETEd
rows.

(UPDATE operations are implemented via DELETE+INSERT.)

In the filesystem it looks like e.g.:
 * full_acid/delta_001_001_/_0
 * full_acid/delta_002_002_/_0
 * full_acid/delete_delta_003_003_/_0

During scanning we need to return INSERTed rows minus DELETEd rows.
This patch implements it by creating an ANTI JOIN between the INSERT and
DELETE sets. It is a planner-only modification. Every HDFS SCAN
that scans full ACID tables (that also have deleted rows) are converted
to two HDFS SCANs, one for the INSERT deltas, and one for the DELETE
deltas. Then a LEFT ANTI HASH JOIN with BROADCAST distribution mode is
created above them.

Later we can add support for other distribution modes if the performance
requires it. E.g. if we have too many deleted rows then probably we are
better off with PARTITIONED distribution mode. We could estimate the
number of deleted rows by sampling the delete delta files.

The current patch only works for primitive types. I.e. we cannot select
nested data if the table has deleted rows.

Testing:
 * added planner test
 * added e2e tests

Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
---
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test
A testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test
M tests/query_test/test_acid.py
15 files changed, 1,142 insertions(+), 89 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16082/6
--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-06-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 5:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/16082/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16082/4//COMMIT_MSG@9
PS4, Line 9: Hive ACID supports row-level DELETE and UPDATE operations on a 
table.
> nit: spelling 'operatations'
Done


http://gerrit.cloudera.org:8080/#/c/16082/4//COMMIT_MSG@11
PS4, Line 11: maintaining two sets of files in a table. The first set is in the
> nit: spelling 'maintinaining'
Done


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1454
PS2, Line 1454: if (fd.getRelativePath().startsWith("delete_delta_")) {
> I think for this patch it is fine to keep it this way.  I saw in the doc th
Yeah I agree. Will do that during perf analysis.


http://gerrit.cloudera.org:8080/#/c/16082/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1551
PS2, Line 1551: HdfsScanNode deltaScanNode = new 
HdfsScanNode(ctx_.getNextNodeId(),
> Actually, the planner does this optimization as part of SanNode.applyCountS
You are right, I was thinking about a different optimization in the ORC scanner 
which is also for count(*):

https://github.com/apache/impala/blob/17fd15c6e4981499932c02d541c76757a5fdf87d/be/src/exec/hdfs-orc-scanner.cc#L531-L548

Anyway, now we have tests for count(*) + ACID so if someone implements the 
Parquet-kind count(*) optimization for ORC in the future, they'll know if they 
break stg.


http://gerrit.cloudera.org:8080/#/c/16082/2/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
File testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test:

http://gerrit.cloudera.org:8080/#/c/16082/2/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test@58
PS2, Line 58: |  row-size=100B cardinality=143
> Regarding the conjuncts, wouldn't it be incorrect to  even pass the conjunc
Yeah I don't think we should pass the conjuncts to the delete deltas. Also it 
wouldn't have any effect since the delete delta files contain no real data, 
only the ACID fields (originaltransaction, rowid, etc.) which serve as a 
tombstone.

To answer the question about cardinality, yes, COMPUTE STATS would show 90. 
It's because COMPUTE STATS is query-based, so it runs count(*) and 
count(distinct) queries on a table and stores the results of it. And the 
"DELETE TABLE" is not a separate table. The delete files are stored under the 
table's directory in so called "delete delta" directories next to the "delta" 
directories, e.g.:

  
/test-warehouse/managed/full_acid_part/p=1/delete_delta_003_003_/bucket_0
  
/test-warehouse/managed/full_acid_part/p=1/delta_001_001_/bucket_0_0

Given that do you think we should rename "DELETE TABLE HASH JOIN" to "DELETE 
EVENTS HASH JOIN"? (The Hive docs usually refer to the rows in delete delta 
files as "delete events")


http://gerrit.cloudera.org:8080/#/c/16082/4/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
File testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test:

http://gerrit.cloudera.org:8080/#/c/16082/4/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test@286
PS4, Line 286: |  |  hash predicates: 
functional_orc_def.alltypes_deleted_rows.month = 
functional_orc_def.alltypes_deleted_rows-delete-delta.month, 
functional_orc_def.alltypes_deleted_rows.row__id.bucket = 
functional_orc_def.alltypes_deleted_rows-delete-delta.row__id.bucket, 
functional_orc_def.alltypes_deleted_rows.row__id.originaltransaction = 
functional_orc_def.alltypes_deleted_rows-delete-delta.row__id.originaltransaction,
 functional_orc_def.alltypes_deleted_rows.row__id.rowid = 
functional_orc_def.alltypes_deleted_rows-delete-delta.row__id.rowid, 
functional_orc_def.alltypes_deleted_rows.year = 
functional_orc_def.alltypes_deleted_rows-delete-delta.year
> This looks like it is adding all the other Slots to hash predicates, apart
It only adds the hidden ACID columns + the partitioning columns (year, month).



--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Jun 

[Impala-ASF-CR] IMPALA-9697: Support priority based scratch directory selection

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16091 )

Change subject: IMPALA-9697: Support priority based scratch directory selection
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6388/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I381c3a358e1382e6696325fec74667f1fa18dd17
Gerrit-Change-Number: 16091
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Jun 2020 16:13:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-06-22 Thread Zoltan Borok-Nagy (Code Review)
Hello Aman Sinha, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16082

to look at the new patch set (#5).

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..

IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive 
types)

Hive ACID supports row-level DELETE and UPDATE operations on a table.
It achieves it via assigning a unique row-id for each row, and
maintaining two sets of files in a table. The first set is in the
base/delta directories, they contain the INSERTed rows. The second set
of files are in the delete-delta directories, they contain the DELETEd
rows.

(UPDATE operations are implemented via DELETE+INSERT.)

In the filesystem it looks like e.g.:
 * full_acid/delta_001_001_/_0
 * full_acid/delta_002_002_/_0
 * full_acid/delete_delta_003_003_/_0

During scanning we need to return INSERTed rows minus DELETEd rows.
This patch implements it by creating an ANTI JOIN between the INSERT and
DELETE sets. It is a planner-only modification. Every HDFS SCAN
that scans full ACID tables (that also have deleted rows) are converted
to two HDFS SCANs, one for the INSERT deltas, and one for the DELETE
deltas. Then a LEFT ANTI HASH JOIN with BROADCAST distribution mode is
created above them.

Later we can add support for other distribution modes if the performance
requires it. E.g. if we have too many deleted rows then probably we are
better off with PARTITIONED distribution mode. We could estimate the
number of deleted rows by sampling the delete delta files.

The current patch only works for primitive types. I.e. we cannot select
nested data if the table has deleted rows.

Testing:
 * added planner test
 * added e2e tests

Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
---
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test
A testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test
M tests/query_test/test_acid.py
15 files changed, 1,147 insertions(+), 89 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16082/5
--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9697: Support priority based scratch directory selection

2020-06-22 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16091 )

Change subject: IMPALA-9697: Support priority based scratch directory selection
..


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/16091/1/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

http://gerrit.cloudera.org:8080/#/c/16091/1/be/src/runtime/tmp-file-mgr.h@103
PS1, Line 103: TmpDir(const std::string& path, int64_t bytes_limit, int 
priority,
> line too long (99 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/be/src/runtime/tmp-file-mgr.h@104
PS1, Line 104:   IntGauge* bytes_used_metric)
> line too long (105 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16091/1/be/src/runtime/tmp-file-mgr.cc@194
PS1, Line 194: LOG(ERROR) << "Malformed scratch directory capacity 
configuration '"
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/be/src/runtime/tmp-file-mgr.cc@207
PS1, Line 207: LOG(ERROR) << "Malformed scratch directory priority 
configuration '"
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/be/src/runtime/tmp-file-mgr.cc@259
PS1, Line 259: tmp_dirs_.emplace_back(scratch_subdir_path.string(), 
tmp_dirs[i]->bytes_limit,
> tab used for whitespace
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/tests/custom_cluster/test_scratch_disk.py
File tests/custom_cluster/test_scratch_disk.py:

http://gerrit.cloudera.org:8080/#/c/16091/1/tests/custom_cluster/test_scratch_disk.py@20
PS1, Line 20: import os
> flake8: F401 'copy' imported but unused
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/tests/custom_cluster/test_scratch_disk.py@230
PS1, Line 230: ,
> flake8: E231 missing whitespace after ','
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/tests/custom_cluster/test_scratch_disk.py@231
PS1, Line 231: ,
> flake8: E231 missing whitespace after ','
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/tests/custom_cluster/test_scratch_disk.py@232
PS1, Line 232: ,
> flake8: E231 missing whitespace after ','
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/tests/custom_cluster/test_scratch_disk.py@233
PS1, Line 233: ,
> flake8: E231 missing whitespace after ','
Done


http://gerrit.cloudera.org:8080/#/c/16091/1/tests/custom_cluster/test_scratch_disk.py@234
PS1, Line 234:
> flake8: E231 missing whitespace after ','
Done



--
To view, visit http://gerrit.cloudera.org:8080/16091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I381c3a358e1382e6696325fec74667f1fa18dd17
Gerrit-Change-Number: 16091
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Jun 2020 15:50:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9697: Support priority based scratch directory selection

2020-06-22 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/16091 )

Change subject: IMPALA-9697: Support priority based scratch directory selection
..

IMPALA-9697: Support priority based scratch directory selection

The '--scratch_dirs' configuration option now supports specifying the
priority of the scratch direcotry. The lower the numeric value, the
higher is the priority. If priority is not specified then default
priority with value numeric_limits::max() is used.

Valid formats for specifying the priority are:
- ::
- ::
Following formats use default priority:
- 
- :
- ::

The new logic in TmpFileGroup::AllocateSpace() tries to find a target
file using a prioritized round-robin scheme. Files are ordered in
decreasing order of their priority. The priority of a file is same as
the priority of the related directory. A target file is selected by
always searching in the ordered list starting from the file with highest
priority. If multiple files have same priority, then the target file is
selected in a round robin manner.

Testing:
- Added unit and e2e tests for priority based spilling logic.

Change-Id: I381c3a358e1382e6696325fec74667f1fa18dd17
---
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M tests/custom_cluster/test_scratch_disk.py
4 files changed, 368 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/16091/2
--
To view, visit http://gerrit.cloudera.org:8080/16091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I381c3a358e1382e6696325fec74667f1fa18dd17
Gerrit-Change-Number: 16091
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9747: More fine-grained codegen for text file scanners

2020-06-22 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16059 )

Change subject: IMPALA-9747: More fine-grained codegen for text file scanners
..


Patch Set 9: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/16059
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id370193af578ecf23ed3c6bfcc65fec448156fa3
Gerrit-Change-Number: 16059
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Jun 2020 12:56:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] POC: Remote codegen (Milestone 1)

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14735 )

Change subject: POC: Remote codegen (Milestone 1)
..


Patch Set 4:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/6387/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14735
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f
Gerrit-Change-Number: 14735
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 11:24:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] POC: Remote codegen (Milestone 1)

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14735 )

Change subject: POC: Remote codegen (Milestone 1)
..


Patch Set 4:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py
File testdata/bin/server.py:

http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@12
PS4, Line 12: def save_log_file(contents):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@17
PS4, Line 17:
flake8: E261 at least two spaces before inline comment


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@23
PS4, Line 23: def bc_to_ll(bc):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@41
PS4, Line 41: def has_non64bit_reloc(obj_file):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@67
PS4, Line 67: def compile_program(data):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@79
PS4, Line 79: [
flake8: E126 continuation line over-indented for hanging indent


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@106
PS4, Line 106: def to_message(data):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@111
PS4, Line 111: class MyTCPHandler(SocketServer.BaseRequestHandler):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@157
PS4, Line 157: class Server(SocketServer.ForkingMixIn, SocketServer.TCPServer):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@160
PS4, Line 160: def server_main():
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@171
PS4, Line 171: def client_main():
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14735/4/testdata/bin/server.py@181
PS4, Line 181: if __name__ == "__main__":
flake8: E305 expected 2 blank lines after class or function definition, found 1



--
To view, visit http://gerrit.cloudera.org:8080/14735
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f
Gerrit-Change-Number: 14735
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Jun 2020 10:54:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] POC: Remote codegen (Milestone 1)

2020-06-22 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14735


Change subject: POC: Remote codegen (Milestone 1)
..

POC: Remote codegen (Milestone 1)

The Impala executor does everything as usual until the point when it
would optimize and compile the LLVM module. Instead, it sends the LLVM
bitcode to the compile server which optimizes it and sends back an
object file (handled in memory, not written to disk). The executor
blocks until it receives the compiled object file.

Note that the customization of the LLVM module (replacing function
calls) is still done by the executor.

This stage should not be used in production as it actually slows down
the code generation because it is done remotely but synchronously.

The server is a python script that calls llc to compile the code.
It will later be replaced probably by a special impalad that uses the
LLVM API to compile the code.

Communication is not secure.

Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f
---
A be/src/codegen/LLVMSmallVectorMemoryBuffer.h
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
A be/src/codegen/simple_message.h
M testdata/bin/kill-all.sh
M testdata/bin/run-all.sh
A testdata/bin/server.py
7 files changed, 499 insertions(+), 90 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/14735/4
--
To view, visit http://gerrit.cloudera.org:8080/14735
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f
Gerrit-Change-Number: 14735
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6386/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Jun 2020 09:02:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-22 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 6:

There was a rebase between patch set 5 and 6. The only thing that is changed is 
bin/rat_exclude_files.txt.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Jun 2020 08:35:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-22 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 275 insertions(+), 230 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/6
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins