[Impala-ASF-CR] IMPALA-8428: Add support for caching file handles on s3

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13221 )

Change subject: IMPALA-8428: Add support for caching file handles on s3
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4161/


--
To view, visit http://gerrit.cloudera.org:8080/13221
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b304d37bc724377fbe7955441cce0cec6fb7f19
Gerrit-Change-Number: 13221
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 07 May 2019 02:42:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8369 : Fix for tests failing with incompatible column changes

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13254 )

Change subject: IMPALA-8369 : Fix for tests failing with incompatible column 
changes
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3095/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13254
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I488121f21d9b35d33dd003b2670bc0bbe1fee4b6
Gerrit-Change-Number: 13254
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 07 May 2019 01:37:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8369 : Fix for tests failing with incompatible column changes

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13254 )

Change subject: IMPALA-8369 : Fix for tests failing with incompatible column 
changes
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13254/2/fe/src/test/resources/hive-site.xml.py
File fe/src/test/resources/hive-site.xml.py:

http://gerrit.cloudera.org:8080/#/c/13254/2/fe/src/test/resources/hive-site.xml.py@85
PS2, Line 85: p
flake8: E501 line too long (92 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/13254
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I488121f21d9b35d33dd003b2670bc0bbe1fee4b6
Gerrit-Change-Number: 13254
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 07 May 2019 00:52:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8369 : Fix for tests failing with incompatible column changes

2019-05-06 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13254


Change subject: IMPALA-8369 : Fix for tests failing with incompatible column 
changes
..

IMPALA-8369 : Fix for tests failing with incompatible column changes

In Hive-3 the configuration for allowing users to make incompatible
column type changes was disabled by default. In Hive-2 this was allowed.
Some of the tests like data_errors/test_data_errors.py and
metadata/test_compute_stats.py make changes to column types which are
disallowed by HMS-3 by default. This change adds a configuration option
in hive-site.xml to allow making incompatible changes to column types so
that we can run the existing tests with HMS-3.

Also, in HMS-3 there are certain new event types (OPEN_TXN, COMMIT_TXN,
etc) which may not have dbname set. This breaks the assumption in the
code in EventProcessor which expects dbName_ to be not null at all
times. This patch also makes changes in the EventProcessor so that such
Ignored events do not fail precondition checks during event processing.

Change-Id: I488121f21d9b35d33dd003b2670bc0bbe1fee4b6
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/test/resources/hive-site.xml.py
2 files changed, 8 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/13254/2
--
To view, visit http://gerrit.cloudera.org:8080/13254
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I488121f21d9b35d33dd003b2670bc0bbe1fee4b6
Gerrit-Change-Number: 13254
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..

IMPALA-7370: DATE: Read/Write to parquet.

This change is a follow-up to IMPALA-7368 and adds support for DATE
type to the parquet scanner/writer. CREATE TABLE LIKE PARQUET
statements associated with data files that contain dates are also
supported.

Parquet uses DATE logical type for dates. DATE logical type annotates
an INT32 that stores the number of days from the Unix epoch, 1 January
1970.

This representation introduces a parquet interoperability issue
between Impala and older versions of Hive:
- Before version 3.1, Hive used Julian calendar to represent dates
  up to 1582-10-05 and Gregorian calendar for dates starting with
  1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost.
- Impala uses proleptic Gregorian calendar, extending the Gregorian
  calendar backward to dates preceding its official introduction in
  1582-10-15.
This means that pre-1582-10-15 dates written to a parquet table by
Hive will be read back incorrectly by Impala and vice versa.

Note that Hive 3.1 switched to proleptic Gregorian calendar too, so
for Hive 3.1+ this is no longer an issue.

Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Reviewed-on: http://gerrit.cloudera.org:8080/13189
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M be/src/exec/parquet/parquet-column-stats.inline.h
M be/src/exec/parquet/parquet-common.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/util/bit-packing.cc
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/analysis/ParquetHelper.java
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/data/README
A testdata/data/hive2_pre_gregorian.parquet
A testdata/data/out_of_range_date.parquet
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-query/queries/QueryTest/date-fileformat-support.test
D 
testdata/workloads/functional-query/queries/QueryTest/date-text-only-support.test
A 
testdata/workloads/functional-query/queries/QueryTest/hive2-pre-gregorian-date.test
A testdata/workloads/functional-query/queries/QueryTest/out-of-range-date.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/common/impala_connection.py
M tests/custom_cluster/test_parquet_page_index.py
M tests/query_test/test_date_queries.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_scanners.py
29 files changed, 465 insertions(+), 148 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 8
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 May 2019 00:36:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading

2019-05-06 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13251 )

Change subject: IMPALA-8369 (part 4): Hive 3: fixes for functional dataset 
loading
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13251/1/testdata/bin/load-dependent-tables.sql
File testdata/bin/load-dependent-tables.sql:

http://gerrit.cloudera.org:8080/#/c/13251/1/testdata/bin/load-dependent-tables.sql@a115
PS1, Line 115:
Some of the test rely on the fact that this table exists. Perhaps we should 
also ignore/modify such tests if we are running against hive-3.

Running git grep "hive_index_tbl" shows that this is used in
CatalogObjectToFromThriftTest, CatalogTest and FrontendTest



--
To view, visit http://gerrit.cloudera.org:8080/13251
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic34930dc064da3136dde4e01a011d14db6a74ecd
Gerrit-Change-Number: 13251
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 07 May 2019 00:34:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-966: Type errors are attributed to wrong expression with insert

2019-05-06 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13050 )

Change subject: IMPALA-966: Type errors are attributed to wrong expression with 
insert
..


Patch Set 5:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/13050/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13050/5//COMMIT_MSG@7
PS5, Line 7: IMPALA-966: Type errors are attributed to wrong expression with 
insert
   :
   : When insert multiple incompatible type values into a table,
   : error message should blame on the correct expression. If there
   : are multiple incompatible type values for a single target
   : column, error should blame on the first widest incompatible type
   : expression.
how about :
IMPALA-966: Attribute type errors to the right expression in an insert
statement

Currently if an insert statement contains multiple expressions that are 
incompatible with the column type, the error message returned attributes the 
error to the wrong expression. This patch makes sure the right expression is 
blamed. If there are multiple incompatible type values for the target column, 
then the error is attributed to the first widest (highest precision) 
incompatible type expression.


http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@692
PS5, Line 692:   // If the queryStmt_ is a unionStmt, it will return a 
WidestExprs list
 :   // when do castToUnionCompatibleTypes().
 :   // widestTypeExpr will be null if the queryStmt_ is a 
SelectStmt
nit: superfluous comment


http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@695
PS5, Line 695:   UnionStmt unionStmt =
 :   (queryStmt_ instanceof UnionStmt) ? (UnionStmt) 
queryStmt_ : null;
 :   if (unionStmt != null && unionStmt.getWidestExprs() != null
 :   && unionStmt.getWidestExprs().size() > 0) {
 : widestTypeExpr = unionStmt.getWidestExprs().get(i);
 :   }
nit: instead of doing this in every loop maybe just get the widestExprList 
before the loop and use it if not null


http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
File fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java:

http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java@292
PS5, Line 292: null
nit: remove the comment above and add an inline comment here like
.., analyzer.isDecimalV2(), null /*widestTypeSrcExpr*/);


http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/StatementBase.java
File fe/src/main/java/org/apache/impala/analysis/StatementBase.java:

http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/StatementBase.java@196
PS5, Line 196: widestTypeSrcExpr
nit: add quotes since this refers to an input param


http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/StatementBase.java@196
PS5, Line 196: for
nit: among


http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/StatementBase.java@197
PS5, Line 197: Error message should blame on the widestTypeSrcExpr instead of 
the first
 :* compatible source expression.
nit: is only used when constructing an AnalysisException message to make sure 
the right expression is blamed in the error message


http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java:

http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java@56
PS5, Line 56:   // widestExprs_ is a list of the first widest compatible 
expression for each column
nit: you can remove the first line  and write "widest (highest precision)" here.
Also, can you mention what order they are stored in and add a full stop at the 
end


http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3404
PS5, Line 3404: on
nit: the


http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3404
PS5, Line 3404:  // Error should blame on correct expression.
  : // The widest (highest precision) expression and type 
should appear in error.
nit: these two are a bit 

[Impala-ASF-CR] [IMPALA-8435] Prohibit operations on full transactional table.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13253 )

Change subject: [IMPALA-8435] Prohibit operations on full transactional table.
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3094/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I542570e30afdd8351250236d1be0077a170dd4ab
Gerrit-Change-Number: 13253
Gerrit-PatchSet: 2
Gerrit-Owner: Sudhanshu Arora 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 23:57:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8369 (part 3): Hive 3: fix test permanent udfs.py for Hive 3 support

2019-05-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13236 )

Change subject: IMPALA-8369 (part 3): Hive 3: fix test_permanent_udfs.py for 
Hive 3 support
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG@16
PS2, Line 16:  This function also
:   exists in Hive 2, so while it isn't necessary, I didn't bother 
to make
:   it conditional on version
> Done
The comment should be also updated.



--
To view, visit http://gerrit.cloudera.org:8080/13236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169
Gerrit-Change-Number: 13236
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 07 May 2019 00:01:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8369 (part 3): Hive 3: fix test permanent udfs.py for Hive 3 support

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13236 )

Change subject: IMPALA-8369 (part 3): Hive 3: fix test_permanent_udfs.py for 
Hive 3 support
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3091/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169
Gerrit-Change-Number: 13236
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 23:52:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition

2019-05-06 Thread Sudhanshu Arora (Code Review)
Sudhanshu Arora has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and 
partition
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/13215/6/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/13215/6/common/thrift/CatalogObjects.thrift@482
PS6, Line 482:   // 

Are we guaranteed that Hive team will keep this string backward compatible?


http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@231
PS6, Line 231: return null;
throw UnsupportedOperationException.


http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@374
PS6, Line 374:   * @return the list of valid write IDs for the table in a string
Nit: or null if there are no validWriteIds


http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@236
PS6, Line 236: StringBuilder validIdsBuf = new StringBuilder("Loaded 
ValidWriteIdLists: ");
For my understanding, how do we use timeline?


http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@649
PS6, Line 649: writeId_ = msPartition != null ?
Nit: Handle null case in shim so that every call does not have to handle it.


http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1026
PS6, Line 1026: }
Nit: Use ternary or put else in the above line



--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 23:48:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13224 )

Change subject: IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3090/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If96064f271582b2790a3cfb3d135f3834d46c41d
Gerrit-Change-Number: 13224
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 23:36:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13251 )

Change subject: IMPALA-8369 (part 4): Hive 3: fixes for functional dataset 
loading
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3092/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13251
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic34930dc064da3136dde4e01a011d14db6a74ecd
Gerrit-Change-Number: 13251
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 06 May 2019 23:32:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8509. Lazily evaluate LOAD sections during data load

2019-05-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13252 )

Change subject: IMPALA-8509. Lazily evaluate LOAD sections during data load
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc64bb5cac4fda675607672329c04c5caf810d99
Gerrit-Change-Number: 13252
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 06 May 2019 23:35:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8509. Lazily evaluate LOAD sections during data load

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13252 )

Change subject: IMPALA-8509. Lazily evaluate LOAD sections during data load
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3093/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc64bb5cac4fda675607672329c04c5caf810d99
Gerrit-Change-Number: 13252
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 06 May 2019 23:36:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..

IMPALA-8364: [DOCS] Remove refereces to authz policy files

Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Reviewed-on: http://gerrit.cloudera.org:8080/13235
Tested-by: Impala Public Jenkins 
Reviewed-by: Fredy Wijaya 
---
M docs/topics/impala_authorization.xml
M docs/topics/impala_grant.xml
M docs/topics/impala_revoke.xml
M docs/topics/impala_show.xml
4 files changed, 54 insertions(+), 311 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Fredy Wijaya: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13248 )

Change subject: IMPALA-8503: add option to start Kudu cluster with HMS 
integration
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3089/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13248
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2
Gerrit-Change-Number: 13248
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 06 May 2019 23:01:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] [IMPALA-8435] Prohibit operations on full transactional table.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13253 )

Change subject: [IMPALA-8435] Prohibit operations on full transactional table.
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java:

http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@535
PS2, Line 535: AnalysisError("create table test as select * from 
functional.full_transactional_table",
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@537
PS2, Line 537: AnalyzesOk("create table test as select * from 
functional.insert_only_transactional_table");
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@557
PS2, Line 557: AnalysisError("alter table 
functional.full_transactional_table add columns (col2 string)",
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@559
PS2, Line 559: AnalyzesOk("alter table 
functional.insert_only_transactional_table add columns (col2 string)");
line too long (99 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/13253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I542570e30afdd8351250236d1be0077a170dd4ab
Gerrit-Change-Number: 13253
Gerrit-PatchSet: 2
Gerrit-Owner: Sudhanshu Arora 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 22:57:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [IMPALA-8435] Prohibit operations on full transactional table.

2019-05-06 Thread Sudhanshu Arora (Code Review)
Sudhanshu Arora has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13253


Change subject: [IMPALA-8435] Prohibit operations on full transactional table.
..

[IMPALA-8435] Prohibit operations on full transactional table.

Copied some code from Hive to identify if the table is transactional,
insert only table.

Testing Done:
- Added a new unit test in Analyzer.

Change-Id: I542570e30afdd8351250236d1be0077a170dd4ab
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M testdata/datasets/functional/functional_schema_template.sql
10 files changed, 117 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/13253/2
--
To view, visit http://gerrit.cloudera.org:8080/13253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I542570e30afdd8351250236d1be0077a170dd4ab
Gerrit-Change-Number: 13253
Gerrit-PatchSet: 2
Gerrit-Owner: Sudhanshu Arora 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 4: Code-Review+2

(2 comments)

I found few typos in the authorization doc. But let's not mix that in this CR. 
We can have a typo fix in a different CR.

http://gerrit.cloudera.org:8080/#/c/13235/4/docs/topics/impala_show.xml
File docs/topics/impala_show.xml:

http://gerrit.cloudera.org:8080/#/c/13235/4/docs/topics/impala_show.xml@388
PS4, Line 388: ROLE
not related to this CR, but this is a typo: it should be SHOW CURRENT ROLES


http://gerrit.cloudera.org:8080/#/c/13235/4/docs/topics/impala_show.xml@440
PS4, Line 440: SHOW ROLE GRANT
this should be SHOW GRANT ROLE.



--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 22:50:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13224 )

Change subject: IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13224/5/bin/jenkins/critique-gerrit-review.py
File bin/jenkins/critique-gerrit-review.py:

http://gerrit.cloudera.org:8080/#/c/13224/5/bin/jenkins/critique-gerrit-review.py@72
PS5, Line 72:
flake8: E261 at least two spaces before inline comment


http://gerrit.cloudera.org:8080/#/c/13224/5/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py
File testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py:

http://gerrit.cloudera.org:8080/#/c/13224/5/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py@33
PS5, Line 33: O
flake8: E501 line too long (96 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/13224/5/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py@37
PS5, Line 37: i
flake8: E501 line too long (94 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/13224/5/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py@42
PS5, Line 42: l
flake8: E501 line too long (101 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/13224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If96064f271582b2790a3cfb3d135f3834d46c41d
Gerrit-Change-Number: 13224
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 22:39:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 4: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/317/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 22:32:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading

2019-05-06 Thread Todd Lipcon (Code Review)
Hello Vihang Karajgaonkar,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/13251

to review the following change.


Change subject: IMPALA-8369 (part 4): Hive 3: fixes for functional dataset 
loading
..

IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading

This fixes three issues for functional dataset loading:

- works around HIVE-21675, a bug in which 'CREATE VIEW IF NOT EXISTS'
  does not function correctly in our current Hive build. This has been
  fixed already, but the workaround is pretty simple, and actually the
  'drop and recreate' pattern is used more widely for data-loading than
  the 'create if not exists' one.

- adds the ability to specify version restrictions for tables to load.
  The restrictions use the Python "requirements.txt" syntax. This new
  functionality is used to skip creating a hive "INDEX" table on Hive 3,
  where this functionality has been removed.

- Moving from MR to Tez execution changed the behavior of data loading by
  disabling the auto-merging of small files. With Hive-on-MR, this
  behavior defaulted to true, but with Hive-on-Tez it defaults false. The
  change is likely motivated by the fact that Tez automatically groups
  small splits on the _input_ side and thus is less likely to produce lots
  of small files. However, that grouping functionality doesn't work
  properly in localhost clusters (TEZ-3310) so we aren't seeing the
  benefit. So, this patch enables the post-process merging of small
  files.

  Prior to this change, the 'alltypesaggmultifilesnopart' test table was
  getting 40+ files inside it, which broke various planner tests. With the
  change, it gets the expected 4 files.

Change-Id: Ic34930dc064da3136dde4e01a011d14db6a74ecd
---
M fe/src/test/resources/hive-site.xml.py
M testdata/bin/generate-schema-statements.py
M testdata/bin/load-dependent-tables.sql
M testdata/datasets/README
M testdata/datasets/functional/functional_schema_template.sql
5 files changed, 119 insertions(+), 24 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/13251/1
--
To view, visit http://gerrit.cloudera.org:8080/13251
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic34930dc064da3136dde4e01a011d14db6a74ecd
Gerrit-Change-Number: 13251
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution

2019-05-06 Thread Todd Lipcon (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Sudhanshu Arora, Joe McDonnell, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13224

to look at the new patch set (#5).

Change subject: IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution
..

IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution

This switches away from Tez local mode to tez-on-YARN. After spending a
couple of days trying to debug issues with Tez local mode, it seemed
like it was just going to be too much of a lift.

This patch switches on the starting of a Yarn RM and NM when
USE_CDP_HIVE is enabled. It also switches to a new yarn-site.xml with a
minimized set of configurations, generated by the new python templating.

In order for everything to work properly I also had to update the Hadoop
dependency to come from CDP instead of CDH when using CDP Hive.
Otherwise, the classpath of the launched Tez containers had conflicting
versions of various Hadoop classes which caused tasks to fail.

I verified that this fixes concurrent query execution by running queries
in parallel in two beeline sessions. With local mode, these queries
would periodically fail due to various races (HIVE-21682). I'm also able
to get farther along in data loading.

Change-Id: If96064f271582b2790a3cfb3d135f3834d46c41d
---
M bin/bootstrap_toolchain.py
M bin/create-test-configuration.sh
M bin/generate_xml_config.py
M bin/impala-config.sh
M bin/jenkins/critique-gerrit-review.py
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/CopyTestCaseStmt.java
M fe/src/test/resources/hive-site.xml.py
M shaded-deps/pom.xml
M testdata/cluster/admin
A testdata/cluster/node_templates/common/etc/hadoop/conf/capacity-scheduler.xml
A testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py
D testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.tmpl
13 files changed, 365 insertions(+), 173 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/13224/5
--
To view, visit http://gerrit.cloudera.org:8080/13224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If96064f271582b2790a3cfb3d135f3834d46c41d
Gerrit-Change-Number: 13224
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-8369 (part 3): Hive 3: fix test permanent udfs.py for Hive 3 support

2019-05-06 Thread Todd Lipcon (Code Review)
Hello Yongzhi Chen, Vihang Karajgaonkar, Sudhanshu Arora, Csaba Ringhofer, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13236

to look at the new patch set (#3).

Change subject: IMPALA-8369 (part 3): Hive 3: fix test_permanent_udfs.py for 
Hive 3 support
..

IMPALA-8369 (part 3): Hive 3: fix test_permanent_udfs.py for Hive 3 support

This fixes two issues in test_permanent_udfs.py:

- two of Hive's built-ins were ported to the new GenericUDF interface
  which Impala can't execute. These UDFs are now excluded from the test
  when running with Hive 3.

- Hive 3 now caches UDFs more aggressively, so we have to run 'RELOAD
  FUNCTION' in Hive after changing UDFs in Impala. This function also
  exists in Hive 2, so while it isn't necessary, I didn't bother to make
  it conditional on version.

Change-Id: I7f50845c7d4769d8843cad87988498e165902169
---
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_permanent_udfs.py
2 files changed, 30 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/13236/3
--
To view, visit http://gerrit.cloudera.org:8080/13236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169
Gerrit-Change-Number: 13236
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-8509. Lazily evaluate LOAD sections during data load

2019-05-06 Thread Todd Lipcon (Code Review)
Hello Vihang Karajgaonkar,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/13252

to review the following change.


Change subject: IMPALA-8509. Lazily evaluate LOAD sections during data load
..

IMPALA-8509. Lazily evaluate LOAD sections during data load

The LOAD sections for the 'testescape' tables were evaluated too
eagerly, before determining whether these tables should be skipped.
Moving to lazy evaluation makes incremental load-data.py calls take
about 30 seconds instead of several minutes.

Change-Id: Ifc64bb5cac4fda675607672329c04c5caf810d99
---
M testdata/bin/generate-schema-statements.py
1 file changed, 12 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13252/1
--
To view, visit http://gerrit.cloudera.org:8080/13252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifc64bb5cac4fda675607672329c04c5caf810d99
Gerrit-Change-Number: 13252
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/316/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 22:31:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration

2019-05-06 Thread Thomas Marshall (Code Review)
Thomas Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13248 )

Change subject: IMPALA-8503: add option to start Kudu cluster with HMS 
integration
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/node_templates/common/etc/init.d/kudu-master
File testdata/cluster/node_templates/common/etc/init.d/kudu-master:

http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/node_templates/common/etc/init.d/kudu-master@31
PS1, Line 31: 
KUDU_COMMON_ARGS+=("-hive_metastore_uris=thrift://${INTERNAL_LISTEN_HOST}:9083")
Instead of doing all of the work of piping an argument all the way through 
testdata/cluster/admin and having this logic here, I wonder if it would be 
easier just to add an env variable like EXTRA_KUDU_STARTUP_ARGS or whatever 
that if its set we always just append it to KUDU_COMMON_ARGS here.

It also doesn't look like this patch actually uses the functionality that 
you've added here, unless I'm missing something. It might be easier for 
reviewers to understand how you intend for this to work if you include it in a 
patch along with a test that actually exercises this functionality



--
To view, visit http://gerrit.cloudera.org:8080/13248
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2
Gerrit-Change-Number: 13248
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 06 May 2019 22:20:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Alex Rodoni (Code Review)
Hello Austin Nobis, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13235

to look at the new patch set (#4).

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..

IMPALA-8364: [DOCS] Remove refereces to authz policy files

Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
---
M docs/topics/impala_authorization.xml
M docs/topics/impala_grant.xml
M docs/topics/impala_revoke.xml
M docs/topics/impala_show.xml
4 files changed, 54 insertions(+), 311 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/13235/4
--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 4:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/317/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 22:20:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7665: Fix unwarranted query cancellation on statestore restart

2019-05-06 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13061 )

Change subject: IMPALA-7665: Fix unwarranted query cancellation on statestore 
restart
..


Patch Set 4:

(3 comments)

Thanks for fixing it. Glad that this is also pretty straightforward.

http://gerrit.cloudera.org:8080/#/c/13061/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13061/4//COMMIT_MSG@23
PS4, Line 23: fo
typo


http://gerrit.cloudera.org:8080/#/c/13061/4/be/src/statestore/statestore-subscriber.h
File be/src/statestore/statestore-subscriber.h:

http://gerrit.cloudera.org:8080/#/c/13061/4/be/src/statestore/statestore-subscriber.h@214
PS4, Line 214: //
nit: ///


http://gerrit.cloudera.org:8080/#/c/13061/4/be/src/statestore/statestore-subscriber.cc
File be/src/statestore/statestore-subscriber.cc:

http://gerrit.cloudera.org:8080/#/c/13061/4/be/src/statestore/statestore-subscriber.cc@196
PS4, Line 196: last_registration_ms_.Store(MonotonicMillis());
Should this be set iff status.ok() ?



--
To view, visit http://gerrit.cloudera.org:8080/13061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30b68bd8bde4bf589d58d42d6f683afb166de959
Gerrit-Change-Number: 13061
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 May 2019 22:13:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/316/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 22:14:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13234 )

Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
Gerrit-Change-Number: 13234
Gerrit-PatchSet: 3
Gerrit-Owner: Austin Nobis 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 06 May 2019 22:03:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Alex Rodoni (Code Review)
Hello Austin Nobis, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13235

to look at the new patch set (#3).

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..

IMPALA-8364: [DOCS] Remove refereces to authz policy files

Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
---
M docs/topics/impala_authorization.xml
M docs/topics/impala_revoke.xml
M docs/topics/impala_show.xml
3 files changed, 51 insertions(+), 305 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/13235/3
--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13234 )

Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
..

IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

A hardcoded path in test_ranger.py for URI testing was updated to
support S3, local, and HDFS as opposed to just HDFS.

Testing:
- Ran authorization E2E tests
- Ran all FE tests
- Ran test_ranger.py with S3

Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
Reviewed-on: http://gerrit.cloudera.org:8080/13234
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/authorization/test_ranger.py
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/13234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
Gerrit-Change-Number: 13234
Gerrit-PatchSet: 4
Gerrit-Owner: Austin Nobis 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13248 )

Change subject: IMPALA-8503: add option to start Kudu cluster with HMS 
integration
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/3087/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/13248
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2
Gerrit-Change-Number: 13248
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 06 May 2019 21:59:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8428: Add support for caching file handles on s3

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13221 )

Change subject: IMPALA-8428: Add support for caching file handles on s3
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13221
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b304d37bc724377fbe7955441cce0cec6fb7f19
Gerrit-Change-Number: 13221
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 May 2019 21:42:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13245 )

Change subject: IMPALA-8499: avoid datetime.total_seconds() in 
test_insert_events
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52
Gerrit-Change-Number: 13245
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 May 2019 21:46:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8428: Add support for caching file handles on s3

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13221 )

Change subject: IMPALA-8428: Add support for caching file handles on s3
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4161/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13221
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b304d37bc724377fbe7955441cce0cec6fb7f19
Gerrit-Change-Number: 13221
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 May 2019 21:42:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13245 )

Change subject: IMPALA-8499: avoid datetime.total_seconds() in 
test_insert_events
..

IMPALA-8499: avoid datetime.total_seconds() in test_insert_events

This function was only added in Python 2.7.

Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52
Reviewed-on: http://gerrit.cloudera.org:8080/13245
Reviewed-by: Todd Lipcon 
Tested-by: Impala Public Jenkins 
---
M tests/custom_cluster/test_event_processing.py
1 file changed, 2 insertions(+), 3 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/13245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52
Gerrit-Change-Number: 13245
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13207 )

Change subject: IMPALA-8460: Simplify cluster membership management
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3088/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
Gerrit-Change-Number: 13207
Gerrit-PatchSet: 6
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 May 2019 21:41:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] Hive 3: fix test permanent udfs.py for Hive 3 support

2019-05-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13236 )

Change subject: Hive 3: fix test_permanent_udfs.py for Hive 3 support
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG@16
PS2, Line 16:  This function also
:   exists in Hive 2, so while it isn't necessary, I didn't bother 
to make
:   it conditional on version
> Maybe create a function like describe_fn__in_hive(self, db, fn)? It could a
Done


http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py
File tests/custom_cluster/test_permanent_udfs.py:

http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py@507
PS2, Line 507: implemened
> typo: implemented
Done


http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py@507
PS2, Line 507: now
> not ?
'now' is correct -- they used to be 'UDF' but now they are 'GenericUDF'



--
To view, visit http://gerrit.cloudera.org:8080/13236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169
Gerrit-Change-Number: 13236
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 21:25:26 +
Gerrit-HasComments: Yes


[native-toolchain-CR] Fix issues with toolchain Python and bump version

2019-05-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13249


Change subject: Fix issues with toolchain Python and bump version
..

Fix issues with toolchain Python and bump version

* Bumps the version to 2.7.16, the latest Python 2 release.
* Fixes issues where paths like
  /tmp/tmp.mEyNqPNTxH-impala-toolchain/gcc got baked into the
  Python metadata, which caused problems when later compiling
  C/C++ extensions from source.
* Remove hardcoded version in Kudu build scripts

This is motivated by IMPALA-8508, where we want to consume the toolchain
python outside of the toolchain.

Change-Id: I7e6c9c4371d3d6c1193c2cc02d45c22b04137672
---
M buildall.sh
M source/kudu/build.sh
M source/python/build.sh
3 files changed, 18 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/49/13249/1
--
To view, visit http://gerrit.cloudera.org:8080/13249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7e6c9c4371d3d6c1193c2cc02d45c22b04137672
Gerrit-Change-Number: 13249
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management

2019-05-06 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13207 )

Change subject: IMPALA-8460: Simplify cluster membership management
..


Patch Set 5:

(20 comments)

Thanks for the review. Please see my inline comments and PS6.

http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/runtime/exec-env.cc@454
PS5, Line 454:   // Register the ImpalaServer with the cluster membership 
manager
 :   cluster_membership_mgr_->SetLocalBeDescFn([server]() {
 : return server->GetLocalBackendDescriptor();
 :   });
 :   cluster_membership_mgr_->SetUpdateLocalServerFn(
 :   [server](const ClusterMembershipMgr::BackendAddressSet& 
current_backends) {
 : server->CancelQueriesOnFailedBackends(current_backends);
 :   });
> just thinking out aloud: should we reset the callback functions during tear
These rely only on the server still being alive, not the ExecEnv, so the right 
thing to do would be to delete them when we unregister the ImpalaServer from 
the ExecEnv (which we currently don't). I added a DCHECK to make sure that we 
don't use this method to reset the ImpalaServer in the future. Since the 
ExecEnv d'tor will destroy the ClusterMembershipManager I think we should be 
good without resetting its state explicitly.


http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@54
PS5, Line 54: /// Clients can also register callbacks to receive notifications 
of changes to the cluster
: /// membership.
> this makes it sound like there is a generic way of registering callbacks li
Done


http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@68
PS5, Line 68: pool
> nit: executor pool. Is there a description of "executor pools" anywhere?
Replaced it with executor groups and added a brief description which we can 
expand in the future.


http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@102
PS5, Line 102: then
> nit: when
Done


http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@112
PS5, Line 112: subscription
> nit: subscription.
Done


http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@117
PS5, Line 117:   /// Registers a callback to provide the local backend 
descriptor.
 :   void SetLocalBeDescFn(BackendDescriptorPtrFn fn);
 :
 :   /// Registers a callback to notify the local ImpalaServer of 
changes in the cluster
 :   /// membership. This callback will only be called when 
backends are deleted from the
 :   /// membership.
 :   void SetUpdateLocalServerFn(UpdateLocalServerFn fn);
 :
 :   /// Registers a callback to notify the local Frontend of 
changes in the cluster
 :   /// membership.
 :   void SetUpdateFrontendFn(UpdateFrontendFn fn);
> I know we discussed this offline, but it might be worth documenting in this
Both clients have different needs: The ImpalaServer only needs to learn about 
deleted backends, whereas the Frontend needs the full list. In the future, the 
Frontend will also need the executor group sizes (but not the memberships 
because groups will likely only be useful for remote read scenarios). 
Additionally, the callbacks have different signatures. Having a generic 
callback interface would require some sort of filtering of events (added, 
updated, deleted) and more complexity on the client side without a clear 
benefit. If the list of client classes expands we can revisit this decision.


http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@165
PS5, Line 165: in
> nit: extra word
Done


http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@167
PS5, Line 167: / occur in 'current_backends'
> nit: outdated comment?
Done


http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@184
PS5, Line 184: May be NULL if the set of
 :   /// backends is fixed.
> maybe mention that this is only true in tests
Done


http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.cc@88
PS5, Line 88: update.is_delta && update.topic_entries.empty()
> just curious, when can we receive an empty delta
It's the way that the statestore pulls for updates from the clients, e.g. every 
time interval it will send an empty delta and the 

[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management

2019-05-06 Thread Lars Volker (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13207

to look at the new patch set (#6).

Change subject: IMPALA-8460: Simplify cluster membership management
..

IMPALA-8460: Simplify cluster membership management

This change adds a class to track cluster membership called
ClusterMembershipMgr. It replaced the logic that was partially
duplicated between the ImpalaServer and the Coordinator and makes sure
that the local backend descriptor is consistent (IMPALA-8469).

The ClusterMembershipMgr maintains a view of the cluster membership and
incorporates incoming updates from the statestore. It also registers the
local backend with the statestore after startup. Clients can obtain a
consistent, immutable snapshot of the current cluster membership from
the ClusterMembershipMgr. Additionally, callbacks can be registered to
receive notifications of cluster membership changes. The ImpalaServer
and Frontend use this mechanism.

This change also unifies the naming of executor-related classes, in
particular it renames "BackendConfig" to "ExecutorGroup". In
anticipation of a subsequent change, it adds maps to store multiple
executor groups.

Testing: This change does not introduce new functionality and the new
class is covered by the existing scheduler unit test and the end to end
tests.

Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
---
M be/src/benchmarks/scheduler-benchmark.cc
M be/src/common/logging.h
M be/src/gutil/strings/split.cc
M be/src/gutil/strings/split.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/scheduling/CMakeLists.txt
D be/src/scheduling/backend-config-test.cc
D be/src/scheduling/backend-config.cc
D be/src/scheduling/backend-config.h
A be/src/scheduling/cluster-membership-mgr.cc
A be/src/scheduling/cluster-membership-mgr.h
A be/src/scheduling/executor-group-test.cc
A be/src/scheduling/executor-group.cc
A be/src/scheduling/executor-group.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/in-process-servers.cc
M common/thrift/StatestoreService.thrift
M tests/custom_cluster/test_coordinators.py
26 files changed, 1,162 insertions(+), 923 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/13207/6
--
To view, visit http://gerrit.cloudera.org:8080/13207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
Gerrit-Change-Number: 13207
Gerrit-PatchSet: 6
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml
File docs/topics/impala_revoke.xml:

http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml@110
PS2, Line 110: this statement
> It looks like we already have that information in L114. I think we can safe
Done



--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 20:34:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml
File docs/topics/impala_revoke.xml:

http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml@110
PS2, Line 110: this statement
> This statement refers to REVOKE.
It looks like we already have that information in L114. I think we can safely 
remove L108-L111. Maybe we can expand what Sentry administrative users mean in 
L114 --> "Users that belong to the groups defined in 
"sentry.service.admin.group" of the Sentry configuration".



--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 20:25:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml
File docs/topics/impala_revoke.xml:

http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml@110
PS2, Line 110: this statement
> Does "this statement" refer to GRANT/REVOKE statement? If yes, this stateme
This statement refers to REVOKE.
How should I correct the incorrect statement?



--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 20:14:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml
File docs/topics/impala_revoke.xml:

http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml@110
PS2, Line 110: this statement
Does "this statement" refer to GRANT/REVOKE statement? If yes, this statement:

Only administrative users (those with ALL privileges on the 
server) is incorrect. Senrtry administrative users are those users that belong 
to the groups defined in "sentry.service.admin.group" of Sentry configuration.



--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 20:05:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Hive 3: fix test permanent udfs.py for Hive 3 support

2019-05-06 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13236 )

Change subject: Hive 3: fix test_permanent_udfs.py for Hive 3 support
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/13236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169
Gerrit-Change-Number: 13236
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 19:55:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3086/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 20:02:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration

2019-05-06 Thread Hao Hao (Code Review)
Hao Hao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13248


Change subject: IMPALA-8503: add option to start Kudu cluster with HMS 
integration
..

IMPALA-8503: add option to start Kudu cluster with HMS integration

Currently static template configuration under testdata/cluster/ is used
to control Kudu gflags when starting a Kudu cluster. An option to allow
custom configuration such as enabling HMS integration is needed to allow
tests to run with Kudu clusters with different set of configurations.

This commit updates 'cluster/admin' script to start a cluster with argument.
And adds an option to 'kudu-master' script to allow starting Kudu master
with HMS integration using command `admin start_with_arg kudu hms`.

Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2
---
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/init.d/kudu-master
2 files changed, 33 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13248/1
--
To view, visit http://gerrit.cloudera.org:8080/13248
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2
Gerrit-Change-Number: 13248
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 6: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 19:17:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13245 )

Change subject: IMPALA-8499: avoid datetime.total_seconds() in 
test_insert_events
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/3084/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/13245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52
Gerrit-Change-Number: 13245
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 May 2019 19:19:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4157/


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 19:12:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 19:17:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4160/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 19:17:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 6:

Added the missing hive2-pre-gregorian-date.test file.


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 19:16:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..

IMPALA-7370: DATE: Read/Write to parquet.

This change is a follow-up to IMPALA-7368 and adds support for DATE
type to the parquet scanner/writer. CREATE TABLE LIKE PARQUET
statements associated with data files that contain dates are also
supported.

Parquet uses DATE logical type for dates. DATE logical type annotates
an INT32 that stores the number of days from the Unix epoch, 1 January
1970.

This representation introduces a parquet interoperability issue
between Impala and older versions of Hive:
- Before version 3.1, Hive used Julian calendar to represent dates
  up to 1582-10-05 and Gregorian calendar for dates starting with
  1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost.
- Impala uses proleptic Gregorian calendar, extending the Gregorian
  calendar backward to dates preceding its official introduction in
  1582-10-15.
This means that pre-1582-10-15 dates written to a parquet table by
Hive will be read back incorrectly by Impala and vice versa.

Note that Hive 3.1 switched to proleptic Gregorian calendar too, so
for Hive 3.1+ this is no longer an issue.

Change-Id: I67da03754531660bc8de3b6935580d46deae1814
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M be/src/exec/parquet/parquet-column-stats.inline.h
M be/src/exec/parquet/parquet-common.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/util/bit-packing.cc
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/analysis/ParquetHelper.java
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/data/README
A testdata/data/hive2_pre_gregorian.parquet
A testdata/data/out_of_range_date.parquet
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-query/queries/QueryTest/date-fileformat-support.test
D 
testdata/workloads/functional-query/queries/QueryTest/date-text-only-support.test
A 
testdata/workloads/functional-query/queries/QueryTest/hive2-pre-gregorian-date.test
A testdata/workloads/functional-query/queries/QueryTest/out-of-range-date.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/common/impala_connection.py
M tests/custom_cluster/test_parquet_page_index.py
M tests/query_test/test_date_queries.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_scanners.py
29 files changed, 465 insertions(+), 148 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/13189/6
--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 2: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/315/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 18:41:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Alex Rodoni (Code Review)
Hello Austin Nobis, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13235

to look at the new patch set (#2).

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..

IMPALA-8364: [DOCS] Remove refereces to authz policy files

Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
---
M docs/topics/impala_authorization.xml
M docs/topics/impala_revoke.xml
M docs/topics/impala_show.xml
3 files changed, 50 insertions(+), 301 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/13235/2
--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/315/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 18:29:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 1:

> (4 comments)
 >
 > I think it's better if we don't mix the Ranger doc in this CR.

I removed the references to Ranger in this patch.


--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 18:27:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml
File docs/topics/impala_authorization.xml:

http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@106
PS1, Line 106: metastore database
> This is incorrect. Replace with "Stored inside the Sentry/Ranger database"
Done


http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@110
PS1, Line 110: If you change privileges in Sentry or Ranger, e.g. adding a 
user, removing a user,
 : modifying privileges, you must clear the Impala Catalog 
server cache by running the
> This is a bit confusing. Maybe reword to something like if we you change pr
Done


http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@112
PS1, Line 112: NVALIDATE METADATA statement. INVALIDATE 
METADATA is
 : not required if you make the changes to privileges 
within Impala.
> Replace INVALIDATE METADATA with REFRESH AUTHORIZATION instead since it's a
Done


http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_revoke.xml
File docs/topics/impala_revoke.xml:

http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_revoke.xml@114
PS1, Line 114: Ranger
> Ranger doesn't support roles.
Removed Ranger



--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 18:19:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13234 )

Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3085/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
Gerrit-Change-Number: 13234
Gerrit-PatchSet: 2
Gerrit-Owner: Austin Nobis 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 06 May 2019 17:32:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files

2019-05-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13235 )

Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files
..


Patch Set 1:

(4 comments)

I think it's better if we don't mix the Ranger doc in this CR.

http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml
File docs/topics/impala_authorization.xml:

http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@106
PS1, Line 106: metastore database
This is incorrect. Replace with "Stored inside the Sentry/Ranger database"


http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@110
PS1, Line 110: If you change privileges in Sentry or Ranger, e.g. adding a 
user, removing a user,
 : modifying privileges, you must clear the Impala Catalog 
server cache by running the
This is a bit confusing. Maybe reword to something like if we you change 
privileges outside Impala, ...


http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@112
PS1, Line 112: NVALIDATE METADATA statement. INVALIDATE 
METADATA is
 : not required if you make the changes to privileges 
within Impala.
Replace INVALIDATE METADATA with REFRESH AUTHORIZATION instead since it's a 
more lightweight operation.


http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_revoke.xml
File docs/topics/impala_revoke.xml:

http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_revoke.xml@114
PS1, Line 114: Ranger
Ranger doesn't support roles.



--
To view, visit http://gerrit.cloudera.org:8080/13235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756
Gerrit-Change-Number: 13235
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 17:14:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13234 )

Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4159/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
Gerrit-Change-Number: 13234
Gerrit-PatchSet: 3
Gerrit-Owner: Austin Nobis 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 06 May 2019 16:56:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

2019-05-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13234 )

Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
Gerrit-Change-Number: 13234
Gerrit-PatchSet: 2
Gerrit-Owner: Austin Nobis 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 06 May 2019 16:56:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13234 )

Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
Gerrit-Change-Number: 13234
Gerrit-PatchSet: 3
Gerrit-Owner: Austin Nobis 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 06 May 2019 16:56:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

2019-05-06 Thread Austin Nobis (Code Review)
Hello Laszlo Gaal, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13234

to look at the new patch set (#2).

Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
..

IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

A hardcoded path in test_ranger.py for URI testing was updated to
support S3, local, and HDFS as opposed to just HDFS.

Testing:
- Ran authorization E2E tests
- Ran all FE tests
- Ran test_ranger.py with S3

Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
---
M tests/authorization/test_ranger.py
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/13234/2
--
To view, visit http://gerrit.cloudera.org:8080/13234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
Gerrit-Change-Number: 13234
Gerrit-PatchSet: 2
Gerrit-Owner: Austin Nobis 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3

2019-05-06 Thread Austin Nobis (Code Review)
Austin Nobis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13234 )

Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13234/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13234/1//COMMIT_MSG@7
PS1, Line 7: Fix hardcoded path in Ranger E2E test on S3
> nit: usually we try to say something like "Fix hardcoded path in Ranger E2E
Done


http://gerrit.cloudera.org:8080/#/c/13234/1/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/13234/1/tests/authorization/test_ranger.py@262
PS1, Line 262: "{0}{1}".forma
> nit: "{0}{1}".format(NAMENODE, uri)
Done



--
To view, visit http://gerrit.cloudera.org:8080/13234
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14
Gerrit-Change-Number: 13234
Gerrit-PatchSet: 2
Gerrit-Owner: Austin Nobis 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 06 May 2019 16:33:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5843: Use page index in Parquet files to skip pages

2019-05-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12065 )

Change subject: IMPALA-5843: Use page index in Parquet files to skip pages
..


Patch Set 19: Code-Review+1

(1 comment)

I am ok with giving +2, but I give Lars a chance to look at the modifications 
he asked for.

http://gerrit.cloudera.org:8080/#/c/12065/19/be/src/exec/parquet/parquet-column-stats.h
File be/src/exec/parquet/parquet-column-stats.h:

http://gerrit.cloudera.org:8080/#/c/12065/19/be/src/exec/parquet/parquet-column-stats.h@278
PS19, Line 278:   /// Returns the required stats field for the given function. 
'fn_name' can be 'le',
  :   /// 'lt', 'ge', and 'gt' (i.e. binary operators <=, <, >=, 
>). If we want to check that
  :   /// whether a column contains a value less than a constant, 
we need the minimum value of
  :   /// the column to answer that question. And, to answer the 
opposite question we need the
  :   /// maximum value. The required stats field (min/max) will be 
stored in 'stats_field'.
  :   /// The function returns true on success, false otherwise.
  :   static bool GetRequiredStatsField(const std::string& fn_name, 
StatsField* stats_field);
optional: I think that this long comment makes this simple function look 
complex - I would prefer to move the implementation to the header and add only 
a minimal comment.



--
To view, visit http://gerrit.cloudera.org:8080/12065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
Gerrit-Change-Number: 12065
Gerrit-PatchSet: 19
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 06 May 2019 16:28:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events

2019-05-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13245 )

Change subject: IMPALA-8499: avoid datetime.total_seconds() in 
test_insert_events
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52
Gerrit-Change-Number: 13245
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 May 2019 16:13:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events

2019-05-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13245


Change subject: IMPALA-8499: avoid datetime.total_seconds() in 
test_insert_events
..

IMPALA-8499: avoid datetime.total_seconds() in test_insert_events

This function was only added in Python 2.7.

Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52
---
M tests/custom_cluster/test_event_processing.py
1 file changed, 2 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/13245/1
--
To view, visit http://gerrit.cloudera.org:8080/13245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52
Gerrit-Change-Number: 13245
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13245 )

Change subject: IMPALA-8499: avoid datetime.total_seconds() in 
test_insert_events
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4158/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13245
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52
Gerrit-Change-Number: 13245
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 16:04:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] Hive 3: fix test permanent udfs.py for Hive 3 support

2019-05-06 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13236 )

Change subject: Hive 3: fix test_permanent_udfs.py for Hive 3 support
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py
File tests/custom_cluster/test_permanent_udfs.py:

http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py@507
PS2, Line 507: now
not ?



--
To view, visit http://gerrit.cloudera.org:8080/13236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169
Gerrit-Change-Number: 13236
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 15:20:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4157/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 14:31:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 14:31:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] Hive 3: fix test permanent udfs.py for Hive 3 support

2019-05-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13236 )

Change subject: Hive 3: fix test_permanent_udfs.py for Hive 3 support
..


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG@16
PS2, Line 16:  This function also
:   exists in Hive 2, so while it isn't necessary, I didn't bother 
to make
:   it conditional on version
Maybe create a function like describe_fn__in_hive(self, db, fn)? It could also 
branch on IMPALA_HIVE_MAJOR_VERSION and add "RELOAD FUNCTION" only if needed.


http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py
File tests/custom_cluster/test_permanent_udfs.py:

http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py@507
PS2, Line 507: implemened
typo: implemented



--
To view, visit http://gerrit.cloudera.org:8080/13236
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169
Gerrit-Change-Number: 13236
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 06 May 2019 14:11:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5843: Use page index in Parquet files to skip pages

2019-05-06 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12065 )

Change subject: IMPALA-5843: Use page index in Parquet files to skip pages
..


Patch Set 17:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@464
PS17, Line 464: bool GetRequiredStatsField(const string& fn_name,
> I think this could go into parquet-column-stats.{h,cc}.
Done


http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@546
PS17, Line 546: ColumnStatsReader::StatsField stats_field = 
ColumnStatsReader::StatsField::MIN;
> I think we usually don't initialize output parameters to make it clear that
Hmm, I thought clang-tidy didn't like that but it must have been something else 
because it doesn't complain now.


http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@643
PS17, Line 643:   // We don't need the raw page index buffers anymore.
  :   page_index_.Release();
> Can this go to ProcessPageIndex? It already touches a bunch of other state.
ProcessPageIndex() has some RETURN_IF_ERROR macros and I wanted to be sure 
about calling Release().

However, I realised that I can just use a scope exit trigger in 
ProcessPageIndex().


http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-column-readers.h
File be/src/exec/parquet/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-column-readers.h@400
PS17, Line 400:   /// True, if we are using NextLevels() to readahead the next 
def and rep levels. In this
> I feel that this field needs more explanation. From just looking at the com
Elaborated the comment.


http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-column-readers.h@403
PS17, Line 403:   bool levels_readahead_ = false;
> Would it simplify the code to make this levels_read_ahead_offset_ (being -1
Since currently there are only two possibilities (-1 or 0), having a simple 
flag is more exact I think.

Also, we'd still need an if stmt when we have processed all the rows, because 
in this case we don't need to adjust the value of 'current_row_'.


http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-page-index-test.cc
File be/src/exec/parquet/parquet-page-index-test.cc:

http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-page-index-test.cc@58
PS17, Line 58: void ValidatePageIndexRange(const RowGroupRanges& 
row_group_ranges,
> Add a comment for this one, too?
Done


http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-page-index.h
File be/src/exec/parquet/parquet-page-index.h:

http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-page-index.h@56
PS17, Line 56: if there's at least parts of the page index are present
> nit: grammar
Rephrased the whole comment.


http://gerrit.cloudera.org:8080/#/c/12065/17/tests/query_test/test_parquet_stats.py
File tests/query_test/test_parquet_stats.py:

http://gerrit.cloudera.org:8080/#/c/12065/17/tests/query_test/test_parquet_stats.py@87
PS17, Line 87: for batch_size in [0, 1]:
> Should we use a proper test dimension for the batch size, e.g. like in test
Yeah, I looked at that earlier, but I didn't want to run the other tests with 
those batch sizes, neither wanted to add if statements for each.

And I use different batch sizes here and at L97 which would be a bit more 
complicated with the other approach.

Also, it has the advantage to only load the data once.

On the other hand, I agree that this is not the cleanest solution, so I can 
change it if you feel strong about it.



--
To view, visit http://gerrit.cloudera.org:8080/12065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
Gerrit-Change-Number: 12065
Gerrit-PatchSet: 17
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 06 May 2019 14:05:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5843: Use page index in Parquet files to skip pages

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12065 )

Change subject: IMPALA-5843: Use page index in Parquet files to skip pages
..


Patch Set 19:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3083/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
Gerrit-Change-Number: 12065
Gerrit-PatchSet: 19
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 06 May 2019 14:01:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4658: Potential race if compiler reorders ReachedLimit() usage.

2019-05-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13178 )

Change subject: IMPALA-4658: Potential race if compiler reorders ReachedLimit() 
usage.
..


Patch Set 6: Code-Review+1

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.h@209
PS6, Line 209:   virtual bool LimitCheckedFromMultipleThreads() const { return 
false; }
 :   virtual bool IsTaskBasedMultiThreadingSupport() const { return 
false; }
optional: maybe creating an enum like ThreadingModel would be better to express 
this? e.g. SINGLE_THREADED, NON_TASK_BASED_SCANNER, TASK_BASED_SCANNER.


http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.h@277
PS6, Line 277:   /// Caps the input row batch to ensure that the limit is not 
exceeded.
 :   /// Sets the eos and returns true, if the limit is reached.
 :   bool CheckLimitAndTruncateRowBatchIfNeeded(RowBatch* 
row_batch, bool* eos);
 :
 :   /// Caps the input row batch to ensure that the limit is not 
exceeded.
 :   /// Sets the eos and returns true, if the limit is reached.
 :   /// Uses thread safe functions.
 :   bool CheckLimitAndTruncateRowBatchIfNeededShared(RowBatch* 
row_batch, bool* eos);
These could be moved to "protected". Can you check other functions too and make 
them protected, unless other classes use them?


http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.cc@418
PS6, Line 418: (limit_ == -1 || (rows_returned() + row_batch_size) < limit_)
Same as line 436.


http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.cc@436
PS6, Line 436: (limit_ == -1 || (rows_returned_shared() + row_batch_size) < 
limit_)
'reached_limit' could be set to this value from the start. It could be also 
reused instead of calling ReachedLimitShared() - we already assume that no 
other thread changes num_rows_returned_.



--
To view, visit http://gerrit.cloudera.org:8080/13178
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cbbfad80f7ab87dd6f192a24e2c68f7c66b047e
Gerrit-Change-Number: 13178
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 06 May 2019 13:51:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5843: Use page index in Parquet files to skip pages

2019-05-06 Thread Zoltan Borok-Nagy (Code Review)
Hello Michael Ho, Lars Volker, Pooja Nilangekar, Tim Armstrong, Csaba 
Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12065

to look at the new patch set (#19).

Change subject: IMPALA-5843: Use page index in Parquet files to skip pages
..

IMPALA-5843: Use page index in Parquet files to skip pages

This commit implements page filtering based on the Parquet page index.

The read and evaluation of the page index is done by the
HdfsParquetScanner. At first, we determine the row ranges we are
interested in, and based on the row ranges we determine the candidate
pages for each column that we are reading.

We still issue one ScanRange per column chunk, but we specify
sub-ranges that store the candidate pages, i.e. we don't read
the whole column chunk, but only fractions of it.

Pages are not aligned across column chunks, i.e. page #2 of column A
might store completely different rows than page #2 of column B.
It means we need to implement some kind of row-skipping logic
when we read the data pages. This logic is implemented in
BaseScalarColumnReader and ScalarColumnReader. Collection column
readers know nothing about page filtering.

Page filtering can be turned off by setting the query option
'read_parquet_page_index' to false.

Testing:
 * added some unit tests for the row range and
   page selection logic
 * generated various Parquet files with Parquet-MR
 * enabled Page index writing and wrote selective queries against
   tables written by Impala. Current tests are likely to use page
   filtering transparently.

Performance:
 * Measured locally, observed 3x to 20x speedup for selective queries.
   The speedup was proportional to the IO operations need to be done.

 * The TPCH benchmark didn't show a significant performance change. It
   is not a suprise since the data is not being sorted in any useful
   way. So the main goal was to not introduce perf regression.

TODO:
   * measure performance for remote reads

Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
---
M be/src/common/global-flags.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/parquet/CMakeLists.txt
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-column-readers.h
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
A be/src/exec/parquet/parquet-common-test.cc
M be/src/exec/parquet/parquet-common.cc
M be/src/exec/parquet/parquet-common.h
M be/src/exec/parquet/parquet-level-decoder.h
A be/src/exec/parquet/parquet-page-index-test.cc
A be/src/exec/parquet/parquet-page-index.cc
A be/src/exec/parquet/parquet-page-index.h
M be/src/exprs/literal.cc
M be/src/runtime/scoped-buffer.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M testdata/data/README
A testdata/data/alltypes_tiny_pages.parquet
A testdata/data/alltypes_tiny_pages_plain.parquet
A testdata/data/decimals_1_10.parquet
A testdata/data/double_nested_decimals.parquet
A testdata/data/nested_decimals.parquet
A 
testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-page-index.test
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-page-index-alltypes-tiny-pages-plain.test
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-page-index-alltypes-tiny-pages.test
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-page-index-large.test
A testdata/workloads/functional-query/queries/QueryTest/parquet-page-index.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/query_test/test_parquet_stats.py
36 files changed, 3,396 insertions(+), 95 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/12065/19
--
To view, visit http://gerrit.cloudera.org:8080/12065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
Gerrit-Change-Number: 12065
Gerrit-PatchSet: 19
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13189/3/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/13189/3/tests/query_test/test_scanners.py@359
PS3, Line 359: """
> nit: close quoute at the end of the previous line.
'test_timestamp_out_of_range' in L330-333 also uses this style for multiline 
comments.



--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 12:44:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 4: Code-Review+2

(1 comment)

I'm fine with the changes. Normally I would give only a +1 on this as I don't 
have deep knowledge around this code, but since Csaba already gave a +1 I think 
this is free to go.

http://gerrit.cloudera.org:8080/#/c/13189/3/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/13189/3/tests/query_test/test_scanners.py@359
PS3, Line 359: """
nit: close quoute at the end of the previous line.



--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 12:33:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3082/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 May 2019 10:16:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.

2019-05-06 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/13189 )

Change subject: IMPALA-7370: DATE: Read/Write to parquet.
..

IMPALA-7370: DATE: Read/Write to parquet.

This change is a follow-up to IMPALA-7368 and adds support for DATE
type to the parquet scanner/writer. CREATE TABLE LIKE PARQUET
statements associated with data files that contain dates are also
supported.

Parquet uses DATE logical type for dates. DATE logical type annotates
an INT32 that stores the number of days from the Unix epoch, 1 January
1970.

This representation introduces a parquet interoperability issue
between Impala and older versions of Hive:
- Before version 3.1, Hive used Julian calendar to represent dates
  up to 1582-10-05 and Gregorian calendar for dates starting with
  1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost.
- Impala uses proleptic Gregorian calendar, extending the Gregorian
  calendar backward to dates preceding its official introduction in
  1582-10-15.
This means that pre-1582-10-15 dates written to a parquet table by
Hive will be read back incorrectly by Impala and vice versa.

Note that Hive 3.1 switched to proleptic Gregorian calendar too, so
for Hive 3.1+ this is no longer an issue.

Change-Id: I67da03754531660bc8de3b6935580d46deae1814
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M be/src/exec/parquet/parquet-column-stats.inline.h
M be/src/exec/parquet/parquet-common.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/util/bit-packing.cc
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/analysis/ParquetHelper.java
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/data/README
A testdata/data/hive2_pre_gregorian.parquet
A testdata/data/out_of_range_date.parquet
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-query/queries/QueryTest/date-fileformat-support.test
D 
testdata/workloads/functional-query/queries/QueryTest/date-text-only-support.test
A testdata/workloads/functional-query/queries/QueryTest/out-of-range-date.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test
M tests/common/impala_connection.py
M tests/custom_cluster/test_parquet_page_index.py
M tests/query_test/test_date_queries.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_scanners.py
28 files changed, 435 insertions(+), 148 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/13189/4
--
To view, visit http://gerrit.cloudera.org:8080/13189
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814
Gerrit-Change-Number: 13189
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins