[Impala-ASF-CR] IMPALA-6643: Add REFRESH fine-grained privilege

2018-03-12 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/9589 )

Change subject: IMPALA-6643: Add REFRESH fine-grained privilege
..

IMPALA-6643: Add REFRESH fine-grained privilege

With this patch, REFRESH privilege is now required to execute INVALIDATE
METADATA or REFRESH statement.

These are the new GRANT/REVOKE statements introduced.

GRANT REFRESH on DATABASE db TO ROLE testrole;
GRANT REFRESH on TABLE db.tbl TO ROLE testrole;

REVOKE REFRESH on DATABASE db TO ROLE testrole;
REVOKE REFRESH on TABLE db.tbl TO ROLE testrole;

Testing:
- Ran end-to-end tests

Change-Id: I4c3c5a51fe493d39fd719c7a388d4d5760049ce4
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/Privilege.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/resources/authz-policy.ini.template
8 files changed, 136 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/9589/4
--
To view, visit http://gerrit.cloudera.org:8080/9589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c3c5a51fe493d39fd719c7a388d4d5760049ce4
Gerrit-Change-Number: 9589
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Fredy Wijaya 


[Impala-ASF-CR] Removing (broken) retries from split-hbase.sh.

2018-03-12 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9588 )

Change subject: Removing (broken) retries from split-hbase.sh.
..


Patch Set 1:

https://jenkins.impala.io/job/gerrit-verify-dryrun-external/92/ passed with 
this change.


--
To view, visit http://gerrit.cloudera.org:8080/9588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I715891c9e744f21002330c3ae3ebc14095d94ffd
Gerrit-Change-Number: 9588
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Tue, 13 Mar 2018 04:57:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] Removing (broken) retries from split-hbase.sh.

2018-03-12 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9588


Change subject: Removing (broken) retries from split-hbase.sh.
..

Removing (broken) retries from split-hbase.sh.

The retries in split-hbase.sh don't work in the common case,
because $MINIKDC_PRINC_HIVE is not set in non-kerberized (common)
environments. The regular data load scripts (create-load-data.sh)
have code to manage that, but split-hbase.sh blindly forges ahead,
leading to errors like:

  /home/impdev/Impala/testdata/bin/split-hbase.sh: line 49: MINIKDC_PRINC_HIVE: 
unbound variable
  Error in /home/impdev/Impala/testdata/bin/create-load-data.sh at line 48: 
LOAD_DATA_ARGS=""

Since this hasn't been working, I opted to remove it entirely, as a failure on
the line where HBase splitting actually failed would be significantly more
useful than the error here. A search of mailing lists suggested that I was at
least the second person to have run into this. (In my case, I did break HBase
splitting, but it took me a second to identify the error, since the log was
spammed with unrelated information relating to the cluster restart.)

Testing: core tests.

Change-Id: I715891c9e744f21002330c3ae3ebc14095d94ffd
---
M testdata/bin/split-hbase.sh
1 file changed, 3 insertions(+), 14 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/9588/1
--
To view, visit http://gerrit.cloudera.org:8080/9588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I715891c9e744f21002330c3ae3ebc14095d94ffd
Gerrit-Change-Number: 9588
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6643: Add REFRESH fine-grained privilege

2018-03-12 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9589 )

Change subject: IMPALA-6643: Add REFRESH fine-grained privilege
..


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
File fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java:

http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java@116
PS2, Line 116:  analyzer.registerPrivReq(new PrivilegeRequest(Privilege.ALL));
Shouldn't the refresh privilege be applied to this case as well?  i.e. 
onServer()
Might require an e2e test instead of AuthorizationTest.


http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/main/java/org/apache/impala/authorization/Privilege.java
File fe/src/main/java/org/apache/impala/authorization/Privilege.java:

http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/main/java/org/apache/impala/authorization/Privilege.java@50
PS2, Line 50: SentryAction
Is this the best name?  Since Sentry does not care what the string is, and it's 
more specific to SQL.  Refresh allows the invalidate metadata action.  This 
list is really privileges.  Maybe SQLPrivilege?

Also, should this be it's own file?  I know there's cases for either way in the 
code.


http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java:

http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@236
PS2, Line 236:
Update comment at top of file to reflect new setup.


http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@249
PS2, Line 249: roleName = "refresh_functional_text_lzo";
 : sentryService.createRole(USER, roleName, true);
 : sentryService.grantRoleToGroup(USER, roleName, 
USER.getName());
Does this chunk need to be duplicated?


http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@261
PS2, Line 261: roleName = "refresh_functional_text_lzo";
 : sentryService.createRole(USER, roleName, true);
 : sentryService.grantRoleToGroup(USER, roleName, 
USER.getName());
duplicate?


http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@273
PS2, Line 273: roleName = "refresh_functional_text_lzo";
 : sentryService.createRole(USER, roleName, true);
 : sentryService.grantRoleToGroup(USER, roleName, 
USER.getName());
duplicate?


http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3579
PS2, Line 3579: // REFRESH privilege object.
I think we need refresh on SERVER also.


http://gerrit.cloudera.org:8080/#/c/9589/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3580
PS2, Line 3580: ParsesOk(String.format("%s REFRESH ON DATABASE foo %s myRole", 
formatStr));
There's no current method to "REFRESH" or "INVALIDATE METADATA" at the database 
level, only server and table.



--
To view, visit http://gerrit.cloudera.org:8080/9589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c3c5a51fe493d39fd719c7a388d4d5760049ce4
Gerrit-Change-Number: 9589
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Adam Holley 
Gerrit-Comment-Date: Tue, 13 Mar 2018 04:31:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] :IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0

2018-03-12 Thread Alex Rodoni (Code Review)
Hello Greg Rahn, Tianyi Wang, John Russell, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9540

to look at the new patch set (#3).

Change subject: :IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0
..

:IMPALA-3916: [DOCS] Reserved keywords updated for Impala 3.0

Change-Id: I09dbaf5e2d22f07a8bca8a601c04e94f3a4b36a0
Cherry-picks: not for 2.x
---
M docs/topics/impala_reserved_words.xml
1 file changed, 3,431 insertions(+), 235 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/9540/3
--
To view, visit http://gerrit.cloudera.org:8080/9540
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I09dbaf5e2d22f07a8bca8a601c04e94f3a4b36a0
Gerrit-Change-Number: 9540
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 2:

This failure was a Kudu side issue tracked and fixed in KUDU-2338. When the 
next Kudu 1.7.0-cdh5.15.0-SNAPSHOT jar is pulled in it should be resolved. I 
will kick off a test tomorrow.


--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 13 Mar 2018 03:48:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4168: [DOCS] Adds Oracle-style hint placement for INSERT/UPSERT

2018-03-12 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9030 )

Change subject: IMPALA-4168: [DOCS] Adds Oracle-style hint placement for 
INSERT/UPSERT
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/9030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I43e0a782087c2e67f2e012424fb9261be445efc9
Gerrit-Change-Number: 9030
Gerrit-PatchSet: 2
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 13 Mar 2018 03:48:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5191, IMPALA-6415: [DOCS] Document breaking change of alias and ordinal substitution

2018-03-12 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9211 )

Change subject: IMPALA-5191, IMPALA-6415: [DOCS] Document breaking change of 
alias and ordinal substitution
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/9211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I558230d07212da62d2cd12e07a52ceba03e980a8
Gerrit-Change-Number: 9211
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Mar 2018 03:43:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 13 Mar 2018 02:40:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..

IMPALA-6638: Reduce file handle cache lock contention

FileHandleCache::OpenFileHandle() currently holds the
lock while opening a file handle. This lengthens the
duration holding the lock considerably, causing
contention when a lot of file handles are being
opened (i.e. when the cache is cold).

This changes FileHandleCache::OpenFileHandle() drops
the lock while opening the file handle, then
reacquires it to add the file handle to the cache.

When running a simple select on a table with 46801
Parquet files, this fixes a small performance overhead
for the cold cache case compared to having the
cache off. All results are averaged over 5 runs:

File handle cache off: 7.19s
File handle cache cold (no patch): 7.92s
File handle cache cold (with patch): 7.20s

When running a select that accesses and aggregates
4 columns from the same Parquet table (thus requiring
more scan ranges for the same file), the fix reduces
contention, particularly for the case with 8 IO threads
per disk:

1 IO thread per disk:
Without patch: 9.59s
With patch: 8.11s

8 IO threads per disk:
Without patch: 14.2s
With patch: 8.17s

The patch has no impact on the performance when the
cache is hot.

Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Reviewed-on: http://gerrit.cloudera.org:8080/9576
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/io/handle-cache.inline.h
1 file changed, 37 insertions(+), 24 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2088/


--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 13 Mar 2018 02:27:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6610: Add inspection and warning for LDAP password in Impala shell when authentication fails

2018-03-12 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9506 )

Change subject: IMPALA-6610: Add inspection and warning for LDAP password in 
Impala shell when authentication fails
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9506/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9506/8//COMMIT_MSG@7
PS8, Line 7: Add inspection and warning for LDAP password in Impala shell wh
> Update
Done


http://gerrit.cloudera.org:8080/#/c/9506/8//COMMIT_MSG@9
PS8, Line 9: The value of LDAP password in Impala shell contains extra line 
break causes
> nit: Remove trailing spaces and wrap (here and below)
Done


http://gerrit.cloudera.org:8080/#/c/9506/8/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/9506/8/shell/impala_shell.py@821
PS8, Line 821:   if options.ldap_password_cmd is not None and \
> Add checks for ldap_password_cmd & ldap_password not being None. Else shell
Done



--
To view, visit http://gerrit.cloudera.org:8080/9506
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie570166aea62af223905b7f0124e9efb15a88ac7
Gerrit-Change-Number: 9506
Gerrit-PatchSet: 9
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Tue, 13 Mar 2018 02:08:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6610: Add inspection and warning for LDAP password in Impala shell when authentication fails

2018-03-12 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9506 )

Change subject: IMPALA-6610: Add inspection and warning for LDAP password in 
Impala shell when authentication fails
..


Patch Set 9:

Thank you very much.
It seems that there is a rule that requires no more than 80 characters per 
line, so that a sentence with a length exceeding the limit will have newline.


--
To view, visit http://gerrit.cloudera.org:8080/9506
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie570166aea62af223905b7f0124e9efb15a88ac7
Gerrit-Change-Number: 9506
Gerrit-PatchSet: 9
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Tue, 13 Mar 2018 02:07:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6610: Add inspection and warning for LDAP password in Impala shell when authentication fails

2018-03-12 Thread Donghui Xu (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9506

to look at the new patch set (#9).

Change subject: IMPALA-6610: Add inspection and warning for LDAP password in 
Impala shell when authentication fails
..

IMPALA-6610: Add inspection and warning for LDAP password in Impala shell when 
authentication fails

The value of LDAP password in Impala shell contains extra line break causes
authentication failure, but the user can't detect the cause of the failure.
I fixed the issue by adding inspection to the password for common pitfalls and
issuing a warning in the shell when authentication fails.

Change-Id: Ie570166aea62af223905b7f0124e9efb15a88ac7
---
M shell/impala_shell.py
1 file changed, 6 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/9506/9
--
To view, visit http://gerrit.cloudera.org:8080/9506
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie570166aea62af223905b7f0124e9efb15a88ac7
Gerrit-Change-Number: 9506
Gerrit-PatchSet: 9
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 13 Mar 2018 02:01:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..

IMPALA-6576: Add metrics for data stream service memory usage

This change adds metrics for the data stream service memory usage. Both
current and peak usage are exposed.

It adds a test to test_krpc_metrics.py to make sure that the expected
metrics are present and that the peak usage shows a non-zero value after
running a query.

Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Reviewed-on: http://gerrit.cloudera.org:8080/9562
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/exec-env.cc
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/util/memory-metrics.cc
M be/src/util/memory-metrics.h
M common/thrift/metrics.json
M tests/custom_cluster/test_krpc_metrics.py
7 files changed, 104 insertions(+), 17 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 6
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5980: Upgrade to LLVM 5.0.1

2018-03-12 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9584 )

Change subject: IMPALA-5980: Upgrade to LLVM 5.0.1
..


Patch Set 1:

> Do any of those benchmarks verify compile time performance?

The query time in the attached benchmarks include the codegen compile time as 
well.

I also tired pulling out the codegen compile time individually from the query 
profiles and doing a similar analysis. The delta in those varies alot along 
with st.dev being high too. Since the scale of those timings is in milliseconds 
and codegen time reported is the wall clock time, it might be difficult to 
seperate the noise.
The variation in codegen time would be compensated by the variation in query 
execution time. I believe that just looking at their cumulative effect should 
be good enough for now.

I also did some initial testing to see if codegen compile time improves for one 
of the queries (large num of OR-predicates) that has been a problem in the 
past. That time improved from ~20sec to ~14sec.


--
To view, visit http://gerrit.cloudera.org:8080/9584
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib0a15cb53feab89e7b35a56b67b3b30eb3e62c6b
Gerrit-Change-Number: 9584
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Tue, 13 Mar 2018 01:30:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8936 )

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..


Patch Set 10:

I need to take another pass over this just to convince myself I didn't miss any 
bugs first time over.


--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 10
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 13 Mar 2018 01:02:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8936 )

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8936/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8936/10//COMMIT_MSG@9
PS10, Line 9: Enabled the fuzz test for Sequence and RCFiles and added new 
checks to return
Commit message is stale.



--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 10
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 13 Mar 2018 01:02:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5315: Cast to timestamp fails for YYYY-M-D format

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7009 )

Change subject: IMPALA-5315: Cast to timestamp fails for -M-D format
..


Patch Set 17: Code-Review+1

Did anyone else want to take a look? I'm happy with the code and I think the 
test coverage is good.


--
To view, visit http://gerrit.cloudera.org:8080/7009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
Gerrit-Change-Number: 7009
Gerrit-PatchSet: 17
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vincent Tran 
Gerrit-Comment-Date: Tue, 13 Mar 2018 00:48:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster.

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9395 )

Change subject: IMPALA-6341, IMPALA-5917: Reduce mem-limit for 
start-impala-cluster.
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7
Gerrit-Change-Number: 9395
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 13 Mar 2018 00:39:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster.

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9395 )

Change subject: IMPALA-6341, IMPALA-5917: Reduce mem-limit for 
start-impala-cluster.
..

IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster.

We've observed empirically that giving Impala 80% of system memory
doesn't leave enough room for the minicluster and ASAN overhead, leading
to the OOM killer striking during test runs (sometimes). This commit
reduces the threshold to 70%.

This commit also reduces the memory usage of semi-joins-exhaustive.test
by roughly halving the number of records it deals with. This was
necessary for tests to pass on a machine with 32GB of RAM.

Testing: I've run the ASAN build (more) happily with this change.
I've run exhaustive tests on a 32GB machine.

Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7
Reviewed-on: http://gerrit.cloudera.org:8080/9395
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M bin/start-impala-cluster.py
M 
testdata/workloads/functional-query/queries/QueryTest/semi-joins-exhaustive.test
2 files changed, 13 insertions(+), 3 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7
Gerrit-Change-Number: 9395
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5315: Cast to timestamp fails for YYYY-M-D format

2018-03-12 Thread Vincent Tran (Code Review)
Hello Lars Volker, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7009

to look at the new patch set (#17).

Change subject: IMPALA-5315: Cast to timestamp fails for -M-D format
..

IMPALA-5315: Cast to timestamp fails for -M-D format

This change allows casting of a string in 'lazy' date/time
format to timestamp. The supported lazy date formats are:
  -[M]M-[d]d
  -[M]M-[d]d [H]H:[m]m:[s]s[.S]
  [H]H:[m]m:[s]s[.S]

We will incur a SCAN performance penalty (approximately 1/2
TotalReadThroughput) when the string is in one of these
lazy date/time format.

Testing:
Benchmarked the performance consequence by executing this SQL on
a private build over 3.8 billion rows:
select min(cast (time_string as timestamp)) from private.impala_5315

Added tests for valid and invalid date/time format strings
in expr-test.cc to be inline with existing tests for CAST() function.

Added end-to-end tests into exprs.test and
select-lazy-timestamp.test to exercise the new function within
the context of a query.

Added tests to exercise the leading and trailing white space trimming
behaviour in default and lazy date/time string format (IMPALA-6630).

Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
---
M be/src/exprs/expr-test.cc
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
A testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
A 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/query_test/test_scanners.py
7 files changed, 398 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/7009/17
--
To view, visit http://gerrit.cloudera.org:8080/7009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
Gerrit-Change-Number: 7009
Gerrit-PatchSet: 17
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vincent Tran 


[Impala-ASF-CR] IMPALA-5717: Support for reading ORC data files

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9134 )

Change subject: IMPALA-5717: Support for reading ORC data files
..


Patch Set 5:

(3 comments)

Thanks for the new patchset. I need to do a deep dive into the new changes but 
will respond to your comments first.

http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/9134/3/be/src/exec/hdfs-orc-scanner.cc@517
PS3, Line 517:
> Though ORC-262 has no progress, I think we can still prefech data and let t
Thanks for filing it. I did spent a little time reading the ORC code and it 
does seem like we could achieve this with some modifications to the ORC library 
- they have two layers of InputStream abstraction, the top-level which does the 
decompression and a lower level that does I/O.)


http://gerrit.cloudera.org:8080/#/c/9134/5/testdata/bin/run-hive-server.sh
File testdata/bin/run-hive-server.sh:

http://gerrit.cloudera.org:8080/#/c/9134/5/testdata/bin/run-hive-server.sh@75
PS5, Line 75: tabls
typo: tables


http://gerrit.cloudera.org:8080/#/c/9134/3/testdata/workloads/functional-query/functional-query_exhaustive.csv
File testdata/workloads/functional-query/functional-query_exhaustive.csv:

http://gerrit.cloudera.org:8080/#/c/9134/3/testdata/workloads/functional-query/functional-query_exhaustive.csv@25
PS3, Line 25: file_format: orc, dataset: functional, compression_codec: none, 
compression_type: none
> Yeah, the default ORC codec is zlib (deflate in Impala).
We should the name of compression_codec to be deflate for our ORC tables so 
that it's accurate (we did the wrong thing with Parquet here).



--
To view, visit http://gerrit.cloudera.org:8080/9134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4
Gerrit-Change-Number: 9134
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 13 Mar 2018 00:28:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5315: Cast to timestamp fails for YYYY-M-D format

2018-03-12 Thread Vincent Tran (Code Review)
Vincent Tran has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7009 )

Change subject: IMPALA-5315: Cast to timestamp fails for -M-D format
..


Patch Set 16:

(4 comments)

Changed the name of the e2e test to be more appropriate for its purpose.

http://gerrit.cloudera.org:8080/#/c/7009/16/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/7009/16/be/src/runtime/timestamp-parse-util.cc@28
PS16, Line 28: #include 
> Standard C++ headers go before the 3rd-party library headers. I.e. line 19
Done


http://gerrit.cloudera.org:8080/#/c/7009/16/testdata/workloads/functional-query/queries/QueryTest/cast-lazy-datetime-string.test
File 
testdata/workloads/functional-query/queries/QueryTest/cast-lazy-datetime-string.test:

http://gerrit.cloudera.org:8080/#/c/7009/16/testdata/workloads/functional-query/queries/QueryTest/cast-lazy-datetime-string.test@3
PS16, Line 3: select cast(dt_str as timestamp) from lazy_dt_str
> should not need a cast here, we want to test the text scanner parsing the d
Done


http://gerrit.cloudera.org:8080/#/c/7009/16/testdata/workloads/functional-query/queries/QueryTest/cast-lazy-datetime-string.test@4
PS16, Line 4:  RESULTS
> Let's make this RESULTS: VERIFY_IS_EQUAL_SORTED
Done


http://gerrit.cloudera.org:8080/#/c/7009/16/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/7009/16/tests/query_test/test_scanners.py@874
PS16, Line 874: STRING
> This should be a timestamp column, so we can make sure that the text scanne
Done



--
To view, visit http://gerrit.cloudera.org:8080/7009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
Gerrit-Change-Number: 7009
Gerrit-PatchSet: 16
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vincent Tran 
Gerrit-Comment-Date: Tue, 13 Mar 2018 00:23:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6589: remove invalid DCHECK in parquet reader

2018-03-12 Thread Tim Armstrong (Code Review)
Hello Pranay Singh, Lars Volker, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9556

to look at the new patch set (#3).

Change subject: IMPALA-6589: remove invalid DCHECK in parquet reader
..

IMPALA-6589: remove invalid DCHECK in parquet reader

The DCHECK was only valid if the Parquet file metadata is internally
consistent, with the number of values reported by the metadata
matching the number of encoded levels.

The DCHECK was intended to directly detect misuse of the RleBatchDecoder
interface, which would lead to incorrect results. However, our other
test coverage for reading Parquet files is sufficient to test the
correctness of level decoding.

Testing:
Added a minimal corrupt test file that reproduces the issue.

Change-Id: Idd6e09f8c8cca8991be5b5b379f6420adaa97daa
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M testdata/data/README
A testdata/data/num_values_def_levels_mismatch.parquet
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-num-values-def-levels-mismatch.test
M tests/query_test/test_scanners.py
6 files changed, 50 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/9556/3
--
To view, visit http://gerrit.cloudera.org:8080/9556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idd6e09f8c8cca8991be5b5b379f6420adaa97daa
Gerrit-Change-Number: 9556
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6589: remove invalid DCHECK in parquet reader

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9556 )

Change subject: IMPALA-6589: remove invalid DCHECK in parquet reader
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9556/2/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/9556/2/be/src/exec/parquet-column-readers.cc@1207
PS2, Line 1207: rep_level_ = rep_levels_.ReadLevel();
> Can you also add a similar check for the repetition level?
Done


http://gerrit.cloudera.org:8080/#/c/9556/2/testdata/workloads/functional-query/queries/QueryTest/parquet-num-values-def-levels-mismatch.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-num-values-def-levels-mismatch.test:

http://gerrit.cloudera.org:8080/#/c/9556/2/testdata/workloads/functional-query/queries/QueryTest/parquet-num-values-def-levels-mismatch.test@7
PS2, Line 7:
> nit: unnecessary line
Done


http://gerrit.cloudera.org:8080/#/c/9556/2/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/9556/2/tests/query_test/test_scanners.py@433
PS2, Line 433:
> nit: unnecessary line
Done



--
To view, visit http://gerrit.cloudera.org:8080/9556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd6e09f8c8cca8991be5b5b379f6420adaa97daa
Gerrit-Change-Number: 9556
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 13 Mar 2018 00:00:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5315: Cast to timestamp fails for YYYY-M-D format

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7009 )

Change subject: IMPALA-5315: Cast to timestamp fails for -M-D format
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7009/16/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/7009/16/be/src/exprs/expr-test.cc@6176
PS16, Line 6176:   TestStringValue("monthname(cast('2011-02-18 09:10:11.00' 
as timestamp))", "February");
> nit: long lines. It might be easier to run the patch through clang-format t
NM, these were pulled in with a rebase



--
To view, visit http://gerrit.cloudera.org:8080/7009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
Gerrit-Change-Number: 7009
Gerrit-PatchSet: 16
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vincent Tran 
Gerrit-Comment-Date: Mon, 12 Mar 2018 23:57:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) IMPALA-6405: Error when string to decimal cast overflows

2018-03-12 Thread Lars Volker (Code Review)
Lars Volker has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9530 )

Change subject: IMPALA-6405: Error when string to decimal cast overflows
..

IMPALA-6405: Error when string to decimal cast overflows

Before this patch, when there was an error when converting a string to
a decimal, a NULL was returned. In this patch, we change this behavior
so that an error is returned if decimal_v2 is enabled.

The reasoning is that we want stricter behavior in decimal_v2.

Testing:
- Added some EE tests.
- Ran an exhaustive build, which passed.

Change-Id: Icffccac1c1c2361447ae4b0de9b6c2ec7de071db
Reviewed-on: http://gerrit.cloudera.org:8080/9339
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins
Reviewed-on: http://gerrit.cloudera.org:8080/9530
Reviewed-by: Taras Bobrovytsky 
---
M be/src/exprs/decimal-operators-ir.cc
M fe/src/main/java/org/apache/impala/rewrite/RemoveRedundantStringCast.java
M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test
3 files changed, 55 insertions(+), 7 deletions(-)

Approvals:
  Taras Bobrovytsky: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Icffccac1c1c2361447ae4b0de9b6c2ec7de071db
Gerrit-Change-Number: 9530
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR](2.x) IMPALA-6405: Error when string to decimal cast overflows

2018-03-12 Thread Lars Volker (Code Review)
Hello Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9530

to look at the new patch set (#2).

Change subject: IMPALA-6405: Error when string to decimal cast overflows
..

IMPALA-6405: Error when string to decimal cast overflows

Before this patch, when there was an error when converting a string to
a decimal, a NULL was returned. In this patch, we change this behavior
so that an error is returned if decimal_v2 is enabled.

The reasoning is that we want stricter behavior in decimal_v2.

Testing:
- Added some EE tests.
- Ran an exhaustive build, which passed.

Change-Id: Icffccac1c1c2361447ae4b0de9b6c2ec7de071db
Reviewed-on: http://gerrit.cloudera.org:8080/9339
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins
---
M be/src/exprs/decimal-operators-ir.cc
M fe/src/main/java/org/apache/impala/rewrite/RemoveRedundantStringCast.java
M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test
3 files changed, 55 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/9530/2
--
To view, visit http://gerrit.cloudera.org:8080/9530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icffccac1c1c2361447ae4b0de9b6c2ec7de071db
Gerrit-Change-Number: 9530
Gerrit-PatchSet: 2
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2089/


--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 12 Mar 2018 23:02:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5717: Support for reading ORC data files

2018-03-12 Thread Quanlong Huang (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9134

to look at the new patch set (#5).

Change subject: IMPALA-5717: Support for reading ORC data files
..

IMPALA-5717: Support for reading ORC data files

This patch integrates the orc library into Impala and implements
HdfsOrcScanner as a middle layer between them. The HdfsOrcScanner
supplies input needed from the orc-reader, tracks memory consumption of
the reader and transfers the reader's output (orc::ColumnVectorBatch)
into impala::RowBatch. The ORC version we used is release-1.4.3.

Currently, we only support reading premitive types. Writing into ORC
table has not been supported neither.

Tests
 - Most of the end-to-end tests can run on ORC format.
 - Add tpcds, tpch tests for ORC.
 - Add some ORC specific tests.
 - Have passed all the tests.

Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-orc-scanner.cc
A be/src/exec/hdfs-orc-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
A cmake_modules/FindOrc.cmake
M common/thrift/CatalogObjects.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/catalog/HdfsStorageDescriptor.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/jflex/sql-scanner.flex
M testdata/LineItemMultiBlock/README.dox
A testdata/LineItemMultiBlock/lineitem_orc_multiblock_one_stripe.orc
A testdata/LineItemMultiBlock/lineitem_sixblocks.orc
A testdata/LineItemMultiBlock/lineitem_threeblocks.orc
M testdata/bin/generate-schema-statements.py
M testdata/bin/load_nested.py
M testdata/bin/run-hive-server.sh
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/functional-query_core.csv
M testdata/workloads/functional-query/functional-query_dimensions.csv
M testdata/workloads/functional-query/functional-query_exhaustive.csv
M testdata/workloads/functional-query/functional-query_pairwise.csv
A 
testdata/workloads/functional-query/queries/QueryTest/nested-types-orc-basic.test
M testdata/workloads/tpcds/tpcds_core.csv
M testdata/workloads/tpcds/tpcds_dimensions.csv
M testdata/workloads/tpcds/tpcds_exhaustive.csv
M testdata/workloads/tpcds/tpcds_pairwise.csv
M testdata/workloads/tpch/tpch_core.csv
M testdata/workloads/tpch/tpch_dimensions.csv
M testdata/workloads/tpch/tpch_exhaustive.csv
M testdata/workloads/tpch/tpch_pairwise.csv
M tests/common/test_dimensions.py
M tests/comparison/cli_options.py
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_scanners.py
M tests/query_test/test_tpch_queries.py
49 files changed, 1,425 insertions(+), 156 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/9134/5
--
To view, visit http://gerrit.cloudera.org:8080/9134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4
Gerrit-Change-Number: 9134
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..


Patch Set 4: Code-Review+2

Rebase, carry +2


--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 12 Mar 2018 23:02:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6389: Make '\0' delimited text files work

2018-03-12 Thread Zach Amsden (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9525

to look at the new patch set (#2).

Change subject: IMPALA-6389: Make '\0' delimited text files work
..

IMPALA-6389: Make '\0' delimited text files work

Initially I didn't want to fully implement this, as the metadata
for these tables can't even be fully stored in Postgres; however
after digging into some older documentation, it appears that the
ASCII NUL character actually has been used as a field separator
in various vendors CSV implementation.

Therefore, this patch attempts to make things as non-broken as
possible and allows \0 as a field or tuple delimiter.  Collection
column delimiters are not allowed to be \0, as they genuinly may
not exist and we don't want to force special escaping on an
arbitrary character.  Note that the field delimiter must be distinct
from the tuple delimiter when they both exist; if it is not, the
effect will be that there is no field delimiter (this is actually
possible with single column tables).

Testing: Created a zero delimited table as described in the JIRA,
using MySQL backed Hive metastore; ran select * from tab_separated
on the table, updated the unit test.

Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8
---
M be/src/exec/delimited-text-parser-test.cc
M be/src/exec/delimited-text-parser.cc
M be/src/exec/delimited-text-parser.h
M be/src/exec/delimited-text-parser.inline.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
8 files changed, 147 insertions(+), 78 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/9525/2
--
To view, visit http://gerrit.cloudera.org:8080/9525
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4b6f38cbe3f1036f60efd31a31d82d0cd8f3d2a8
Gerrit-Change-Number: 9525
Gerrit-PatchSet: 2
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2088/


--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:40:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:39:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5717: Support for reading ORC data files

2018-03-12 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9134 )

Change subject: IMPALA-5717: Support for reading ORC data files
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9134/4/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/9134/4/be/src/exec/hdfs-scanner.h@379
PS4, Line 379: virtual
> why make this virtual? This subroutine interface is pretty ill-defined, so
Oh, I forgot this! At first, I found that the signature of 
HdfsParquetScanner::CommitRows differ from this (just has reverse argument 
list). I think it may be a mistake so I was going to fix that. But finally 
abandon it to keep this patch smaller.
I'll remove the HdfsOrcScanner::CommitRows as well :)



--
To view, visit http://gerrit.cloudera.org:8080/9134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4
Gerrit-Change-Number: 9134
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:38:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9527 )

Change subject: IMPALA-6609: Fix ownership of class members in 
KrpcDataStreamRecvr
..

IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr

A KrpcDataStreamRecvr is co-owned by the singleton KrpcDataStreamMgr
and an exchange node. It's possible that some threads (e.g. RPC service
threads) may still retain reference to the KrpcDataStreamRecvr after its
owning exchange node has been closed. This is problematic as some members
in the receiver (e.g. sender/receiver profiles) are actually owned by the
exchange node so accessing them after the exchange node is closed and
possibly deleted may lead to user-after-free.

This patch changes the ownership of some members in KrpcDataStreamRecvr
to the receiver. In particular, the profiles are now owned by the receiver
and various stat counters and timers are in turn owned by these profiles.
This prevents the use-after-free problem described above. This patch also
moves the access to deferred_rpc_tracker_ in TakeOverEarlySenders() to be
under the sender queue's 'lock_' to prevent another instance of IMPALA-6554.

Testing done: core debug build.

Change-Id: I3378496e2201b16c662b9daa634333480705c61a
Reviewed-on: http://gerrit.cloudera.org:8080/9527
Reviewed-by: Dan Hecht 
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/query-state.cc
5 files changed, 104 insertions(+), 30 deletions(-)

Approvals:
  Dan Hecht: Looks good to me, approved
  Sailesh Mukil: Looks good to me, but someone else must approve
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a
Gerrit-Change-Number: 9527
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9527 )

Change subject: IMPALA-6609: Fix ownership of class members in 
KrpcDataStreamRecvr
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a
Gerrit-Change-Number: 9527
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:31:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2087/


--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:27:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:24:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 5:

I wasn't happy with the metric naming, I think "current" captures the idea 
better than "total".


--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:23:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6498: test query profile thrift timestamps causes following tests to fail.

2018-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9590 )

Change subject: IMPALA-6498: test_query_profile_thrift_timestamps causes 
following tests to fail.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9590/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/9590/3/tests/query_test/test_observability.py@198
PS3, Line 198: Moving this test to its own suite (IMPALA-6498). This test case 
forces Unregistration
 :   # of the query, so that we force computation of query end 
time, which shows up as
 :   # a non-empty 'End Time' in the profile. We use 
self.client.close() since the
 :   # Beeswax client does not have a cencellation interface. 
self.client cannot be used
 :   # after close(), so if new test cases are added to this suite, 
the test cases must be
 :   # executed before this one.
Always better to keep comments brief, if possible. I think all of this can be 
simplified to something like:

"This test needs to call self.client.close() to force computation of query end 
time, so it has to be in its own suite (IMPALA-6498)."



--
To view, visit http://gerrit.cloudera.org:8080/9590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
Gerrit-Change-Number: 9590
Gerrit-PatchSet: 3
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:22:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9562

to look at the new patch set (#5).

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..

IMPALA-6576: Add metrics for data stream service memory usage

This change adds metrics for the data stream service memory usage. Both
current and peak usage are exposed.

It adds a test to test_krpc_metrics.py to make sure that the expected
metrics are present and that the peak usage shows a non-zero value after
running a query.

Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
---
M be/src/runtime/exec-env.cc
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M be/src/util/memory-metrics.cc
M be/src/util/memory-metrics.h
M common/thrift/metrics.json
M tests/custom_cluster/test_krpc_metrics.py
7 files changed, 104 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/9562/5
--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6498: test query profile thrift timestamps causes following tests to fail.

2018-03-12 Thread Zoram Thanga (Code Review)
Zoram Thanga has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9590 )

Change subject: IMPALA-6498: test_query_profile_thrift_timestamps causes 
following tests to fail.
..


Patch Set 3:

(1 comment)

Please have a look at ps #3

http://gerrit.cloudera.org:8080/#/c/9590/2/tests/query_test/test_thrift_profile.py
File tests/query_test/test_thrift_profile.py:

http://gerrit.cloudera.org:8080/#/c/9590/2/tests/query_test/test_thrift_profile.py@24
PS2, Line 24:
> It should be sufficient just to put it into its own class still within the
Done



--
To view, visit http://gerrit.cloudera.org:8080/9590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
Gerrit-Change-Number: 9590
Gerrit-PatchSet: 3
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:17:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6498: test query profile thrift timestamps causes following tests to fail.

2018-03-12 Thread Zoram Thanga (Code Review)
Zoram Thanga has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9590 )

Change subject: IMPALA-6498: test_query_profile_thrift_timestamps causes 
following tests to fail.
..


Patch Set 3:

> (1 comment)

Thanks for the suggestion. I've moved it back to the old file, but a separate 
class.


--
To view, visit http://gerrit.cloudera.org:8080/9590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
Gerrit-Change-Number: 9590
Gerrit-PatchSet: 3
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:17:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6498: test query profile thrift timestamps causes following tests to fail.

2018-03-12 Thread Zoram Thanga (Code Review)
Hello Michael Ho, Thomas Tauber-Marshall,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9590

to look at the new patch set (#3).

Change subject: IMPALA-6498: test_query_profile_thrift_timestamps causes 
following tests to fail.
..

IMPALA-6498: test_query_profile_thrift_timestamps causes following
tests to fail.

test_query_profile_thrift_timestamps uses ImapaTestSuite.client.close()
to force cancellation/unregistration of the query, so that 'EndTime'
of the query shows up in the profile. Since other test cases also need
a valid ImpalaTestSuite.client, we move the test case in question to
its own test suite.

Have also reduced the query to 'select sleep(5)', as the earlier
'select sleep(1)' is just really excessively long.

Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
---
M tests/query_test/test_observability.py
1 file changed, 60 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/9590/3
--
To view, visit http://gerrit.cloudera.org:8080/9590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
Gerrit-Change-Number: 9590
Gerrit-PatchSet: 3
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..


Patch Set 3: Code-Review+2

Thanks, the new comments make it clearer why we're doing what we're doing.


--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:10:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6498: test query profile thrift timestamps causes following tests to fail.

2018-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9590 )

Change subject: IMPALA-6498: test_query_profile_thrift_timestamps causes 
following tests to fail.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9590/2/tests/query_test/test_thrift_profile.py
File tests/query_test/test_thrift_profile.py:

http://gerrit.cloudera.org:8080/#/c/9590/2/tests/query_test/test_thrift_profile.py@24
PS2, Line 24: class TestThriftProfile(ImpalaTestSuite):
It should be sufficient just to put it into its own class still within the 
original file.



--
To view, visit http://gerrit.cloudera.org:8080/9590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
Gerrit-Change-Number: 9590
Gerrit-PatchSet: 2
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 22:00:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6498: test query profile thrift timestamps causes following tests to fail.

2018-03-12 Thread Zoram Thanga (Code Review)
Hello Michael Ho, Thomas Tauber-Marshall,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9590

to look at the new patch set (#2).

Change subject: IMPALA-6498: test_query_profile_thrift_timestamps causes 
following tests to fail.
..

IMPALA-6498: test_query_profile_thrift_timestamps causes following
tests to fail.

test_query_profile_thrift_timestamps uses ImapaTestSuite.client.close()
to force cancellation/unregistration of the query, so that 'EndTime'
of the query shows up in the profile. Since other test cases also need
a valid ImpalaTestSuite.client, we move the test case in question to
its own test suite.

Have also reduced the query to 'select sleep(5)', as the earlier
'select sleep(1)' is just really excessively long.

Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
---
M tests/query_test/test_observability.py
A tests/query_test/test_thrift_profile.py
2 files changed, 82 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/9590/2
--
To view, visit http://gerrit.cloudera.org:8080/9590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
Gerrit-Change-Number: 9590
Gerrit-PatchSet: 2
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6498: test query profile thrift timestamps causes following tests to fail.

2018-03-12 Thread Zoram Thanga (Code Review)
Zoram Thanga has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9590


Change subject: IMPALA-6498: test_query_profile_thrift_timestamps causes 
following tests to fail.
..

IMPALA-6498: test_query_profile_thrift_timestamps causes following
tests to fail.

test_query_profile_thrift_timestamps uses ImapaTestSuite.client.close()
to force cancellation/unregistration of the query, so that 'EndTime'
of the query shows up in the profile. Since other test cases also need
a valid ImpalaTestSuite.client, we move the test case in question to
its own test suite.

Have also reduced the query to 'select sleep(5)', as the earlier
'select sleep(1)' is just really excessively long.

Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
---
A tests/query_test/test_thrift_profile.py
1 file changed, 82 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/9590/1
--
To view, visit http://gerrit.cloudera.org:8080/9590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I287a05f3c90b1a71a5b7ee0f5c06a8840ebac4c9
Gerrit-Change-Number: 9590
Gerrit-PatchSet: 1
Gerrit-Owner: Zoram Thanga 


[Impala-ASF-CR] IMPALA-6643: Add REFRESH fine-grained privilege

2018-03-12 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9589


Change subject: IMPALA-6643: Add REFRESH fine-grained privilege
..

IMPALA-6643: Add REFRESH fine-grained privilege

With this patch, REFRESH privilege is now required to execute INVALIDATE
METADATA or REFRESH statement.

Testing:
- Ran end-to-end tests

Change-Id: I4c3c5a51fe493d39fd719c7a388d4d5760049ce4
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/Privilege.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/resources/authz-policy.ini.template
8 files changed, 137 insertions(+), 36 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/9589/2
--
To view, visit http://gerrit.cloudera.org:8080/9589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4c3c5a51fe493d39fd719c7a388d4d5760049ce4
Gerrit-Change-Number: 9589
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-3040: add logging to test caching ddl

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9579 )

Change subject: IMPALA-3040: add logging to test_caching_ddl
..

IMPALA-3040: add logging to test_caching_ddl

We don't currently have enough information to reconstruct why the test
failed, so lets add logging with timestamps to understand which timeout
we're actually hitting.

Change-Id: Iabc30445440e0fb358856da407d833f5ae975213
Reviewed-on: http://gerrit.cloudera.org:8080/9579
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins
---
M tests/query_test/test_hdfs_caching.py
1 file changed, 5 insertions(+), 1 deletion(-)

Approvals:
  Thomas Tauber-Marshall: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iabc30445440e0fb358856da407d833f5ae975213
Gerrit-Change-Number: 9579
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3040: add logging to test caching ddl

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9579 )

Change subject: IMPALA-3040: add logging to test_caching_ddl
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabc30445440e0fb358856da407d833f5ae975213
Gerrit-Change-Number: 9579
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 21:53:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h
File be/src/runtime/io/handle-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@128
PS2, Line 128: This only happens if there is no entry for this file or all the
 :   // existing file handles are in use.
> That was true while holding the lock above but is no longer necessarily tru
Changed this to insert in the front and updated this comment about why. Also 
added a comment when iterating above to explain why iterating in order is 
important.


http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@130
PS2, Line 130:  matter whether we insert this entry before or after or in 
between the
 :   // existing entries for this file
> why is the previous sentence needed in order for this to be okay?
This was poorly worded. What I meant is that not much time has passed since we 
saw that there wasn't any entry available. Any ordering decision is mostly 
unimportant under that circumstance.

I thought about it some more, and we are very slightly better off explicitly 
specifying the ordering. It is mostly an edge case, but inserting at the front 
solves it.


http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@134
PS2, Line 134:   FileHandleEntry entry(new_fh, p.lru_list);
 :   typename MapType::iterator new_it = p.cache.emplace(*fname, 
std::move(entry));
> I would remove the constructor and move the arguments to emplace: emplace(*
I appreciate the comment, but I try to avoid style changes in adjacent code.


http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@137
PS2, Line 137:   ++p.size;
 :   if (p.size > p.capacity) EvictHandles(p);
> I would prefer to move this before the creation of the new entry, to make i
I appreciate the comment, but I try to avoid style changes in adjacent code.


http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@139
PS2, Line 139:   DCHECK(!new_elem->in_use);
> This DCHECK could be probably removed.
This could go either way. We are relying on the constructor setting in_use to 
false. I figure a DCHECK doesn't hurt.



--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 12 Mar 2018 21:51:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Joe McDonnell (Code Review)
Hello Csaba Ringhofer, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9576

to look at the new patch set (#3).

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..

IMPALA-6638: Reduce file handle cache lock contention

FileHandleCache::OpenFileHandle() currently holds the
lock while opening a file handle. This lengthens the
duration holding the lock considerably, causing
contention when a lot of file handles are being
opened (i.e. when the cache is cold).

This changes FileHandleCache::OpenFileHandle() drops
the lock while opening the file handle, then
reacquires it to add the file handle to the cache.

When running a simple select on a table with 46801
Parquet files, this fixes a small performance overhead
for the cold cache case compared to having the
cache off. All results are averaged over 5 runs:

File handle cache off: 7.19s
File handle cache cold (no patch): 7.92s
File handle cache cold (with patch): 7.20s

When running a select that accesses and aggregates
4 columns from the same Parquet table (thus requiring
more scan ranges for the same file), the fix reduces
contention, particularly for the case with 8 IO threads
per disk:

1 IO thread per disk:
Without patch: 9.59s
With patch: 8.11s

8 IO threads per disk:
Without patch: 14.2s
With patch: 8.17s

The patch has no impact on the performance when the
cache is hot.

Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
---
M be/src/runtime/io/handle-cache.inline.h
1 file changed, 37 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/9576/3
--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-3866: Improve error reporting for scratch write errors

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9420 )

Change subject: IMPALA-3866: Improve error reporting for scratch write errors
..


Patch Set 7:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/9420/8/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9420/8/be/src/runtime/io/disk-io-mgr.h@373
PS8, Line 373:   /// queued buffers (plus the buffer that is returned to the 
client) gives good
Let's make it clear that it's a test-only function.


http://gerrit.cloudera.org:8080/#/c/9420/8/be/src/runtime/io/disk-io-mgr.h@425
PS8, Line 425:   /// provide a MemTracker.
We try to avoid shared_ptr as much as possible because shared ownership is 
generally harder to reason about. I think in this case unique_ptr is sufficient 
if you move() it into set_local_file_system


http://gerrit.cloudera.org:8080/#/c/9420/8/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9420/8/be/src/runtime/io/disk-io-mgr.cc@1101
PS8, Line 1101:   DCHECK(write_range != nullptr);
I missed on my first pass that we aren't fclose()ing the file when we hit an 
error now. It would be ok to add it before each call to HandleWriteFinished(), 
but I would probably use the "goto end" pattern since we need to execute the 
exact same two lines of code on all exit paths, i.e.:



  ret_status = ...;
  if (!ret_status.ok()) goto end:
  ...

  end:
ret_status = local_file_system_->Fclose(file_handle, write_range);
HandleWriteFinished(writer_context, write_range, ret_status);


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.cc
File be/src/runtime/io/errno-to-error-status-converter.cc:

http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.cc@23
PS7, Line 23: using std::unordered_map;
> I wasn't aware of this common/names.h, thx!
Yeah that's a bit of an unfortunate case. We started off with 
boost::unordered_map. In most cases where a std:: equivalent appeared we've 
automatically switched, but there was some concern here that the std:: versions 
of the maps would use somewhat more memory and be slower because the standard 
requires using doubly-linked lists for buckets: 
http://bannalia.blogspot.co.uk/2013/10/implementation-of-c-unordered.html

We could probably benchmark std::unordered_map and switch over if it looked 
acceptable but we haven't put the time in yet.

I'd suggest using std::unordered_map here since the map is small and its not 
perf-critical.


http://gerrit.cloudera.org:8080/#/c/9420/8/be/src/runtime/io/local-file-system-with-fault-injection.cc
File be/src/runtime/io/local-file-system-with-fault-injection.cc:

http://gerrit.cloudera.org:8080/#/c/9420/8/be/src/runtime/io/local-file-system-with-fault-injection.cc@31
PS8, Line 31:
Long lines. It might be useful to run clang-format on the patches to detect 
some of these minor issues:

https://wiki.cloudera.com/pages/viewpage.action?pageId=24646528

The downside is that clang-format is sometimes overly aggressive at 
reformatting surrounding lines, which adds unnecessary code churn.


http://gerrit.cloudera.org:8080/#/c/9420/8/be/src/runtime/io/local-file-system.cc
File be/src/runtime/io/local-file-system.cc:

http://gerrit.cloudera.org:8080/#/c/9420/8/be/src/runtime/io/local-file-system.cc@57
PS8, Line 57:
I'm not sure what this TODO means



--
To view, visit http://gerrit.cloudera.org:8080/9420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5aa7b424209b1a5ef8dc7d04c5ba58788e91aad7
Gerrit-Change-Number: 9420
Gerrit-PatchSet: 7
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 21:45:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5315: Cast to timestamp fails for YYYY-M-D format

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7009 )

Change subject: IMPALA-5315: Cast to timestamp fails for -M-D format
..


Patch Set 16:

I started a dry run of the merge just in case there were any test failures or 
clang-tidy warnings: https://jenkins.impala.io/job/gerrit-verify-dryrun/2086/


--
To view, visit http://gerrit.cloudera.org:8080/7009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
Gerrit-Change-Number: 7009
Gerrit-PatchSet: 16
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vincent Tran 
Gerrit-Comment-Date: Mon, 12 Mar 2018 21:16:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5315: Cast to timestamp fails for YYYY-M-D format

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7009 )

Change subject: IMPALA-5315: Cast to timestamp fails for -M-D format
..


Patch Set 16:

(5 comments)

I think this should be the last round of comments.

http://gerrit.cloudera.org:8080/#/c/7009/16/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/7009/16/be/src/exprs/expr-test.cc@6176
PS16, Line 6176:   TestStringValue("monthname(cast('2011-02-18 09:10:11.00' 
as timestamp))", "February");
nit: long lines. It might be easier to run the patch through clang-format to 
fix some of these whitespace and include ordering issues:

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536


http://gerrit.cloudera.org:8080/#/c/7009/16/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/7009/16/be/src/runtime/timestamp-parse-util.cc@28
PS16, Line 28: #include 
Standard C++ headers go before the 3rd-party library headers. I.e. line 19 
above.


http://gerrit.cloudera.org:8080/#/c/7009/16/testdata/workloads/functional-query/queries/QueryTest/cast-lazy-datetime-string.test
File 
testdata/workloads/functional-query/queries/QueryTest/cast-lazy-datetime-string.test:

http://gerrit.cloudera.org:8080/#/c/7009/16/testdata/workloads/functional-query/queries/QueryTest/cast-lazy-datetime-string.test@3
PS16, Line 3: select cast(dt_str as timestamp) from lazy_dt_str
should not need a cast here, we want to test the text scanner parsing the 
datetime column directly


http://gerrit.cloudera.org:8080/#/c/7009/16/testdata/workloads/functional-query/queries/QueryTest/cast-lazy-datetime-string.test@4
PS16, Line 4:  RESULTS
Let's make this RESULTS: VERIFY_IS_EQUAL_SORTED

We don't guarantee the order of returned rows without an order by clause 
(although in practice it will be in the file order for this query plan).


http://gerrit.cloudera.org:8080/#/c/7009/16/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/7009/16/tests/query_test/test_scanners.py@874
PS16, Line 874: STRING
This should be a timestamp column, so we can make sure that the text scanner 
does the conversion correctly.



--
To view, visit http://gerrit.cloudera.org:8080/7009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
Gerrit-Change-Number: 7009
Gerrit-PatchSet: 16
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vincent Tran 
Gerrit-Comment-Date: Mon, 12 Mar 2018 21:15:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster.

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9395 )

Change subject: IMPALA-6341, IMPALA-5917: Reduce mem-limit for 
start-impala-cluster.
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2084/


--
To view, visit http://gerrit.cloudera.org:8080/9395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7
Gerrit-Change-Number: 9395
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 20:55:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6341, IMPALA-5917: Reduce mem-limit for start-impala-cluster.

2018-03-12 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9395 )

Change subject: IMPALA-6341, IMPALA-5917: Reduce mem-limit for 
start-impala-cluster.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9395/2/testdata/workloads/functional-query/queries/QueryTest/semi-joins-exhaustive.test
File 
testdata/workloads/functional-query/queries/QueryTest/semi-joins-exhaustive.test:

http://gerrit.cloudera.org:8080/#/c/9395/2/testdata/workloads/functional-query/queries/QueryTest/semi-joins-exhaustive.test@9
PS2, Line 9: # of the relevant fragment is 3.6GB when tested.)
> It might be worth reducing it further, since we could potentially still hav
I've not done this at the moment.



--
To view, visit http://gerrit.cloudera.org:8080/9395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabca7a95560bd27c2de2b0a147ee9a3c45199db7
Gerrit-Change-Number: 9395
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 20:53:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6621: Improve set lookup performance for in-predicate evaluation

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9570 )

Change subject: IMPALA-6621: Improve set lookup performance for in-predicate 
evaluation
..


Patch Set 1:

(1 comment)

Out of curiousity I tried out this alternative hash map: 
https://github.com/greg7mdp/sparsepp

It was actually slower for decimal (700ms vs 500ms). I also concluded that 
google's dense_hash_set was somewhat tricky to make work, since it requires 
providing a sentinel value to represent empty entries.

http://gerrit.cloudera.org:8080/#/c/9570/1/be/src/exprs/in-predicate.h
File be/src/exprs/in-predicate.h:

http://gerrit.cloudera.org:8080/#/c/9570/1/be/src/exprs/in-predicate.h@359
PS1, Line 359:   state->val_set.insert(GetVal(state->type, 
*arg));
We should change this function to use the bulk insert API to avoid N^2 
behaviour with flat_set: 
http://www.boost.org/doc/libs/1_56_0/doc/html/boost/container/flat_set.html#idp30015536-bb



--
To view, visit http://gerrit.cloudera.org:8080/9570
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd1627d779d10a16468cc3c2d0bc26a497e048df
Gerrit-Change-Number: 9570
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 20:40:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2083/


--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 20:25:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 20:15:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 20:03:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5717: Support for reading ORC data files

2018-03-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9134 )

Change subject: IMPALA-5717: Support for reading ORC data files
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9134/4/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/9134/4/be/src/exec/hdfs-scanner.h@379
PS4, Line 379: virtual
why make this virtual? This subroutine interface is pretty ill-defined, so I 
think we should avoid making it overridable.

I don't think it's needed to be virtual anyway, since only the subclasses call 
it, so you could always just not use it from the orc scanner. But, it looks 
like the orc scanner version is the same implementation anyway...



--
To view, visit http://gerrit.cloudera.org:8080/9134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7b6ae4ce3b9ee8125b21993702faa87537790a4
Gerrit-Change-Number: 9134
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:58:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5980: Upgrade to LLVM 5.0.1

2018-03-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9584 )

Change subject: IMPALA-5980: Upgrade to LLVM 5.0.1
..


Patch Set 1:

Do any of those benchmarks verify compile time performance?


--
To view, visit http://gerrit.cloudera.org:8080/9584
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib0a15cb53feab89e7b35a56b67b3b30eb3e62c6b
Gerrit-Change-Number: 9584
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:45:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 4: Code-Review+1

Thanks for the reviews. PS4 addresses Tim's comment. Carrying +1 from Michael 
and Sailesh.


--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:39:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h
File be/src/runtime/io/handle-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@128
PS2, Line 128: This only happens if there is no entry for this file or all the
 :   // existing file handles are in use.
That was true while holding the lock above but is no longer necessarily true 
since we no longer hold the lock -- there might now be an unused entry for this 
file.  So how can we rely on that?


http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@130
PS2, Line 130:  matter whether we insert this entry before or after or in 
between the
 :   // existing entries for this file
why is the previous sentence needed in order for this to be okay?



--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:28:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6479: Update DESCRIBE to respect column privileges

2018-03-12 Thread Adam Holley (Code Review)
Adam Holley has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/9276 )

Change subject: IMPALA-6479: Update DESCRIBE to respect column privileges
..

IMPALA-6479: Update DESCRIBE to respect column privileges

Modified the Frontend to filter columns from the DESCRIBE
statement.  Additionally, if a user has select on at least
one column, they can run DESCRIBE and see most metadata.
If they do not have full table access, they will not see
location or view query metadata.

Testing:
Added tests to validate users that have one or more column
access can run describe and that the output is filtered
accordingly.

Change-Id: Ic96ae184fccdc88ba970b5adcd501da1966accb9
---
M be/src/service/client-request-state.cc
M be/src/service/frontend.cc
M be/src/service/frontend.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
13 files changed, 333 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/9276/2
--
To view, visit http://gerrit.cloudera.org:8080/9276
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic96ae184fccdc88ba970b5adcd501da1966accb9
Gerrit-Change-Number: 9276
Gerrit-PatchSet: 2
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 


[Impala-ASF-CR] IMPALA-6479: Update DESCRIBE to respect column privileges

2018-03-12 Thread Adam Holley (Code Review)
Adam Holley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9276 )

Change subject: IMPALA-6479: Update DESCRIBE to respect column privileges
..


Patch Set 1:

(22 comments)

Changes applied.

http://gerrit.cloudera.org:8080/#/c/9276/1/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/9276/1/be/src/service/client-request-state.cc@381
PS1, Line 381:   params->__isset.table_name ? &(params->table_name) : 
NULL;
> NULL -> nullptr
Removed lines from other refactoring.


http://gerrit.cloudera.org:8080/#/c/9276/1/be/src/service/frontend.h
File be/src/service/frontend.h:

http://gerrit.cloudera.org:8080/#/c/9276/1/be/src/service/frontend.h@133
PS1, Line 133:   Status DescribeTable(const TDescribeOutputStyle::type 
output_style,
> Why do we need to change the signature so dramatically? Should it not be su
Refactored.


http://gerrit.cloudera.org:8080/#/c/9276/1/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/9276/1/common/thrift/Frontend.thrift@173
PS1, Line 173:   4: optional ImpalaInternalService.TSessionState session
> Yes, you are right. We need to keep this field.
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java@122
PS1, Line 122:   resultStruct_ = Path.getTypeAsStruct(path_.destType());
> Maybe I'm missing something, but it seems like the following scenario is no
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/catalog/Table.java@489
PS1, Line 489:   public StructType getHiveColumnsAsStruct(List columns) 
{
> Seems weird to have this as a member, since it's totally non-specific to th
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/catalog/Table.java@490
PS1, Line 490: ArrayList fields = 
Lists.newArrayListWithCapacity(colsByPos_.size());
> columns.size()
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
File fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java:

http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java@199
PS1, Line 199: List nonClustered = new 
ArrayList(table.getNonClusteringColumns());
> Will this code work if 'nonClustered' or 'clustered' is empty?
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java@200
PS1, Line 200: nonClustered.retainAll(filteredColumns);
> Concise but slow. I suggest this instead
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java@259
PS1, Line 259:* Builds a TDescribeResult for a table.
> update comment
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java@261
PS1, Line 261:   public static TDescribeResult 
buildDescribeMinimalResult(List columns) {
> buildKuduDescribeMinimalResult()
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/Frontend.java@796
PS1, Line 796:   for (Column col: table.getColumnsInHiveOrder()) {
> Can we do a table-level check first? Checking all columns all the time is p
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/Frontend.java@806
PS1, Line 806:   filteredColumns = table.getColumns();
> Shouldn't this be getColumnsInHiveOrder()?
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/JniFrontend.java@434
PS1, Line 434: // If the session was not set it indicates this is an 
internal Impala call.
> Where is this called internally? I only see this function being called Clie
Done


http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java:

http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@116
PS1, Line 116:   private static final List 

[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9578/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

http://gerrit.cloudera.org:8080/#/c/9578/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test@553
PS1, Line 553: # Decimal InList predicate.
> Good point. You could still make it work by having that test join alltypes
Done



--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:19:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Grant Henke (Code Review)
Hello Thomas Tauber-Marshall,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9578

to look at the new patch set (#2).

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..

IMPALA-6635: Add DECIMAL type to Kudu predicates

This patch enables pushing scan predicates on
DECIMAL columns down to Kudu.

Testing:
- Added Planner decimal predicate test to kudu.test
- Added Planner decimal in-list test to kudu-selectivity.test

Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
---
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
3 files changed, 35 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/9578/2
--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9578/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

http://gerrit.cloudera.org:8080/#/c/9578/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test@553
PS1, Line 553: # Decimal InList predicate.
> I can't add the column there because the functional alltypes table doesn't
Good point. You could still make it work by having that test join alltypes and 
decimal_tbl, but that's messy.

If you just move this to be next to that test (and also add comments on both 
test cases to make it clear what's going on) that should be fine



--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:18:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9578/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9578/1//COMMIT_MSG@8
PS1, Line 8:
> Could you add a little more color here, eg. "This patch enables pushing sca
Done


http://gerrit.cloudera.org:8080/#/c/9578/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

http://gerrit.cloudera.org:8080/#/c/9578/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test@553
PS1, Line 553: # Decimal InList predicate.
> There's an existing test in 'PlannerTest/kudu-selectivity.test' (around lin
I can't add the column there because the functional alltypes table doesn't 
actually contain all types and adding the decimal type would be extremely 
invasive given how widely used that table is in tests.

I could move this test to that location if you prefer, but it would still be 
stand alone.



--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:13:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5980: Upgrade to LLVM 5.0.1

2018-03-12 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9584


Change subject: IMPALA-5980: Upgrade to LLVM 5.0.1
..

IMPALA-5980: Upgrade to LLVM 5.0.1

Highlighting a few changes in LLVM:
- Minor changes to some function signatures
- Minor changes to error handling
- Split Bitcode/ReaderWriter.h - https://reviews.llvm.org/D26502
- Introduced an optional new GVN optimization pass.

Needed to fix a few new clang-tidy warnings.

Testing:
Ran core and ASAN tests successfully.

Performance:
Ran single node TPC-H and targeted perf with scale factor 60. Both
improved on average. Identified regression in
"primitive_filter_in_predicate" which will be addressed by IMPALA-6621.

+---+---+-++++
| Workload  | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) 
| Delta(GeoMean) |
+---+---+-++++
| TARGETED-PERF(60) | parquet / none / none | 22.29   | -0.12% | 3.90   
| +3.16% |
| TPCH(60)  | parquet / none / none | 15.97   | -3.64% | 10.14  
| -4.92% |
+---+---+-++++

+---++---++-++++-+---+
| Workload  | Query  | 
File Format   | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%)  | Base 
StdDev(%) | Num Clients | Iters |
+---++---++-++++-+---+
| TARGETED-PERF(60) | PERF_LIMIT-Q1  | 
parquet / none / none | 0.01   | 0.00| R +156.43% | * 25.80% * | * 
17.14% * | 1   | 5 |
| TARGETED-PERF(60) | primitive_filter_in_predicate  | 
parquet / none / none | 3.39   | 1.92| R +76.33%  |   3.23%|   
4.37%| 1   | 5 |
| TARGETED-PERF(60) | primitive_filter_string_non_selective  | 
parquet / none / none | 1.25   | 1.11|   +12.46%  |   3.41%|   
5.36%| 1   | 5 |
| TARGETED-PERF(60) | primitive_filter_decimal_selective | 
parquet / none / none | 1.40   | 1.25|   +12.25%  |   3.57%|   
3.44%| 1   | 5 |
| TARGETED-PERF(60) | primitive_filter_string_like   | 
parquet / none / none | 16.87  | 15.65   |   +7.78%   |   5.05%|   
0.37%| 1   | 5 |
| TARGETED-PERF(60) | primitive_min_max_runtime_filter   | 
parquet / none / none | 1.79   | 1.71|   +4.77%   |   0.71%|   
1.73%| 1   | 5 |
| TARGETED-PERF(60) | primitive_broadcast_join_2 | 
parquet / none / none | 0.60   | 0.58|   +3.64%   |   3.19%|   
3.81%| 1   | 5 |
| TARGETED-PERF(60) | primitive_filter_string_selective  | 
parquet / none / none | 0.95   | 0.93|   +2.91%   |   5.23%|   
5.85%| 1   | 5 |
| TARGETED-PERF(60) | primitive_broadcast_join_3 | 
parquet / none / none | 4.33   | 4.21|   +2.83%   |   5.46%|   
3.25%| 1   | 5 |
| TARGETED-PERF(60) | primitive_groupby_bigint_lowndv| 
parquet / none / none | 4.59   | 4.47|   +2.82%   |   3.73%|   
1.14%| 1   | 5 |
| TARGETED-PERF(60) | primitive_conjunct_ordering_3  | 
parquet / none / none | 0.20   | 0.19|   +2.65%   |   4.76%|   
2.24%| 1   | 5 |
| TARGETED-PERF(60) | PERF_AGG-Q1| 
parquet / none / none | 2.49   | 2.43|   +2.31%   |   1.06%|   
1.93%| 1   | 5 |
| TARGETED-PERF(60) | PERF_AGG-Q6| 
parquet / none / none | 2.04   | 2.00|   +2.09%   |   3.51%|   
2.80%| 1   | 5 |
| TPCH(60)  | TPCH-Q3| 
parquet / none / none | 12.37  | 12.17   |   +1.62%   |   0.80%|   
2.45%| 1   | 5 |
| TARGETED-PERF(60) | PERF_STRING-Q5 | 
parquet / none / none | 4.52   | 4.45|   +1.54%   |   1.23%|   
1.08%| 1   | 5 |
| TPCH(60)  | TPCH-Q6| 
parquet / none / none | 2.95   | 2.91|   +1.33%   |   1.92%| 

[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc
File be/src/service/data-stream-service.cc:

http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc@58
PS3, Line 58: InitMetrics
I think InitMetrics() is misleading - the name is too similar to other 
functions that do some kind of global initialization. Maybe something like 
MemTrackerMetric::CreateMetrics() so it sounds more like a constructor/factory 
method.



--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:07:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5191, IMPALA-6415: [DOCS] Document breaking change of alias and ordinal substitution

2018-03-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9211 )

Change subject: IMPALA-5191, IMPALA-6415: [DOCS] Document breaking change of 
alias and ordinal substitution
..


Patch Set 1:

(1 comment)

thanks!

http://gerrit.cloudera.org:8080/#/c/9211/1/docs/topics/impala_aliases.xml
File docs/topics/impala_aliases.xml:

http://gerrit.cloudera.org:8080/#/c/9211/1/docs/topics/impala_aliases.xml@76
PS1, Line 76: , you can read more about it at
> Replace with "." Conref will render the text here.
Done



--
To view, visit http://gerrit.cloudera.org:8080/9211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I558230d07212da62d2cd12e07a52ceba03e980a8
Gerrit-Change-Number: 9211
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:05:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5191, IMPALA-6415: [DOCS] Document breaking change of alias and ordinal substitution

2018-03-12 Thread Zoltan Borok-Nagy (Code Review)
Hello Alex Rodoni, John Russell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9211

to look at the new patch set (#2).

Change subject: IMPALA-5191, IMPALA-6415: [DOCS] Document breaking change of 
alias and ordinal substitution
..

IMPALA-5191, IMPALA-6415: [DOCS] Document breaking change of alias and ordinal 
substitution

The alias and ordinal substitution logic has been
changed by IMPALA-5191. This commit updates the
documentation regarding to the new behavior.

Change-Id: I558230d07212da62d2cd12e07a52ceba03e980a8
Cherry-picks: not for 2.x.
---
M docs/shared/impala_common.xml
M docs/topics/impala_aliases.xml
2 files changed, 77 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/9211/2
--
To view, visit http://gerrit.cloudera.org:8080/9211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I558230d07212da62d2cd12e07a52ceba03e980a8
Gerrit-Change-Number: 9211
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 1: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9578/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9578/1//COMMIT_MSG@8
PS1, Line 8:
Could you add a little more color here, eg. "This patch enables pushing scan 
predicates on DECIMAL columns down to Kudu. Testing: ..."


http://gerrit.cloudera.org:8080/#/c/9578/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

http://gerrit.cloudera.org:8080/#/c/9578/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test@553
PS1, Line 553: # Decimal InList predicate.
There's an existing test in 'PlannerTest/kudu-selectivity.test' (around line 
124) that covers in-list predicates for all of the other types. Can you add 
this there?

(and could you also add a brief comment for that test case that notes that its 
testing push down for various types of in-list predicates?)



--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 19:00:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc
File be/src/service/data-stream-service.cc:

http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc@58
PS3, Line 58: DataStreamService
> If we're going to do a CM change anyway, then I suggest staying consistent 
Lars pointed out to me that non KRPC RPC metrics are also camel cased. So, 
since that's the case, lets leave this as it is. It's not important enough to 
have to change everything at this point.



--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 18:58:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9527 )

Change subject: IMPALA-6609: Fix ownership of class members in 
KrpcDataStreamRecvr
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2082/


--
To view, visit http://gerrit.cloudera.org:8080/9527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a
Gerrit-Change-Number: 9527
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 18:56:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..


Patch Set 2: Code-Review+1

(3 comments)

Thanks for the explanation!

I have left some optional comments, but the code is ok for me as it is.

http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h
File be/src/runtime/io/handle-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@134
PS2, Line 134:   FileHandleEntry entry(new_fh, p.lru_list);
 :   typename MapType::iterator new_it = p.cache.emplace(*fname, 
std::move(entry));
I would remove the constructor and move the arguments to emplace: 
emplace(*fname, new_fh, p.lru_list) or emplace(*fname, FileHandleEntry(new_fh, 
p.lru_list))


http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@137
PS2, Line 137:   ++p.size;
 :   if (p.size > p.capacity) EvictHandles(p);
I would prefer to move this before the creation of the new entry, to make it 
clear that EvictHandles() can not have any effect on it. This would also make 
it a little little bit faster, because evicted elements would be removed from a 
smaller map.


http://gerrit.cloudera.org:8080/#/c/9576/2/be/src/runtime/io/handle-cache.inline.h@139
PS2, Line 139:   DCHECK(!new_elem->in_use);
This DCHECK could be probably removed.



--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 12 Mar 2018 18:56:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr

2018-03-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9527 )

Change subject: IMPALA-6609: Fix ownership of class members in 
KrpcDataStreamRecvr
..


Patch Set 5: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/9527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a
Gerrit-Change-Number: 9527
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 18:52:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 3: Code-Review+1

(1 comment)

Feel free to upgrade to a +2 if no one else is reviewing it aside from Michael 
and I.

http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc
File be/src/service/data-stream-service.cc:

http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc@58
PS3, Line 58: DataStreamService
> Yes, we seem to be a bit inconsistent in the naming convention for the RPC
If we're going to do a CM change anyway, then I suggest staying consistent with 
the other metrics and change both this and the one from IMPALA-6269.

If the CM change is to be deferred for later, then sticking with this is fine. 
But the preference is the former.



--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 18:48:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6610: Impala shell fetches the value of ldap password cmd incorrectly

2018-03-12 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9506 )

Change subject: IMPALA-6610: Impala shell fetches the value of 
ldap_password_cmd incorrectly
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9506/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9506/8//COMMIT_MSG@7
PS8, Line 7: Impala shell fetches the value of ldap_password_cmd incorrectly
Update


http://gerrit.cloudera.org:8080/#/c/9506/8//COMMIT_MSG@9
PS8, Line 9: The value of LDAP password in Impala shell contains extra line 
break causes
nit: Remove trailing spaces and wrap (here and below)


http://gerrit.cloudera.org:8080/#/c/9506/8/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/9506/8/shell/impala_shell.py@821
PS8, Line 821:   if options.ldap_password_cmd.strip().startswith('echo') 
and \
Add checks for ldap_password_cmd & ldap_password not being None. Else shell can 
crash with an error if it hits this Exception and --ldap_password_cmd is not 
used.



--
To view, visit http://gerrit.cloudera.org:8080/9506
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie570166aea62af223905b7f0124e9efb15a88ac7
Gerrit-Change-Number: 9506
Gerrit-PatchSet: 8
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Comment-Date: Mon, 12 Mar 2018 18:34:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3040: add logging to test caching ddl

2018-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9579 )

Change subject: IMPALA-3040: add logging to test_caching_ddl
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2081/


--
To view, visit http://gerrit.cloudera.org:8080/9579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabc30445440e0fb358856da407d833f5ae975213
Gerrit-Change-Number: 9579
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 18:15:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3040: add logging to test caching ddl

2018-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9579 )

Change subject: IMPALA-3040: add logging to test_caching_ddl
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabc30445440e0fb358856da407d833f5ae975213
Gerrit-Change-Number: 9579
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 18:13:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6576: Add metrics for data stream service memory usage

2018-03-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9562 )

Change subject: IMPALA-6576: Add metrics for data stream service memory usage
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc
File be/src/service/data-stream-service.cc:

http://gerrit.cloudera.org:8080/#/c/9562/3/be/src/service/data-stream-service.cc@58
PS3, Line 58: DataStreamService
> I noticed that, too, but the one we added in IMPALA-6269 is camel case. htt
Yes, we seem to be a bit inconsistent in the naming convention for the RPC 
related metrics and other metrics. My suggestion would be to stay consistent 
with other RPC related metrics for now.



--
To view, visit http://gerrit.cloudera.org:8080/9562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5033b8dfda0b23d4230535ba13c3e050a35d01a3
Gerrit-Change-Number: 9562
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 18:09:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3040: add logging to test caching ddl

2018-03-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9579


Change subject: IMPALA-3040: add logging to test_caching_ddl
..

IMPALA-3040: add logging to test_caching_ddl

We don't currently have enough information to reconstruct why the test
failed, so lets add logging with timestamps to understand which timeout
we're actually hitting.

Change-Id: Iabc30445440e0fb358856da407d833f5ae975213
---
M tests/query_test/test_hdfs_caching.py
1 file changed, 5 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/9579/1
--
To view, visit http://gerrit.cloudera.org:8080/9579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iabc30445440e0fb358856da407d833f5ae975213
Gerrit-Change-Number: 9579
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..


Patch Set 1:

(1 comment)

> (1 comment)
 >
 > This change can increase the number of parallel calls to the
 > namenode by FileHandleCache, which was limited by the number of
 > partitions before. I do not know if this can cause problems or not,
 > but I would prefer to mention it as a possible side effect, or
 > limit the number of parallel hdfsOpenFile calls somehow.

Good point. This code executes in the disk IO threads, and there are a limited 
number of disk IO threads (1/spinning disk, 8/SSD, etc). This can be larger 
than the number of partitions for the cache, but it is limited. The reason that 
I consider it ok to increase the parallelism is that we already use this level 
of parallelism when the file handle cache is off. When the cache is off, we use 
DiskIoMgr::GetExclusiveHdfsFileHandle(), which doesn't need to get any lock. 
This is fixing up the file handle cache so that it can be as fast as when the 
cache is off.

http://gerrit.cloudera.org:8080/#/c/9576/1/be/src/runtime/io/handle-cache.inline.h
File be/src/runtime/io/handle-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/9576/1/be/src/runtime/io/handle-cache.inline.h@129
PS1, Line 129:   pair 
range =
 : p.cache.equal_range(*fname);
 :   FileHandleEntry entry(new_fh, p.lru_list);
 :   typename MapType::iterator new_it = 
p.cache.emplace_hint(range.second,
 :   *fname, std::move(entry));
> Does the hint logic make a difference anymore? I think that it would be bet
Good point, that is simpler/better. Done.



--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 12 Mar 2018 17:24:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Joe McDonnell (Code Review)
Hello Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9576

to look at the new patch set (#2).

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..

IMPALA-6638: Reduce file handle cache lock contention

FileHandleCache::OpenFileHandle() currently holds the
lock while opening a file handle. This lengthens the
duration holding the lock considerably, causing
contention when a lot of file handles are being
opened (i.e. when the cache is cold).

This changes FileHandleCache::OpenFileHandle() drops
the lock while opening the file handle, then
reacquires it to add the file handle to the cache.

When running a simple select on a table with 46801
Parquet files, this fixes a small performance overhead
for the cold cache case compared to having the
cache off. All results are averaged over 5 runs:

File handle cache off: 7.19s
File handle cache cold (no patch): 7.92s
File handle cache cold (with patch): 7.20s

When running a select that accesses and aggregates
4 columns from the same Parquet table (thus requiring
more scan ranges for the same file), the fix reduces
contention, particularly for the case with 8 IO threads
per disk:

1 IO thread per disk:
Without patch: 9.59s
With patch: 8.11s

8 IO threads per disk:
Without patch: 14.2s
With patch: 8.17s

The patch has no impact on the performance when the
cache is hot.

Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
---
M be/src/runtime/io/handle-cache.inline.h
1 file changed, 30 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/9576/2
--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL

2018-03-12 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9484 )

Change subject: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL
..


Patch Set 4:

Added a dependency on a patch for IMPALA-6635 which contains all functional 
changes so this patch is still test/tpc changes only.


--
To view, visit http://gerrit.cloudera.org:8080/9484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7e4464dc6705cadd610a82c459390a9c0dfe4f
Gerrit-Change-Number: 9484
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 17:00:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6609: Fix ownership of class members in KrpcDataStreamRecvr

2018-03-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9527 )

Change subject: IMPALA-6609: Fix ownership of class members in 
KrpcDataStreamRecvr
..


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.h
File be/src/runtime/krpc-data-stream-recvr.h:

http://gerrit.cloudera.org:8080/#/c/9527/4/be/src/runtime/krpc-data-stream-recvr.h@73
PS4, Line 73: /// the recvr instance from the tracking structure of its 
KrpcDataStreamMgr in all cases.
> Done
The new comment doesn't really explicitly talk about the synchronization rules, 
but I guess they can be inferred.



--
To view, visit http://gerrit.cloudera.org:8080/9527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3378496e2201b16c662b9daa634333480705c61a
Gerrit-Change-Number: 9527
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Mar 2018 16:59:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL

2018-03-12 Thread Grant Henke (Code Review)
Hello Thomas Tauber-Marshall, Dimitris Tsirogiannis, Tim Armstrong, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9484

to look at the new patch set (#4).

Change subject: IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL
..

IMPALA-6551: Change Kudu TPCDS and TPCH columns to DECIMAL

Before Kudu supported DECIMAL columns the TPCDS and TPCH
columns were djusted to use DOUBLE in place of DECIMAL. This
patch undoes that change now that Kudu supports DECIMAL.

Testing:
 - Updated concurrent_select.py
 - Updated test_tpch_queries.py
 - Excersized by the Kudu planner tests

Change-Id: I2f7e4464dc6705cadd610a82c459390a9c0dfe4f
---
M testdata/datasets/tpcds/tpcds_kudu_template.sql
M testdata/datasets/tpch/tpch_kudu_template.sql
M testdata/datasets/tpch/tpch_schema_template.sql
D testdata/workloads/tpcds/queries/tpcds-kudu-q19.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q27.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q3.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q34.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q42.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q43.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q46.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q47.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q52.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q53.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q55.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q59.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q6.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q61.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q63.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q65.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q68.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q7.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q73.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q79.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q8.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q88.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q89.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q96.test
D testdata/workloads/tpcds/queries/tpcds-kudu-q98.test
D testdata/workloads/tpch/queries/tpch-kudu-q1.test
D testdata/workloads/tpch/queries/tpch-kudu-q10.test
D testdata/workloads/tpch/queries/tpch-kudu-q11.test
D testdata/workloads/tpch/queries/tpch-kudu-q12.test
D testdata/workloads/tpch/queries/tpch-kudu-q13.test
D testdata/workloads/tpch/queries/tpch-kudu-q14.test
D testdata/workloads/tpch/queries/tpch-kudu-q15.test
D testdata/workloads/tpch/queries/tpch-kudu-q16.test
D testdata/workloads/tpch/queries/tpch-kudu-q17.test
D testdata/workloads/tpch/queries/tpch-kudu-q18.test
D testdata/workloads/tpch/queries/tpch-kudu-q19.test
D testdata/workloads/tpch/queries/tpch-kudu-q2.test
D testdata/workloads/tpch/queries/tpch-kudu-q20.test
D testdata/workloads/tpch/queries/tpch-kudu-q21.test
D testdata/workloads/tpch/queries/tpch-kudu-q22.test
D testdata/workloads/tpch/queries/tpch-kudu-q3.test
D testdata/workloads/tpch/queries/tpch-kudu-q4.test
D testdata/workloads/tpch/queries/tpch-kudu-q5.test
D testdata/workloads/tpch/queries/tpch-kudu-q6.test
D testdata/workloads/tpch/queries/tpch-kudu-q7.test
D testdata/workloads/tpch/queries/tpch-kudu-q8.test
D testdata/workloads/tpch/queries/tpch-kudu-q9.test
M tests/query_test/test_tpch_queries.py
M tests/stress/concurrent_select.py
52 files changed, 108 insertions(+), 22,157 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/9484/4
--
To view, visit http://gerrit.cloudera.org:8080/9484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f7e4464dc6705cadd610a82c459390a9c0dfe4f
Gerrit-Change-Number: 9484
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9578 )

Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..


Patch Set 1:

Started https://jenkins.impala.io/job/pre-review-test/151/


--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 12 Mar 2018 16:51:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6635: Add DECIMAL type to Kudu predicates

2018-03-12 Thread Grant Henke (Code Review)
Grant Henke has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9578


Change subject: IMPALA-6635: Add DECIMAL type to Kudu predicates
..

IMPALA-6635: Add DECIMAL type to Kudu predicates

Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
---
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
2 files changed, 36 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/9578/1
--
To view, visit http://gerrit.cloudera.org:8080/9578
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2569a9e1d58f1c58884d58633d46348364888ed7
Gerrit-Change-Number: 9578
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 


[Impala-ASF-CR] IMPALA-6638: Reduce file handle cache lock contention

2018-03-12 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9576 )

Change subject: IMPALA-6638: Reduce file handle cache lock contention
..


Patch Set 1:

(1 comment)

This change can increase the number of parallel calls to the namenode by 
FileHandleCache, which was limited by the number of partitions before. I do not 
know if this can cause problems or not, but I would prefer to mention it as a 
possible side effect, or limit the number of parallel hdfsOpenFile calls 
somehow.

http://gerrit.cloudera.org:8080/#/c/9576/1/be/src/runtime/io/handle-cache.inline.h
File be/src/runtime/io/handle-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/9576/1/be/src/runtime/io/handle-cache.inline.h@129
PS1, Line 129:   pair 
range =
 : p.cache.equal_range(*fname);
 :   FileHandleEntry entry(new_fh, p.lru_list);
 :   typename MapType::iterator new_it = 
p.cache.emplace_hint(range.second,
 :   *fname, std::move(entry));
Does the hint logic make a difference anymore? I think that it would be better 
to replace emplace_hint() with emplace() and remove equal_range().



--
To view, visit http://gerrit.cloudera.org:8080/9576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c695b21ca556e9c73c703c0c891e64939271c8d
Gerrit-Change-Number: 9576
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Comment-Date: Mon, 12 Mar 2018 16:11:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3866: Improve error reporting for scratch write errors

2018-03-12 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9420 )

Change subject: IMPALA-3866: Improve error reporting for scratch write errors
..


Patch Set 8:

(16 comments)

Thanks for taking a look, Tim!

http://gerrit.cloudera.org:8080/#/c/9420/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9420/7//COMMIT_MSG@7
PS7, Line 7: IMPALA-3866: Improve error reporting for scratch write errors
> nit: colon after JIRA number
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc
File be/src/runtime/io/disk-io-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@347
PS7, Line 347: void DiskIoMgrTest::ExecuteWriteFailureTest(DiskIoMgr* io_mgr, 
const string& file_name,
> nit: extra blank line.
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@355
PS7, Line 355: LOG(ERROR) << "Error creating temp file " << file_name << " 
of size 100";
> Can't we stack allocate this? The memory shouldn't be referenced once write
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@362
PS7, Line 362:   io_mgr->SetLocalFileSystem(fs);
> Same with new_range - can't we stack allocate it?
Removed this.


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@380
PS7, Line 380:   nullptr, nullptr, nullptr, data,
> I don't think allocated the pointer on the heap is needed, is it?
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-test.cc@385
PS7, Line 385: }
> We already overwrote the value of 'write_range' on l380, so this doesn't ac
That's right, removed the write_range param. Also I noticed that it's not 
necessary in this case to pass a write_range to WriteValidateCallback() as it 
only uses it if the write succeeds.


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-with-fault-injection.h
File be/src/runtime/io/disk-io-mgr-with-fault-injection.h:

http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-with-fault-injection.h@46
PS7, Line 46:
> Let's document explicitly which functions they're wrapping.
I docemented the name of the wrapped functions in disk-io-mgr.h. These are 
overriding the wrapper function from that class.


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-with-fault-injection.h@48
PS7, Line 48:
> Let's add an "override" specifier to the end of overriding functions.
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr-with-fault-injection.h@58
PS7, Line 58:
> Our convention is to not use underscores on struct members, since they're m
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.h@401
PS7, Line 401: REMOTE_S3_DISK_OFFSET,
> I feel like maybe there's already too much functionality in the DiskIoMgr c
Good idea, thx!
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.h@404
PS7, Line 404:   };
> I also think if we're going to pull these out into functions, we should jus
Makes sense!
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/disk-io-mgr.cc@1120
PS7, Line 1120:
> We can unnest the body of this "else" now that we're doing an early return
Thx!
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.h
File be/src/runtime/io/errno-to-error-status-converter.h:

http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.h@34
PS7, Line 34:
> Maybe we can just name this ErrorConverter to be more concise?
Sure, Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.h@43
PS7, Line 43:
> Can we document what 'params' does?
Done


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.cc
File be/src/runtime/io/errno-to-error-status-converter.cc:

http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.cc@23
PS7, Line 23:
> We can just do #include "common/names.h" under the other includes and remov
I wasn't aware of this common/names.h, thx!
Done

(Apparently this doesn't work for std::unordered_map, however would work for 
boost::unordered_map. Do you know if there is a preference between them?)


http://gerrit.cloudera.org:8080/#/c/9420/7/be/src/runtime/io/errno-to-error-status-converter.cc@68
PS7, Line 68:
> nit: needs space
Done



--
To view, visit http://gerrit.cloudera.org:8080/9420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master

[Impala-ASF-CR] IMPALA-3866: Improve error reporting for scratch write errors

2018-03-12 Thread Gabor Kaszab (Code Review)
Hello Attila Jeges, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9420

to look at the new patch set (#8).

Change subject: IMPALA-3866: Improve error reporting for scratch write errors
..

IMPALA-3866: Improve error reporting for scratch write errors

The error messages coming from DiskIoMgr::Write() are enhanced by this
change. A mapping is introduced between the errno set by open(),
fdopen(), fseek(), fwrite() or fclose() low level functions and an
error message for displaying purposes. If any of these functions
fail then the returned error message is taken from this mapping.

In addition there were two functions, NewFile() and
FileAllocateSpace() that always returned Status::OK(). I made them
void and removed the status checks from the call sites.

For testing purposes a fault injection mechanism is introduced to
simulate the cases when the above mentioned functions fail.

Change-Id: I5aa7b424209b1a5ef8dc7d04c5ba58788e91aad7
---
M be/src/runtime/io/CMakeLists.txt
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
A be/src/runtime/io/error-converter.cc
A be/src/runtime/io/error-converter.h
A be/src/runtime/io/local-file-system-with-fault-injection.cc
A be/src/runtime/io/local-file-system-with-fault-injection.h
A be/src/runtime/io/local-file-system.cc
A be/src/runtime/io/local-file-system.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
15 files changed, 613 insertions(+), 85 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/9420/8
--
To view, visit http://gerrit.cloudera.org:8080/9420
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5aa7b424209b1a5ef8dc7d04c5ba58788e91aad7
Gerrit-Change-Number: 9420
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 


  1   2   >