[Impala-ASF-CR] IMPALA-5506: Add stdin description to help information of query file option

2017-06-13 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5506: Add stdin description to help information of 
query_file option
..


Patch Set 1:

(1 comment)

Thanks for doing this!

http://gerrit.cloudera.org:8080/#/c/7178/1/shell/option_parser.py
File shell/option_parser.py:

Line 87: help="Execute the queries in the query file or 
from STDIN indicated by - and end input with ctrl+d, delimited by ;")
Please wrap to 90 chars.

The message is good - but may I suggest a minor reword? How about:

"Execute the queries in the query file, delimited by ;. Queries
  may be read from stdin if the argument to -f is -."


-- 
To view, visit http://gerrit.cloudera.org:8080/7178
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0264174fd062497c72891b31cf9ac1ba6c00f716
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message

2017-06-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP message
..


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/729/

-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5506: Add stdin description to help information of query file option

2017-06-13 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7178

Change subject: IMPALA-5506: Add stdin description to help information of 
query_file option
..

IMPALA-5506: Add stdin description to help information of query_file option

Help information of query_file option in impala-shell misses stdin
description.

I fix this issue by adding stdin description to help information of
query_file option.

Change-Id: I0264174fd062497c72891b31cf9ac1ba6c00f716
---
M shell/option_parser.py
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/7178/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7178
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0264174fd062497c72891b31cf9ac1ba6c00f716
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message

2017-06-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP message
..


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/728/

-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message

2017-06-13 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP message
..


Patch Set 6: Code-Review+2

Thanks for fixing the nits. I'll run the pre commit job

-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5494: Fixes the selectivity of NOT IN predicates

2017-06-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5494: Fixes the selectivity of NOT IN predicates
..


Patch Set 6:

(6 comments)

Thanks for adding the new test!

http://gerrit.cloudera.org:8080/#/c/7168/6/fe/src/test/java/org/apache/impala/analysis/ExprSelectivityTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprSelectivityTest.java:

Line 31: public void verifySelectivityStmt(String stmtStr, double 
expectedSel)
we indent 2


Line 33:   SelectStmt stmt = (SelectStmt) ParsesOk(stmtStr);
simplify with AnalyzesOk()


Line 38:   assertEquals(calculatedSel,expectedSel);
space after comma


Line 43:   String stmtStr = "select * from functional.alltypes where id " +
simplify this to "select  from functional.alltypes"


Line 52: private double getPredSel(int numPredChild) {
Very specific to IN predicate. How about we hardcode the expected values for 
now? It's not clear to me that it makes sense to re-implement the same logic 
again for testing purposes.


Line 69:   verifySel("in (1,3,5,7)", getPredSel(4));
Seems cleaner to provide the full expr as input and not assume "id". That way 
this framework should work for arbitrary exprs.


-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5494: Fixes the selectivity of NOT IN predicates

2017-06-13 Thread Vincent Tran (Code Review)
Vincent Tran has uploaded a new patch set (#6).

Change subject: IMPALA-5494: Fixes the selectivity of NOT IN predicates
..

IMPALA-5494: Fixes the selectivity of NOT IN predicates

This change modifies the logic of NOT IN predicate so that
the planner can calculate the correct node cardinality. Prior
to this change, both IN and NOT IN predicates shared the
same selectivity, which resulted in the same cardinality
during planning.

The selectivity is calculated by the following heuristic:

selectivity = 1 - (num of predicate children /
num of distinct values)

Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
---
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
A fe/src/test/java/org/apache/impala/analysis/ExprSelectivityTest.java
2 files changed, 79 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/7168/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 


[Impala-ASF-CR] IMPALA-5494: Fixes the selectivity of NOT IN predicates

2017-06-13 Thread Vincent Tran (Code Review)
Vincent Tran has uploaded a new patch set (#5).

Change subject: IMPALA-5494: Fixes the selectivity of NOT IN predicates
..

IMPALA-5494: Fixes the selectivity of NOT IN predicates

This change modifies the logic of NOT IN predicate so that
the planner can calculate the correct node cardinality. Prior
to this change, both IN and NOT IN predicates shared the
same selectivity, which resulted in the same cardinality
during planning.

The selectivity is calculated by the following heuristic:

selectivity = 1 - (num of predicate children /
num of distinct values)

Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
---
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
A fe/src/test/java/org/apache/impala/analysis/ExprSelectivityTest.java
2 files changed, 79 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/7168/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 


[Impala-ASF-CR] IMPALA-5494: Fixes the selectivity of NOT IN predicates

2017-06-13 Thread Vincent Tran (Code Review)
Vincent Tran has uploaded a new patch set (#4).

Change subject: IMPALA-5494: Fixes the selectivity of NOT IN predicates
..

IMPALA-5494: Fixes the selectivity of NOT IN predicates

This change modifies the logic of NOT IN predicate so that
the planner can calculate the correct node cardinality. Prior
to this change, both IN and NOT IN predicates shared the
same selectivity, which resulted in the same cardinality
during planning.

The selectivity is calculated by the following heuristic:

selectivity = 1 - (num of predicate children /
num of distinct values)

Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
---
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
1 file changed, 7 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/7168/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 


[Impala-ASF-CR] IMPALA-5494: Fixes the selectivity of NOT IN predicates

2017-06-13 Thread Vincent Tran (Code Review)
Vincent Tran has posted comments on this change.

Change subject: IMPALA-5494: Fixes the selectivity of NOT IN predicates
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7168/3/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:

Line 105: result = self.execute_query("explain select * from %s.%s where id 
"
> Instead of this end-to-end test, please add a new FE unit test ExprSelectiv
That makes sense. Done.


Line 105: result = self.execute_query("explain select * from %s.%s where id 
"
> Agree, e-e test seems like an overkill.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5495: Improve error message if no impalad role is configured

2017-06-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5495: Improve error message if no impalad role is 
configured
..


IMPALA-5495: Improve error message if no impalad role is configured

"Impala server needs to have a role (EXECUTOR, COORDINATOR)"

replaced with:

"Impala does not have a valid role configured. Either --is_coordinator or
--is_executor must be set to true."

Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
Reviewed-on: http://gerrit.cloudera.org:8080/7167
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Impala Public Jenkins
---
M be/src/service/impala-server.cc
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Dimitris Tsirogiannis: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5495: Improve error message if no impalad role is configured

2017-06-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5495: Improve error message if no impalad role is 
configured
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message

2017-06-13 Thread Donghui Xu (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7166

to look at the new patch set (#6).

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP message
..

IMPALA-5492: Fix incorrect newline character in the LDAP message

The introduction has redundant '\' in impala-shell when using LDAP.

I fix this issue by removing extraneous '\' in introduction when
impala-shell using LDAP.

Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
---
M shell/impala_shell.py
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/7166/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message within impala-shell introduction

2017-06-13 Thread Donghui Xu (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7166

to look at the new patch set (#5).

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP 
message within  impala-shell introduction
..

IMPALA-5492: Fix incorrect newline character in the LDAP message within 
impala-shell introduction

The introduction has redundant '\' in impala-shell when using LDAP.

I fix this issue by removing extraneous '\' in introduction when
impala-shell using LDAP.

Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
---
M shell/impala_shell.py
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/7166/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message within impala-shell introduction

2017-06-13 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change.

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP 
message within impala-shell introduction
..


Patch Set 4:

(1 comment)

I have modified code according to your opinion and test it.

http://gerrit.cloudera.org:8080/#/c/7166/3/shell/impala_shell.py
File shell/impala_shell.py:

PS3, Line 1391: 
> nit: can you remove this while you're here as well?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message within impala-shell introduction

2017-06-13 Thread Donghui Xu (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7166

to look at the new patch set (#4).

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP 
message within impala-shell introduction
..

IMPALA-5492: Fix incorrect newline character in the LDAP message within 
impala-shell introduction

The introduction has redundant '\' in impala-shell when using LDAP.

I fix this issue by removing extraneous '\' in introduction when
impala-shell using LDAP.

Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
---
M shell/impala_shell.py
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/7166/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message within impala-shell introduction

2017-06-13 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change.

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP 
message within impala-shell introduction
..


Patch Set 3:

(2 comments)

I have resolved the problems you mentioned. Please review again. Thanks.

http://gerrit.cloudera.org:8080/#/c/7166/2//COMMIT_MSG
Commit Message:

Line 7: IMPALA-5492: Fix incorrect newline character in the LDAP message within 
impala-shell introduction
> nit: Space after the jira key, something like "IMPALA-5492: ". (Th
Done


PS2, Line 9: The introduction has redundant '\' in impala-shell when using LDAP.
   : 
   : I fix this issue by removing extraneous '\' in introduction when
   : impala-shell using LDAP.
> May be reword to "Remove extraneous '\' .."?  Looks like you are not removi
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message within impala-shell introduction

2017-06-13 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP 
message within impala-shell introduction
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7166/3/shell/impala_shell.py
File shell/impala_shell.py:

PS3, Line 1391: +
nit: can you remove this while you're here as well?


-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5492: Fix incorrect newline character in the LDAP message within impala-shell introduction

2017-06-13 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new patch set (#3).

Change subject: IMPALA-5492: Fix incorrect newline character in the LDAP 
message within impala-shell introduction
..

IMPALA-5492: Fix incorrect newline character in the LDAP message within 
impala-shell introduction

The introduction has redundant '\' in impala-shell when using LDAP.

I fix this issue by removing extraneous '\' in introduction when
impala-shell using LDAP.

Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
---
M shell/impala_shell.py
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/7166/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 


[Impala-ASF-CR] IMPALA-5483: Automatically disable codegen for small queries

2017-06-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5483: Automatically disable codegen for small queries
..


Patch Set 3:

1. Pretty big. The single node optimisation defaults to 100 rows total, this is 
set to 50,000 rows per node (e.g. 500,000 on a 10 node system).

2. Agreed. This checks the scan node estimates so is always disabled if there 
are a lot of rows to be scanned (or stats are unavailable). The main case where 
this could cause a major regression is an exploding many-to-many join, e.g. 
when two smallish tables are scanned but it blows up.

3. Yeah I think the model breaks down when the cost of codegen is non-linear 
with the size of the input exprs, but I think then we just need to fix the 
codegen. The inlining changes help but we still have cases like IMPALA-5296.

-- 
To view, visit http://gerrit.cloudera.org:8080/7153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I273bcee58641f5b97de52c0b2caab043c914b32e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5085: large rows in BufferedTupleStreamV2

2017-06-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5085: large rows in BufferedTupleStreamV2
..


Patch Set 10:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/6638/10/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS10, Line 120: Exceptions are the page we're currently writing
> why is that still an exception now that we have the saved reservation?
You're right. For some reason I thought I fixed this code but I obviously 
didn't.


PS10, Line 271: ExpectedPinCount
> Now that the complex logic shifts to reservations, should this become "bool
This seemed to work out a bit cleaner for the validations (i.e. we can just 
assert the pin count equals this, rather than asserting is_pinned == NeedPin() 
&& pin_count == 1.


PS10, Line 340: has_write_iterator
> why is that? should comment say: Need reservation if there are no pages cur
Done


Line 409:   - write_page_reservation_to_reclaim)) {
> this can only fail for large pages, right? if so, how about clarifying with
Or for pinned streams. I added the DCHECK.


Line 424:   // free up reservation for the next write page, so do it first.
> then why does this not have to happen before the IncreaseReservationToFit()
It's not necessary now that we factor that into the IncreaseReservationToFit() 
calculation. Removed the comment.


PS10, Line 505: deleting or unpinning the previous page
> this seems like it needs updating.
Done


PS10, Line 516: (!pinned_ || pages_.size() == 1) && has_read_write_page()
> hmm, is this not expressible by NeedWriteReservation()?
Done. It's a little clunky but it is probably clearer.


PS10, Line 548: InvalidateReadIterator
> why do that? don't we still need the saved read reservation?
In case there was already a read iterator active. This just resets things to a 
clear state then sets up the new iterator from scratch.


Line 566: read_end_ptr_ = nullptr;
> why doesn't this case need to save reservation?
It's handled in PrepareForReadWrite(). It's unclear whether it's cleaner to 
handle this here or in the caller. The current approach is slightly less code.


PS10, Line 713: read_page_ == pages_.end()
> why do we do NextReadPage() if we're already at the end?
pages_.end() is just the general-purpose invalid iterator value. The added case 
handles moving to the first page of a read/write stream.


PS10, Line 866: so
  :   // we can just use AllocateRowSlow() to do the work of 
advancing the page.
> i don't understand why this follows from the first part of the sentence.
I don't think it was very helpful. Shortened the comment.


PS10, Line 893: pinned_
> shouldn't that be expressed in terms of NeedWriteReservation()?
Done


http://gerrit.cloudera.org:8080/#/c/6638/10/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS10, Line 77: In the unpinned mode, only
 : /// pages currently being read and written are pinned and other 
pages are unpinned and
 : /// therefore do not use the client's reservation and can be 
spilled to disk.
 : ///
> it seems like this sentence should be combined with the next paragraph, sin
Merged the paragraphs.


PS10, Line 89: caller
> client?
Done


Line 278:   /// but std::function always makes a heap allocation.
> explain what this callback should do.
Done


PS10, Line 281: 'fixed_size' and variable
  :   /// length data 'varlen_size'
> update
Done


Line 283:   /// start of the allocated space and returns true. Otherwise 
returns false.
> does this have the same three outcomes as AddRow?
Done


PS10, Line 284: AllocateRow
> At this point the "Add" and "Allocate" row terminology seems historical. Le
AddRow() still seems reasonable. AddRowCustom(), AddRowUnsafe()?


Line 484:   /// * there is only one pinned page in the stream and it is already 
pinned for reading.
> this comment is useful, but it'd also be helpful to explain what these thre
Done


PS10, Line 535: and at the byte before 'data_end'.
> what does that mean?
was missing "ending"


PS10, Line 536: updates *data_end
> data_end doesn't have enough indirection. is that suppose to say '*data'?
Done


Line 582:   void InvalidateWriteIterator();
> it's a bit hard to understand how these "invalidate" methods are different 
Folded ResetReadPage() into InvalidateReadIterator(). Updated the 
ResetWritePage() comment to reflect the logical position of the iterator after 
it returns.


http://gerrit.cloudera.org:8080/#/c/6638/10/be/src/runtime/buffered-tuple-stream-v2.inline.h
File be/src/runtime/buffered-tuple-stream-v2.inline.h:

Line 34: inline bool BufferedTupleStreamV2::AllocateRow(
> why is it that AllocateRow() fast path is inlined but AddRow() is not?
AddRow() calls into the interpreted DeepCopy() anyway on the fast path so 
there's not really a benefit, whereas AllocateRow() doesn't do as much work and 
ben

[Impala-ASF-CR] IMPALA-5483: Automatically disable codegen for small queries

2017-06-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-5483: Automatically disable codegen for small queries
..


Patch Set 3:

This is a reasonable extension of the single node optimization. I haven't gone 
through the change in details. A couple of questions:

1. how big is the window (in terms of number of rows) in which a query won't 
trigger single-node execution but still falls under this new hint ?

2. the caveat (as in single node optimization) is how accurate the estimation 
is.  Can you think of some pathological case in which we will regress hugely as 
a result of this new hint ? I can imagine stale hint being a reason.

3. will the ratio of Ec / Ei be drastically different for small vs large 
expressions or it doesn't matter as much after recent inlining change ?

-- 
To view, visit http://gerrit.cloudera.org:8080/7153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I273bcee58641f5b97de52c0b2caab043c914b32e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5494: Fixes the selectivity of NOT IN predicates

2017-06-13 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5494: Fixes the selectivity of NOT IN predicates
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7168/3/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:

Line 105: result = self.execute_query("explain select * from %s.%s where id 
"
> Instead of this end-to-end test, please add a new FE unit test ExprSelectiv
Agree, e-e test seems like an overkill.


-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5483: Automatically disable codegen for small queries

2017-06-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5483: Automatically disable codegen for small queries
..


Patch Set 3:

Interested what you think about this approach - it seems like a big win for 
some workloads that we don't handle well now.

-- 
To view, visit http://gerrit.cloudera.org:8080/7153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I273bcee58641f5b97de52c0b2caab043c914b32e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5286/IMPALA-5283: Kudu column name case cleanup

2017-06-13 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#3).

Change subject: IMPALA-5286/IMPALA-5283: Kudu column name case cleanup
..

IMPALA-5286/IMPALA-5283: Kudu column name case cleanup

Impala is case insensitive for column names and generally deals
with them in all lower case. Kudu is case sensitive. This can
lead to a problems when a table is created externally in Kudu
with a column name with upper case letters.

This patch solves the problem by having KuduColumn always store
its name in lower case, so that general Impala code that has been
written expecting lower cased column names can use Column.getName
safely.

It also adds the method KuduTable.getKuduColName, which takes a
column name in any case and returns it in the case that it appears
in Kudu. Any code that passes column names into the Kudu API must
call this method first to get the correct column name.

There are four specific situations fixed by this patch:
- When ordering on a Kudu column, the Analyzer would create
  two SlotDescriptors that point to the same column because
  registerSlotRef() was being called with inconsistent casing.
  It is now always called with the lower cased names.
- 'ADD RANGE PARTITION' would fail to find the range partition
  column if it isn't all lower case in Kudu.
- 'ALTER TABLE DROP COLUMN' and 'ALTER TABLE CHANGE' only worked
  if the column name was specified in Kudu case.
- 'CREATE EXTERNAL TABLE' called on a Kudu table with column names
  that differ only in case now returns an error, since Impala has
  no way of handling this situation.

Testing:
- Added e2e tests in test_kudu.py.
- Manually edited functional_kudu to change column names to have
  mixed casing and ran the kudu tests.

Change-Id: I14aba88510012174716691b9946e1c7d54d01b44
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M tests/query_test/test_kudu.py
8 files changed, 130 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/6902/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14aba88510012174716691b9946e1c7d54d01b44
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5483: Automatically disable codegen for small queries

2017-06-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-5483: Automatically disable codegen for small queries
..

IMPALA-5483: Automatically disable codegen for small queries

This is similar to the single-node execution optimisation, but applies
to slightly larger queries that should run in a distributed manner but
won't benefit from codegen.

This adds a new query option disable_codegen_rows_threshold that
defaults to 50,000. If fewer than this number of rows are processed
by a plan node per impalad, the cost of codegen almost certainly
outweighs the benefit.

Using rows processed as a threshold is justified by a simple
model that assumes the cost of codegen and execution per row for
the same operation are proportional. E.g. if x is the complexity
of the operation, n is the number of rows processed, C is a
constant factor giving the cost of codegen and Ec/Ei are constant
factor giving the cost of codegen'd and interpreted execution and
d, then the cost of the codegen'd operator is C * x + Ec * x * n
and the cost of the interpreted operator is Ei * x * n. Rearranging
means that interpretation is cheaper if n < C / (Ei - Ec), i.e. that
(at least with the simplified model) it makes sense to choose
interpretation or codegen based on a constant threshold. The
model also implies that it is somewhat safer to choose codegen
because the additional cost of codegen is O(1) but the additional
cost of interpretation is O(n).

I ran some experiments with TPC-H Q1, varying the input table size, to
determine what the cut-over point where codegen was beneficial was.
The cutover was around 150k rows per node for both text and parquet.
At 50k rows per node disabling codegen was very beneficial - around
0.12s versus 0.24s.  To be somewhat conservative I set the default
threshold to 50k rows. On more complex queries, e.g. TPC-H Q10, the
cutover tends to be higher because there are plan nodes that process
many fewer than the max rows.

Fix a couple of minor issues in the frontend - the numNodes_
calculation could return 0 for Kudu, and the single node optimization
didn't handle the case where for a scan node with conjuncts, a limit
and missing stats correctly (it considered the estimate still valid.)

Testing:
Updated e2e tests that set disable_codegen to set
disable_codegen_rows_threshold to 0, so that those tests run both
with and without codegen still.

Added an e2e test to make sure that the optimisation is applied in
the backend.

Perf:
Added a targeted perf test for a join+agg over a small input, which
benefits from this change.

Change-Id: I273bcee58641f5b97de52c0b2caab043c914b32e
---
M be/src/exprs/expr-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
A testdata/workloads/functional-query/queries/QueryTest/codegen.test
A testdata/workloads/targeted-perf/queries/primitive_small_join_1.test
M tests/common/impala_test_suite.py
M tests/common/test_dimensions.py
A tests/query_test/test_codegen.py
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_scanners_fuzz.py
M tests/query_test/test_udfs.py
17 files changed, 198 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/7153/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I273bcee58641f5b97de52c0b2caab043c914b32e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5494: Fixes the selectivity of NOT IN predicates

2017-06-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5494: Fixes the selectivity of NOT IN predicates
..


Patch Set 3:

(1 comment)

Thanks for finding and fixing this.

http://gerrit.cloudera.org:8080/#/c/7168/3/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:

Line 105: result = self.execute_query("explain select * from %s.%s where id 
"
Instead of this end-to-end test, please add a new FE unit test 
ExprSelectivityTest similar to ExprNdvTest. Relying on a full explain seems 
like overkill, and checking the scan cardinality is really an indirect test.


-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5494: Fixes the selectivity of NOT IN predicates

2017-06-13 Thread Vincent Tran (Code Review)
Vincent Tran has posted comments on this change.

Change subject: IMPALA-5494: Fixes the selectivity of NOT IN predicates
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7168/2//COMMIT_MSG
Commit Message:

PS2, Line 7: NOT IN predicate shares the same selectivity as
   : IN predicate.
> Change to what this commit fixes may be? Something like Fix the selectivity
Done


PS2, Line 18: predicae
> typo
Done


http://gerrit.cloudera.org:8080/#/c/7168/2/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:

Line 104: # IN predicate, cardinality should be 7300*(num of children/num 
of distinct values)
> Not totally sure if there is a better place to add these tests, Alex any id
Maybe I should just specify the expected value as a constant?


Line 109: # NOT IN predicate, cardinality should be 7300*(1-(num of 
children/num of distinct values))
> nit: long line.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5494: Fixes the selectivity of NOT IN predicates

2017-06-13 Thread Vincent Tran (Code Review)
Vincent Tran has uploaded a new patch set (#3).

Change subject: IMPALA-5494: Fixes the selectivity of NOT IN predicates
..

IMPALA-5494: Fixes the selectivity of NOT IN predicates

This change modifies the logic of NOT IN predicate so that
the planner can calculate the correct node cardinality. Prior
to this change, both IN and NOT IN predicates shared the
same selectivity, which resulted in the same cardinality
during planning.

The selectivity is calculated by the following heuristic:

selectivity = 1 - (num of predicate children /
num of distinct values)

Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
---
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M tests/metadata/test_explain.py
2 files changed, 18 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/7168/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 


[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

2017-06-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
..


Patch Set 4:

(4 comments)

We're almost done here, nice work!

http://gerrit.cloudera.org:8080/#/c/7110/4/fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java
File fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java:

Line 30:  * Coalesce disjunctive equality predicates to an IN predicate,
Coalesces ... IN predicate, and ...


Line 57:* The transformation is applied, if one of the children is an IN 
predicate and the other child
remove first comma


Line 80: } else
else if


http://gerrit.cloudera.org:8080/#/c/7110/3/fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java
File 
fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java:

Line 43: if (!expr.getChild(0).isConstant() || 
expr.getChild(1).isConstant()) {
> Personally, I prefer a flow which is easy to understand, even at the expens
Fair point about the logic being more comprehensible. To make us both happy how 
about you add helper functions for the cases, e.g.:

boolean isExprOpConstant(Expr e) {
  return !e.getChild(0).isConstant() && e.getChild(1).isConstant();
}

boolean isExprOpSlotRef() {
  return expr.getChild(0).unwrapSlotRef(false) == null && 
expr.getChild(1).unwrapSlotRef(false) != null;
}

if (isExprOpSlotRef(expr) || isExprOpConstant()) {
  // Do the trnasofrmation
}


-- 
To view, visit http://gerrit.cloudera.org:8080/7110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sandeep akinapelli 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: sandeep akinapelli 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5061: Populate null count in parquet::statistics

2017-06-13 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change.

Change subject: IMPALA-5061: Populate null_count in parquet::statistics
..


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7058/6/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 553:   // We need to get min stats.
> Where did "to get" go?
Done


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

PS5, Line 197: CK(page_stats_ba
> I think changing this to a DCHECK is preferable, since it always has to be 
Done


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

Line 162:   void Update(const T& min_value, const T& max_value);
> My idea was to keep this line, but change it to
Done


http://gerrit.cloudera.org:8080/#/c/7058/6/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

PS6, Line 60:  w
> nit: double space
Done


PS6, Line 99: appended 
> nit: typo
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5061: Populate null count in parquet::statistics

2017-06-13 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded a new patch set (#7).

Change subject: IMPALA-5061: Populate null_count in parquet::statistics
..

IMPALA-5061: Populate null_count in parquet::statistics

The null_count in the statistics field is updated each time a null
value is encountered by parquet table writer. The value is written
to the parquet header if it has one or more null values in the
row_group.

Testing: Modified the existing end-to-end test in the
test_insert_parquet.py file to make sure each parquet header has
the appropriate null_count. Verified the correctness of the nulltable
test and added an additional test which populates a parquet file with
the functional_parquet.zipcode_incomes table and ensures that the
expected null_count is populated.

Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/parquet-column-stats.h
M be/src/exec/parquet-column-stats.inline.h
M tests/query_test/test_insert_parquet.py
4 files changed, 134 insertions(+), 119 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/7058/7
-- 
To view, visit http://gerrit.cloudera.org:8080/7058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 


[Impala-ASF-CR] IMPALA-5333: [DOCS] Document Impala ADLS support

2017-06-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5333: [DOCS] Document Impala ADLS support
..


Patch Set 1:

(3 comments)

I've read through roughly half of impala_adls.xml and noticed a few things.

http://gerrit.cloudera.org:8080/#/c/7175/1/docs/topics/impala_adls.xml
File docs/topics/impala_adls.xml:

Line 154:   
L154 has trailing white space. It's best to remove trailing white space from 
text that is under revision control, whether it be prose, code, or markup.


PS1, Line 156: As an alternative, specify the credentials in 
environment variables before starting the impalad
 : daemon.
Is it possible to list a mapping between core-site.xml settings above and 
environment variable names?


Line 219:   
This is still empty. :)


-- 
To view, visit http://gerrit.cloudera.org:8080/7175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id5a98217741e5d540d9874e9b30e36f01644ef14
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5495: Improve error message if no impalad role is configured

2017-06-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5495: Improve error message if no impalad role is 
configured
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/727/

-- 
To view, visit http://gerrit.cloudera.org:8080/7167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5499: avoid ephemeral port conflicts

2017-06-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5499: avoid ephemeral port conflicts
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7171/1/be/src/statestore/statestore-test.cc
File be/src/statestore/statestore-test.cc:

Line 76:   int subscriber_port = FindUnusedEphemeralPort(&used_ports);
> you could assert that used_ports.back() == subscriber_port here and elsewhe
Is this the right place for it? Seems more like a unit test for 
FindUnusedEphemeralPort() than an invariant of this function. I can add a unit 
test for the function but unsure if it's worth it.


http://gerrit.cloudera.org:8080/#/c/7171/1/be/src/util/network-util.cc
File be/src/util/network-util.cc:

Line 197: }
> Should'nt we increment tries here?
Changed back to a for() loop, which achieves the same thing. I was thinking 
that it was a bit weird to fail randomly because we picked numbers that were 
already used, but I guess it's no different from failing because the port was 
in use for other reasons.


Line 206:   close(sockfd);
> Shouldn't this be above the continue, possibly at the start of the loop? Ot
Done


http://gerrit.cloudera.org:8080/#/c/7171/1/be/src/util/network-util.h
File be/src/util/network-util.h:

Line 70: int FindUnusedEphemeralPort(std::vector* used_ports);
> You could make this default to nullptr.
I hit a compile error that I couldn't get around. Unsure if it's a compiler bug 
or something else but my inclination is to not spend time digging into it.

   error: default argument given for parameter 1 of ‘int 
impala::FindUnusedEphemeralPort(std::vector*)’ [-fpermissive]
 int FindUnusedEphemeralPort(std::vector* used_ports = nullptr);


-- 
To view, visit http://gerrit.cloudera.org:8080/7171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id02c83e6f946a14b83f5e6561957d7ad81442835
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5495: Improve error message if no impalad role is configured

2017-06-13 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5495: Improve error message if no impalad role is 
configured
..


Patch Set 2: Code-Review+2

Thanks for fixing this :)

-- 
To view, visit http://gerrit.cloudera.org:8080/7167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5499: avoid ephemeral port conflicts

2017-06-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-5499: avoid ephemeral port conflicts
..

IMPALA-5499: avoid ephemeral port conflicts

We should not select the same port twice, which could happen because
multiple ports were selected consecutively without actually binding to
any of the ports.

Change-Id: Id02c83e6f946a14b83f5e6561957d7ad81442835
---
M be/src/rpc/thrift-server-test.cc
M be/src/statestore/statestore-test.cc
M be/src/testutil/in-process-servers.cc
M be/src/util/network-util.cc
M be/src/util/network-util.h
5 files changed, 27 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/7171/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id02c83e6f946a14b83f5e6561957d7ad81442835
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5495: Improve error message if no impalad role is configured

2017-06-13 Thread Henry Robinson (Code Review)
Hello anujphadke,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7167

to look at the new patch set (#2).

Change subject: IMPALA-5495: Improve error message if no impalad role is 
configured
..

IMPALA-5495: Improve error message if no impalad role is configured

"Impala server needs to have a role (EXECUTOR, COORDINATOR)"

replaced with:

"Impala does not have a valid role configured. Either --is_coordinator or
--is_executor must be set to true."

Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
---
M be/src/service/impala-server.cc
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/7167/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5495: Improve error message if no impalad role is configured

2017-06-13 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5495: Improve error message if no impalad role is 
configured
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7167/1//COMMIT_MSG
Commit Message:

Line 7: IMPALA-5495: Improve error message if no impalad role configured
> role is configured.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5061: Populate null count in parquet::statistics

2017-06-13 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5061: Populate null_count in parquet::statistics
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 961:   Status err(Substitute("Corrupt Parquet file '$0': column 
'$1' "
> Sorry about that! I won't do it henceforth.
No worries :)


http://gerrit.cloudera.org:8080/#/c/7058/6/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 553:   // We need min stats.
Where did "to get" go?


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

PS5, Line 197: page_stats_base_
> I am not sure, I noticed Line #688 does a DCHECK. Should I change it to a D
I think changing this to a DCHECK is preferable, since it always has to be set.


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

Line 162:   void Update(const T& min_value, const T& max_value);
> Done
My idea was to keep this line, but change it to

void Update(const T& v) { Update(v, v); }

and remove the implementation from the .cc file.


http://gerrit.cloudera.org:8080/#/c/7058/6/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

PS6, Line 60:   
nit: double space


PS6, Line 99: appeneded
nit: typo


-- 
To view, visit http://gerrit.cloudera.org:8080/7058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5333: [DOCS] Document Impala ADLS support

2017-06-13 Thread John Russell (Code Review)
John Russell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7175

Change subject: IMPALA-5333: [DOCS] Document Impala ADLS support
..

IMPALA-5333: [DOCS] Document Impala ADLS support

Change-Id: Id5a98217741e5d540d9874e9b30e36f01644ef14
---
M docs/impala.ditamap
M docs/shared/impala_common.xml
A docs/topics/impala_adls.xml
M docs/topics/impala_parquet_file_size.xml
4 files changed, 721 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/7175/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7175
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id5a98217741e5d540d9874e9b30e36f01644ef14
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 


[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

2017-06-13 Thread sandeep akinapelli (Code Review)
sandeep akinapelli has posted comments on this change.

Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
..


Patch Set 4:

(17 comments)

Thanks for the detailed review. addressed review comments.

http://gerrit.cloudera.org:8080/#/c/7110/3/fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java
File fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java:

Line 1: // Licensed to the Apache Software Foundation (ASF) under one
> Apache license header
Done


Line 15: // specific language governing permissions and limitations
> grammar, I suggest this correction:
Done


Line 29: /**
> inline these calls for brevity
Done


Line 42:   @Override
> Still not very accurate. Sorry for riding on this point, but precise docume
good suggestion. Didnt have to change a lot. thanks


Line 58:* is a compatible IN predicate or equality predicate. Returns the 
transformed expr or null
> combine into L 57
combined the earlier notIn() into single if. Logically the next step is to 
compare the LHS expressions, hence kept it separate.


Line 62: InPredicate inPred = null;
> children -> newInList?
Done


Line 64: if (child0 instanceof InPredicate) {
> style: we always use {} for if/else-if, except if the whole if+then fits in
added explicit {} for if-else


Line 75: // other predicate can be OR predicate or IN predicate
> Please follow suggestions above on making the comment more accurate, in par
Done


Line 79:   newInList.add(otherPred.getChild(1));
> single line
Done


Line 84:   } else {
> Assignment+cast not needed
agreed. Done


http://gerrit.cloudera.org:8080/#/c/7110/3/fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java
File 
fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java:

Line 29:  * are also normalized so that  is always on the right hand 
side.
> Adjust comment to reflect new behavior, also fix indentation while you are 
Done


Line 34:  * 5 = id + 2 -> id + 2 = 5
> Add an additional example where the rhs is not a simple slotref
Done


Line 42: if (!(expr instanceof BinaryPredicate)) return expr;
> No need for this comment, the class comment and examples should cover this
removed.


Line 43: if (!expr.getChild(0).isConstant() || 
expr.getChild(1).isConstant()) {
> space after "if", rework logic to not have an empty then block
Personally, I prefer a flow which is easy to understand, even at the expense of 
having an empty then block. I changed the logic to a non empty if block. let me 
know, and I can keep the version that is preferred.


http://gerrit.cloudera.org:8080/#/c/7110/3/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 360: RewritesOk("5 + 3 = id", rule, "id = 5 + 3");
> long line
Done


Line 415: RewritesOk("int_col = 1 or int_col in (2, 3)", edToInrule,
> add negative test cases:
Done


Line 422: RewritesOk("int_col = 1 or int_col = int_col ", edToInrule, null);
> Please be consistent with regards to capitalization of keywords in your tes
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sandeep akinapelli 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: sandeep akinapelli 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

2017-06-13 Thread sandeep akinapelli (Code Review)
sandeep akinapelli has uploaded a new patch set (#4).

Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
..

IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

This change introduces a new rule to merge disjunct equality
predicates into an IN predicate. As with every rule being applied
bottom up, the rule merges the leaf OR predicates into an in predicate
and subsequently merges the OR predicate to the existing IN predicate
Patch also include changes in response to alex's review comments to
normalize the binary predicates and testcases for the same.

Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
A fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java
M fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
5 files changed, 195 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/7110/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sandeep akinapelli 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: sandeep akinapelli 


[Impala-ASF-CR] IMPALA-5061: Populate null count in parquet::statistics

2017-06-13 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change.

Change subject: IMPALA-5061: Populate null_count in parquet::statistics
..


Patch Set 6:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 961:   Status err(Substitute("Corrupt Parquet file '$0': column 
'$1' "
> If possible, please don't rebase changes while they are under review. Other
Sorry about that! I won't do it henceforth.


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

PS5, Line 197: page_stats_base_
> can this ever be nullptr?
I am not sure, I noticed Line #688 does a DCHECK. Should I change it to a 
DCHECK or entirely remove this line?


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.cc
File be/src/exec/parquet-column-stats.cc:

Line 58:   DCHECK(false) << "Unsupported statistics field requested";
> undo
Done


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

Line 60:   /// Enum to select  whether to read minimum or maximum statistics. 
Values do not
> nit: Please undo this change, too.
Done


Line 72:   static bool ReadFromThrift(const parquet::ColumnChunk& col_chunk,
> Undo
Done


Line 100:   void IncrementNullCount(int64_t count) { null_count_ += count; }
> Update sounds like it will set it to a new value. Maybe "AddToNullCount" or
Done. Changed it to IncrementNullCount(int64_t count)


Line 162:   void Update(const T& min_value, const T& max_value);
> You can remove the implementation of Update(const T& v) and instead put a s
Done


Line 163: 
> Please add a comment to this method, too.
Done


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.inline.h
File be/src/exec/parquet-column-stats.inline.h:

Line 61: std::string min_str;
> nit: This could go on a single line now
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5061: Populate null count in parquet::statistics

2017-06-13 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded a new patch set (#6).

Change subject: IMPALA-5061: Populate null_count in parquet::statistics
..

IMPALA-5061: Populate null_count in parquet::statistics

The null_count in the statistics field is updated each time a null
value is encountered by parquet table writer. The value is written
to the parquet header if it has one or more null values in the
row_group.

Testing: Modified the existing end-to-end test in the
test_insert_parquet.py file to make sure each parquet header has
the appropriate null_count. Verified the correctness of the nulltable
test and added an additional test which populates a parquet file with
the functional_parquet.zipcode_incomes table and ensures that the
expected null_count is populated.

Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/parquet-column-stats.h
M be/src/exec/parquet-column-stats.inline.h
M tests/query_test/test_insert_parquet.py
5 files changed, 135 insertions(+), 124 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/7058/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..


IMPALA-5263: test infra: support CA bundles with secure clusters

This patch adds the command line option --ca_cert to the common test
infra CLI options for use alongside --use-ssl. This is useful when
testing against a secured Impala cluster in which the SSL certs are
self-signed. This will allow the SSL request to be validated. Using this
option will also suppress noisy console warnings like:

  InsecureRequestWarning: Unverified HTTPS request is being made. Adding
  certificate verification is strongly advised. See:
  https://urllib3.readthedocs.org/en/latest/security.html

We also go further in this patch and use the warnings module to print
these SSL-related warnings once and only once, instead of all over the
place. In the case of the stress test, this greatly reduces the noise in
the console log.

Testing:
- quick concurrent_select.py calls with and without --ca_cert to observe
  that connections still get made and the test runs smoothly. Some of
  this testing occurred without warning suppression, so that I could be
  sure the InsecureRequestWarnings were not occurring when using
  --ca_cert anymore.
- ensured warnings are printed once, not multiple times

Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Reviewed-on: http://gerrit.cloudera.org:8080/7152
Reviewed-by: David Knupp 
Tested-by: Impala Public Jenkins
---
M tests/comparison/cli_options.py
M tests/comparison/cluster.py
M tests/comparison/db_connection.py
3 files changed, 56 insertions(+), 11 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  David Knupp: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Wood 


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5499: avoid ephemeral port conflicts

2017-06-13 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5499: avoid ephemeral port conflicts
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7171/1/be/src/statestore/statestore-test.cc
File be/src/statestore/statestore-test.cc:

Line 76:   int subscriber_port = FindUnusedEphemeralPort(&used_ports);
you could assert that used_ports.back() == subscriber_port here and elsewhere. 
Or more generally that it is contained in the list if you don't want to assume 
a particular order.


http://gerrit.cloudera.org:8080/#/c/7171/1/be/src/util/network-util.cc
File be/src/util/network-util.cc:

Line 206: ++tries;
Shouldn't this be above the continue, possibly at the start of the loop? 
Otherwise tries that end up picking a port in used_ports will not be counted 
correctly.


http://gerrit.cloudera.org:8080/#/c/7171/1/be/src/util/network-util.h
File be/src/util/network-util.h:

Line 70: int FindUnusedEphemeralPort(std::vector* used_ports);
You could make this default to nullptr.


-- 
To view, visit http://gerrit.cloudera.org:8080/7171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id02c83e6f946a14b83f5e6561957d7ad81442835
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5499: avoid ephemeral port conflicts

2017-06-13 Thread anujphadke (Code Review)
anujphadke has posted comments on this change.

Change subject: IMPALA-5499: avoid ephemeral port conflicts
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7171/1/be/src/util/network-util.cc
File be/src/util/network-util.cc:

Line 197:   continue;
Should'nt we increment tries here?


-- 
To view, visit http://gerrit.cloudera.org:8080/7171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id02c83e6f946a14b83f5e6561957d7ad81442835
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-13 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#6).

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..

IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

There was a race between ClientRequestState::UpdateQueryStatus() and the
beeswax get_state()/get_log() RPCs leading to the rare situation that a
query would abort with an error, but the error message would be empty.

The fix is to take the ClientRequestState lock in the beeswax RPCs
before obtaining the status.

To test this I ran test_corrupt_files in a loop for a day. Without this
fix, it would usually fail within a few hours.

I changed the test to allow running it in parallel like so:

@pytest.mark.parametrize('multiplier', xrange(32))
def test_corrupt_files(self, vector, multiplier):

Then I ran it in a loop like so:

i=0; while [ $? -eq 0 ]; do ((++i)); echo "Run: $i"; impala-py.test \
tests/query_test/test_scanners.py::TestParquet::test_corrupt_files \
--exploration_strategy=exhaustive -n8; done

Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
---
M be/src/service/impala-beeswax-server.cc
1 file changed, 13 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7155/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] WIP Test Modification

2017-06-13 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7173

Change subject: WIP Test Modification
..

WIP Test Modification

Change-Id: I6b1bd6bb7febbaa8a76e34581ad4b6c849a96620
---
M be/src/service/impala-beeswax-server.cc
M tests/query_test/test_scanners.py
2 files changed, 4 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/7173/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6b1bd6bb7febbaa8a76e34581ad4b6c849a96620
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-13 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..


Patch Set 5: -Verified

(2 comments)

It looks like I ran the tests against the wrong binary. On my dev machine I 
maintain different worktrees and switching between them did not update the 
binary path. Added the DCHECK you suggested nonetheless, but I my failures were 
false alarms. I'm looping the test with correct binaries now.

http://gerrit.cloudera.org:8080/#/c/7155/5/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

Line 253:   lock_guard l(*request_state->lock());
> you could add a DCHECK() here that if query_state == EXCEPTION, then !query
Done


Line 299: lock_guard l(*request_state->lock());
> rather than retaking the lock here, you should remember the value of ok() f
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP Test Modification

2017-06-13 Thread Lars Volker (Code Review)
Lars Volker has abandoned this change.

Change subject: WIP Test Modification
..


Abandoned

Pushed by mistake

-- 
To view, visit http://gerrit.cloudera.org:8080/7173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I6b1bd6bb7febbaa8a76e34581ad4b6c849a96620
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7155/5/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

Line 253:   lock_guard l(*request_state->lock());
you could add a DCHECK() here that if query_state == EXCEPTION, then 
!query_status.ok(). That should help track down whether the bug is something 
different than originally thought.


Line 299: lock_guard l(*request_state->lock());
rather than retaking the lock here, you should remember the value of ok() from 
above.  actually, though, since we print GetAnalysisWarnings() in between, I 
don't see why this line break logic is correct.


-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5499: avoid ephemeral port conflicts

2017-06-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7171

Change subject: IMPALA-5499: avoid ephemeral port conflicts
..

IMPALA-5499: avoid ephemeral port conflicts

We should not select the same port twice, which could happen because
multiple ports were selected consecutively without actually binding to
any of the ports.

Change-Id: Id02c83e6f946a14b83f5e6561957d7ad81442835
---
M be/src/rpc/thrift-server-test.cc
M be/src/statestore/statestore-test.cc
M be/src/testutil/in-process-servers.cc
M be/src/util/network-util.cc
M be/src/util/network-util.h
5 files changed, 29 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/7171/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id02c83e6f946a14b83f5e6561957d7ad81442835
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

2017-06-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
..


Patch Set 3:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/7110/3/fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java
File fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java:

Line 1: package org.apache.impala.rewrite;
Apache license header


Line 15:  * merges the compatible equality or IN predicate into existing IN 
predicate.
grammar, I suggest this correction:

merges compatible equality or IN predicates into an existing IN predicate


Line 29: Expr child0 = expr.getChild(0);
inline these calls for brevity


Line 42:* Takes in two predicates, one is of type Expr IN (1, 2) and the 
other is of
Still not very accurate. Sorry for riding on this point, but precise 
documentation is important. Here are a few points that could be improved:
* The inputs can be any boolean expression, but the comment makes it sound like 
the caller must ensure that one of them is an IN predicate and the other is an 
IN predicate or equality predicate.
* Do not use examples to describe the function semantics. It's fine to use 
examples to support the description of this function, but the description 
itself should not be based on examples.


Consider this alternative formulation:

Takes the children of an OR predicate and attempts to combine them into a 
single IN predicate. The transformation is applied if one of the children is an 
IN predicate and the other child is a compatible IN predicate or equality 
predicate. Returns the transformed expr or null if no transformation was 
possible.


Line 58: if (inPred.isNotIn()) return null;
combine into L 57


Line 62: List children = Lists.newArrayList(
children -> newInList?


Line 64: if (Expr.IS_EXPR_EQ_LITERAL_PREDICATE.apply(otherPred))
style: we always use {} for if/else-if, except if the whole if+then fits into a 
single line


Line 75:* Takes in two predicates of type 'Expr = literal' and returns the 
rewritten IN predicate.
Please follow suggestions above on making the comment more accurate, in 
particular, this function does not require child0/1 to be of the form  = 



Line 79: if (!Expr.IS_EXPR_EQ_LITERAL_PREDICATE.apply(child0))
single line


Line 84: BinaryPredicate bp0 = (BinaryPredicate) child0;
Assignment+cast not needed


http://gerrit.cloudera.org:8080/#/c/7110/3/fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java
File 
fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java:

Line 29:* Normalizes binary predicates of the formso 
that the slot is
Adjust comment to reflect new behavior, also fix indentation while you are here.


Line 34:  * 5 > id -> id < 5
Add an additional example where the rhs is not a simple slotref


Line 42: // always rewrite if LHS is literal and RHS is not a literal
No need for this comment, the class comment and examples should cover this


Line 43: if(expr.getChild(0).isLiteral() && !expr.getChild(1).isLiteral()) 
{}
space after "if", rework logic to not have an empty then block

for this rule we should check isConstant() instead of isLiteral() because this 
rule is required even with enable_expr_rewrites=false (which disables constant 
folding)


http://gerrit.cloudera.org:8080/#/c/7110/3/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 360: RewritesOk("tinyint_col + smallint_col = int_col", rule, "int_col 
= tinyint_col + smallint_col");
long line


Line 415: RewritesOk("int_col IN (1, 2) or int_col IN (3, 4)", edToInrule,
add negative test cases:
* lhs is NOT IN
* rhs is NOT IN
* lhs and rhs are NOT IN


Line 422: RewritesOk("int_col IN (1, 2) or int_col = int_col + 3 ", 
edToInrule, null);
Please be consistent with regards to capitalization of keywords in your test 
cases. Personally, I prefer to use lower caps in test cases so they can be 
quickly distinguished from the expected result (which is in caps).


-- 
To view, visit http://gerrit.cloudera.org:8080/7110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sandeep akinapelli 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: sandeep akinapelli 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC

2017-06-13 Thread Henry Robinson (Code Review)
Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5720

to look at the new patch set (#13).

Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore 
services to KRPC
..

IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC

This patch adds the core abstractions necessary for using Kudu RPC for
Impala's backend services. The StatestoreService and
StatestoreSubscriberService are both ported to KRPC to demonstrate how
the new RPC framework operates.

The main new class is RpcMgr, which is responsible for both services run
by a daemon and for obtaining clients to remote services. The new Rpc
class helps clients build rpc invocation objects, and use them to call
remote methods.

Also done:
 * Utility code to convert from a Thrift structure to a Protobuf
   wrapper. Thrift RPCs do not need to be ported to Protobuf to work.
 * Added more build support for Protobuf generation. All library targets
   now depend on the Protobuf generation step, just as they do Thrift.
 * thrift-server-test is rewritten to use Beeswax as a test service,
   since the old StatestoreService has been removed.
 * Remove InProcessStatestore that was only used for statestore-test and
   mini-impala-cluster. As far as we know, mini-impala-cluster is unused
   by anyone. (IMPALA-4784)

TESTING:
 * Impala's test suite passes.
 * statestore-test has been rewritten to use new statestore code. It
   includes all test cases from the now-removed Python test_statestore.py
 * rpc-mgr-test unit-tests the RpcMgr and Rpc classes.

TODO:
 * Enable SSL and Kerberos support (already in Kudu's imported
   libraries)

Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766
---
M be/generated-sources/gen-cpp/CMakeLists.txt
M be/src/catalog/CMakeLists.txt
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/codegen/CMakeLists.txt
M be/src/common/CMakeLists.txt
M be/src/exec/CMakeLists.txt
M be/src/exec/catalog-op-executor.cc
M be/src/exec/kudu-util.h
M be/src/experiments/CMakeLists.txt
M be/src/exprs/CMakeLists.txt
M be/src/rpc/CMakeLists.txt
A be/src/rpc/common.proto
A be/src/rpc/rpc-mgr-test.cc
A be/src/rpc/rpc-mgr.cc
A be/src/rpc/rpc-mgr.h
A be/src/rpc/rpc-mgr.inline.h
A be/src/rpc/rpc.h
A be/src/rpc/rpc_test.proto
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-util.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/bufferpool/CMakeLists.txt
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/scheduling/CMakeLists.txt
M be/src/service/CMakeLists.txt
M be/src/service/client-request-state.cc
M be/src/statestore/CMakeLists.txt
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore-test.cc
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
A be/src/statestore/statestore.proto
M be/src/statestore/statestored-main.cc
M be/src/testutil/CMakeLists.txt
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
D be/src/testutil/mini-impala-cluster.cc
M be/src/transport/CMakeLists.txt
M be/src/udf/CMakeLists.txt
M be/src/udf_samples/CMakeLists.txt
M be/src/util/CMakeLists.txt
M be/src/util/collection-metrics.h
M be/src/util/counting-barrier.h
M bin/start-impala-cluster.py
M bin/start-impalad.sh
M common/thrift/CMakeLists.txt
M common/thrift/StatestoreService.thrift
D tests/statestore/test_statestore.py
51 files changed, 1,744 insertions(+), 1,091 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/5720/13
-- 
To view, visit http://gerrit.cloudera.org:8080/5720
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC

2017-06-13 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore 
services to KRPC
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5720/11/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

Line 112:   /// 'service_ptr' contains an interface implementation that will 
handle RPCs.
> May help to comment that 'service_ptr' releases reference to the underlying
I commented that RegisterService() assumes ownership of the underlying object. 
Let me know if that's not quite what you meant.


-- 
To view, visit http://gerrit.cloudera.org:8080/5720
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-06-13 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#3).

Change subject: IMPALA-4856: Port data stream service to KRPC
..

IMPALA-4856: Port data stream service to KRPC

This patch ports the data-flow parts of ImpalaInternalService to KRPC.

* ImpalaInternalService is split into two services. The first,
  ImpalaInternalService, deals with control messages for plan fragment
  instance execution, cancellation and reporting, and remains
  implemented in Thrift for now. The second, DataStreamService, handles
  large-payload RPCs for transmitting runtime filters and row batches
  between hosts.

* In the DataStreamService, all RPCs use 'native' protobuf. The
  DataStreamService starts on the port previously reserved for the
  StatestoreSubscriberService (which is also a KRPC service), to avoid
  having to configure another port when starting Impala. When the
  ImpalaInternalService is ported to KRPC, all services will run on one
  port.

* To support needing to address two different backend services, a data
  service port has been added to TBackendDescriptor.

* This patch adds support for asynchronous RPCs to the RpcMgr and Rpc
  classes. Previously, Impala used fixed size thread pools + synchronous
  RPCs to achieve some parallelism for 'broadcast' RPCs like filter
  propagation, or a dedicated per-sender+receiver pair thread on the
  sender side in the DataStreamSender case. In this patch, the
  PublishFilter() and TransmitData() RPCs are sent asynchronously using
  KRPC's thread pools.

* The TransmitData() protocol has changed to adapt to asynchronous
  RPCs. The full details are in data-stream-mgr.h.

* As a result, DataStreamSender no longer creates a
  thread-per-connection on the sender side.

* Both tuple transmission and runtime filter publication use sidecars to
  minimise the number of copies and serialization steps required.

* Also include a fix for KUDU-2011 that properly allows sidecars to be
  shared between KRPC and the RPC caller (fixing IMPALA-5093, a
  corruption bug).

* A large portion of this patch is the replacement of TRowBatch with its
  Protobuf equivalent, RowBatchPb. The replacement is a literal port of
  the data structure, and row-batch-test, row-batch-list-test and
  row-batch-serialize-benchmark continue to execute without major logic
  changes.

* Simplify FindRecvr() logic in DataStreamManager. No-longer need to
  handle blocking sender-side, so no need for complex promise-based
  machinery. Instead, all senders with no receiver are added to a
  per-receiver list, which is processed when the receiver arrives. If it
  does not arrive promptly, the DataStreamManager cleans them up after
  FLAGS_datastream_sender_timeout_ms.

* This patch also begins a clean-up of how ImpalaServer instances are
  created (by removing CreateImpalaServer), and clarifying the
  relationship between ExecEnv and ImpalaServer. ImpalaServer now
  follows the standard construct->Init()->Start()->Join() lifecycle that
  we use for other services.

* Ensure that all addresses used for KRPCs are fully resolved, avoiding
  the need to resolve them for each RPC.

TESTING
---

* New tests added to rpc-mgr-test.

TO DO
-

* Re-enable throughput and latency measurements per data-stream sender
  when that information is exposed from KRPC (KUDU-1738).

* TLS and Kerberos are still not supported by KRPC in this patch.

Change-Id: Ia66704be7a0a8162bb85556d07b583ec756c584b
---
M .clang-format
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/common/init.cc
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exprs/expr-test.cc
M be/src/kudu/rpc/rpc_sidecar.cc
M be/src/kudu/rpc/rpc_sidecar.h
M be/src/rpc/CMakeLists.txt
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/common.proto
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/rpc/rpc.h
D be/src/runtime/backend-client.h
M be/src/runtime/client-cache-types.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h

[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-06-13 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/7103/2/be/src/rpc/rpc.h
File be/src/rpc/rpc.h:

PS2, Line 126: 'cb'
> Can you please state the context in which 'cb' is called from (e.g. reactor
Done


PS2, Line 133: aattempted
> typo
Done


PS2, Line 230: //
> ///
Done


PS2, Line 319: Retries
> Is there any way to write a be-test to exercise the retry path ?
Yep - see rpc-mgr-test.cc, RetryAsyncTest. That injects ERROR_SERVER_TOO_BUSY 
into an RPC response which triggers the retry logic.


PS2, Line 322: auto cb_wrapper = [params = std::move(params), mgr, func, 
req, resp,
 : cb = std::move(cb), controller_ptr = 
controller.release(), num_attempts]()
 : mutable {
> An alternative to this lambda implementation would be to define a separate 
It would, yeah. My preference is for using a lambda here partly because it's 
very clear about how the arguments are copied, and how ownership is managed (I 
find the copying behaviour of bind() a bit inscrutable).


PS2, Line 332: cb(Status::OK(), req, resp, controller_ptr);
> Re-reading the comments above, status seems to indicate whether status was 
Right - I was in two minds about using different statuses, or merging together 
the one from the RpcController and the one that we must provide somehow. I 
think this is the simplest way to pass both statuses, but let me know if you 
have an idea! I added a comment for now.


Line 337:   kudu::MonoDelta retry_interval = 
kudu::MonoDelta::FromMilliseconds(params->retry_interval_ms);
> long line
Done


http://gerrit.cloudera.org:8080/#/c/7103/2/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

PS2, Line 203:   auto cb = [self_ptr = 
weak_ptr(self_),
 :   instance_id = fragment_instance_id_, proto_batch = batch]
 :   (const Status& status, TransmitDataRequestPb* request,
 :   TransmitDataResponsePb* response, RpcController* 
controller) {
 : 
 : // Ensure that request and response get deleted when this 
callback returns.
 : auto request_container = 
unique_ptr(request);
 : auto response_container = 
unique_ptr(response);
 : 
 : // Check if this channel still exists.
 : auto channel = self_ptr.lock();
 : if (!channel) return;
 : {
 :   lock_guard l(channel->lock_);
 :   Status rpc_status = status.ok() ? 
FromKuduStatus(controller->status()) : status;
 : 
 :   int32_t status_code = response->status().status_code();
 :   channel->recvr_gone_ = status_code == 
TErrorCode::DATASTREAM_RECVR_ALREADY_GONE;
 : 
 :   if (!rpc_status.ok()) {
 : channel->last_rpc_status_ = rpc_status;
 :   } else if (!channel->recvr_gone_) {
 : if (status_code != TErrorCode::OK) {
 :   // Don't bubble up the 'receiver gone' status, because 
it's not an error.
 :   channel->last_rpc_status_ = Status(response->status());
 : } else {
 :   int size = proto_batch->GetSize();
 :   channel->num_data_bytes_sent_.Add(size);
 :   VLOG_ROW << "incremented #data_bytes_sent="
 :<< channel->num_data_bytes_sent_.Load();
 : }
 :   }
 :   channel->rpc_in_flight_ = false;
 : }
 : channel->rpc_done_cv_.notify_one();
 :   };
> I am no C++ expert so this question may be stupid: can we not write this as
We could write this as a method, and use bind(), or we could create a struct 
with one method (this one) that captures the context upon construction. The 
latter is what a lambda compiles down to, and I prefer the syntax sugar a 
lambda gives you. The former uses bind(), which I am not a great fan of. 

In my experience, gdb handles this just fine, and the stack is IMHO cleaner 
than using bind() (I've broken in this method lots of times!).


-- 
To view, visit http://gerrit.cloudera.org:8080/7103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia66704be7a0a8162bb85556d07b583ec756c584b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-06-13 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6812/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 445:   *dst_slot = file_metadata_.row_groups[row_group_idx_].num_rows;
Bounds check against file_metadata_.num_rows (i.e. keep a running counter as 
below in row_group_rows_read_ and DCHECK_LE(row_group_rows_read_, 
file_metatdata_.num_rows)?


Line 454:   if (scan_node_->IsZeroSlotTableScan()) {
Why is this optimization not redundant now?  Maybe update the comment to 
indicate the type of query this now will operate on:

  e.g., select 1 from table


http://gerrit.cloudera.org:8080/#/c/6812/3/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

Line 226:   11: optional i64 parquet_count_star_slot_offset
> i32 right?
Would it be simpler to have this be one parameter and indicate truth by passing 
a value >= 0?


-- 
To view, visit http://gerrit.cloudera.org:8080/6812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5061: Populate null count in parquet::statistics

2017-06-13 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5061: Populate null_count in parquet::statistics
..


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 961:   Status err(Substitute("Corrupt Parquet file '$0': column 
'$1' "
If possible, please don't rebase changes while they are under review. Otherwise 
the rebase can clutter the view when comparing recent change sets.


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

PS5, Line 197: page_stats_base_
can this ever be nullptr?


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.cc
File be/src/exec/parquet-column-stats.cc:

Line 58:   DCHECK(false) << "Unsupported statistics value field requested";
undo


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

Line 60:   /// Enum to select the statistics field to be read. Values do not
nit: Please undo this change, too.


Line 72:   static bool ReadValueFromThrift(const parquet::ColumnChunk& 
col_chunk,
Undo


Line 100:   void UpdateNullCount(int64_t count = 1) { null_count_ += count; }
Update sounds like it will set it to a new value. Maybe "AddToNullCount" or 
"IncrementNullCount" would be better names. When calling this, 
IncrementNullCount(1) may be more explicit and thus more clear. Then you could 
drop the default value.


Line 162:   void Update(const T& v);
You can remove the implementation of Update(const T& v) and instead put a 
single line "Update(v, v);" here.


Line 163:   void Update(const T& min_value, const T& max_value);
Please add a comment to this method, too.


http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.inline.h
File be/src/exec/parquet-column-stats.inline.h:

Line 61: Update(cs->min_value_, cs->max_value_);
nit: This could go on a single line now


-- 
To view, visit http://gerrit.cloudera.org:8080/7058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5494: NOT IN predicate shares the same selectivity as IN predicate.

2017-06-13 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5494: NOT IN predicate shares the same selectivity as IN 
predicate.
..


Patch Set 2:

(4 comments)

Patch looks good to me overall, I can +1 it once you fix the nits.

http://gerrit.cloudera.org:8080/#/c/7168/2//COMMIT_MSG
Commit Message:

PS2, Line 7: NOT IN predicate shares the same selectivity as
   : IN predicate.
Change to what this commit fixes may be? Something like Fix the selectivity of 
NOT IN  Easier to read.


PS2, Line 18: predicae
typo


http://gerrit.cloudera.org:8080/#/c/7168/2/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:

Line 104: # IN predicate, cardinality should be 7300*(num of children/num 
of distinct values)
Not totally sure if there is a better place to add these tests, Alex any idea?


Line 109: # NOT IN predicate, cardinality should be 7300*(1-(num of 
children/num of distinct values))
nit: long line.


-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vincent Tran 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7152/2/tests/comparison/cli_options.py
File tests/comparison/cli_options.py:

Line 205: 'once',
> "once"
David contacted me offline to say this was just a comment, nothing to fix.


-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/726/

-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: No


[native-toolchain-CR] Support gcc 6.3.0 and 7.1.0

2017-06-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged.

Change subject: Support gcc 6.3.0 and 7.1.0
..


Support gcc 6.3.0 and 7.1.0

The default still remains gcc 4.9.2 but this lets us play around
with different GCC versions. The toolchain builds successfully
with either version, e.g.:

  GCC_VERSION=7.1.0 ./buildall.sh

Change-Id: Id36448f493b0cc07dc36f9eb7bd5cc30c54af3d7
---
M buildall.sh
M source/gcc/build.sh
A 
source/thrift/thrift-0.9.0-patches/0009-THRIFT-2201-Ternary-operator-returns-different-types.patch
3 files changed, 69 insertions(+), 8 deletions(-)

Approvals:
  Henry Robinson: Looks good to me, approved
  Tim Armstrong: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7156
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id36448f493b0cc07dc36f9eb7bd5cc30c54af3d7
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Support gcc 6.3.0 and 7.1.0

2017-06-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Support gcc 6.3.0 and 7.1.0
..


Patch Set 2: Verified+1

Ran a successful test build

-- 
To view, visit http://gerrit.cloudera.org:8080/7156
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id36448f493b0cc07dc36f9eb7bd5cc30c54af3d7
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5494: NOT IN predicate shares the same selectivity as IN predicate.

2017-06-13 Thread Vincent Tran (Code Review)
Vincent Tran has posted comments on this change.

Change subject: IMPALA-5494: NOT IN predicate shares the same selectivity as IN 
predicate.
..


Patch Set 2:

removed extraneous space characters.

-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Vincent Tran 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5494: NOT IN predicate shares the same selectivity as IN predicate.

2017-06-13 Thread Vincent Tran (Code Review)
Vincent Tran has uploaded a new patch set (#2).

Change subject: IMPALA-5494: NOT IN predicate shares the same selectivity as IN 
predicate.
..

IMPALA-5494: NOT IN predicate shares the same selectivity as
IN predicate.

This change modifies the logic of NOT IN predicate so that
the planner can calculate the correct node cardinality. Prior
to this change, both IN and NOT IN predicates shared the
same selectivity, which resulted in the same cardinality
during planning.

The selectivity is calculated by the following heuristic:

selectivity = 1 - (num of predicae children /
num of distinct values)

Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
---
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M tests/metadata/test_explain.py
2 files changed, 17 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/7168/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 


[Impala-ASF-CR] IMPALA-5494: NOT IN predicate shares the same selectivity as IN predicate.

2017-06-13 Thread Vincent Tran (Code Review)
Vincent Tran has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7168

Change subject: IMPALA-5494: NOT IN predicate shares the same selectivity as IN 
predicate.
..

IMPALA-5494: NOT IN predicate shares the same selectivity as
IN predicate.

This change modifies the logic of NOT IN predicate so that
the planner can calculate the correct node cardinality. Prior
to this change, both IN and NOT IN predicates shared the
same selectivity, which resulted in the same cardinality
during planning.

The selectivity is calculated by the following heuristic:

selectivity = 1 - (num of predicae children /
num of distinct values)

Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
---
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M tests/metadata/test_explain.py
2 files changed, 17 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/7168/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I69e6217257b5618cb63e13b32ba3347fa0483b63
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Vincent Tran 


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-06-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7103/2/be/src/rpc/rpc.h
File be/src/rpc/rpc.h:

PS2, Line 332: cb(Status::OK(), req, resp, controller_ptr);
> Why do we pass Status::OK() for non-retryable error or after exceeding the 
Re-reading the comments above, status seems to indicate whether status was 
successfully attempted so it may be okay. The assumption is that cb will check 
for remote error from controller_ptr.


-- 
To view, visit http://gerrit.cloudera.org:8080/7103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia66704be7a0a8162bb85556d07b583ec756c584b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-06-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7103/2/be/src/rpc/rpc.h
File be/src/rpc/rpc.h:

PS2, Line 126: 'cb'
Can you please state the context in which 'cb' is called from (e.g. reactor 
thread) and also add the precaution caller should take when implementing 'cb' 
(e.g. no blocking etc) ?


PS2, Line 133: aattempted
typo


PS2, Line 230: //
///

Same below.


PS2, Line 319: Retries
Is there any way to write a be-test to exercise the retry path ?


PS2, Line 322: auto cb_wrapper = [params = std::move(params), mgr, func, 
req, resp,
 : cb = std::move(cb), controller_ptr = 
controller.release(), num_attempts]()
 : mutable {
An alternative to this lambda implementation would be to define a separate 
function and uses boost::bind() to stash the arguments, right ?


PS2, Line 332: cb(Status::OK(), req, resp, controller_ptr);
Why do we pass Status::OK() for non-retryable error or after exceeding the 
maximum number of retries ?


Line 337:   kudu::MonoDelta retry_interval = 
kudu::MonoDelta::FromMilliseconds(params->retry_interval_ms);
long line


http://gerrit.cloudera.org:8080/#/c/7103/1/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

PS1, Line 265: 10
Mind commenting above what 10 stands for ?


PS1, Line 266: numeric_limits::max()
Why is this not FLAGS_datastream_sender_timeout_ms ?


http://gerrit.cloudera.org:8080/#/c/7103/2/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

PS2, Line 203:   auto cb = [self_ptr = 
weak_ptr(self_),
 :   instance_id = fragment_instance_id_, proto_batch = batch]
 :   (const Status& status, TransmitDataRequestPb* request,
 :   TransmitDataResponsePb* response, RpcController* 
controller) {
 : 
 : // Ensure that request and response get deleted when this 
callback returns.
 : auto request_container = 
unique_ptr(request);
 : auto response_container = 
unique_ptr(response);
 : 
 : // Check if this channel still exists.
 : auto channel = self_ptr.lock();
 : if (!channel) return;
 : {
 :   lock_guard l(channel->lock_);
 :   Status rpc_status = status.ok() ? 
FromKuduStatus(controller->status()) : status;
 : 
 :   int32_t status_code = response->status().status_code();
 :   channel->recvr_gone_ = status_code == 
TErrorCode::DATASTREAM_RECVR_ALREADY_GONE;
 : 
 :   if (!rpc_status.ok()) {
 : channel->last_rpc_status_ = rpc_status;
 :   } else if (!channel->recvr_gone_) {
 : if (status_code != TErrorCode::OK) {
 :   // Don't bubble up the 'receiver gone' status, because 
it's not an error.
 :   channel->last_rpc_status_ = Status(response->status());
 : } else {
 :   int size = proto_batch->GetSize();
 :   channel->num_data_bytes_sent_.Add(size);
 :   VLOG_ROW << "incremented #data_bytes_sent="
 :<< channel->num_data_bytes_sent_.Load();
 : }
 :   }
 :   channel->rpc_in_flight_ = false;
 : }
 : channel->rpc_done_cv_.notify_one();
 :   };
I am no C++ expert so this question may be stupid: can we not write this as a 
lambda function ? I am not sure how well gdb can handle lambda functions when 
compiled with optimization and this callback seems important enough that one 
may want to inspect its states in a core dump if necessary.


http://gerrit.cloudera.org:8080/#/c/7103/1/be/src/service/impala-internal-service.cc
File be/src/service/impala-internal-service.cc:

PS1, Line 63: DataStreamService
nit: Just wondering why we didn't put DataStreamService in 
data-stream-service.cc ?


-- 
To view, visit http://gerrit.cloudera.org:8080/7103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia66704be7a0a8162bb85556d07b583ec756c584b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5495: Improve error message if no impalad role configured

2017-06-13 Thread anujphadke (Code Review)
anujphadke has posted comments on this change.

Change subject: IMPALA-5495: Improve error message if no impalad role configured
..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7167/1//COMMIT_MSG
Commit Message:

Line 7: IMPALA-5495: Improve error message if no impalad role configured
role is configured.


-- 
To view, visit http://gerrit.cloudera.org:8080/7167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes