[Impala-ASF-CR] IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has submitted this change and it was merged.

Change subject: IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop
..


IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop

Occasionally we'd see HBase fail to startup properly on CentOS 7
clusters. The symptom was that HBase would not open the required nodes
in zookeeper, signaling its readiness.

As a workaround, this change includes waiting for the Zookeeper nodes
into the retry logic.

Change-Id: Id8dbdff4ad02cac1322e7d580e0a6971daf6ea28
Reviewed-on: http://gerrit.cloudera.org:8080/7159
Reviewed-by: Michael Brown 
Reviewed-by: anujphadke 
Reviewed-by: David Knupp 
Tested-by: Lars Volker 
---
M testdata/bin/run-hbase.sh
1 file changed, 11 insertions(+), 4 deletions(-)

Approvals:
  David Knupp: Looks good to me, approved
  Michael Brown: Looks good to me, approved
  Lars Volker: Verified
  anujphadke: Looks good to me, but someone else must approve



-- 
To view, visit http://gerrit.cloudera.org:8080/7159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id8dbdff4ad02cac1322e7d580e0a6971daf6ea28
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5495: Improve error message if no impalad role configured

2017-06-12 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7167

Change subject: IMPALA-5495: Improve error message if no impalad role configured
..

IMPALA-5495: Improve error message if no impalad role configured

"Impala server needs to have a role (EXECUTOR, COORDINATOR)"

replaced with:

"Impala does not have a valid role configured. Either --is_coordinator or
--is_executor must be set to true."

Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
---
M be/src/service/impala-server.cc
1 file changed, 2 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/7167/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib8ae09eeaefeda3fab62e66dbcb4f9134082c039
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-5446: dropped Sorter::Reset() status

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5446: dropped Sorter::Reset() status
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4f9e93a44531901e663b3f1e18edc514363f74
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5446: dropped Sorter::Reset() status

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5446: dropped Sorter::Reset() status
..


IMPALA-5446: dropped Sorter::Reset() status

This patch aligns the sorter's methods closer with the ExecNode methods
and moves the possibly-failing parts of Reset() into Open().

Testing:
Added WARN_UNUSED_RESULT to all the sorter methods that return Status to
prevent similar issues in future.

Add a test that sometimes goes down this code path. It was able to cause
a crash at least once every 5 executions.

Ran an exhaustive build to make sure there were no other regressions.

Change-Id: I7d4f9e93a44531901e663b3f1e18edc514363f74
Reviewed-on: http://gerrit.cloudera.org:8080/7134
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exec/sort-node.cc
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
4 files changed, 43 insertions(+), 21 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7d4f9e93a44531901e663b3f1e18edc514363f74
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..


Patch Set 5: Verified-1

Patchset 5 takes about an hour or so to repro the error - it does not fix the 
issue. I added more lock_guards, but that did not help either.

-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[native-toolchain-CR] Support gcc 6.3.0 and 7.1.0

2017-06-12 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Support gcc 6.3.0 and 7.1.0
..


Patch Set 2: Code-Review+2

That Thrift bug was some ugly code :/

-- 
To view, visit http://gerrit.cloudera.org:8080/7156
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id36448f493b0cc07dc36f9eb7bd5cc30c54af3d7
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

2017-06-12 Thread Anonymous Coward (Code Review)
sakinape...@cloudera.com has posted comments on this change.

Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
..


Patch Set 3:

(17 comments)

additional tests and changes to normalize binary predicate rule

http://gerrit.cloudera.org:8080/#/c/7110/2/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 180:   public final static com.google.common.base.Predicate 
IS_EXPR_EQ_LITERAL_PREDICATE =
> long line
Done


http://gerrit.cloudera.org:8080/#/c/7110/1/fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java
File 
fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java:

Line 69
> I think this particular case is easy enough to fix in this JIRA. It's a sim
Adding code in the rule class.


http://gerrit.cloudera.org:8080/#/c/7110/2/fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java
File fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java:

Line 12: 
> Suggest modifications for brevity and grammar:
Done


Line 16:  * Examples:
> add an example where an equality predicate is merged into an IN predicate
Done


Line 32: Expr inAndOtherExpr = rewriteInAndOtherExpr(child0, child1);
> What about IN OR IN? That can be produced by (a = 1 or a = 2) or (a = 3 or 
Done. Added code to merge two IN predicates.


Line 38: return expr;
> This comment should describe the inputs and outputs more clearly, in partic
Done


Line 52: }
> extra space before "return"
Done


Line 54:   inPred = (InPredicate) child1;
> can get rid of these empty lines imo
Done


Line 55:   otherPred = child0;
> Our rewriting framework requires that we return a new InPredicate here. See
Added code to instantiate new class


Line 61: // other predicate can be OR predicate or IN predicate
> Same here. Comment needs polish to describe inputs and outputs.
Done


Line 64: if (Expr.IS_EXPR_EQ_LITERAL_PREDICATE.apply(otherPred))
> single lines?
Done


Line 72:   }
> Seems more concise to inline these function calls, and get rid of the extra
Done


Line 78:   private Expr rewriteEqEqPredicate(Expr child0, Expr child1) {
> To keep things simple, I think we should also require that rightChild1 is a
removed this as it is covered by the IS_EXPR_EQ_LITERAL_PREDICATE function


http://gerrit.cloudera.org:8080/#/c/7110/2/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 392: RewritesOk("int_col * 3 = 6 or int_col * 3 = 9 or int_col * 3 = 
12",
> add parenthesis to make the AND/OR precedence explicit
Done


Line 394: 
> remove blank lines, these 3 test cases belong together logically, so it's g
Done


Line 403: edToInrule, "int_col * 3 IN (6, 9) OR int_col * 3 <= 12");
> add negative tests to show the non-obvious cases in which the rewrite is no
Done


Line 411: RewritesOk("int_col in (1,2) or int_col = 3", edToInrule,
> int_col in (1,2) or int_col in (3,4) should also work
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sakinape...@cloudera.com
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: sakinape...@cloudera.com
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

2017-06-12 Thread Anonymous Coward (Code Review)
sakinape...@cloudera.com has uploaded a new patch set (#3).

Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
..

IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

This change introduces a new rule to merge disjunct equality
predicates into an IN predicate. As with every rule being applied
bottom up, the rule merges the leaf OR predicates into an in predicate
and subsequently merges the OR predicate to the existing IN predicate
Patch also include changes in response to alex's review comments to
normalize the binary predicates.

Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
A fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java
M fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
5 files changed, 172 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/7110/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sakinape...@cloudera.com
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: sakinape...@cloudera.com


[Impala-ASF-CR] IMPALA-5492:There is an error in impala-shell introduction when using LDAP

2017-06-12 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5492:There is an error in impala-shell introduction when 
using LDAP
..


Patch Set 2:

(3 comments)

Thanks for submitting the patch. Mind fix a couple of nits?

http://gerrit.cloudera.org:8080/#/c/7166/2//COMMIT_MSG
Commit Message:

Line 7: IMPALA-5492:There is an error in impala-shell introduction when using 
LDAP
nit: Space after the jira key, something like "IMPALA-5492: ". (This 
helps gerrit interpret the jira as an URL)


PS2, Line 7: There is an error in impala-shell introduction when using LDAP
May be edit this to convey what the commit is actually fixing, something like, 
"Fix incorrect newline character in the LDAP message"


PS2, Line 9: The introduction has redundant '\n' in impala-shell when using 
LDAP.
   : 
   : I fix this issue by deleting the redundant '\n' in introduction 
when
   : impala-shell using LDAP.
May be reword to "Remove extraneous '\' .."?  Looks like you are not removing 
the the \n, just the extra '\' in the beginning.


-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5492:There is an error in impala-shell introduction when using LDAP

2017-06-12 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new patch set (#2).

Change subject: IMPALA-5492:There is an error in impala-shell introduction when 
using LDAP
..

IMPALA-5492:There is an error in impala-shell introduction when using LDAP

The introduction has redundant '\n' in impala-shell when using LDAP.

I fix this issue by deleting the redundant '\n' in introduction when
impala-shell using LDAP.

Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
---
M shell/impala_shell.py
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/7166/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 


[Impala-ASF-CR] IMPALA-5492:There is an error in impala-shell introduction when using LDAP

2017-06-12 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7166

Change subject: IMPALA-5492:There is an error in impala-shell introduction when 
using LDAP
..

IMPALA-5492:There is an error in impala-shell introduction
when using LDAP

The introduction has redundant '\n' in impala-shell when using LDAP.

I fix this issue by deleting the redundant '\n' in introduction when
impala-shell using LDAP.

Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
---
M shell/impala_shell.py
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/7166/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 


[Impala-ASF-CR] IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop

2017-06-12 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8dbdff4ad02cac1322e7d580e0a6971daf6ea28
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..


Patch Set 4:

(2 comments)

Thanks for the comments. Please see PS5.

http://gerrit.cloudera.org:8080/#/c/7155/3/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

PS3, Line 291: }
 :   }
> the write of this doesn't look like it's done under the lock, so no point o
Done


PS3, Line 294:   // Add warnings from analysis
 :   error_log_ss << join(request_state->GetAnalysisWarnings(), 
"\n");
 : 
 :   // Add warnings from execution
 :   if (request_state->coord() != nullptr) {
 : if (!request_state->que
> I think we shouldn't hold the lock across this since it should be synchroni
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#5).

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..

IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

There was a race between ClientRequestState::UpdateQueryStatus() and the
beeswax get_state()/get_log() RPCs leading to the rare situation that a
query would abort with an error, but the error message would be empty.

The fix is to take the ClientRequestState lock in the beeswax RPCs
before obtaining the status.

To test this I ran test_corrupt_files in a loop for a day. Without this
fix, it would usually fail within a few hours.

I changed the test to allow running it in parallel like so:

@pytest.mark.parametrize('multiplier', xrange(32))
def test_corrupt_files(self, vector, multiplier):

Then I ran it in a loop like so:

i=0; while [ $? -eq 0 ]; do ((++i)); echo "Run: $i"; impala-py.test \
tests/query_test/test_scanners.py::TestParquet::test_corrupt_files \
--exploration_strategy=exhaustive -n8; done

Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
---
M be/src/service/impala-beeswax-server.cc
1 file changed, 12 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7155/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#4).

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..

IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

There was a race between ClientRequestState::UpdateQueryStatus() and the
beeswax get_state()/get_log() RPCs leading to the rare situation that a
query would abort with an error, but the error message would be empty.

The fix is to take the ClientRequestState lock in the beeswax RPCs
before obtaining the status.

To test this I ran test_corrupt_files in a loop for a day. Without this
fix, it would usually fail within a few hours.

I changed the test to allow running it in parallel like so:

@pytest.mark.parametrize('multiplier', xrange(32))
def test_corrupt_files(self, vector, multiplier):

Then I ran it in a loop like so:

i=0; while [ $? -eq 0 ]; do ((++i)); echo "Run: $i"; impala-py.test \
tests/query_test/test_scanners.py::TestParquet::test_corrupt_files \
--exploration_strategy=exhaustive -n8; done

Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
---
M be/src/service/impala-beeswax-server.cc
1 file changed, 11 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7155/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5085: large rows in BufferedTupleStreamV2

2017-06-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5085: large rows in BufferedTupleStreamV2
..


Patch Set 10:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/6638/10/be/src/runtime/buffered-tuple-stream-v2.cc
File be/src/runtime/buffered-tuple-stream-v2.cc:

PS10, Line 120: Exceptions are the page we're currently writing
why is that still an exception now that we have the saved reservation?


PS10, Line 271: ExpectedPinCount
Now that the complex logic shifts to reservations, should this become "bool 
NeedPin()"? I'm fine with leaving it alone though.


PS10, Line 340: has_write_iterator
why is that? should comment say: Need reservation if there are no pages current 
pinned for reading but we may add a page, or similar?


Line 409:   - write_page_reservation_to_reclaim)) {
this can only fail for large pages, right? if so, how about clarifying with a 
dcheck?


Line 424:   // free up reservation for the next write page, so do it first.
then why does this not have to happen before the IncreaseReservationToFit() 
call above?


PS10, Line 505: deleting or unpinning the previous page
this seems like it needs updating.


PS10, Line 516: (!pinned_ || pages_.size() == 1) && has_read_write_page()
hmm, is this not expressible by NeedWriteReservation()?


PS10, Line 548: InvalidateReadIterator
why do that? don't we still need the saved read reservation?


Line 566: read_end_ptr_ = nullptr;
why doesn't this case need to save reservation?


PS10, Line 713: read_page_ == pages_.end()
why do we do NextReadPage() if we're already at the end?


PS10, Line 866: so
  :   // we can just use AllocateRowSlow() to do the work of 
advancing the page.
i don't understand why this follows from the first part of the sentence.


PS10, Line 893: pinned_
shouldn't that be expressed in terms of NeedWriteReservation()?


http://gerrit.cloudera.org:8080/#/c/6638/10/be/src/runtime/buffered-tuple-stream-v2.h
File be/src/runtime/buffered-tuple-stream-v2.h:

PS10, Line 77: In the unpinned mode, only
 : /// pages currently being read and written are pinned and other 
pages are unpinned and
 : /// therefore do not use the client's reservation and can be 
spilled to disk.
 : ///
it seems like this sentence should be combined with the next paragraph, since 
it's no longer complete.


PS10, Line 89: caller
client?


Line 278:   /// but std::function always makes a heap allocation.
explain what this callback should do.


PS10, Line 281: 'fixed_size' and variable
  :   /// length data 'varlen_size'
update


Line 283:   /// start of the allocated space and returns true. Otherwise 
returns false.
does this have the same three outcomes as AddRow?


PS10, Line 284: AllocateRow
At this point the "Add" and "Allocate" row terminology seems historical. Let's 
come up with better names for these. They both allocate/add a row, the 
difference is just whether the tuple row is passed and copied, or whether it's 
constructed on the fly, right?


Line 484:   /// * there is only one pinned page in the stream and it is already 
pinned for reading.
this comment is useful, but it'd also be helpful to explain what these three 
conditions mean in english like read_page_reservation_ comment does. i.e. "the 
possibility that a pin count will be needed for the write iterator" or 
something.


PS10, Line 535: and at the byte before 'data_end'.
what does that mean?


PS10, Line 536: updates *data_end
data_end doesn't have enough indirection. is that suppose to say '*data'?


Line 582:   void InvalidateWriteIterator();
it's a bit hard to understand how these "invalidate" methods are different from 
the "Reset*Page" methods from the description. in the case of Reset*Page(), 
what happens to the iterator? is it not invalidated?


http://gerrit.cloudera.org:8080/#/c/6638/10/be/src/runtime/buffered-tuple-stream-v2.inline.h
File be/src/runtime/buffered-tuple-stream-v2.inline.h:

Line 34: inline bool BufferedTupleStreamV2::AllocateRow(
why is it that AllocateRow() fast path is inlined but AddRow() is not?


-- 
To view, visit http://gerrit.cloudera.org:8080/6638
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2861c58efa7bc1aeaa5b7e2f043c97cb3985c8f5
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4669: [SECURITY] Add security library to build

2017-06-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4669: [SECURITY] Add security library to build
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5717/7/CMakeLists.txt
File CMakeLists.txt:

PS7, Line 348: find_package(Kerberos REQUIRED)
Does it become a requirement to have krb5 development library and headers 
installed to build Impala after this change ?


-- 
To view, visit http://gerrit.cloudera.org:8080/5717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4f327810066bee7f3ac107b0295480fb9ed45e14
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5446: dropped Sorter::Reset() status

2017-06-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5446: dropped Sorter::Reset() status
..


Patch Set 2: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/7134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4f9e93a44531901e663b3f1e18edc514363f74
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5446: dropped Sorter::Reset() status

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5446: dropped Sorter::Reset() status
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/724/

-- 
To view, visit http://gerrit.cloudera.org:8080/7134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4f9e93a44531901e663b3f1e18edc514363f74
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


IMPALA-5487: Fix race in RuntimeProfile::toThrift()

node.num_children got initialized before children_ was locked. This
could lead to node.num_children < children_.size(), which made the node
tree in the resulting thrift profiles not deserialize properly.

To fix this, node.num_children needs to be initialized after children_
has been locked.

I tested this by running queries on a private cluster for a while,
making sure that the thrift profiles of running queries could be parsed
correctly.

Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Reviewed-on: http://gerrit.cloudera.org:8080/7154
Reviewed-by: Dan Hecht 
Reviewed-by: Tim Armstrong 
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
---
M be/src/util/runtime-profile.cc
1 file changed, 1 insertion(+), 3 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Sailesh Mukil: Looks good to me, but someone else must approve
  Tim Armstrong: Looks good to me, but someone else must approve
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5446: dropped Sorter::Reset() status

2017-06-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5446: dropped Sorter::Reset() status
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d4f9e93a44531901e663b3f1e18edc514363f74
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7155/3/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

PS3, Line 291:   // Add warnings from analysis
 :   error_log_ss << join(request_state->GetAnalysisWarnings(), 
"\n");
the write of this doesn't look like it's done under the lock, so no point of 
reading it under the lock. maybe that should be fixed, but not necessary for 
this change.


PS3, Line 294:   // Add warnings from execution
 :   if (request_state->coord() != nullptr) {
 : if (!request_state->query_status().ok()) error_log_ss << 
"\n\n";
 : error_log_ss << request_state->coord()->GetErrorLog();
 :   }
 :   log = error_log_ss.str();
I think we shouldn't hold the lock across this since it should be synchronized 
by the coordinator itself.  in other words, until the locking is sorted out 
better, it's probably only worth holding the lock across reads of 
request_state->query_status(). that also makes it consistent with HS2 where we 
don't hold the lock in ImpalaServer::GetLog().


-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5031: Don't dereference uninitialized memory as a bool.

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5031: Don't dereference uninitialized memory as a bool.
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc03401a9aeb966f33298268e53ab1040d007224
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Don't dereference uninitialized memory as a bool.

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5031: Don't dereference uninitialized memory as a bool.
..


IMPALA-5031: Don't dereference uninitialized memory as a bool.

Every bool takes up a byte of memory, but not every byte of memory is
a valid bool. Dereferencing that memory, even if only to write to, it
is undefined behavior.

I tested exhaustive UBSAN and DEBUG. In UBSAN, the error is no longer
present; in both UBSAN and DEBUG, all tests pass

Change-Id: Ibc03401a9aeb966f33298268e53ab1040d007224
Reviewed-on: http://gerrit.cloudera.org:8080/7148
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/raw-value.cc
1 file changed, 5 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Jim Apple: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ibc03401a9aeb966f33298268e53ab1040d007224
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-06-12 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/7103/1/be/src/runtime/data-stream-mgr.cc
File be/src/runtime/data-stream-mgr.cc:

Line 61:   bool unused = false;
> Doesn't the contract for FindRecvr() state that we need to hold 'lock_' bef
Done


Line 105:   EarlySendersList waiters;
> Add brief comment:
Done


PS1, Line 123: for (int32_t sender_id: waiters.closing_senders) 
recvr->RemoveSender(sender_id);
> According to the header comment in data-stream-mgr.h, a sender shouldn't be
Done


PS1, Line 300: early_senders_
> Assume the following case:
The sender fragment instance would fail, and then the coordinator should cancel 
the receiver. 

I believe there's an outstanding issue where, if the coordinator fails to 
cancel a fragment instance, the fragment instance will not fail itself. I'm 
going to file a JIRA for that, but it's unrelated to KRPC.


Line 321: // Wait for 10s
> Add a brief comment stating that this is to check if the DataStreamMgr is b
Done


http://gerrit.cloudera.org:8080/#/c/7103/1/be/src/runtime/data-stream-mgr.h
File be/src/runtime/data-stream-mgr.h:

PS1, Line 81: will do one of three things
> nit: would be nice to format them as bullet points.
Done


PS1, Line 83: if the buffer is full
> "if the batch queues are full"?
Done


PS1, Line 87: the sender
> "the sender along with its payload" ?
Done


Line 224:   /// has not yet prepared 'payload' is queued until it arrives, or 
is timed out. If the
> nit: been prepared,
Done


http://gerrit.cloudera.org:8080/#/c/7103/1/be/src/service/impala-server.h
File be/src/service/impala-server.h:

PS1, Line 255: void UpdateFilter
> Leave a TODO stating that this should move to query-state.h/cc after IMPALA
I'm going to leave that for now, since I don't want to make design decisions 
for IMPALA-3825 in this patch.


-- 
To view, visit http://gerrit.cloudera.org:8080/7103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia66704be7a0a8162bb85556d07b583ec756c584b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-06-12 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#2).

Change subject: IMPALA-4856: Port data stream service to KRPC
..

IMPALA-4856: Port data stream service to KRPC

This patch ports the data-flow parts of ImpalaInternalService to KRPC.

* ImpalaInternalService is split into two services. The first,
  ImpalaInternalService, deals with control messages for plan fragment
  instance execution, cancellation and reporting, and remains
  implemented in Thrift for now. The second, DataStreamService, handles
  large-payload RPCs for transmitting runtime filters and row batches
  between hosts.

* In the DataStreamService, all RPCs use 'native' protobuf. The
  DataStreamService starts on the port previously reserved for the
  StatestoreSubscriberService (which is also a KRPC service), to avoid
  having to configure another port when starting Impala. When the
  ImpalaInternalService is ported to KRPC, all services will run on one
  port.

* To support needing to address two different backend services, a data
  service port has been added to TBackendDescriptor.

* This patch adds support for asynchronous RPCs to the RpcMgr and Rpc
  classes. Previously, Impala used fixed size thread pools + synchronous
  RPCs to achieve some parallelism for 'broadcast' RPCs like filter
  propagation, or a dedicated per-sender+receiver pair thread on the
  sender side in the DataStreamSender case. In this patch, the
  PublishFilter() and TransmitData() RPCs are sent asynchronously using
  KRPC's thread pools.

* The TransmitData() protocol has changed to adapt to asynchronous
  RPCs. The full details are in data-stream-mgr.h.

* As a result, DataStreamSender no longer creates a
  thread-per-connection on the sender side.

* Both tuple transmission and runtime filter publication use sidecars to
  minimise the number of copies and serialization steps required.

* Also include a fix for KUDU-2011 that properly allows sidecars to be
  shared between KRPC and the RPC caller (fixing IMPALA-5093, a
  corruption bug).

* A large portion of this patch is the replacement of TRowBatch with its
  Protobuf equivalent, RowBatchPb. The replacement is a literal port of
  the data structure, and row-batch-test, row-batch-list-test and
  row-batch-serialize-benchmark continue to execute without major logic
  changes.

* Simplify FindRecvr() logic in DataStreamManager. No-longer need to
  handle blocking sender-side, so no need for complex promise-based
  machinery. Instead, all senders with no receiver are added to a
  per-receiver list, which is processed when the receiver arrives. If it
  does not arrive promptly, the DataStreamManager cleans them up after
  FLAGS_datastream_sender_timeout_ms.

* This patch also begins a clean-up of how ImpalaServer instances are
  created (by removing CreateImpalaServer), and clarifying the
  relationship between ExecEnv and ImpalaServer. ImpalaServer now
  follows the standard construct->Init()->Start()->Join() lifecycle that
  we use for other services.

* Ensure that all addresses used for KRPCs are fully resolved, avoiding
  the need to resolve them for each RPC.

TESTING
---

* New tests added to rpc-mgr-test.

TO DO
-

* Re-enable throughput and latency measurements per data-stream sender
  when that information is exposed from KRPC (KUDU-1738).

* TLS and Kerberos are still not supported by KRPC in this patch.PART1 DSS

Change-Id: Ia66704be7a0a8162bb85556d07b583ec756c584b
---
M .clang-format
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/common/init.cc
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exprs/expr-test.cc
M be/src/kudu/rpc/rpc_sidecar.cc
M be/src/kudu/rpc/rpc_sidecar.h
M be/src/rpc/CMakeLists.txt
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/common.proto
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/rpc/rpc.h
D be/src/runtime/backend-client.h
M be/src/runtime/client-cache-types.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/runtime/runtime-state.cc
M 

[Impala-ASF-CR] IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.

2017-06-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.
..


Patch Set 4:

(6 comments)

Did the tests pass?

I did another pass over the code, just had minor comments. Thanks for making 
those changes.

http://gerrit.cloudera.org:8080/#/c/6527/4/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

PS4, Line 89: static_cast(
Is this cast needed? It looks like this is already the type of scan_node_.


http://gerrit.cloudera.org:8080/#/c/6527/4/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

Line 64: using std::unique_ptr;
Aren't these in common/names.h?


http://gerrit.cloudera.org:8080/#/c/6527/4/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 52: using std::move;
I think common/names.h has this


http://gerrit.cloudera.org:8080/#/c/6527/4/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

PS4, Line 112: unique_ptr(
 : new RowBatch
make_unique( ?


http://gerrit.cloudera.org:8080/#/c/6527/4/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

PS4, Line 136: tow
row


http://gerrit.cloudera.org:8080/#/c/6527/4/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

Line 174: unique_ptr row_batch(new RowBatch(
make_unique?


-- 
To view, visit http://gerrit.cloudera.org:8080/6527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie18f57b0d3fe0052a8ccd361b6a5fcdf979d0669
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop

2017-06-12 Thread anujphadke (Code Review)
anujphadke has posted comments on this change.

Change subject: IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8dbdff4ad02cac1322e7d580e0a6971daf6ea28
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop

2017-06-12 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8dbdff4ad02cac1322e7d580e0a6971daf6ea28
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop

2017-06-12 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7159/1/testdata/bin/run-hbase.sh
File testdata/bin/run-hbase.sh:

PS1, Line 127: brea
break


-- 
To view, visit http://gerrit.cloudera.org:8080/7159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8dbdff4ad02cac1322e7d580e0a6971daf6ea28
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop
..

IMPALA-5223: Add waiting for HBase Zookeeper nodes to retry loop

Occasionally we'd see HBase fail to startup properly on CentOS 7
clusters. The symptom was that HBase would not open the required nodes
in zookeeper, signaling its readiness.

As a workaround, this change includes waiting for the Zookeeper nodes
into the retry logic.

Change-Id: Id8dbdff4ad02cac1322e7d580e0a6971daf6ea28
---
M testdata/bin/run-hbase.sh
1 file changed, 11 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/7159/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id8dbdff4ad02cac1322e7d580e0a6971daf6ea28
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..


Patch Set 3:

(2 comments)

Apologies Sailesh for the messed up PS #2. In general I try to leave an 
additional comment when I think a PS is worth another look, but I should have 
marked it as bad.

http://gerrit.cloudera.org:8080/#/c/7155/2/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

Line 253:   lock_guard l(*request_state->lock());
> You need to take request_state->lock() before the return here.
Done


Line 283:   }
> Same here. You need to take request_state->lock() before retrieving the que
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..


Patch Set 3:

Thank you for your comments. I changed the code and the commit message 
accordingly.

-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#3).

Change subject: IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and 
beeswax RPCs
..

IMPALA-5427: Fix race between CRS::UpdateQueryStatus() and beeswax RPCs

There was a race between ClientRequestState::UpdateQueryStatus() and the
beeswax get_state()/get_log() RPCs leading to the rare situation that a
query would abort with an error, but the error message would be empty.

The fix is to take the ClientRequestState lock in the beeswax RPCs
before obtaining the status.

To test this I ran test_corrupt_files in a loop for a day. Without this
fix, it would usually fail within a few hours.

I changed the test to allow running it in parallel like so:

@pytest.mark.parametrize('multiplier', xrange(32))
def test_corrupt_files(self, vector, multiplier):

Then I ran it in a loop like so:

i=0; while [ $? -eq 0 ]; do ((++i)); echo "Run: $i"; impala-py.test \
tests/query_test/test_scanners.py::TestParquet::test_corrupt_files \
--exploration_strategy=exhaustive -n8; done

Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
---
M be/src/service/impala-beeswax-server.cc
1 file changed, 5 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7155/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5427: Fix order of status handling in ClientRequestState

2017-06-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5427: Fix order of status handling in ClientRequestState
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7155/2/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

Line 253:   return request_state->query_state();
You need to take request_state->lock() before the return here.


Line 283:   stringstream error_log_ss;
Same here. You need to take request_state->lock() before retrieving the query 
status and details.


-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5427: Fix order of status handling in ClientRequestState

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-5427: Fix order of status handling in ClientRequestState
..

IMPALA-5427: Fix order of status handling in ClientRequestState

There was a race in ClientRequestState::UpdateQueryStatus() leading to
the rare situation that a query would abort with an error, but the error
message would be empty.

The fix is to update the query_status_ before setting it to preserve the
error message.

To test this I ran test_corrupt_files in a loop for 2 days. Without this
fix, it would usually fail within a few hours.

I changed the test to allow running it in parallel like so:

@pytest.mark.parametrize('multiplier', xrange(32))
def test_corrupt_files(self, vector, multiplier):

Then I ran it in a loop like so:

i=0; while [ $? -eq 0 ]; do ((++i)); echo "Run: $i"; impala-py.test
tests/query_test/test_scanners.py::TestParquet::test_corrupt_files
--exploration_strategy=exhaustive -n8; done

Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
---
M be/src/service/impala-beeswax-server.cc
1 file changed, 3 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7155/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 


[native-toolchain-CR] Support gcc 6.3.0 and 7.1.0

2017-06-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: Support gcc 6.3.0 and 7.1.0
..

Support gcc 6.3.0 and 7.1.0

The default still remains gcc 4.9.2 but this lets us play around
with different GCC versions. The toolchain builds successfully
with either version, e.g.:

  GCC_VERSION=7.1.0 ./buildall.sh

Change-Id: Id36448f493b0cc07dc36f9eb7bd5cc30c54af3d7
---
M buildall.sh
M source/gcc/build.sh
A 
source/thrift/thrift-0.9.0-patches/0009-THRIFT-2201-Ternary-operator-returns-different-types.patch
3 files changed, 69 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/56/7156/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7156
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id36448f493b0cc07dc36f9eb7bd5cc30c54af3d7
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[native-toolchain-CR] Support gcc 6.3.0 and 7.1.0

2017-06-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7156

Change subject: Support gcc 6.3.0 and 7.1.0
..

Support gcc 6.3.0 and 7.1.0

The default still remains gcc 4.9.2 but this lets us play around
with different GCC versions. The toolchain builds successfully
with either version, e.g.:

  GCC_VERSION=7.1.0 ./buildall.sh

Change-Id: Id36448f493b0cc07dc36f9eb7bd5cc30c54af3d7
---
M buildall.sh
M source/gcc/build.sh
A 
source/thrift/thrift-0.9.0-patches/0009-THRIFT-2201-Ternary-operator-returns-different-types.patch
3 files changed, 71 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/56/7156/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7156
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id36448f493b0cc07dc36f9eb7bd5cc30c54af3d7
Gerrit-PatchSet: 1
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4856: Port ImpalaInternalService to KRPC

2017-06-12 Thread Henry Robinson (Code Review)
Henry Robinson has abandoned this change.

Change subject: IMPALA-4856: Port ImpalaInternalService to KRPC
..


Abandoned

Superseded by https://gerrit.cloudera.org/#/c/7103/

-- 
To view, visit http://gerrit.cloudera.org:8080/5888
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I95229290566a8ccffd80ed2d74c1c57cf1479238
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Anonymous Coward #168
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5427: Fix order of status handling in ClientRequestState

2017-06-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5427: Fix order of status handling in ClientRequestState
..


Patch Set 1:

> > (1 comment)
 > >
 > > This is really subtle. In order to be correct, you'd need at
 > least
 > > a compiler barrier to ensure the writes stay in the intended
 > order.
 > >
 > > However, the HS2 version of the RPCs is already taking the
 > request
 > > state lock: GetOperationStatus(). I think we should just do the
 > > same inside of get_state() and get_log().  Then we can assert,
 > like
 > > the HS2 code, that an error state implies a !ok() status.  Is
 > there
 > > a reason we don't want to take the lock?
 > 
 > You're right, if the compiler reorders the instructions, this won't
 > be helpful. My prior reasoning was that get_log() and get_state()
 > could be called multiple times potentially slowing down the rest of
 > the operations happening under that query if we hold the
 > CRS::lock_. But seeing as how HS2 takes it, it shouldn't be an
 > issue.
 > 
 > Also, I didn't notice this earlier, but we take a global lock
 > inside get_log() and get_state() anyway (client_request_state_map_lock_),
 > so taking a query wide lock shouldn't make these functions all that
 > much slower.

Just be careful to not take the CRS::lock_ while still holding the map lock. 
Follow the same pattern elsewhere by using GetClientRequestState() and then 
take the lock_ after owning a shared ptr to the CLR itself.

-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5427: Fix order of status handling in ClientRequestState

2017-06-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5427: Fix order of status handling in ClientRequestState
..


Patch Set 1: -Code-Review

> (1 comment)
 > 
 > This is really subtle. In order to be correct, you'd need at least
 > a compiler barrier to ensure the writes stay in the intended order.
 > 
 > However, the HS2 version of the RPCs is already taking the request
 > state lock: GetOperationStatus(). I think we should just do the
 > same inside of get_state() and get_log().  Then we can assert, like
 > the HS2 code, that an error state implies a !ok() status.  Is there
 > a reason we don't want to take the lock?

You're right, if the compiler reorders the instructions, this won't be helpful. 
My prior reasoning was that get_log() and get_state() could be called multiple 
times potentially slowing down the rest of the operations happening under that 
query if we hold the CRS::lock_. But seeing as how HS2 takes it, it shouldn't 
be an issue. 

Also, I didn't notice this earlier, but we take a global lock inside get_log() 
and get_state() anyway (client_request_state_map_lock_), so taking a query wide 
lock shouldn't make these functions all that much slower.

-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5394: Set socket timeouts while opening TSaslTransport

2017-06-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5394: Set socket timeouts while opening TSaslTransport
..


Patch Set 2: -Code-Review

> Is the the -kerb option deprecated in the buildall.sh script? If
 > not is there a guide on how to use it? I played around with it
 > trying to get it to work since it would save me a bit of time in
 > testing (being able to test the kerberos stuff in my local dev
 > environment vs deploying to a full kerberos enabled cluster), I ran
 > into a variety of issues that lead me to think that it isn't
 > actively maintained/used.

Unfortunately, we don't yet support kerberos for our minicluster. That script 
is deprecated. It will be coming up at some point in the near future, but for 
now, if you want to test with kerberos, the easiest way would be to deploy to a 
cluster configured with kerberos.

-- 
To view, visit http://gerrit.cloudera.org:8080/7061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-12 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7152/2/tests/comparison/cli_options.py
File tests/comparison/cli_options.py:

Line 205: 'once',
> "once"
David, I can't figure out what you're asking me to do.

1. The word "once" is spelled correctly.
2. This file tends to use single quotes.


-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5394: Set socket timeouts while opening TSaslTransport

2017-06-12 Thread John Sherman (Code Review)
John Sherman has posted comments on this change.

Change subject: IMPALA-5394: Set socket timeouts while opening TSaslTransport
..


Patch Set 2:

Is the the -kerb option deprecated in the buildall.sh script? If not is there a 
guide on how to use it? I played around with it trying to get it to work since 
it would save me a bit of time in testing (being able to test the kerberos 
stuff in my local dev environment vs deploying to a full kerberos enabled 
cluster), I ran into a variety of issues that lead me to think that it isn't 
actively maintained/used.

-- 
To view, visit http://gerrit.cloudera.org:8080/7061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-06-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 1:

(12 comments)

> This patch passes core, exhaustive and ASAN tests. It can execute
 > 32 concurrent streams of TPCDS-Q17 @ scale factor 3 on a
 > 138-node cluster with Kerberos enabled. (I don't believe the
 > previous implementation could do this effectively because of the
 > number of Thrift connections required).
 > 
 > Some perf results from a 20-node cluster:
 > 
 > ++--+---++-++---++-+---+
 > | Workload   | Query| File Format   | Avg(s) | Base
 > Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients |
 > Iters |
 > ++--+---++-++---++-+---+
 > | TPCH(_300) | TPCH-Q3  | parquet / none / none | 32.55  | 28.18   
 >|   +15.51%  |   4.71%   |   1.17%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q13 | parquet / none / none | 24.43  | 22.21   
 >|   +9.99%   |   0.61%   |   0.70%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q8  | parquet / none / none | 7.53   | 7.05
 >|   +6.69%   |   1.70%   |   2.09%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q22 | parquet / none / none | 6.35   | 6.04
 >|   +5.19%   |   0.37%   |   0.76%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q14 | parquet / none / none | 4.28   | 4.10
 >|   +4.36%   |   0.03%   |   0.73%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q15 | parquet / none / none | 3.53   | 3.41
 >|   +3.69%   |   0.61%   |   1.42%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q16 | parquet / none / none | 6.09   | 5.87
 >|   +3.63%   |   0.15%   |   1.78%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q11 | parquet / none / none | 1.73   | 1.70
 >|   +2.22%   |   0.10%   |   0.95%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q21 | parquet / none / none | 105.84 | 103.71  
 >|   +2.06%   |   0.57%   |   0.44%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q9  | parquet / none / none | 30.76  | 30.46   
 >|   +1.00%   |   2.57%   |   1.22%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q1  | parquet / none / none | 22.14  | 21.94   
 >|   +0.91%   |   0.81%   |   0.86%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q4  | parquet / none / none | 5.09   | 5.05
 >|   +0.79%   |   0.48%   |   2.54%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q18 | parquet / none / none | 31.76  | 32.54   
 >|   -2.39%   |   0.44%   |   0.03%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q2  | parquet / none / none | 1.98   | 2.04
 >|   -2.74%   |   7.17%   |   7.41%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q5  | parquet / none / none | 47.62  | 48.98   
 >|   -2.79%   |   0.51%   |   0.16%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q20 | parquet / none / none | 3.18   | 3.27
 >|   -2.89%   |   1.34%   |   1.98%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q6  | parquet / none / none | 1.32   | 1.37
 >|   -3.72%   |   0.03%   |   4.00%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q10 | parquet / none / none | 9.00   | 9.48
 >|   -5.06%   |   0.16%   |   0.69%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q17 | parquet / none / none | 5.16   | 5.75
 >|   -10.18%  |   6.44%   |   2.63%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q12 | parquet / none / none | 3.01   | 3.39
 >|   -11.38%  |   2.43%   |   0.06%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q19 | parquet / none / none | 25.20  | 28.82   
 >| I -12.57%  |   0.01%   |   0.75%| 1   | 3
 > |
 > | TPCH(_300) | TPCH-Q7  | parquet / none / none | 45.32  | 61.16   
 >| I -25.91%  |   0.55%   |   2.22%| 1   | 3
 > |
 > ++--+---++-++---++-+---+
 > 
 > Primitives (note the significant regression in many_independent_fragments,
 > that needs further attention)
 > 
 > +-++---++-++---++-+---+
 > | Workload| Query  
 >| File Format   | Avg(s) | Base Avg(s) |
 > Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters |
 > 

[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-12 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7152/2/tests/comparison/cli_options.py
File tests/comparison/cli_options.py:

Line 205: 'once',
"once"

Nice. I've never even heard of this library before.


-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-12 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..


Patch Set 2:

> Do you have a Jenkins run that uses this code?

There is currently nothing upstream that exercises this either before or after 
this patch.

-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5427: Fix order of status handling in ClientRequestState

2017-06-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5427: Fix order of status handling in ClientRequestState
..


Patch Set 1:

(1 comment)

This is really subtle. In order to be correct, you'd need at least a compiler 
barrier to ensure the writes stay in the intended order.

However, the HS2 version of the RPCs is already taking the request state lock: 
GetOperationStatus(). I think we should just do the same inside of get_state() 
and get_log().  Then we can assert, like the HS2 code, that an error state 
implies a !ok() status.  Is there a reason we don't want to take the lock?

http://gerrit.cloudera.org:8080/#/c/7155/1//COMMIT_MSG
Commit Message:

PS1, Line 9: ClientRequestState::UpdateQueryStatus()
the race isn't really in that function. It's between that function and the pair 
of beeswax get_state()/get_log() RPCs.  Please clarify.


-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-12 Thread Matthew Mulder (Code Review)
Matthew Mulder has posted comments on this change.

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..


Patch Set 2:

The code looks good to me. Do you have a Jenkins run that uses this code?

-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: No


Re: [Impala-ASF-CR] IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.

2017-06-12 Thread Alexander Behm
Congratulations on your first commit, Yu! Thanks for your contribution.
Great work on getting this through design and code review!

On Mon, Jun 12, 2017 at 1:31 PM, Impala Public Jenkins (Code Review) <
ger...@cloudera.org> wrote:

> Impala Public Jenkins has submitted this change and it was merged.
>
> Change subject: IMPALA-5016: Simplify COALESCE() in
> SimplifyConditionalsRule.
> ..
>
>
> IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.
>
> Simplify COALESCE by skipping leading nulls and applying the following
> transformations:
> COALESCE(null, a, b) -> COALESCE(a, b);
> COALESCE(, a, b) -> , when literal is not NullLiteral;
> COALESCE(, a, b) -> ,
> when the partition column does not contain NULL.
>
> Testing:
> added unit tests in ExprRewriteRulesTest
>
> Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
> Reviewed-on: http://gerrit.cloudera.org:8080/7015
> Reviewed-by: Alex Behm 
> Tested-by: Impala Public Jenkins
> ---
> M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
> M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
> 2 files changed, 126 insertions(+), 17 deletions(-)
>
> Approvals:
>   Impala Public Jenkins: Verified
>   Alex Behm: Looks good to me, approved
>
>
>
> --
> To view, visit http://gerrit.cloudera.org:8080/7015
> To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
>
> Gerrit-MessageType: merged
> Gerrit-Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
> Gerrit-PatchSet: 7
> Gerrit-Project: Impala-ASF
> Gerrit-Branch: master
> Gerrit-Owner: yu feng 
> Gerrit-Reviewer: Alex Behm 
> Gerrit-Reviewer: Impala Public Jenkins
> Gerrit-Reviewer: yu feng 
>
> --
> You received this message because you are subscribed to the Google Groups
> "impala-cr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to impala-cr+unsubscr...@cloudera.com.
> For more options, visit https://groups.google.com/a/cloudera.com/d/optout.
>


[Impala-ASF-CR] IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.
..


IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.

Simplify COALESCE by skipping leading nulls and applying the following
transformations:
COALESCE(null, a, b) -> COALESCE(a, b);
COALESCE(, a, b) -> , when literal is not NullLiteral;
COALESCE(, a, b) -> ,
when the partition column does not contain NULL.

Testing:
added unit tests in ExprRewriteRulesTest

Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
Reviewed-on: http://gerrit.cloudera.org:8080/7015
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
2 files changed, 126 insertions(+), 17 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: yu feng 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: yu feng 


[Impala-ASF-CR] IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.
..


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: yu feng 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: yu feng 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5427: Fix order of status handling in ClientRequestState

2017-06-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5427: Fix order of status handling in ClientRequestState
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5036: Parquet count star optimization

2017-06-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5036: Parquet count star optimization
..


Patch Set 3:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/6812/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 452:   }
Why not else if as in the previous patch set? Else-if seems more accurate.


http://gerrit.cloudera.org:8080/#/c/6812/3/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

Line 226:   11: optional i64 parquet_count_star_slot_offset
i32 right?


http://gerrit.cloudera.org:8080/#/c/6812/2/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

Line 571:   public FunctionCallExpr getMergeAggInputFn() { return 
mergeAggInputFn_; }
> Yes, they are used on line 617, for example. I don't think it would be a go
I think you can access members directly in L617


http://gerrit.cloudera.org:8080/#/c/6812/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

Line 248:* Adds a new slot descriptor to the tuple descriptor of this scan. 
Also adds an entry
* explain what is going to be stored in this new slot descriptor
* explain what is returned


Line 249:* to 'optimizedAggSmap_' that replaces a count() with a special 
sum() function that
that substitutes count(*) with sum_init_zero()


Line 915: 
msg.hdfs_scan_node.setOptimize_parquet_count_star(optimizedAggSmap_ != null);
Do we need to pass this to the BE? The presence/absence of the 
parquet_count_star_slot_offset seems enough to decide.


http://gerrit.cloudera.org:8080/#/c/6812/3/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

Line 1213:* table scans.
instead of scanning the table (fix other places below also)


http://gerrit.cloudera.org:8080/#/c/6812/3/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

Line 290:   public void testParquetStats() { 
runPlannerTestFile("parquet-stats-agg"); }
testParquetStatsAgg()


http://gerrit.cloudera.org:8080/#/c/6812/3/testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test:

Line 1: # Verify that that the parquet count(*) optimization is applied in all 
the cases.
spell out "in all the cases" a little more and also mention that in one case we 
expect the optimization to not be applied


Line 22: |  |  output: sum_init_zero(functional_parquet.alltypes.parquet-stats: 
num_rows)
Can we reduce this to just parquet-stats.num_rows? How do we create such a long 
label?


Line 99:  DISTRIBUTEDPLAN
Remove here and all tests below. I think showing the distributed plan for the 
first test case is enough.


Line 114: select month, count(*) from functional_parquet.alltypes group by 
month, year
Add a negative test for this one:

select count(*), count(year), count(month),


Line 172: select max(year), count(*) from functional_parquet.alltypes
use avg() instead of max() because max() is going to be optimized in the same 
way in the future (but avg() definitely not)


Line 195: # IMPALA-5036
JIRA number is not very descriptive. Describe what this test case is checking.


Line 278: # The count(*) optimization is applied to the inline view even if 
there is a join.
Add a negative test case that shows the query block must have one table ref, 
e.g.

select count(*) from functional_parquet.alltypes a, functional_parquet.alltypes 
b


Line 352: # tinyint_col is not partitioned so the optimization is disabled.
tinyint_col is not a partition column


Line 402: # Optimization is not applied in the case of count(null).
is not applied to count(null)


Line 451: # Optimization is not applied because the count(*) is not applied 
directly to the
Optimization is not applied across query blocks, even though it would be 
correct here.


Line 453: select count(*) from ( select int_col from 
functional_parquet.alltypes) t
Add a new test that shows we only consider materialized agg exprs, something 
like:

select year, cnt from (
  select year, count(bigint_col), count(*) cnt, avg(int_col)
  from functional_parquet.alltypes
  where month=1
  group by year
)


Line 476: # Optimization is not applied because we are not scanning a Parquet 
table.
Remove. This case is already covered above.


http://gerrit.cloudera.org:8080/#/c/6812/3/testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test:

Line 324: WARNING: The following tables are missing relevant table and/or 
column statistics.
Something wrong with your setup? This table should have 

[Impala-ASF-CR] IMPALA-5427: Fix order of status handling in ClientRequestState

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7155

Change subject: IMPALA-5427: Fix order of status handling in ClientRequestState
..

IMPALA-5427: Fix order of status handling in ClientRequestState

There was a race in ClientRequestState::UpdateQueryStatus() leading to
the rare situation that a query would abort with an error, but the error
message would be empty.

The fix is to update the query_status_ before setting it to preserve the
error message.

To test this I ran test_corrupt_files in a loop for 2 days. Without this
fix, it would usually fail within a few hours.

I changed the test to allow running it in parallel like so:

@pytest.mark.parametrize('multiplier', xrange(32))
def test_corrupt_files(self, vector, multiplier):

Then I ran it in a loop like so:

i=0; while [ $? -eq 0 ]; do ((++i)); echo "Run: $i"; impala-py.test
tests/query_test/test_scanners.py::TestParquet::test_corrupt_files
--exploration_strategy=exhaustive -n8; done

Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
---
M be/src/service/client-request-state.cc
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/7155/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7155
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib4494fe3f933cc23841db0e7da407eec5650f2b5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-4622: Add ALTER COLUMN statement.

2017-06-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#4).

Change subject: IMPALA-4622: Add ALTER COLUMN statement.
..

IMPALA-4622: Add ALTER COLUMN statement.

Kudu recently added the ability to alter a column's default value
and storage attributes (KUDU-861). This patch adds the ability to
modify these from Impala using ALTER.

It also supports altering a column's comment for non-Kudu tables.

It does not support setting a column to be a primary key or
chaning a column's nullability, because those are not supported on
the Kudu side yet.

Syntax:
ALTER TABLE  ALTER [COLUMN] 
  SET   [  [ ...]]
where  is one of:
  - DEFAULT, BLOCK_SIZE, ENCODING, COMPRESSION (Kudu tables)
  - COMMENT (non-Kudu tables)
ALTER TABLE  ALTER [COLUMN]  DROP DEFAULT

This is similar to the existing CHANGE statement:
ALTER TABLE  CHANGE   
  [COMMENT ]
but the new syntax is more natural for setting column properties
when the column name and type are not being changed. ALTER COLUMN
operations use AlterTableChangeColStmt and are sent to the catalog
as CHANGE operations where the type and column name are unchanged.

Testing:
- Added FE tests to ParserTest and AnalyzeDDLTest
- Added EE tests to kudu_alter.test and kudu_describe.test

Change-Id: Id2e8bd65342b79644a0fdcd925e6f17797e89ad6
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
10 files changed, 225 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/6955/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id2e8bd65342b79644a0fdcd925e6f17797e89ad6
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4622: Add ALTER COLUMN statement.

2017-06-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-4622: Add ALTER COLUMN statement.
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6955/3/fe/src/main/java/org/apache/impala/analysis/AlterColumnStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterColumnStmt.java:

PS3, Line 43: 
> comment; maybe createDropDefaultStmt to be a bit more clear
Done


PS3, Line 75: 
> not necessary
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id2e8bd65342b79644a0fdcd925e6f17797e89ad6
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/723/

-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 2: Code-Review+1

LGTM, thanks.

-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 1:

(1 comment)

Thank you for the comments. I uploaded PS2, where I removed the .reserve().

http://gerrit.cloudera.org:8080/#/c/7154/1/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

PS1, Line 764: children_.size()
> Yes, I left it there since it doesn't change correctness and I assumed in t
I removed.


-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 2: Code-Review+2

Please have Tim and Sailesh also finish their reviews.

-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Remove undefined behavior "reference binding to null"

2017-06-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5031: Remove undefined behavior "reference binding to 
null"
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7008/6/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

Line 983:   if (haystack == NULL) return NULL;
> This overly-restrictive DCHECK was exposed by the other parts of the change
haystack can be NULL only if 'hay_len' == 0, in which case, the loop below 
won't be entered. So, rather than add this branch on each iteration of 
SplitPart(), it should just be: DCHECK(hay_len == 0 || haystack != NULL);


-- 
To view, visit http://gerrit.cloudera.org:8080/7008
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifef5c3de686318bff4eac7dfad6e1639ddadb8f2
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5031: Don't dereference uninitialized memory as a bool.

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5031: Don't dereference uninitialized memory as a bool.
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/722/

-- 
To view, visit http://gerrit.cloudera.org:8080/7148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc03401a9aeb966f33298268e53ab1040d007224
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Don't dereference uninitialized memory as a bool.

2017-06-12 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5031: Don't dereference uninitialized memory as a bool.
..


Patch Set 3:

> Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/720/

https://issues.apache.org/jira/browse/IMPALA-3040, trying again.

-- 
To view, visit http://gerrit.cloudera.org:8080/7148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc03401a9aeb966f33298268e53ab1040d007224
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Don't dereference uninitialized memory as a bool.

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5031: Don't dereference uninitialized memory as a bool.
..


Patch Set 3: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/720/

-- 
To view, visit http://gerrit.cloudera.org:8080/7148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc03401a9aeb966f33298268e53ab1040d007224
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 1:

> (1 comment)

Let's remove it

-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7154/1/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

PS1, Line 764: children_.size()
> The reserve() shouldn't matter for correctness, but this is wonky.
Yes, I left it there since it doesn't change correctness and I assumed in the 
average case it will still be usefull. I agree that it is confusing. I could 
either

- Remove it
- Add a comment and explain why it is here
- Move it below where we add the children and where it may matter more

I'm in favor of removing it, the number of children seems to be small in 
general, but if we're concerned about performance we may want to move it down 
where the children are copied.


-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3040 addendum: use specific build type timeout for slow builds

2017-06-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3040 addendum: use specific_build_type_timeout for slow 
builds
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7115
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I80f1c8a0e634a3726c53ef7297c5b162dd57a3a2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7154/1/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

PS1, Line 764: children_.size()
> What about this?
The reserve() shouldn't matter for correctness, but this is wonky.

Maybe we should just get rid of this line - I doubt avoiding an allocation or 
two on this code makes any measurable difference.


-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7154/1/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

PS1, Line 764: children_.size()
What about this?


-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5487: Fix race in RuntimeProfile::toThrift()

2017-06-12 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7154

Change subject: IMPALA-5487: Fix race in RuntimeProfile::toThrift()
..

IMPALA-5487: Fix race in RuntimeProfile::toThrift()

node.num_children got initialized before children_ was locked. This
could lead to node.num_children < children_.size(), which made the node
tree in the resulting thrift profiles not deserialize properly.

To fix this, node.num_children needs to be initialized after children_
has been locked.

I tested this by running queries on a private cluster for a while,
making sure that the thrift profiles of running queries could be parsed
correctly.

Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
---
M be/src/util/runtime-profile.cc
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7154/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7fad4ee2eee1f73e387c1e90a3db265b19b3a0c6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-12 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7152/2/tests/comparison/cli_options.py
File tests/comparison/cli_options.py:

Line 204: filterwarnings(
> What's the Python standard for writing multi-word names?  I thought it was 
This is an import of a standard library module, so I don't have a choice.

https://docs.python.org/2/library/warnings.html


-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Wood 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5280: Coalesce chains of OR conditions to an IN predicate

2017-06-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5280: Coalesce chains of OR conditions to an IN predicate
..


Patch Set 2:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/7110/2/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 180:   public final static com.google.common.base.Predicate 
IS_EXPR_EQ_LITERAL_PREDICATE = new com.google.common.base.Predicate() {
long line


http://gerrit.cloudera.org:8080/#/c/7110/1/fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java
File 
fe/src/main/java/org/apache/impala/rewrite/CoalesceEqualityDisjunctsToInRule.java:

Line 69
> Agree. do we capture this effort in a separate jira?
I think this particular case is easy enough to fix in this JIRA. It's a simple 
generalization of the existing NormalizeBinaryPredicatesRule (always move 
constant exprs to the right)


http://gerrit.cloudera.org:8080/#/c/7110/2/fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java
File fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java:

Line 12:  * This rule will coalesce multiple equality predicates to an IN 
predicate. It
Suggest modifications for brevity and grammar:

Coalesces disjunctive equality predicates to an IN predicate, and merges 
compatible equality predicates into existing IN predicates.


Line 16:  * (X+Y = 5) OR (X+Y = 6) -> X+Y IN (5, 6)
add an example where an equality predicate is merged into an IN predicate


Line 32: if (orChildExpr != null) return orChildExpr;
What about IN OR IN? That can be produced by (a = 1 or a = 2) or (a = 3 or a = 
4) and should result in a IN (1,2,3,4)


Line 38:* returns a rewritten expression for expression of type A= 1 OR A 
IN (2, 3)
This comment should describe the inputs and outputs more clearly, in 
particular, what happens when no rewrite is performed.


Line 52: if (!Expr.IS_EXPR_EQ_LITERAL_PREDICATE.apply(otherPred))  return 
null;
extra space before "return"


Line 54: 
can get rid of these empty lines imo


Line 55: inPred.addChild(otherPred.getChild(1));
Our rewriting framework requires that we return a new InPredicate here. See the 
comment on ExprRewriteRule.apply().


Line 61:* returns a rewritten expression for expression of type A=1 OR A=2
Same here. Comment needs polish to describe inputs and outputs.


Line 64: if (!Expr.IS_EXPR_EQ_LITERAL_PREDICATE.apply(child0))
single lines?


Line 72: Expr leftChild0 = bp0.getChild(0);
Seems more concise to inline these function calls, and get rid of the extra 
assignments.


Line 78: if (!leftChild1.isLiteral()) return null;
To keep things simple, I think we should also require that rightChild1 is a 
literal. Otherwise, this rule is somewhat asymmetric.


http://gerrit.cloudera.org:8080/#/c/7110/2/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 392: RewritesOk("int_col = 1 or int_col = 2 or int_col = 3 and int_col 
= 4",
add parenthesis to make the AND/OR precedence explicit


Line 394: 
remove blank lines, these 3 test cases belong together logically, so it's good 
to cluster them visually


Line 403: // combo rules
add negative tests to show the non-obvious cases in which the rewrite is not 
performed, for example:

int_col = smallint_col or int_col = bigint_col


Line 411: RewritesOk("int_col = 1 or int_col in (2, 3)", edToInrule,
int_col in (1,2) or int_col in (3,4) should also work


-- 
To view, visit http://gerrit.cloudera.org:8080/7110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If02396b752c5497de9a92828c24c8062027dc2e2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: sakinape...@cloudera.com
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: sakinape...@cloudera.com
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-12 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new patch set (#2).

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..

IMPALA-5263: test infra: support CA bundles with secure clusters

This patch adds the command line option --ca_cert to the common test
infra CLI options for use alongside --use-ssl. This is useful when
testing against a secured Impala cluster in which the SSL certs are
self-signed. This will allow the SSL request to be validated. Using this
option will also suppress noisy console warnings like:

  InsecureRequestWarning: Unverified HTTPS request is being made. Adding
  certificate verification is strongly advised. See:
  https://urllib3.readthedocs.org/en/latest/security.html

We also go further in this patch and use the warnings module to print
these SSL-related warnings once and only once, instead of all over the
place. In the case of the stress test, this greatly reduces the noise in
the console log.

Testing:
- quick concurrent_select.py calls with and without --ca_cert to observe
  that connections still get made and the test runs smoothly. Some of
  this testing occurred without warning suppression, so that I could be
  sure the InsecureRequestWarnings were not occurring when using
  --ca_cert anymore.
- ensured warnings are printed once, not multiple times

Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
---
M tests/comparison/cli_options.py
M tests/comparison/cluster.py
M tests/comparison/db_connection.py
3 files changed, 56 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/7152/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Mulder 
Gerrit-Reviewer: Tim Wood 


[Impala-ASF-CR] IMPALA-5263: test infra: support CA bundles with secure clusters

2017-06-12 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7152

Change subject: IMPALA-5263: test infra: support CA bundles with secure clusters
..

IMPALA-5263: test infra: support CA bundles with secure clusters

This patch adds commandline option, --ca_cert, to the common test infra
CLI options for use alongside --use-ssl. This is useful when testing
against a secured Impala cluster in which the SSL certs are self-signed.
This will allow the SSL request to be validated. Using this option will
also suppress noisy console warnings like:

  InsecureRequestWarning: Unverified HTTPS request is being made. Adding
  certificate verification is strongly advised. See:
  https://urllib3.readthedocs.org/en/latest/security.html

We also go further in this patch and use the warnings module to print
these SSL-related warnings once and only once, instead of all over the
place. In the case of the stress test, this greatly reduces the noise in
the console log.

Testing:
- quick concurrent_select.py calls with and without --ca_cert to observe
  that connections still get made and the test runs smoothly. Some of
  this testing occurred without warning suppression, so that I could be
  sure the InsecureRequestWarnings were not occurring when using
  --ca_cert anymore.
- ensured warnings are printed once, not multiple times

Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
---
M tests/comparison/cli_options.py
M tests/comparison/cluster.py
M tests/comparison/db_connection.py
3 files changed, 55 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/7152/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifb9e466e4b7cde704cdc4cf98159c068c0a400a9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 


[Impala-ASF-CR] IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.
..


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/721/

-- 
To view, visit http://gerrit.cloudera.org:8080/7015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: yu feng 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: yu feng 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.

2017-06-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: yu feng 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: yu feng 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.

2017-06-12 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#6).

Change subject: IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.
..

IMPALA-5016: Simplify COALESCE() in SimplifyConditionalsRule.

Simplify COALESCE by skipping leading nulls and applying the following
transformations:
COALESCE(null, a, b) -> COALESCE(a, b);
COALESCE(, a, b) -> , when literal is not NullLiteral;
COALESCE(, a, b) -> ,
when the partition column does not contain NULL.

Testing:
added unit tests in ExprRewriteRulesTest

Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
---
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
2 files changed, 126 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/7015/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7015
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0325a9e437261b15313bbdf2f22d83376a84b575
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: yu feng 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: yu feng 


[Impala-ASF-CR] IMPALA-5031: Don't dereference uninitialized memory as a bool.

2017-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5031: Don't dereference uninitialized memory as a bool.
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/720/

-- 
To view, visit http://gerrit.cloudera.org:8080/7148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc03401a9aeb966f33298268e53ab1040d007224
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: Don't dereference uninitialized memory as a bool.

2017-06-12 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5031: Don't dereference uninitialized memory as a bool.
..


Patch Set 3: Code-Review+2

rebase carry

-- 
To view, visit http://gerrit.cloudera.org:8080/7148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc03401a9aeb966f33298268e53ab1040d007224
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No