[Impala-ASF-CR] IMPALA-4163: Add sortby() query hint

2016-12-14 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4163: Add sortby() query hint
..


Patch Set 3:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

Line 370: nonterminal List opt_plan_hints, plan_hint_list;
do you really need both? why not have a non-existing hint = empty hint list?


Line 2397:   | /* empty */
what's the point of this?


http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 127:   // Contains the result exprs corresponding to the columns of the 
target table, that are
what do you mean by result exprs? are these basetbl exprs?


Line 774:   private void analyzeSortByColumns(List columnNames) throws 
AnalysisException {
use line breaks where it makes sense.

also, describe the rules for sort-by columns somewhere.

analyzeSortByHint is more general (and correct)


Line 787:   } else {
you'd also take this branch for hbase


http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/java/org/apache/impala/analysis/PlanHint.java
File fe/src/main/java/org/apache/impala/analysis/PlanHint.java:

Line 34:   private final String hintName_;
remove 'hint' from members, this class already contains 'hint' in its name.


Line 62:   public String toString() {
toSql test missing


http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

Line 87:   // ArrayList initialization each.
yes, please do.


http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

PS3, Line 511: clustered/noclustered plan
 :* hint and the sortby()
"on the clustered/noclustered/sortby plan hint" is shorter


Line 515:* will also sort by all columns specified in the sortby() hint.
rewrite comment, instead of just tacking on


http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/main/jflex/sql-scanner.flex
File fe/src/main/jflex/sql-scanner.flex:

Line 331: HintContent = [ ]* "+" [^\r\n*]*
isn't [ ]* the same as " "*?


Line 336: ContainsCommentEnd = [^]* "*/" [^]*
what is [^]*?

shouldn't 'anything' be simply ".*"?


Line 342: TraditionalComment = "/*" !({HintContent}|{ContainsCommentEnd}) "*/"
why do you need the ContainsCommentEnd here? wouldn't /* */ */ simply be a 
parse error?


Line 434:   yybegin(HINT);
shouldn't this only apply to end-of-line hints?


http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

Line 1776:   "partition (year, month) 
%sshuffle,clustered,sortby(int_col,bool_col)%s " +
mix up the order a bit


http://gerrit.cloudera.org:8080/#/c/5051/3/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

Line 216: ParserError("select 1 /*+ sortby(() */");
also check that this is illegal:

/* +sortby()\n

and 

-- +sortby() */ \n


Line 477:   "insert into t %sSoRtBy(  a  ,  , ,,, b  )%s select * from 
t", prefix, suffix));
is the capitalization necessary?


http://gerrit.cloudera.org:8080/#/c/5051/3/testdata/workloads/functional-planner/queries/PlannerTest/insert.test
File testdata/workloads/functional-planner/queries/PlannerTest/insert.test:

Line 758:/*+ clustered,sortby(int_col, bool_col) */
move clustered to end


-- 
To view, visit http://gerrit.cloudera.org:8080/5051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37a3ffab99aaa5d5a4fd1ac674b3e8b394a3c4c0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.

2016-12-14 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that 
only run locally.
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4633: Change broken gflag default for Kudu client mem

2016-12-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#2).

Change subject: IMPALA-4633: Change broken gflag default for Kudu client mem
..

IMPALA-4633: Change broken gflag default for Kudu client mem

We discovered that the current Kudu client defaults in the
KuduTableSink are causing a large number of queries to
timeout, failing the query. The current default value of the
'mutation buffer size' is 100MB which results in higher
write throughput than Kudu can currently handle on large
clusters.  By decreasing the value of this flag, more RPCs
will be sent for the same amount of data, i.e. throttling
the load on Kudu. We found tests to be more successful on
200 nodes with a 10MB buffer size than the previous 100MB
value where most queries couldn't complete due to timeouts.
These queries were not timing out with the 10MB value. This
appears to work well on 9 node stress tests as well.

Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
2 files changed, 5 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/5503/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.

2016-12-14 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that 
only run locally.
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4715

to look at the new patch set (#14).

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..

IMPALA-3200: Implement suballocator for splitting buffers

This is useful for situations like hash tables, where we want to
make multiple non-spillable allocations of variable size from buffer
pool memory and not incur the overhead of interacting with the global
buffer pool. The allocator subdivides buffers to service allocations
and uses a buddy allocation algorithm to merge freed allocations into
larger chunks. This helps avoid external fragmentation and is quite
effective at reusing memory given the typical doubling allocation
patterns of hash tables in partitioned aggs and joins.

Testing:
The allocator has fairly robust internal consistency checks via
assertions and unique_ptrs. Includes a unit test that exercises
various allocation patterns.

I tested porting hash tables over to allocate memory using an earlier
version of the suballocator, which worked well (was able to run a
wide range of queries successfully with good performance).

Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
---
M be/src/common/names.h
M be/src/exec/hash-table.h
M be/src/runtime/bufferpool/CMakeLists.txt
A be/src/runtime/bufferpool/suballocator-test.cc
A be/src/runtime/bufferpool/suballocator.cc
A be/src/runtime/bufferpool/suballocator.h
6 files changed, 938 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/14
-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 14: Code-Review+1

Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.

2016-12-14 Thread David Knupp (Code Review)
Hello Michael Brown, Lars Volker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5446

to look at the new patch set (#4).

Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that 
only run locally.
..

IMPALA-4639: Add pytest option and xfail markers for tests that only run 
locally.

As we're beginning to run Impala end-to-end tests on remote clusters, we're
finding some tests that do not pass for infrastructure-related reasons (as
opposed to product issues.) It would be useful to be able to xfail any tests
that we know to be problematic within a given module, yet still run the
others. This way, we can get passing test runs as we're ironing out those
infrastructure issues.

Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
---
M tests/conftest.py
M tests/metadata/test_compute_stats.py
M tests/query_test/test_mt_dop.py
3 files changed, 24 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/5446/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3977: TransmitData() should not block

2016-12-14 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3977: TransmitData() should not block
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/5491/1/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

PS1, Line 226: FLAGS_datastream_sender_timeout_ms
> rather than have the sleep period be a function of the total timeout, it ma
Does it make sense to add some exponential back-off here, let's say up to a 
maximum value? That way, if the sender out-races the receiver we will try again 
sooner and bound delays.


http://gerrit.cloudera.org:8080/#/c/5491/1/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

Line 233:   ("DATASTREAM_SENDER_TIMEOUT", 72, "Sender timed out waiting for 
receiver fragment "
Should we mark this as deprecated here?


PS1, Line 314: DATASTREAM_RECEIVER_EAGAIN
Would it make sense to name this more general, e.g. "RPC_TRY_AGAIN" so we can 
re-use it in similar situations?


Line 315:"Datastream receiver is not yet ready"),
nit: single line?


-- 
To view, visit http://gerrit.cloudera.org:8080/5491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47f9d6c1de95a72ab73471dcd2a3118ef13e7db0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4630: Change broken gflag default for Kudu client mem

2016-12-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4630: Change broken gflag default for Kudu client mem
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5503/1//COMMIT_MSG
Commit Message:

Line 7: IMPALA-4630: Change broken gflag default for Kudu client mem
> Wrong JIRA?
Done


Line 11: fail. The current mutation buffer size is 100MB, which is
> can you qualify "fail"? did they time out?
Done


Line 15: successful on 200 node tests than the previous 100MB value.
> Can you briefly describe the effect that this decrease has? For the same am
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4569: fuzz test fixes

2016-12-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4569: fuzz test fixes
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5502/2//COMMIT_MSG
Commit Message:

PS2, Line 7: IMPALA-4569
wrong jira?


-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4569: fuzz test fixes

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-4569: fuzz test fixes
..

IMPALA-4569: fuzz test fixes

* Apply a 512m mem_limit to all fuzz tests. This limits aggregate memory
  consumption to ~5GB per daemon(assuming 10 concurrent tests).
* Refactor the exec option handling to use the exec_option dimension.
  This avoids executing the test multiple times redundantly
* Remove the xfails to reduce noise, since there is no immediate plan to
  fix the product bugs. Instead just pass the tests.

Testing:
Ran in a loop for ~1h to flush out flakiness.

Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
---
M tests/query_test/test_scanners_fuzz.py
1 file changed, 41 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/5502/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4631: don't use floating point operations for time unit 
conversions
..


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/121/

-- 
To view, visit http://gerrit.cloudera.org:8080/5434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4569: fuzz test fixes

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4569: fuzz test fixes
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5502/1/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

Line 59:   'disable_codegen' : cls.DISABLE_CODEGEN_VALUES, 'mem_limit' 
: cls.MEM_LIMITS}))
> Should we add BATCH_SIZES here as well? I can see how you'd want to run sev
That's a good point. Changed it to do that.


-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4569: fuzz test fixes

2016-12-14 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4569: fuzz test fixes
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4569: fuzz test fixes

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4569: fuzz test fixes
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/122/

-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4633: Change broken gflag default for Kudu client mem

2016-12-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4633: Change broken gflag default for Kudu client mem
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5503/2//COMMIT_MSG
Commit Message:

PS2, Line 11: The current default value of the
: 'mutation buffer size' is 100MB which results in higher
: write throughput than Kudu can currently handle on large
: clusters. 
is there a kudu jira for making kudu tolerant of writers that are "too fast"? 
if so, can you mention it (or link the impala jira to it)?


-- 
To view, visit http://gerrit.cloudera.org:8080/5503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4715/13/be/src/runtime/bufferpool/suballocator.h
File be/src/runtime/bufferpool/suballocator.h:

PS13, Line 173: Suballocation objects are owned by
  : /// either client code (if in use), the free list (if free) or 
its left child (if split).
> I think this could use calling out even more explicitly - the client has to
Tried making it more explicit.


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3977: TransmitData() should not block

2016-12-14 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3977: TransmitData() should not block
..


Patch Set 1:

I'll do the first round of reviews.

-- 
To view, visit http://gerrit.cloudera.org:8080/5491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47f9d6c1de95a72ab73471dcd2a3118ef13e7db0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 7: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 7:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/118/

-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4569: fuzz test fixes

2016-12-14 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4569: fuzz test fixes
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5502/1/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

Line 59:   'disable_codegen' : cls.DISABLE_CODEGEN_VALUES, 'mem_limit' 
: cls.MEM_LIMITS}))
Should we add BATCH_SIZES here as well? I can see how you'd want to run several 
batch sizes on the same file, but the same would also apply to 
DISABLE_CODEGEN_VALUES. Not asking you to change anything, just curious what 
you think.


-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4630: Change broken gflag default for Kudu client mem

2016-12-14 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4630: Change broken gflag default for Kudu client mem
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5503/1//COMMIT_MSG
Commit Message:

Line 7: IMPALA-4630: Change broken gflag default for Kudu client mem
Wrong JIRA?


Line 11: fail. The current mutation buffer size is 100MB, which is
can you qualify "fail"? did they time out?


Line 15: successful on 200 node tests than the previous 100MB value.
Can you briefly describe the effect that this decrease has? For the same amount 
of data will we do more RPCs?


-- 
To view, visit http://gerrit.cloudera.org:8080/5503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4640: Fix number of rows displayed by parquet-reader tool

2016-12-14 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-4640: Fix number of rows displayed by parquet-reader tool
..

IMPALA-4640: Fix number of rows displayed by parquet-reader tool

The variable just never got updated in the code. This change also adds
verification that all columns contain the same number of rows.

Change-Id: I281a784a0aa2df4ed1852dfb864587a0c1aa4d9a
---
M be/src/util/parquet-reader.cc
1 file changed, 19 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/5453/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5453
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I281a784a0aa2df4ed1852dfb864587a0c1aa4d9a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-2605: Omit the sort and mini stress tests

2016-12-14 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-2605: Omit the sort and mini stress tests
..


Patch Set 2:

> Maybe we should dump out the value of some of these metrics before
 > running the tests in tests/run-tests.py. That would help diagnose
 > in future.

So you're suggesting we add logging for, say, 
impala.thrift-server.backend.connections-in-use, 
impala.thrift-server.beeswax-frontend.connections-in-use, and 
impala.thrift-server.hiveserver2-frontend.connections-in-use, abandon this 
patch, and see what we can diagnose the next time this happens?

-- 
To view, visit http://gerrit.cloudera.org:8080/5401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd30abf8215415e0f2830b725e43b005daa2bb2d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2605: Omit the sort and mini stress tests

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2605: Omit the sort and mini stress tests
..


Patch Set 2:

I guess I'm uneasy that we don't fully understand the cause of the problem. It 
seems like it's probably a test bug and I don't think we have a path to solving 
that. I looked again at the backtraces in the ticket and just see a bunch of 
threads waiting to read from the client connection.

My other thought is that, even if we disable these tests, will the same problem 
just happen with the other enabled stress tests.

Maybe we should dump out the value of some of these metrics before running the 
tests in tests/run-tests.py. That would help diagnose in future.

-- 
To view, visit http://gerrit.cloudera.org:8080/5401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd30abf8215415e0f2830b725e43b005daa2bb2d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions

2016-12-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4631: don't use floating point operations for time unit 
conversions
..


Patch Set 6: Code-Review+2

Rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/5434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3977: TransmitData() should not block

2016-12-14 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3977: TransmitData() should not block
..


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/5491/1/be/src/runtime/data-stream-mgr.cc
File be/src/runtime/data-stream-mgr.cc:

PS1, Line 53: STREAM_EXPIRATION_TIME_MS
Do you think we still need the stream cache? In the case where a receiver has 
completed before it receives an initial RPC, we should be able to rely on 
cancellation being detected on the sender side. The cost is that we might retry 
that initial RPC a couple of times before we receive cancellation from the 
coordinator.

Can you think this through and work out whether there's a state where the 
stream cache is really worth the extra complexity?


PS1, Line 99: if (acquire_lock) lock_.lock();
Can we just require now that the caller takes the lock?


http://gerrit.cloudera.org:8080/#/c/5491/1/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

PS1, Line 209: TTransmitDataResult res;
Let's move this inside the loop so it's freshly initialized on each iteration. 
That avoids any bugs where one RPC returns EAGAIN and sets a field in the 
result that - for some reason - a subsequent successful RPC does not overwrite.


PS1, Line 217: DoTransmitDataRpc
Since a row batch can potentially be many megabytes large, I wonder if it makes 
sense to have an initial 'probe' RPC that has no payload, but just confirms 
that the receiver is ready. Then we can limit the retry logic to that probe 
only, and if a subsequent RPC returns EAGAIN, we know that the receiver is 
closed and don't need to retry.


Line 224: if (res.status.status_code != 
TErrorCode::DATASTREAM_RECEIVER_EAGAIN) break;
Is cancellation detected during this loop? That's part of the point of moving 
control back to the sender on EAGAIN, so that cancellation can be detected. 
Otherwise this blocks from the sender's perspective in the same way that it did 
before.


PS1, Line 225: VLOG_RPC << "Datastream receiver not yet ready. Retrying";
Not very informative without the fragment ID / destination address etc.


PS1, Line 226: FLAGS_datastream_sender_timeout_ms
rather than have the sleep period be a function of the total timeout, it makes 
more sense to have a fixed retry interval and exit this loop if the timeout has 
been exceeded.

For example, if someone sets the timeout to ten minutes (because they're being 
ultra conservative on a heavy cluster), this will retry every two minutes.


Line 228: 
I think it's a bit confusing to have this method return EAGAIN upon failure. 
Can you replace the status with timeout if it times-out of the loop?


-- 
To view, visit http://gerrit.cloudera.org:8080/5491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47f9d6c1de95a72ab73471dcd2a3118ef13e7db0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4648: remove build thirdparty.sh

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4648: remove build_thirdparty.sh
..


Patch Set 2:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/116/

-- 
To view, visit http://gerrit.cloudera.org:8080/5477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4648: remove build thirdparty.sh

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4648: remove build_thirdparty.sh
..


Patch Set 2: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/5477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-12-14 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 7:

> Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/120/

Sorry, that was me :-(

-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430: enable codegen for native UDAs
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/agg-fn-evaluator.cc
File be/src/exprs/agg-fn-evaluator.cc:

PS6, Line 542: 
Expr::InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(intermediate_type()),
 : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, 
*uda_fn);
> I am not sure I understand the intention of this.
This is preserving the pre-existing behaviour of the UDA interface, which 
treats the last (in-out) argument as the return type in the function context. 
Changing how the types are exposed could break UDAs that were written against 
the old interface. I updated the comment to hopefully make this clearer.

I don't feel strongly about whether this is the right API (on one hand, it 
doesn't match the C++ function signature, on the other the intermediate type is 
logically the return type) but I don't think it's worth breaking compatibility 
for.

The constants aren't inlined into the input expressions, only to the UDA 
function itself, so the second scenario isn't possible.


http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

PS6, Line 327: !udf->isDeclaration()
> May help to add a comment on why this can be a declaration only (i.e. no fu
Done


PS6, Line 328: InlineConstants(AnyValUtil::ColumnTypeToTypeDesc(type_),
 : AnyValUtil::ColumnTypesToTypeDescs(arg_types), codegen, 
udf);
> Please see comments in AggFnEvaluator. It seems that we should be able to k
That was the initial path I took but it's not possible because UDAs treat the 
final in/out argument as the return type. I could pass in an additional 
argument to determine which mode it should use if you prefer.


PS6, Line 449:   DCHECK(has_varargs || arg_types.size() == num_fixed_args);
 :   DCHECK(!has_varargs || arg_types.size() > num_fixed_args);
> DCHECK(arg_types.size() == num_fixed_args ||  (has_var_args && arg_types.si
We actually want to reject the case of passing 0 varargs into a varargs 
function. The frontend currently rejects those cases and I added a comment to 
this function that specifically mentions this case.

It would be reasonable to pass 0 args into varargs functions, but a bunch of 
code here doesn't handle that correctly - it assumes that having 0 varargs 
means that the function signature doesn't have a varargs argument.


PS6, Line 478: codegen->void_type() :
 : CodegenAnyVal::GetLoweredType(codegen, *return_type);
> nit: one line ?
clang-format seems to prefer it this way. I don't feel strongly


http://gerrit.cloudera.org:8080/#/c/5161/6/be/src/exprs/scalar-fn-call.h
File be/src/exprs/scalar-fn-call.h:

PS6, Line 59: 'cache_entry'
> May help to have a comment about what 'cache_entry' is. Something like the 
Done


Line 60:   /// updated to point to the library (or its use count is incremented 
if already loaded).
> Please add a comment that the caller is expected to call FinalizeFunction()
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5161
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 7:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/120/

-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#7).

Change subject: IMPALA-1430: enable codegen for native UDAs
..

IMPALA-1430: enable codegen for native UDAs

This uses the existing infrastructure for codegening builtin UDAs and
for codegening calls to UDFs. GetUdf() is refactored to support both
UDFs and UDAs.

IR UDAs are still not allowed by the frontend. It's unclear if we want
to enable them going forward because of the difficulties in testing and
supporting IR UDFs/UDAs.

Testing:
test_udfs.py tests UDAs with codegen enabled and disabled

Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/agg-fn-evaluator.h
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
11 files changed, 163 insertions(+), 124 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/7
-- 
To view, visit http://gerrit.cloudera.org:8080/5161
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.

2016-12-14 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that 
only run locally.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5446/3/tests/conftest.py
File tests/conftest.py:

Line 483: if "localhost" in pytest.config.option.impalad:
> or "127.0.0.1" ?
Any address starting with 127.x.x.x is the local machine I think.


-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/117/

-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4569: fuzz test fixes

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5502

Change subject: IMPALA-4569: fuzz test fixes
..

IMPALA-4569: fuzz test fixes

* Apply a 512m mem_limit to all fuzz tests. This limits aggregate memory
  consumption to ~5GB per daemon(assuming 10 concurrent tests).
* Refactor the exec option handling to use the exec_option dimension.
  This avoids executing the test multiple times redundantly
* Remove the xfails to reduce noise, since there is no immediate plan to
  fix the product bugs. Instead just pass the tests.

Testing:
Ran in a loop for ~1h to flush out flakiness.

Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
---
M tests/query_test/test_scanners_fuzz.py
1 file changed, 23 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/5502/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4630: Change broken gflag default for Kudu client mem

2016-12-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5503

Change subject: IMPALA-4630: Change broken gflag default for Kudu client mem
..

IMPALA-4630: Change broken gflag default for Kudu client mem

We discovered that the current kudu client defaults in the
kudu table sink are causing a large number of queries to
fail. The current mutation buffer size is 100MB, which is
results in the higher write throughput than Kudu can
currently handle on large clusters. By decreasing the
default value of this flag, we found tests to be more
successful on 200 node tests than the previous 100MB value.
This value appeared to work well on 9 node stress tests as
well.

Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
2 files changed, 4 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/5503/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4630: Change broken gflag default for Kudu client mem

2016-12-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4630: Change broken gflag default for Kudu client mem
..


Patch Set 1:

In addition to stress testing and the 200 node tests from Mostafa, this also 
passed a private test job.

-- 
To view, visit http://gerrit.cloudera.org:8080/5503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4659: fuzz test fixes

2016-12-14 Thread Tim Armstrong (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5502

to look at the new patch set (#3).

Change subject: IMPALA-4659: fuzz test fixes
..

IMPALA-4659: fuzz test fixes

* Apply a 512m mem_limit to all fuzz tests. This limits aggregate memory
  consumption to ~5GB per daemon(assuming 10 concurrent tests).
* Refactor the exec option handling to use the exec_option dimension.
  This avoids executing the test multiple times redundantly
* Remove the xfails to reduce noise, since there is no immediate plan to
  fix the product bugs. Instead just pass the tests.

Testing:
Ran in a loop for ~1h to flush out flakiness.

Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
---
M tests/query_test/test_scanners_fuzz.py
1 file changed, 41 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/5502/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates

2016-12-14 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list 
predicates
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5505/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

PS1, Line 388: Preconditions.checkNotNull(value == null);
Shouldn't this be checkNotNull(value)?


-- 
To view, visit http://gerrit.cloudera.org:8080/5505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates

2016-12-14 Thread Matthew Jacobs (Code Review)
Hello Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5505

to look at the new patch set (#2).

Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list 
predicates
..

IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates

The KuduScanNode attempts to push IN list predicates to the
Kudu scan, but NULL literals cannot be pushed. The code in
KuduScanNode needed to check if the Literals in the
InPredicate is a NullLiteral, in which case the entire IN
list should not be pushed to Kudu.

The same handling is already in place for binary predicate
pushdown.

Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b
---
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
2 files changed, 20 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/5505/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] CDH-48291: Fix flaky test TestRequestPoolService

2016-12-14 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5507

Change subject: CDH-48291: Fix flaky test TestRequestPoolService
..

CDH-48291: Fix flaky test TestRequestPoolService

This commit fixes a flaky test caused by tight timeouts. The fix is to
bump up the timeouts in this test.

Change-Id: I71a036d5b1227508e2175808d7e92837ae247d99
---
M fe/src/test/java/org/apache/impala/util/TestRequestPoolService.java
1 file changed, 2 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/5507/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5507
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I71a036d5b1227508e2175808d7e92837ae247d99
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: Add all build targets to CMake and speed up builds
..


Add all build targets to CMake and speed up builds

Use CMake's dependency resolution always instead of serial execution of
targets via shell scripts.  This improves parallelism by building fe,
be, and other targets at the same time and avoid some overhead from
invoking "make" multiple times. This reduces the time taken for
an incremental compilation of fe and be from 56s to 24s with this
command:

  ./buildall.sh -debug -noclean -notests -skiptests -ninja

Also use Impala-lzo's build script. This depends on the IMPALA-4277
fixes to the Impala-lzo build script.

Log directory creation is also moved from impala-config.sh to
buildall.sh. This means that impala-config.sh has no side-effects and
can be run concurrently with no issues.

Also make sure that "make" builds all the same artifacts as buildall.sh
when run with no args.

Testing:
Ran a jenkins core job, also experimented locally. Ran a jenkins core
job with distcc disabled - this exposed some concurrency bugs where
impala-config.sh fails if run concurrently.

Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Reviewed-on: http://gerrit.cloudera.org:8080/4790
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M CMakeLists.txt
M be/src/benchmarks/CMakeLists.txt
M be/src/service/CMakeLists.txt
M bin/impala-config.sh
M bin/make_impala.sh
M buildall.sh
M ext-data-source/CMakeLists.txt
M fe/CMakeLists.txt
8 files changed, 85 insertions(+), 66 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 7: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates

2016-12-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5505

Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list 
predicates
..

IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates

The KuduScanNode attempts to push IN list predicates to the
Kudu scan, but NULL literals cannot be pushed. The code in
KuduScanNode needed to check if the Literals in the
InPredicate is a NullLiteral, in which case the entire IN
list should not be pushed to Kudu.

The same handling is already in place for binary predicate
pushdown.

Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b
---
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
2 files changed, 15 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/5505/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4467: Add support for DML statements in stress test
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5093/4//COMMIT_MSG
Commit Message:

Line 24: following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
> I agree you can achieve the same thing with mod values in principle. Let me
In order to calculate %rows, we don't actually need to know how many rows are 
in the table. %rows = 100 / mod_value. This is why I think it still makes sense 
to keep the mod_values. Another way to think of mod_value is if mod_value = N, 
then the DML statement should affect every Nth row.


-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#7).

Change subject: IMPALA-4467: Add support for DML statements in stress test
..

IMPALA-4467: Add support for DML statements in stress test

- Add support for insert, upsert, update and and delete statements.
- Add support for compute stats with mt_dop query options.
- Update impyla version in order to be able to have access to query
  error text for DML queries.
- Made flake8 fixes. flake8 on this file is clean.

For every Kudu table in the databases, we make a copy and add a
'_original' suffix to the table name. The DML queries will only make
modifications to the non original table, the original table will never
be modified. The orignal tables could be used to bring the non-original
table to the inital state. Two flags were added for doing this:
--reset-databases-before-binary-search and
--reset-databases-after-binary-search.

The DML queries are generated based on the mod values passed in with the
following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
queries are generated. The DML operations will touch table rows where
primary_key % mod_value = 0. So, the larger the mod value, the more rows
would be affected. The DML queries are generated in such a way that the
data for the insert, upsert, and update queries is taken from the table
with the _original suffix. The stress test generates DML queries for
only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu
--tpch-db=tpch_100 --generate-dml-queries would only generate queries
for the tpch_100_kudu database.

Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
---
M infra/python/deps/requirements.txt
M tests/stress/concurrent_select.py
M tests/util/parse_util.py
3 files changed, 689 insertions(+), 269 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/7
-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4659: fuzz test fixes

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4659: fuzz test fixes
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5502/2//COMMIT_MSG
Commit Message:

PS2, Line 7: IMPALA-4569
> wrong jira?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4654: KuduScanner must return when ReachedLimit()

2016-12-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4654: KuduScanner must return when ReachedLimit()
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaddd5a1b2647995d68e6d37d0500b3a322de
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions

2016-12-14 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4631: don't use floating point operations for time unit 
conversions
..


Patch Set 6: Verified+1

http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/121/

-- 
To view, visit http://gerrit.cloudera.org:8080/5434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates

2016-12-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list 
predicates
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5505/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

PS1, Line 384: prediates
> Typo: predicates
Done


http://gerrit.cloudera.org:8080/#/c/5505/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

PS1, Line 319: select id from functional_kudu.alltypestiny where id in (1, 
null);
> Is it necessary to add additional tests for other types? (string, bool, etc
Sure


-- 
To view, visit http://gerrit.cloudera.org:8080/5505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#6).

Change subject: IMPALA-4467: Add support for DML statements in stress test
..

IMPALA-4467: Add support for DML statements in stress test

- Add support for insert, upsert, update and and delete statements.
- Add support for compute stats with mt_dop query options.
- Update impyla version in order to be able to have access to query
  error text for DML queries.
- Made flake8 fixes. flake8 on this file is clean.

For every Kudu table in the databases, we make a copy and add a
'_original' suffix to the table name. The DML queries will only make
modifications to the non original table, the original table will never
be modified. The orignal tables could be used to bring the non-original
table to the inital state. Two flags were added for doing this:
--reset-databases-before-binary-search and
--reset-databases-after-binary-search.

The DML queries are generated based on the mod values passed in with the
following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
queries are generated. The DML operations will touch table rows where
primary_key % mod_value = 0. So, the larger the mod value, the more rows
would be affected. The DML queries are generated in such a way that the
data for the insert, upsert, and update queries is taken from the table
with the _original suffix. The stress test generates DML queries for
only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu
--tpch-db=tpch_100 --generate-dml-queries would only generate queries
for the tpch_100_kudu database.

Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
---
M infra/python/deps/requirements.txt
M tests/stress/concurrent_select.py
M tests/util/parse_util.py
3 files changed, 689 insertions(+), 269 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/6
-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Taras Bobrovytsky (Code Review)
Hello Michael Brown,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5093

to look at the new patch set (#6).

Change subject: IMPALA-4467: Add support for DML statements in stress test
..

IMPALA-4467: Add support for DML statements in stress test

- Add support for insert, upsert, update and and delete statements.
- Add support for compute stats with mt_dop query options.
- Update impyla version in order to be able to have access to query
  error text for DML queries.
- Made flake8 fixes. flake8 on this file is clean.

For every Kudu table in the databases, we make a copy and add a
'_original' suffix to the table name. The DML queries will only make
modifications to the non original table, the original table will never
be modified. The orignal tables could be used to bring the non-original
table to the inital state. Two flags were added for doing this:
--reset-databases-before-binary-search and
--reset-databases-after-binary-search.

The DML queries are generated based on the mod values passed in with the
following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
queries are generated. The DML operations will touch table rows where
primary_key % mod_value = 0. So, the larger the mod value, the more rows
would be affected. The DML queries are generated in such a way that the
data for the insert, upsert, and update queries is taken from the table
with the _original suffix. The stress test generates DML queries for
only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu
--tpch-db=tpch_100 --generate-dml-queries would only generate queries
for the tpch_100_kudu database.

Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
---
M infra/python/deps/requirements.txt
M tests/stress/concurrent_select.py
M tests/util/parse_util.py
3 files changed, 689 insertions(+), 269 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/6
-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4467: Add support for DML statements in stress test
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5093/4/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

Line 1481:   "UPDATE a SET {update_list} FROM {table_name} a JOIN 
{table_name}_original b "
> To me it's not really about validating the results, but more about predicta
Done. Skipping creation of DML queries for tables with more than 1 primary key 
column.


-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4467: Add support for DML statements in stress test
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5093/6/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

Line 1471:   # TODO: IMPALA-: Add support for tables with multiple 
primary keys.
> fill in real JIRA or just leave the TODO
Yes, I'll fill in the Jira. Michael wants every TODO to have a Jira attached.


-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4649: add a mechanism to pass flags into make

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4649: add a mechanism to pass flags into make
..


Patch Set 2: Code-Review+1

Rebased, carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/5480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I17b13cbaf395f962762d5cff3d650ffb077934a4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4648: remove build thirdparty.sh

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4648: remove build_thirdparty.sh
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4467: Add support for DML statements in stress test
..


Patch Set 7: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5093/4//COMMIT_MSG
Commit Message:

Line 24: following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
> In order to calculate %rows, we don't actually need to know how many rows a
Ahh yes, you are right. Thanks for clarifying. Not sure what I was thinking. I 
was clearly wrong.


-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates

2016-12-14 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list 
predicates
..


Patch Set 2: Code-Review+1

Thanks for adding the additional test cases. My +1 means I'm satisfied that you 
addressed my comments, but obviously someone should take a look again at your 
FE changes.

-- 
To view, visit http://gerrit.cloudera.org:8080/5505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4659: fuzz test fixes

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4659: fuzz test fixes
..


Patch Set 3: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4659: fuzz test fixes

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4659: fuzz test fixes
..


IMPALA-4659: fuzz test fixes

* Apply a 512m mem_limit to all fuzz tests. This limits aggregate memory
  consumption to ~5GB per daemon(assuming 10 concurrent tests).
* Refactor the exec option handling to use the exec_option dimension.
  This avoids executing the test multiple times redundantly
* Remove the xfails to reduce noise, since there is no immediate plan to
  fix the product bugs. Instead just pass the tests.

Testing:
Ran in a loop for ~1h to flush out flakiness.

Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Reviewed-on: http://gerrit.cloudera.org:8080/5502
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M tests/query_test/test_scanners_fuzz.py
1 file changed, 41 insertions(+), 25 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4649: add a mechanism to pass flags into make

2016-12-14 Thread Tim Armstrong (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5480

to look at the new patch set (#2).

Change subject: IMPALA-4649: add a mechanism to pass flags into make
..

IMPALA-4649: add a mechanism to pass flags into make

Testing:
Tested that buildall.sh works as expected. Built locally with
IMPALA_MAKE_FLAGS unset to confirm I didn't break anything.

Built locally with IMPALA_MAKE_FLAGS=--load-average=$IMPALA_BUILD_THREADS
and looked at "ps auxf" output to confirm it's passed through.

Change-Id: I17b13cbaf395f962762d5cff3d650ffb077934a4
---
M bin/impala-config.sh
M bin/make_impala.sh
M testdata/bin/copy-udfs-udas.sh
3 files changed, 8 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/5480/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5480
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I17b13cbaf395f962762d5cff3d650ffb077934a4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4659: fuzz test fixes

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4659: fuzz test fixes
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5502
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1942ceef252ec3e6171a0a54722b66a7d9abbd7
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4654: KuduScanner must return when ReachedLimit()

2016-12-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4654: KuduScanner must return when ReachedLimit()
..


IMPALA-4654: KuduScanner must return when ReachedLimit()

Fixes a bug in the KuduScanner where the scan node's limit
was not respected and thus the scanner thread would
continue executing until the scan range was fully consumed.
This could result in completed queries leaving fragments
running and those threads could be using significant CPU and
memory.

For example, the query 'select * from tpch_kudu.lineitem
limit 90' when running in the minicluster and lineitem is
partitioned into 3 hash partitions would end up leaving a
scanner thread running for ~60 seconds. In real world
scenarios this can cause unexpected resource consumption.
This could build up over time leading to query failures if
these queries are submitted frequently.

The fix is to ensure KuduScanner::GetNext() returns with
eos=true when it finds ReachedLimit=true. An unnecessary and
somewhat confusing flag 'batch_done' was being returned by a
helper function DecodeRowsIntoRowBatch, which isn't
necessary and was removed in order to make it more clear how
the code in GetNext() should behave.

Change-Id: Iaddd5a1b2647995d68e6d37d0500b3a322de
Reviewed-on: http://gerrit.cloudera.org:8080/5493
Reviewed-by: Alex Behm 
Reviewed-by: Tim Armstrong 
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-scanner.h
M tests/query_test/test_kudu.py
3 files changed, 19 insertions(+), 28 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, but someone else must approve
  Dan Hecht: Looks good to me, approved
  Tim Armstrong: Looks good to me, but someone else must approve



-- 
To view, visit http://gerrit.cloudera.org:8080/5493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iaddd5a1b2647995d68e6d37d0500b3a322de
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-4648: remove build thirdparty.sh

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4648: remove build_thirdparty.sh
..


IMPALA-4648: remove build_thirdparty.sh

It is not needed by any build processes.

Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3
Reviewed-on: http://gerrit.cloudera.org:8080/5477
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
D bin/build_thirdparty.sh
1 file changed, 0 insertions(+), 244 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I36e11384877e5115430baf0170b31a508cb01ba3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates

2016-12-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list 
predicates
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5505/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

PS1, Line 388: Preconditions.checkNotNull(value == null);
> Shouldn't this be checkNotNull(value)?
It depends, if you want this Precondition to do anything then yes :)
Good catch. Fixing this too.


-- 
To view, visit http://gerrit.cloudera.org:8080/5505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4662: Fix NULL literal handling in Kudu IN list predicates

2016-12-14 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4662: Fix NULL literal handling in Kudu IN list 
predicates
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5505/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

PS1, Line 384: prediates
Typo: predicates


http://gerrit.cloudera.org:8080/#/c/5505/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

PS1, Line 319: select id from functional_kudu.alltypestiny where id in (1, 
null);
Is it necessary to add additional tests for other types? (string, bool, etc.)


-- 
To view, visit http://gerrit.cloudera.org:8080/5505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaf2c10a326373ad80aef51a85cec64071daefa7b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4467: Add support for DML statements in stress test
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5093/6/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

Line 1471:   # TODO: IMPALA-: Add support for tables with multiple 
primary keys.
fill in real JIRA or just leave the TODO


-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.h
File be/src/runtime/bufferpool/suballocator.h:

Line 104:   static constexpr int64_t MAX_ALLOCATION_BYTES = 1L << 
LOG_MAX_ALLOCATION_BYTES;
> this is really large (entire virtual address space on current x64 implement
after reading the rest of the code I see it's used so to bound the depth of the 
tree and number of free lists. Maybe comment thats what it's for.


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.cc
File be/src/runtime/bufferpool/suballocator.cc:

Line 137:   return Status::OK();
Now that I read one of Dan's comments, I wonder if this should be an error 
returned here. Otherwise, it looks like this method never returns an error and 
could just have a void return type.


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.

2016-12-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that 
only run locally.
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4639: Add pytest option and xfail markers for tests that only run locally.

2016-12-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4639: Add pytest option and xfail markers for tests that 
only run locally.
..


IMPALA-4639: Add pytest option and xfail markers for tests that only run 
locally.

As we're beginning to run Impala end-to-end tests on remote clusters, we're
finding some tests that do not pass for infrastructure-related reasons (as
opposed to product issues.) It would be useful to be able to xfail any tests
that we know to be problematic within a given module, yet still run the
others. This way, we can get passing test runs as we're ironing out those
infrastructure issues.

Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Reviewed-on: http://gerrit.cloudera.org:8080/5446
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M tests/conftest.py
M tests/metadata/test_compute_stats.py
M tests/query_test/test_mt_dop.py
3 files changed, 24 insertions(+), 0 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4633: Change broken gflag default for Kudu client mem

2016-12-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4633: Change broken gflag default for Kudu client mem
..


IMPALA-4633: Change broken gflag default for Kudu client mem

We discovered that the current Kudu client defaults in the
KuduTableSink are causing a large number of queries to
timeout, failing the query. The current default value of the
'mutation buffer size' is 100MB which results in higher
write throughput than Kudu can currently handle on large
clusters.  By decreasing the value of this flag, more RPCs
will be sent for the same amount of data, i.e. throttling
the load on Kudu. We found tests to be more successful on
200 nodes with a 10MB buffer size than the previous 100MB
value where most queries couldn't complete due to timeouts.
These queries were not timing out with the 10MB value. This
appears to work well on 9 node stress tests as well.

Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
Reviewed-on: http://gerrit.cloudera.org:8080/5503
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
2 files changed, 5 insertions(+), 4 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 19:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/123/

-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 19: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Hello Jim Apple, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4715

to look at the new patch set (#18).

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..

IMPALA-3200: Implement suballocator for splitting buffers

This is useful for situations like hash tables, where we want to
make multiple non-spillable allocations of variable size from buffer
pool memory and not incur the overhead of interacting with the global
buffer pool. The allocator subdivides buffers to service allocations
and uses a buddy allocation algorithm to merge freed allocations into
larger chunks. This helps avoid external fragmentation and is quite
effective at reusing memory given the typical doubling allocation
patterns of hash tables in partitioned aggs and joins.

Testing:
The allocator has fairly robust internal consistency checks via
assertions and unique_ptrs. Includes a unit test that exercises
various allocation patterns.

I tested porting hash tables over to allocate memory using an earlier
version of the suballocator, which worked well (was able to run a
wide range of queries successfully with good performance).

Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
---
M be/src/common/names.h
M be/src/exec/hash-table.h
M be/src/runtime/bufferpool/CMakeLists.txt
A be/src/runtime/bufferpool/suballocator-test.cc
A be/src/runtime/bufferpool/suballocator.cc
A be/src/runtime/bufferpool/suballocator.h
6 files changed, 857 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/18
-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 15:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.cc
File be/src/runtime/bufferpool/suballocator.cc:

Line 107: 
> But once we switch to using NEVER, will we still need WHEN_NEEDED?
I pulled it out of this patch. I'll keep the code around, would be easy to add 
back in in future.


http://gerrit.cloudera.org:8080/#/c/4715/15/be/src/runtime/bufferpool/suballocator.cc
File be/src/runtime/bufferpool/suballocator.cc:

Line 50: }
> missing space
Done


http://gerrit.cloudera.org:8080/#/c/4715/15/be/src/runtime/bufferpool/suballocator.h
File be/src/runtime/bufferpool/suballocator.h:

PS15, Line 128: bein
> missing space
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4715/15/be/src/runtime/bufferpool/suballocator.cc
File be/src/runtime/bufferpool/suballocator.cc:

Line 50: }
missing space


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4633: Change broken gflag default for Kudu client mem

2016-12-14 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4633: Change broken gflag default for Kudu client mem
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b3544f9a93c82e347f6e97540d6b561c30d09fd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3202,IMPALA-2298: rework scratch file I/O

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#15).

Change subject: IMPALA-3202,IMPALA-2298: rework scratch file I/O
..

IMPALA-3202,IMPALA-2298: rework scratch file I/O

Refactor BufferedBlockMgr/TmpFileMgr to push more I/O logic into
TmpFileMgr, in anticipation of it being shared with BufferPool.
TmpFileMgr now handles:
* Scratch space allocation and recycling
* Read and write I/O

The interface is also greatly changed so that it is built around Write()
and Read() calls, abstracting away the details of temporary file
allocation from clients. This means the TmpFileMgr::File class can
be hidden from clients.

Write error recovery:
Also implement write error recovery in TmpFileMgr.

If an error occurs while writing to scratch and we have multiple
scratch directories, we will try one of the other directories
before cancelling the query. File-level blacklisting is used to
prevent excessive repeated attempts to resize a scratch file during
a single query. Device-level blacklisting is still disabled because
it is problematic to permanently take a scratch directory out of use.

To reduce the number of error paths, all I/O errors are now handled
asynchronously. Previously errors creating or extending the file were
returned synchronously from WriteUnpinnedBlock(). This required
modifying DiskIoMgr to create the file if not present when opened.

Future Work:
* Support for recycling variable-length scratch file ranges. I omitted
  this to avoid making the patch even large.

Testing:
Updated BufferedBlockMgr unit test to reflect changes in behaviour:
* Scratch space is no longer permanently associated with a block, and
  is remapped every time a new block is written to disk .
* Files are now blacklisted - updated existing tests and enable the
  disable blacklisting test.

Added some basic testing of recycling of scratch file ranges in
the TmpFileMgr unit test.

I also manually tested the code in two ways. First by removing permissions
for /tmp/impala-scratch and ensuring that a spilling query fails cleanly.
Second, by creating a tiny ramdisk (16M) and running with two scratch
directories: one on /tmp and one on the tiny ramdisk. When spilling, an
out of space error is encountered for the tiny ramdisk and impala spills
the remaining data (72M) to /tmp.

Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17
---
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/buffered-block-mgr.h
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/query-state.cc
A be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
A be/src/util/buffer-ref.h
M be/src/util/filesystem-util.cc
M be/src/util/filesystem-util.h
M common/thrift/ImpalaInternalService.thrift
15 files changed, 1,151 insertions(+), 506 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/5141/15
-- 
To view, visit http://gerrit.cloudera.org:8080/5141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Taras Bobrovytsky (Code Review)
Hello Michael Brown, Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5093

to look at the new patch set (#8).

Change subject: IMPALA-4467: Add support for DML statements in stress test
..

IMPALA-4467: Add support for DML statements in stress test

- Add support for insert, upsert, update and and delete statements.
- Add support for compute stats with mt_dop query options.
- Update impyla version in order to be able to have access to query
  error text for DML queries.
- Made flake8 fixes. flake8 on this file is clean.

For every Kudu table in the databases, we make a copy and add a
'_original' suffix to the table name. The DML queries will only make
modifications to the non original table, the original table will never
be modified. The orignal tables could be used to bring the non-original
table to the inital state. Two flags were added for doing this:
--reset-databases-before-binary-search and
--reset-databases-after-binary-search.

The DML queries are generated based on the mod values passed in with the
following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
queries are generated. The DML operations will touch table rows where
primary_key % mod_value = 0. So, the larger the mod value, the more rows
would be affected. The DML queries are generated in such a way that the
data for the insert, upsert, and update queries is taken from the table
with the _original suffix. The stress test generates DML queries for
only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu
--tpch-db=tpch_100 --generate-dml-queries would only generate queries
for the tpch_100_kudu database.

Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
---
M infra/python/deps/requirements.txt
M tests/stress/concurrent_select.py
M tests/util/parse_util.py
3 files changed, 691 insertions(+), 269 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/8
-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Hello Jim Apple, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4715

to look at the new patch set (#17).

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..

IMPALA-3200: Implement suballocator for splitting buffers

This is useful for situations like hash tables, where we want to
make multiple non-spillable allocations of variable size from buffer
pool memory and not incur the overhead of interacting with the global
buffer pool. The allocator subdivides buffers to service allocations
and uses a buddy allocation algorithm to merge freed allocations into
larger chunks. This helps avoid external fragmentation and is quite
effective at reusing memory given the typical doubling allocation
patterns of hash tables in partitioned aggs and joins.

Testing:
The allocator has fairly robust internal consistency checks via
assertions and unique_ptrs. Includes a unit test that exercises
various allocation patterns.

I tested porting hash tables over to allocate memory using an earlier
version of the suballocator, which worked well (was able to run a
wide range of queries successfully with good performance).

Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
---
M be/src/common/names.h
M be/src/exec/hash-table.h
M be/src/runtime/bufferpool/CMakeLists.txt
A be/src/runtime/bufferpool/suballocator-test.cc
A be/src/runtime/bufferpool/suballocator.cc
A be/src/runtime/bufferpool/suballocator.h
6 files changed, 856 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/17
-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 14:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.cc
File be/src/runtime/bufferpool/suballocator.cc:

Line 68: 
> extraneous blank
Done


Line 80: DCHECK(free_node != nullptr);
> it looks like this dcheck can fire if SplitToSize() fails to allocate a nod
Good point - SplitToSize() should return an error in that case. Fixed.


Line 107:   }
> why do we need both policies?
WHEN_NEEDED will be convenient for porting the existing exec nodes, and I think 
we'll need NEVER in future when exec nodes want to have better control over 
their reservations.

If we want to remove NEVER for the time being that also works.


Line 137:   return Status::OK();
> Now that I read one of Dan's comments, I wonder if this should be an error 
Yep this was a mistake.


Line 216:   *ptr_from_prev = move(node->next_free_);
> should this reset result->prev_free_ so we don't have a dangling reference,
Good point - Done


http://gerrit.cloudera.org:8080/#/c/4715/14/be/src/runtime/bufferpool/suballocator.h
File be/src/runtime/bufferpool/suballocator.h:

Line 31: /// allocation algorithm optimised for power-of-two allocations. Uses 
a binary buddy
> does it require that the buffer pool buffers themselves are power of two? w
Expanded to be clearer about how min_buffer_len plays into this and what is a 
power of two. 

We require that we can allocate power-of-two buffers from the buffer pool.


Line 104:   static constexpr int64_t MAX_ALLOCATION_BYTES = 1L << 
LOG_MAX_ALLOCATION_BYTES;
> after reading the rest of the code I see it's used so to bound the depth of
Done


PS14, Line 125: 'node' should already be free and not in
  :   /// any free list
> what does this mean given that free nodes are in the free list? do you mean
It's really referring to node->in_use so updated to reflect.


PS14, Line 159: support
> supported?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4715

to look at the new patch set (#15).

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..

IMPALA-3200: Implement suballocator for splitting buffers

This is useful for situations like hash tables, where we want to
make multiple non-spillable allocations of variable size from buffer
pool memory and not incur the overhead of interacting with the global
buffer pool. The allocator subdivides buffers to service allocations
and uses a buddy allocation algorithm to merge freed allocations into
larger chunks. This helps avoid external fragmentation and is quite
effective at reusing memory given the typical doubling allocation
patterns of hash tables in partitioned aggs and joins.

Testing:
The allocator has fairly robust internal consistency checks via
assertions and unique_ptrs. Includes a unit test that exercises
various allocation patterns.

I tested porting hash tables over to allocate memory using an earlier
version of the suballocator, which worked well (was able to run a
wide range of queries successfully with good performance).

Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
---
M be/src/common/names.h
M be/src/exec/hash-table.h
M be/src/runtime/bufferpool/CMakeLists.txt
A be/src/runtime/bufferpool/suballocator-test.cc
A be/src/runtime/bufferpool/suballocator.cc
A be/src/runtime/bufferpool/suballocator.h
6 files changed, 941 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/15
-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 15: Code-Review+1

carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-14 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#8).

Change subject: IMPALA-4467: Add support for DML statements in stress test
..

IMPALA-4467: Add support for DML statements in stress test

- Add support for insert, upsert, update and and delete statements.
- Add support for compute stats with mt_dop query options.
- Update impyla version in order to be able to have access to query
  error text for DML queries.
- Made flake8 fixes. flake8 on this file is clean.

For every Kudu table in the databases, we make a copy and add a
'_original' suffix to the table name. The DML queries will only make
modifications to the non original table, the original table will never
be modified. The orignal tables could be used to bring the non-original
table to the inital state. Two flags were added for doing this:
--reset-databases-before-binary-search and
--reset-databases-after-binary-search.

The DML queries are generated based on the mod values passed in with the
following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
queries are generated. The DML operations will touch table rows where
primary_key % mod_value = 0. So, the larger the mod value, the more rows
would be affected. The DML queries are generated in such a way that the
data for the insert, upsert, and update queries is taken from the table
with the _original suffix. The stress test generates DML queries for
only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu
--tpch-db=tpch_100 --generate-dml-queries would only generate queries
for the tpch_100_kudu database.

Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
---
M infra/python/deps/requirements.txt
M tests/stress/concurrent_select.py
M tests/util/parse_util.py
3 files changed, 691 insertions(+), 269 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/8
-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4654: KuduScanner must return when ReachedLimit()

2016-12-14 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4654: KuduScanner must return when ReachedLimit()
..


Patch Set 2:

> Change looks good to me.
 > 
 > As discussed, I think there is still one remaining bug with plans
 > like scan+topn where the coordinator may not necessarily cancel
 > fragments containing scans. In that case Close() is called on the
 > plan tree during tear down, but unlike other scan nodes the Kudu
 > scan node does not use done_ to terminate the scanner threads. For
 > non-Kudu scan nodes setting done_ triggers teardown of all scanner
 > threads. The scanner threads detect this indirectly via
 > ScannerContext::cancelled() which is checked on a per-row-batch
 > basis.

I can't repro this yet (topn doesn't do it) so I'm hesitant to make the change 
in this CR, but I agree we should do it separately. It's worth digging into the 
topn case and seeing why it doesn't repro, but I can't spend too much time on 
that right now. I'll make a separate change to check done_ and that way we 
don't have to take it into a release branch until it has more bake time / time 
to find a repro. This does deserve more attention but given I can't find an 
obvious bug I can't work on it right now.

 > 
 > Todd has a good point. Our limit check is broken. Agree to deal
 > with that everywhere in s separate change.

Filed https://issues.cloudera.org/browse/IMPALA-4658

-- 
To view, visit http://gerrit.cloudera.org:8080/5493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaddd5a1b2647995d68e6d37d0500b3a322de
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No