[Impala-ASF-CR] IMPALA-3436: Return a decimal when rounding a double

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8398 )

Change subject: IMPALA-3436: Return a decimal when rounding a double
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8398/3/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/8398/3/common/function-registry/impala_functions.py@285
PS3, Line 285:   [['round_v1','dround_v1'],
> Users typically aren't happy with workarounds that involve changing their q
There are plenty of edge cases though where users that relied on decimal_v1 
behaviour would have to modify their queries to use decimal_v2 semantics, so 
that's unavoidable when we remove decimal_v1.

My concern was that in this case, even if they can modify their queries, then 
there is no rounding function that can take a variable precision.


http://gerrit.cloudera.org:8080/#/c/8398/3/common/function-registry/impala_functions.py@285
PS3, Line 285:   [['round_v1','dround_v1'],
> If users rely on that behavior they can use DECIMAL_V1.
We're not keeping decimal_v1 around indefinitely though are we?

Seems more of a pain to file a JIRA and go through code review again than to 
just add an alias to the existing floating-point rounding function.



--
To view, visit http://gerrit.cloudera.org:8080/8398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I240ec70da7996bbfefcf6438eba4dff8d33d7bd6
Gerrit-Change-Number: 8398
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 23:03:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3436: Return a decimal when rounding a double

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8398 )

Change subject: IMPALA-3436: Return a decimal when rounding a double
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8398/3/be/src/exprs/decimal-functions-ir.cc
File be/src/exprs/decimal-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8398/3/be/src/exprs/decimal-functions-ir.cc@119
PS3, Line 119: Decimal16Value
> Yes, the result is a decimal with precision 38, so it always requires 16 by
Got it


http://gerrit.cloudera.org:8080/#/c/8398/3/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/8398/3/common/function-registry/impala_functions.py@285
PS3, Line 285:   #   Althernatively, consider renaming "round_v1" to 
"fround".
> Ok, are you suggesting to leave the code as is right now and add "fround" w
Might as well add it now if we're going to add it.



--
To view, visit http://gerrit.cloudera.org:8080/8398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I240ec70da7996bbfefcf6438eba4dff8d33d7bd6
Gerrit-Change-Number: 8398
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 22:00:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8489 )

Change subject: IMPALA-6084: Avoid using of global namespace for llvm
..


Patch Set 7:

Thanks for the contribution!


--
To view, visit http://gerrit.cloudera.org:8080/8489
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
Gerrit-Change-Number: 8489
Gerrit-PatchSet: 7
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 21:57:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2235: Fix current db when shell auto-reconnects

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8368 )

Change subject: IMPALA-2235: Fix current db when shell auto-reconnects
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40dfa00ba0314d356fe8617446f516505c925e5e
Gerrit-Change-Number: 8368
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 15 Nov 2017 19:24:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2235: Fix current db when shell auto-reconnects

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8368 )

Change subject: IMPALA-2235: Fix current db when shell auto-reconnects
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40dfa00ba0314d356fe8617446f516505c925e5e
Gerrit-Change-Number: 8368
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 15 Nov 2017 19:24:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8510 )

Change subject: IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8510/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8510/2//COMMIT_MSG@17
PS2, Line 17: run runtime tmp-file-mgr-test
Can you also run openssl-util-test, buffer-pool-test and 
buffered-tuple-stream-test?



--
To view, visit http://gerrit.cloudera.org:8080/8510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib97939f2334838263364b53ef3413871638bf53e
Gerrit-Change-Number: 8510
Gerrit-PatchSet: 2
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mike Yoder 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 19:15:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8510 )

Change subject: IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8510/2/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/8510/2/be/src/util/openssl-util.cc@102
PS2, Line 102:   int success = 0;
We can just combine line 102 and 108 to avoid this unnecessary initialisation.



--
To view, visit http://gerrit.cloudera.org:8080/8510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib97939f2334838263364b53ef3413871638bf53e
Gerrit-Change-Number: 8510
Gerrit-PatchSet: 2
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mike Yoder 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 19:09:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8510 )

Change subject: IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow
..


Patch Set 2:

(1 comment)

Added Mike Yoder since he originally wrote this code and has some expertise in 
the matter. At a high level, does switching from CFB to CTR in Impala's 
spill-to-disk make sense?

http://gerrit.cloudera.org:8080/#/c/8510/2/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/8510/2/be/src/util/openssl-util.cc@107
PS2, Line 107:   //   (2). well-optimized(instruction level parallelism) with 
hardware acceleration on x86 or PowerPC
long line



--
To view, visit http://gerrit.cloudera.org:8080/8510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib97939f2334838263364b53ef3413871638bf53e
Gerrit-Change-Number: 8510
Gerrit-PatchSet: 2
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mike Yoder 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 19:08:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5052: Read and write signed integer logical types in Parquet

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8548 )

Change subject: IMPALA-5052: Read and write signed integer logical types in 
Parquet
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8548/1/testdata/data/signed_integer_logical_types.parquet
File testdata/data/signed_integer_logical_types.parquet:

PS1:
Can you add a description of this file to the readme - i.e. what it has in it 
and how it was generated? I see that the commit message has some info but i 
should be in the readme too.


http://gerrit.cloudera.org:8080/#/c/8548/1/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/8548/1/tests/query_test/test_insert_parquet.py@295
PS1, Line 295: colummn
column


http://gerrit.cloudera.org:8080/#/c/8548/1/tests/query_test/test_insert_parquet.py@325
PS1, Line 325: execute_query
should these (above and below as well) be execute_query_expect_success?


http://gerrit.cloudera.org:8080/#/c/8548/1/tests/query_test/test_insert_parquet.py@331
PS1, Line 331: result_dst = self.execute_query("describe %s" % dst_tbl)
> If we got both FE and BE wrong (mapping smallint to int32 for example) I th
+1. I think the test would be a little easier to understand too if we asserted 
the column types directly after creating each table.



--
To view, visit http://gerrit.cloudera.org:8080/8548
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I47a8371858c9597c6a440808cf6f933532468927
Gerrit-Change-Number: 8548
Gerrit-PatchSet: 1
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 19:06:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 10: Code-Review+2

Rebased onto the ConditionVariable change


--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 18:47:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-15 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8408

to look at the new patch set (#10).

Change subject: IMPALA-6121: remove I/O mgr request context cache
..

IMPALA-6121: remove I/O mgr request context cache

This simplifies the lifecycle of the request contexts and eliminates
some code. The comments claim that request context cache improves
performance when allocating smallish the objects. But allocating
from TCMalloc's thread caches should scale much better than a
global object pool protected by a lock.

I needed to move the definition to a non-internal header file so that it
was visible to clients that manage it by unique_ptr.

We also do not need to transfer the request contexts to the RuntimeState
since I/O buffers do not leave scanners now.

Testing:
Ran exhaustive tests.

Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
A be/src/runtime/disk-io-mgr-reader-context.h
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
16 files changed, 652 insertions(+), 811 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/8408/10
--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-15 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8408

to look at the new patch set (#9).

Change subject: IMPALA-6121: remove I/O mgr request context cache
..

IMPALA-6121: remove I/O mgr request context cache

This simplifies the lifecycle of the request contexts and eliminates
some code. The comments claim that request context cache improves
performance when allocating smallish the objects. But allocating
from TCMalloc's thread caches should scale much better than a
global object pool protected by a lock.

I needed to move the definition to a non-internal header file so that it
was visible to clients that manage it by unique_ptr.

We also do not need to transfer the request contexts to the RuntimeState
since I/O buffers do not leave scanners now.

Testing:
Ran exhaustive tests.

Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
A be/src/runtime/disk-io-mgr-reader-context.h
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
16 files changed, 651 insertions(+), 811 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/8408/9
--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/disk-io-mgr-reader-context.h@27
PS8, Line 27:  but some clients may need to include this header, e.g. to make 
the
: /// unique_ptr destructor work correctly.
: ///
> that's really unfortunate, but the change is a net win.
I think exposing it will make more sense after a follow-on patch. There are a 
number of methods in DiskIoMgr that are just indirections to RequestContext 
methods and I think just letting clients call them directly makes more sense.


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@654
PS7, Line 654:   std::unique_ptr 
RegisterContext(MemTracker* reader_mem_tracker);
> I wasn't thinking in terms of implementation. I was thinking in terms of th
Ah ok I understand now


http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

http://gerrit.cloudera.org:8080/#/c/8408/8/be/src/runtime/runtime-state.h@39
PS8, Line 39: class DiskIoRequestContext;
> delete
Done



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 18:27:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8489 )

Change subject: IMPALA-6084: Avoid using of global namespace for llvm
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8489
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
Gerrit-Change-Number: 8489
Gerrit-PatchSet: 6
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 18:23:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8489 )

Change subject: IMPALA-6084: Avoid using of global namespace for llvm
..


Patch Set 5:

(2 comments)

I had one nit but just fixed it myself since I don't think it's worth doing 
another iteration to fix.

http://gerrit.cloudera.org:8080/#/c/8489/4/be/src/codegen/codegen-callgraph.cc
File be/src/codegen/codegen-callgraph.cc:

http://gerrit.cloudera.org:8080/#/c/8489/4/be/src/codegen/codegen-callgraph.cc@41
PS4, Line 41: for (llvm::Use& u: val->uses())
> I think we usually stick to using no space between ":" like
IN the backend code, clang-format likes to put spaces on both sides, so that's 
what we've standardised on. The frontend code only has a trailing space. It 
doesn't really matter much though.


http://gerrit.cloudera.org:8080/#/c/8489/5/be/src/exprs/scalar-expr.cc
File be/src/exprs/scalar-expr.cc:

http://gerrit.cloudera.org:8080/#/c/8489/5/be/src/exprs/scalar-expr.cc@260
PS5, Line 260: // Compute the alignment of this value. Values should be 
self-aligned for
This formatting is weird - the old formatting was better.



--
To view, visit http://gerrit.cloudera.org:8080/8489
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
Gerrit-Change-Number: 8489
Gerrit-PatchSet: 5
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 15 Nov 2017 18:23:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#6) to the change originally 
created by Kim Jin Chul. ( http://gerrit.cloudera.org:8080/8489 )

Change subject: IMPALA-6084: Avoid using of global namespace for llvm
..

IMPALA-6084: Avoid using of global namespace for llvm

There are a large number of places in the backend where
we import everything from the llvm namespace into the
global namespace. (e.g. using namespace llvm;)

Here are the reasons why we don't prefer this:
* It could make symbol conflicts if a newly added code
has same symbole name.
* It makes code readability uncomfortable. We may not
recognize a symbol came from.

We adopt a sequence of namespace specifiers in each use case.

Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-callgraph.cc
M be/src/codegen/codegen-symbol-emitter.cc
M be/src/codegen/codegen-util.cc
M be/src/codegen/instruction-counter-test.cc
M be/src/codegen/instruction-counter.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/filter-context.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/select-node.cc
M be/src/exec/text-converter.cc
M be/src/exec/topn-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/agg-fn.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/types.cc
M be/src/util/tuple-row-compare.cc
42 files changed, 1,417 insertions(+), 1,384 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/8489/6
--
To view, visit http://gerrit.cloudera.org:8080/8489
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
Gerrit-Change-Number: 8489
Gerrit-PatchSet: 6
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8549 )

Change subject: IMPALA-1144: Fix exception when cancelling query in 
Impala-shell with CTRL-C
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8549/3/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/8549/3/shell/impala_client.py@319
PS3, Line 319:   elif query_state == self.query_state["EXCEPTION"]:
Do we also need to check is_cancelled here? In the case of a long-running DML 
query like an insert, not sure if an error can bubble up here.


http://gerrit.cloudera.org:8080/#/c/8549/3/shell/impala_client.py@334
PS3, Line 334:   if self.is_query_cancelled:
How do we avoid the two threads racing?

E.g.
* Fetch thread checks is_query_cancelled, sees False, then starts fetch RPC
* Cancellation thread sets is_query_cancelled to True and sends Cancelled RPC 
around the same time
* Fetch fails, either with CANCELLED or invalid query handle
* The cancelled or invalid query handle bubbles up to the client.

Assignment of variables is atomic in Python, so I think it's sufficient to 
re-check is_query_cancelled after the fetch RPC returns, but we should add some 
concise comments to explain how concurrent fetch and cancel work.


http://gerrit.cloudera.org:8080/#/c/8549/3/shell/impala_client.py@335
PS3, Line 335: self.is_query_cancelled = False
Why does this need to be set back to False? Seems confusing since the query is 
still cancelled and we haven't started a new one at this point. Isn't setting 
it in execute_query() sufficient.


http://gerrit.cloudera.org:8080/#/c/8549/3/shell/impala_client.py@355
PS3, Line 355:   def close_dml(self, last_query_handle):
I think some of these other methods might also be subject to the same bug - it 
seems like these RPCs could race with cancellation too.



--
To view, visit http://gerrit.cloudera.org:8080/8549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
Gerrit-Change-Number: 8549
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 15 Nov 2017 17:45:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3436: Return a decimal when rounding a double

2017-11-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8398 )

Change subject: IMPALA-3436: Return a decimal when rounding a double
..


Patch Set 3:

(3 comments)

Didn't look at the frontend part.

http://gerrit.cloudera.org:8080/#/c/8398/3/be/src/exprs/decimal-functions-ir.cc
File be/src/exprs/decimal-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8398/3/be/src/exprs/decimal-functions-ir.cc@119
PS3, Line 119: Decimal16Value
Do we always need Decimal16? Or should we switch on ByteSize() like the other 
functions?


http://gerrit.cloudera.org:8080/#/c/8398/3/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/8398/3/common/function-registry/impala_functions.py@285
PS3, Line 285:   [['round_v1','dround_v1'],
I wonder if we should keep the functionality around under another alias in case 
someone is actually using it with a variable precision. It's an edge case but I 
can imagine someone having a dimension table that specifies formatting, or 
having some clever logic to round things to different levels of precision. It 
would be nice to give them a way out if decimal_v2 breaks them.

We use the "f" prefix in some cases like this, e.g. fmod.


http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

http://gerrit.cloudera.org:8080/#/c/8398/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@407
PS1, Line 407: Type == nu
situation



--
To view, visit http://gerrit.cloudera.org:8080/8398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I240ec70da7996bbfefcf6438eba4dff8d33d7bd6
Gerrit-Change-Number: 8398
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 10 Nov 2017 01:29:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2235: Fix current db when shell auto-reconnects

2017-11-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8368 )

Change subject: IMPALA-2235: Fix current db when shell auto-reconnects
..


Patch Set 3:

Phil, did you want to take another look at this?


--
To view, visit http://gerrit.cloudera.org:8080/8368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40dfa00ba0314d356fe8617446f516505c925e5e
Gerrit-Change-Number: 8368
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 10 Nov 2017 01:08:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm

2017-11-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8489 )

Change subject: IMPALA-6084: Avoid using of global namespace for llvm
..


Patch Set 4:

I'm ok with using llvm:: in those files, but was going to let Bikram also weigh 
in.


--
To view, visit http://gerrit.cloudera.org:8080/8489
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
Gerrit-Change-Number: 8489
Gerrit-PatchSet: 4
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 10 Nov 2017 00:41:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6054: Parquet dictionary pages should be freed on dictionary construction

2017-11-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8436 )

Change subject: IMPALA-6054: Parquet dictionary pages should be freed on 
dictionary construction
..


Patch Set 4:

(4 comments)

Looks close.

http://gerrit.cloudera.org:8080/#/c/8436/4/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8436/4/be/src/exec/parquet-column-readers.cc@934
PS4, Line 934: NULL
nullptr


http://gerrit.cloudera.org:8080/#/c/8436/4/be/src/exec/parquet-column-readers.cc@942
PS4, Line 942:   // 3. If the column type is not string, and the dictionary 
page is not compressed,
I'm not 100% confident that we have test coverage for this case - we have 
somewhat limited testing of uncompressed parquet (the parquet/none tests are 
actually snappy-compressed since that's the default).

I think the main coverage we have is tests that insert into parquet and read it 
back. I took a look and it seems like tests/query_test/test_insert_parquet.py 
should exercise this code path, but could you confirm? E.g. add a log message 
on each branch and make sure that they appear in the logs.


http://gerrit.cloudera.org:8080/#/c/8436/4/be/src/exec/parquet-column-readers.cc@944
PS4, Line 944:
nit: unnecessary blank line


http://gerrit.cloudera.org:8080/#/c/8436/4/be/src/exec/parquet-column-readers.cc@964
PS4, Line 964: )
nit: != nullptr (we prefer explicit checks against null to make it visually 
obvious that it's a null check).



--
To view, visit http://gerrit.cloudera.org:8080/8436
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d9d5f4da1028d961155dafdac0028a1c3641004
Gerrit-Change-Number: 8436
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 10 Nov 2017 00:39:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-09 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8408

to look at the new patch set (#8).

Change subject: IMPALA-6121: remove I/O mgr request context cache
..

IMPALA-6121: remove I/O mgr request context cache

This simplifies the lifecycle of the request contexts and eliminates
some code. The comments claim that request context cache improves
performance when allocating smallish the objects. But allocating
from TCMalloc's thread caches should scale much better than a
global object pool protected by a lock.

I needed to move the definition to a non-internal header file so that it
was visible to clients that manage it by unique_ptr.

We also do not need to transfer the request contexts to the RuntimeState
since I/O buffers do not leave scanners now.

Testing:
Ran exhaustive tests.

Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
A be/src/runtime/disk-io-mgr-reader-context.h
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
16 files changed, 651 insertions(+), 810 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/8408/8
--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h@25
PS7, Line 25: /// Internal per request-context state. This object maintains a 
lot of state that is
> so no one should include this file except for disk io mgr, is that right? i
Added a comment explaining that. Some clients need to include it to make the 
unique_ptr destructor work (that destructor has a static assert that 
sizeof(DiskIoRequestContext) > 0).


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.cc
File be/src/runtime/disk-io-mgr-reader-context.cc:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.cc@108
PS7, Line 108:   state_ = Inactive;
> why is it that Cancel() has to do so much more work than this, when this su
This function calls Cancel() on line 95.


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@654
PS7, Line 654:   std::unique_ptr 
RegisterContext(MemTracker* reader_mem_tracker);
> I'm not really sure what it means to "register" a context. Should this just
I think it's an implementation detail the RegisterContext() doesn't register it 
in any data structures. UnregisterContext() *does* have to track down and 
remove references from I/O manager internal data structures.


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662
PS7, Line 662:   void UnregisterContext(DiskIoRequestContext* context);
> Yeah, except that the statement "context cannot be used after this call" is
"unregister" to me means removing all references to it from the I/O manager. I 
think the name conveys that the I/O manager may hold references to it. Reworked 
the comment a bit.

I can imagine adding operations to DiskIoRequestContext that would be 
reasonable to call after UnregisterContext(), e.g. exposing the current state 
of the context.

I'm fine with switching to passing in the unique_ptr, can do that in the next 
PS if it seems appropriate.



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 23:53:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

2017-11-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt
..


Patch Set 7:

(3 comments)

I kept on pulling at the error propagation string and I've realised that we 
need a few more changes to get this into a good state where it's more clearly 
correct. I'm going to continue working on that and post a patchset once I'm 
done.

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc@797
PS4, Line 797:
> So is the fix here still correct? Here only a range is cancelled and the cl
I don't think it's correct. I also realised that the error propagation is weird 
and could be simplified, so I'm going to simplify both things in a new patchset.


http://gerrit.cloudera.org:8080/#/c/8414/7/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/8414/7/be/src/runtime/exec-env.cc@38
PS7, Line 38: #include "runtime/io/disk-io-mgr.h"
> include order
Done


http://gerrit.cloudera.org:8080/#/c/8414/7/be/src/runtime/io/disk-io-mgr-stress.cc
File be/src/runtime/io/disk-io-mgr-stress.cc:

http://gerrit.cloudera.org:8080/#/c/8414/7/be/src/runtime/io/disk-io-mgr-stress.cc@204
PS7, Line 204: if (clients_[i].reader != nullptr) {
> Is it possible? ClientThread set reader to NULL in L165.
Done



--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 19:28:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2:

I also don't want to block progress on making this good improvement to our S3 
development, but my fear is that this script will get more and more out of 
control if we don't draw a line somewhere about what belongs in it.


--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 02:22:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294
PS2, Line 294:   if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z 
${AWS_SECRET_ACCESS_KEY-} ]]); then
> I'm not sure that putting the checks into a separate script would change an
Lines 252-264 seem ok - I don't have a problem with setting environment 
variables in this script.


http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294
PS2, Line 294:   if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z 
${AWS_SECRET_ACCESS_KEY-} ]]); then
> In most of the use cases the network access is avoided, it happens only if
Yeah, I'd question whether "aws ls" belongs in this script either.

The performance is one thing but I just generally think we should restrict this 
script to doing the minimum possible to set up environment variables. I don't 
see why it should be extended to do heavyweight things to validate the 
configuration - we don't do anything to validate the vast majority of other 
variables that are set in this script.

It does appear that people have added validations in an ad-hoc way before so if 
a majority of people think that that's a good idea, I can yield to that.

We should also keep in mind that this script is a crappy programming 
environment, because it's running in the context of the user's shell and we 
can't use things like "set -x" and have to be careful setting variables that we 
don't intend to leak into the user's shell.

So I think regardless this logic would be more maintainable in a separate 
script to validate the AWS config. My preference is that we also run that 
script from elsewhere to keep impala-config.sh lightweight but if other people 
feel strongly that impala-config.sh should be doing more validation of configs, 
etc then that's not the worst thing in the world.


http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@303
PS2, Line 303: CURL_ARGS
This variable will leak out into the user's shell.



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 02:21:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5237: support custom string in date/time format

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8508 )

Change subject: IMPALA-5237: support custom string in date/time format
..


Patch Set 1:

(9 comments)

This is cool!

http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/exprs/expr-test.cc@5724
PS1, Line 5724:   TestValue("unix_timestamp('1970|01|01 00|00|00', '|MM|dd 
HH|mm|ss')", TYPE_BIGINT,
I think we also need to add tests for parsing with format strings including 
quoted text.


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/exprs/expr-test.cc@5746
PS1, Line 5746:   TestError("from_unixtime(0, 'HHH\\'')");
Can you add a test for an unmatched '\. Or if one of the above tests covers 
that case, can you add a comment?


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.h
File be/src/runtime/timestamp-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.h@77
PS1, Line 77: /// The following features are not supported:
Can you update this comment to include an example of the feature you added?


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@160
PS1, Line 160: if ('\'' == *str) {
nit: *str == '\'' to match the convention below


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@161
PS1, Line 161: comment_begin
maybe 'string_literal' instead of 'comment'?


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@162
PS1, Line 162:   do {
Can you add a comment here explaining that we're matching a string literal, and 
maybe give an example of the input that it's matching.


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@164
PS1, Line 164: // Meet end of string
This comment isn't to informative. Maybe:

  // Hit end of string before finding a matching quote.


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@165
PS1, Line 165: if (str == str_end) {
Nit: 1 line:

  if (str == str_end) return false;


http://gerrit.cloudera.org:8080/#/c/8508/1/be/src/runtime/timestamp-parse-util.cc@168
PS1, Line 168:   } while ('\'' != *str);
nit: *str != '\'' to match the convention below



--
To view, visit http://gerrit.cloudera.org:8080/8508
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178
Gerrit-Change-Number: 8508
Gerrit-PatchSet: 1
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 01:40:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

PS7:
> I think you can use "git mv" to produce better diff in code review.
Unfortunately git mv doesn't add any metadata - the rename detection is still 
based on file similarity heuristics (not sure why that didn't work in this 
case).



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 01:25:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8489 )

Change subject: IMPALA-6084: Avoid using of global namespace for llvm
..


Patch Set 3:

@Bikram I think that would be convenient, except LLVM has a lot of stuff in its 
namespace and it seems to change from release to release. E.g. one thing that 
motivated this was that LLVM defines a make_unique() function in its namespace, 
which conflicts with the std:: one.


--
To view, visit http://gerrit.cloudera.org:8080/8489
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
Gerrit-Change-Number: 8489
Gerrit-PatchSet: 3
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 09 Nov 2017 01:15:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] Expose $IMPALA MAVEN OPTIONS for configuring Maven.

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8496 )

Change subject: Expose $IMPALA_MAVEN_OPTIONS for configuring Maven.
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8496
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c62185476fd2388c7cda8884276b79a77370127
Gerrit-Change-Number: 8496
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 23:17:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc@797
PS4, Line 797:   
reader->disk_states_[disk_queue->disk_id].DecrementRequestThread();
> I spent some more time looking at this and I'm increasingly thinking that t
So I understand the rule now. Once the DiskIoRequestContext is cancelled, the 
last thread to leave the building must turn off the lights. If it's cancelled 
you need to call DecrementRequestThreadAndCheckDone(). If it's not cancelled,  
you need to call DecrementRequestThread(). Both functions assume that the 
caller holds lock_, so there's no reason we couldn't just combine the functions 
and check the status.

The reason that HandleReadFinished() works is that it checks for the request 
context being cancelled earlier in the critical section and exits early.



--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 23:06:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6170: Remove broken backend test from llvm-codegen-test

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8505 )

Change subject: IMPALA-6170: Remove broken backend test from llvm-codegen-test
..


Patch Set 1:

We should definitely merge this to unblock the broken tests though


--
To view, visit http://gerrit.cloudera.org:8080/8505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaed0109f5163343427015d571d6d24233b9d3fdc
Gerrit-Change-Number: 8505
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 22:35:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6170: Remove broken backend test from llvm-codegen-test

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8505 )

Change subject: IMPALA-6170: Remove broken backend test from llvm-codegen-test
..


Patch Set 1: Code-Review+2

It's unfortunately that we don't have any test coverage for the original bug. 
It would be good to put a little more effort to see if we can reproduce it 
without depending on HDFS. E.g. can we use a file:// URL to point to a module 
on the local filesystem and trigger the bug by calling LibCache::DropCache() 
between linking attempts. I'm not sure but it seems worth trying.


--
To view, visit http://gerrit.cloudera.org:8080/8505
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaed0109f5163343427015d571d6d24233b9d3fdc
Gerrit-Change-Number: 8505
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 22:35:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc@797
PS4, Line 797:   
reader->disk_states_[disk_queue->disk_id].DecrementRequestThread();
> That's a good question. I don't understand whether it's a bug or not. It's
I spent some more time looking at this and I'm increasingly thinking that this 
isn't the right way to propagate the error. I've been looking at how other 
errors are propagated to clients and it's deeply weird - I think it somehow 
depends on the cancelled scan range being repeatedly scheduled until the 
request context is cancelled.



--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 22:23:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc@797
PS4, Line 797:   
reader->disk_states_[disk_queue->disk_id].DecrementRequestThread();
> The scan range might be cancelled in DiskIoMgr::HandleReadFinished as well.
That's a good question. I don't understand whether it's a bug or not. It's 
definitely suspicious. I think that in some case re-scheduling the scan range 
is enough to get the cleanup to happen next time the scan range is picked up, 
but I'm not sure. I'm looking at simplifying error propagation here (amongst 
other things) so this should change to something more obviously correct. 
Currently it's really confusing because the error is propagated in 
HandleReadFinished() by both ScanRange::Cancel() and enqueueing a 
BufferDescriptor with the error status.



--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 21:42:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8489 )

Change subject: IMPALA-6084: Avoid using of global namespace for llvm
..


Patch Set 3:

(1 comment)

I started looking through the first few files and it looks good to me so far. 
Maybe Bikram can continue the review though when he has time.

http://gerrit.cloudera.org:8080/#/c/8489/3/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/8489/3/be/src/codegen/codegen-anyval.cc@479
PS3, Line 479: llvm::cast
search-and-replace gone wrong



-- 
To view, visit http://gerrit.cloudera.org:8080/8489
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
Gerrit-Change-Number: 8489
Gerrit-PatchSet: 3
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 18:52:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Expose $IMPALA MAVEN OPTIONS for configuring Maven.

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8496 )

Change subject: Expose $IMPALA_MAVEN_OPTIONS for configuring Maven.
..


Patch Set 1:

We call mvn in a few other places, e.g. testdata/bin/split-hbase.sh, 
testdata/bin/generate-load-nested.sh,  testdata/bin/generate-block-ids.sh. 
Should those also be updated?


--
To view, visit http://gerrit.cloudera.org:8080/8496
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c62185476fd2388c7cda8884276b79a77370127
Gerrit-Change-Number: 8496
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 18:49:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6171: Revert "IMPALA-1575: part 2: yield admission control resources"

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8499


Change subject: IMPALA-6171: Revert "IMPALA-1575: part 2: yield admission 
control resources"
..

IMPALA-6171: Revert "IMPALA-1575: part 2: yield admission control resources"

This reverts commit fe90867d890c71bfdcf8ff941f8ec51e36083f25.

Change-Id: I3eec4b5a6ff350933ffda0bb80949c5960ecdf25
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/query-state.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/client-request-state.cc
M tests/custom_cluster/test_admission_controller.py
9 files changed, 94 insertions(+), 192 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/8499/1
--
To view, visit http://gerrit.cloudera.org:8080/8499
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3eec4b5a6ff350933ffda0bb80949c5960ecdf25
Gerrit-Change-Number: 8499
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294
PS2, Line 294:   if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z 
${AWS_SECRET_ACCESS_KEY-} ]]); then
> Does impala-config.sh really have to talk to the internet? It just tends to
(I think it's fine if this is put in a separate script and called from 
buildall.sh or testdata/bin/run-all.sh or something like that)



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 18:01:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-11-08 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 2: Code-Review-1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8294/2/bin/impala-config.sh@294
PS2, Line 294:   if (set +x; [[ -z ${AWS_ACCESS_KEY_ID-} && -z 
${AWS_SECRET_ACCESS_KEY-} ]]); then
Does impala-config.sh really have to talk to the internet? It just tends to 
cause problems if impala-config.sh does things aside from setting environment 
variables.



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 2
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 18:01:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 16: Code-Review+2

Forgot that this needed a coordinated change with the Impala-lzo plugin. That 
has one callsite:

 context_->ReleaseCompletedResources(nullptr, false);

So I just added a wrapper function with the old API that we can remove once 
it's not needed.


--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 06:45:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Alex Behm, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8172

to look at the new patch set (#16).

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..

IMPALA-5307: Part 4: copy out uncompressed text and seq

This is the final patch for IMPALA-5307.

Text and Seq scanners are converted to use the same approach
as Avro.

contains_tuple_data is now false so a bunch of dead code in
ScannerContext can be removed. We also no longer attach I/O
buffers to row batches so that logic can be removed.

Testing:
Ran core ASAN tests.

Perf:
I reran the same benchmarks as in Part 2. There was no measurable
difference before and after - for both text and seq processing time
is dominated by text parsing.

Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/text-converter.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
23 files changed, 144 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/16
--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 15: Code-Review+2

carry


--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 01:06:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8172

to look at the new patch set (#14).

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..

IMPALA-5307: Part 4: copy out uncompressed text and seq

This is the final patch for IMPALA-5307.

Text and Seq scanners are converted to use the same approach
as Avro.

contains_tuple_data is now false so a bunch of dead code in
ScannerContext can be removed. We also no longer attach I/O
buffers to row batches so that logic can be removed.

Testing:
Ran core ASAN tests.

Perf:
I reran the same benchmarks as in Part 2. There was no measurable
difference before and after - for both text and seq processing time
is dominated by text parsing.

Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/text-converter.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
23 files changed, 137 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/14
--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h@274
PS13, Line 274: //
> nit: ///
Done



--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 01:06:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 4:

Looks like it hit IMPALA-6092 plus some network flakiness trying to send RPCs 
to itself. I don't see how they're likely to be related to the change.


--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 22:50:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8487 )

Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
Gerrit-Change-Number: 8487
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 21:21:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

PS7:
> I assume this was a wholesale move of the class without modifications, exce
There was some minor cleanup - initialising member variables inline, adding a 
DCHECK in the destructor, etc. I created a diff here to see how the class 
definition changed: 
https://gist.github.com/timarmstrong/f640dd200260556e1167d8ca69a67f51/revisions?diff=split


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662
PS7, Line 662:   void UnregisterContext(DiskIoRequestContext* context);
> Given that the context cannot be used after this call, should we move the u
I don't feel strongly. I think your argument has merit but destroying the data 
structure goes (somewhat) against the idea that we shouldn't tear down control 
structures until the very end of the query.



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 21:10:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8484 )

Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover 
the query.
..


Patch Set 3: Code-Review+2

Thanks Tim W.


--
To view, visit http://gerrit.cloudera.org:8080/8484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c
Gerrit-Change-Number: 8484
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Tue, 07 Nov 2017 20:54:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 11:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner-ir.cc
File be/src/exec/hdfs-scanner-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner-ir.cc@61
PS11, Line 61: return 0;
> Is there a reason why we don't treat this as parse error and returns -1 ?
No that's a good point. I changed it.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.h@347
PS11, Line 347:   /// Returns -1 if parsing should be aborted due to parse 
errors.
> May help to also add "Returns 0 if copying strings into 'pool' failed'.
Clarified that it could be a memory allocation error. Per the other comment, it 
probably makes most sense to just return -1.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-sequence-scanner.cc
File be/src/exec/hdfs-sequence-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-sequence-scanner.cc@a282
PS11, Line 282:
> It may be trivial but does it make sense to add DCHECK(row_batch->num_tuple
The assumption of 1 tuple per row for scans is pretty deeply baked into the 
scanners and planner, so it seemed silly to have this speculative generality 
here. I added a DCHECK to HdfsScanner to document that the assumption is valid 
for all HDFS scanners.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc@849
PS11, Line 849: tuples_returned
> There is a subtle behavior here if Tuple::CopyStrings() failed in WriteAlig
Yeah I think I just made a mistake returning 0. Returning -1 makes more sense. 
I manually tested with this query:
[localhost:21000] > set num_nodes=1; set disable_codegen=1; set 
mem_limit=5m; select * from widetable_1000_cols;
NUM_NODES set to 1
DISABLE_CODEGEN set to 1
MEM_LIMIT set to 5m
Query: select * from widetable_1000_cols
Query submitted at: 2017-11-07 11:52:01 (Coordinator: 
http://tarmstrong-box:25000)
Query progress can be monitored at: 
http://tarmstrong-box:25000/query_plan?query_id=9948285e1aa8e172:f73fe971
ERROR: Memory limit exceeded: HdfsScanner::WriteAlignedTuples() failed to 
allocate 125 bytes for strings.
HDFS_SCAN_NODE (id=0) could not allocate 125.00 B without exceeding limit.
Error occurred on backend tarmstrong-box:22000 by fragment 
9948285e1aa8e172:f73fe971
Memory left in process limit: 8.01 GB
Memory left in query limit: 947.59 KB
Query(9948285e1aa8e172:f73fe971): Limit=5.00 MB Reservation=0 
ReservationLimit=0 OtherMemory=4.07 MB Total=4.07 MB Peak=4.07 MB
  Fragment 9948285e1aa8e172:f73fe971: Reservation=0 
OtherMemory=4.07 MB Total=4.07 MB Peak=4.07 MB
HDFS_SCAN_NODE (id=0): Total=4.07 MB Peak=4.07 MB
PLAN_ROOT_SINK: Total=0 Peak=0

It would be nice to turn it into an end-to-end test but I cant see how to make 
it robustly fail in that place.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc@857
PS11, Line 857: num_tuples
> Why is this not tuples_returned here ? Is there any guarantee max_added_tup
I think num_fields is tracking the number of fields consumed from the input.



--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 19:54:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8172

to look at the new patch set (#13).

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..

IMPALA-5307: Part 4: copy out uncompressed text and seq

This is the final patch for IMPALA-5307.

Text and Seq scanners are converted to use the same approach
as Avro.

contains_tuple_data is now false so a bunch of dead code in
ScannerContext can be removed. We also no longer attach I/O
buffers to row batches so that logic can be removed.

Testing:
Ran core ASAN tests.

Perf:
I reran the same benchmarks as in Part 2. There was no measurable
difference before and after - for both text and seq processing time
is dominated by text parsing.

Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/text-converter.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
23 files changed, 137 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/13
--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h@184
PS6, Line 184: //
> Can remove this TODO
Done



--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 19:55:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8424

to look at the new patch set (#7).

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..

IMPALA-4835 (prep only): create io subfolder and namespace

Instead of using the DiskIoMgr class as a namespace, which prevents
forward-declaration of inner classes, create an impala::io namespace
and unnested the inner class.

This is done in anticipation of DiskIoMgr depending on BufferPool. This
helps avoid a circular dependency between DiskIoMgr, TmpFileMgr and
BufferPool headers that could not be broken with forward declarations.

Testing:
Ran core tests.

Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
---
M be/CMakeLists.txt
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/disk-io-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
A be/src/runtime/io/CMakeLists.txt
R be/src/runtime/io/disk-io-mgr-internal.h
R be/src/runtime/io/disk-io-mgr-stress-test.cc
R be/src/runtime/io/disk-io-mgr-stress.cc
R be/src/runtime/io/disk-io-mgr-stress.h
R be/src/runtime/io/disk-io-mgr-test.cc
R be/src/runtime/io/disk-io-mgr.cc
A be/src/runtime/io/disk-io-mgr.h
R be/src/runtime/io/handle-cache.h
R be/src/runtime/io/handle-cache.inline.h
R be/src/runtime/io/request-context.cc
R be/src/runtime/io/request-context.h
A be/src/runtime/io/request-ranges.h
R be/src/runtime/io/scan-range.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
38 files changed, 1,416 insertions(+), 1,311 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/8424/7
-- 
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8267 )

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..


Patch Set 14: Code-Review+1

Rebased, carry Lars' +1. Lars are you comfortable with bringing this to a +2?


--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 19:12:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8267

to look at the new patch set (#14).

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..

IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

Switch the decoders to using more batch-oriented interfaces. As an
intermediate step this doesn't make the interfaces of LevelDecoder
or DictDecoder batch-oriented, only the lower-level utility classes.

The next step would be to change those interfaces to be batch-oriented
and make according optimisations in parquet. This could deliver much
larger perf improvements than the current patch.

The high-level changes are.
* BitReader -> BatchedBitReader, which is built to unpack runs of 32
  bit-packed values efficiently.
* RleDecoder -> RleBatchDecoder, which exposes the repeated and literal
  runs to the caller and uses BatchedBitReader to unpack literal runs
  efficiently.
* Dict decoding uses RleBatchDecoder to decode repeated runs efficiently
  and uses the BitPacking utilities to unpack and encode in a single
  step.

Also removes an older benchmark that isn't too interesting (since
the batch-oriented approach to encoding and decoding is so much
faster than the value-by-value approach).

Testing:
* Ran core tests.
* Updated unit tests to exercise new code.
* Added test coverage for the deprecated bit-packed level encoding to
  that it still works (there was no coverage previously).

Perf:
Single-node benchmarks showed a few % performance gain. 16 node cluster
benchmarks only showed a gain for TPC-H nested.

Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
---
M be/src/benchmarks/CMakeLists.txt
M be/src/benchmarks/bit-packing-benchmark.cc
D be/src/benchmarks/rle-benchmark.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
D be/src/experiments/bit-stream-utils.8byte.h
D be/src/experiments/bit-stream-utils.8byte.inline.h
M be/src/util/bit-packing.h
M be/src/util/bit-packing.inline.h
M be/src/util/bit-stream-utils.h
M be/src/util/bit-stream-utils.inline.h
M be/src/util/dict-encoding.h
M be/src/util/dict-test.cc
M be/src/util/parquet-reader.cc
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
M testdata/data/README
A testdata/data/alltypes_agg_bitpacked_def_levels.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test
M tests/query_test/test_scanners.py
20 files changed, 1,251 insertions(+), 966 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/8267/14
--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8267 )

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h@96
PS12, Line 96: for different encodings.
> what do we mean by that? Isn't this just for RLE?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:57:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8267

to look at the new patch set (#13).

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..

IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

Switch the decoders to using more batch-oriented interfaces. As an
intermediate step this doesn't make the interfaces of LevelDecoder
or DictDecoder batch-oriented, only the lower-level utility classes.

The next step would be to change those interfaces to be batch-oriented
and make according optimisations in parquet. This could deliver much
larger perf improvements than the current patch.

The high-level changes are.
* BitReader -> BatchedBitReader, which is built to unpack runs of 32
  bit-packed values efficiently.
* RleDecoder -> RleBatchDecoder, which exposes the repeated and literal
  runs to the caller and uses BatchedBitReader to unpack literal runs
  efficiently.
* Dict decoding uses RleBatchDecoder to decode repeated runs efficiently
  and uses the BitPacking utilities to unpack and encode in a single
  step.

Also removes an older benchmark that isn't too interesting (since
the batch-oriented approach to encoding and decoding is so much
faster than the value-by-value approach).

Testing:
* Ran core tests.
* Updated unit tests to exercise new code.
* Added test coverage for the deprecated bit-packed level encoding to
  that it still works (there was no coverage previously).

Perf:
Single-node benchmarks showed a few % performance gain. 16 node cluster
benchmarks only showed a gain for TPC-H nested.

Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
---
M be/src/benchmarks/CMakeLists.txt
M be/src/benchmarks/bit-packing-benchmark.cc
D be/src/benchmarks/rle-benchmark.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
D be/src/experiments/bit-stream-utils.8byte.h
D be/src/experiments/bit-stream-utils.8byte.inline.h
M be/src/util/bit-packing.h
M be/src/util/bit-packing.inline.h
M be/src/util/bit-stream-utils.h
M be/src/util/bit-stream-utils.inline.h
M be/src/util/dict-encoding.h
M be/src/util/dict-test.cc
M be/src/util/parquet-reader.cc
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
M testdata/data/README
A testdata/data/alltypes_agg_bitpacked_def_levels.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test
M tests/query_test/test_scanners.py
20 files changed, 1,252 insertions(+), 966 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/8267/13
--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 4: Code-Review+2

Thanks Zoltan - I really appreciate the cleanup.


--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:49:26 +
Gerrit-HasComments: No


[native-toolchain-CR] Bump Kudu version to 1520b39

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8491 )

Change subject: Bump Kudu version to 1520b39
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08
Gerrit-Change-Number: 8491
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:33:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 8:

It's still technically accurate in that the queries consume some amount of 
memory, especially if the result cache is used. Could we tweak the wording 
somewhat, e.g. "To free any remaining resources they are using..."


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:32:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 6: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:24:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc@953
PS5, Line 953:   ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(1L);
> On the /queries webpage, I think "num_in_flight_queries" is this same numbe
Oh, I see, I didn't look at the queries json. The queries page seems internally 
inconsistent - num_in_flight_queries includes "queries in flight" and "queries 
waiting to be closed". But yeah, I deliberately avoided using the term "in 
flight" because it's unclear whether it should include queries that are waiting 
to be closed.

I don't like inventing a new term but this does seem less ambiguous going 
forward.

It would be nice to re-evaluate all the names but probably best to do that all 
at once instead of piecemeal, since it will require considering impact on other 
tools, like you mentioned.



--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:24:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 4:

(1 comment)

Also added it to test_metrics_are_zero

http://gerrit.cloudera.org:8080/#/c/8461/4/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8461/4/be/src/service/impala-server.cc@871
PS4, Line 871: // Metric is decremented in UnregisterQuery().
> but UnregisterQuery() is called for more paths then just this one. So why i
Done



--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 06:56:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-06 Thread Tim Armstrong (Code Review)
Hello Thomas Tauber-Marshall, Mostafa Mokhtar, Alex Behm, Dan Hecht, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8461

to look at the new patch set (#5).

Change subject: IMPALA-6151: add query-level fragment/backend counters
..

IMPALA-6151: add query-level fragment/backend counters

This adds NumBackends, NumFragments and NumFragmentInstances
counters to the query profile to make it easier to manually or
programmatically analyse the query.

Also add a num-queries-registered metric to track the number
of queries that have been executed but are not yet unregistered.

Testing:
Ran "select count(*) from alltypessmall" and checked profile:

Backend startup latencies: Count: 3, min / max: 1ms / 1ms, 25th %-ile: 1ms, 
50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th 
%-ile: 1ms
Per Node Peak Memory Usage: tarmstrong-box:22000(1.10 MB) 
tarmstrong-box:22001(1.02 MB) tarmstrong-box:22002(1.02 MB)
 - FiltersReceived: 0 (0)
 - FinalizationTimer: 0.000ns
 - NumBackends: 3 (3)
 - NumFragmentInstances: 4 (4)
 - NumFragments: 2 (2)

Ran some query tests (both beeswax and HS2) and manually checked the
num-queries-registered metric on the /metrics page when the queries
were running and after they finished. Added the metric to
test_metrics_are_zero() to make sure that there are no accounting
errors.

Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
---
M be/src/runtime/coordinator.cc
M be/src/service/impala-server.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
M tests/verifiers/metric_verifier.py
6 files changed, 37 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/8461/5
--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 7:

I took a brief look at how much work it would be. Seems doable but there's just 
a lot of code in this test and I don't realistically have time to spend on it 
right now.


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 07 Nov 2017 01:52:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 7: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 07 Nov 2017 01:51:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6155: Ensure TPC-DS result orders can't vary.

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8484 )

Change subject: IMPALA-6155: Ensure TPC-DS result orders can't vary.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8484/2/testdata/workloads/tpcds/queries/tpcds-q77a.test
File testdata/workloads/tpcds/queries/tpcds-q77a.test:

http://gerrit.cloudera.org:8080/#/c/8484/2/testdata/workloads/tpcds/queries/tpcds-q77a.test@120
PS2, Line 120:  RESULTS
Another option is to use the VERIFY_IS_EQUAL_SORTED verifier instead of 
modifying the query. That's not very strict but might be better than diverging 
the queries further from the official TPC-DS versions. I haven't been following 
that closely so not sure what the general approach has been with modifications 
vs changing the verification logic.



--
To view, visit http://gerrit.cloudera.org:8080/8484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c
Gerrit-Change-Number: 8484
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Tue, 07 Nov 2017 01:20:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8487 )

Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen-test.cc
File be/src/codegen/llvm-codegen-test.cc:

http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen-test.cc@485
PS1, Line 485: NULL
nullptr


http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen.h@584
PS1, Line 584: LinkModule
Maybe LinkModuleFromLocalFs() or something like that to disambiguate it the one 
below.



--
To view, visit http://gerrit.cloudera.org:8080/8487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
Gerrit-Change-Number: 8487
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 00:39:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 4:

That's a good point, will make that change. Cancelled merge in the meantime.


--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 00:10:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 6: Code-Review+1

carry +1


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:26:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.h@454
PS5, Line 454: probably released
> probably have been released
Done


http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/8323/5/be/src/runtime/coordinator.cc@831
PS5, Line 831:
> nit: here and elsewhere, we seem to have gratuitous blank lines which makes
Done


http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8323/5/tests/custom_cluster/test_admission_controller.py@91
PS5, Line 91: ]
> there is also session close and session timeout. but those should both resu
Yeah I was checking the coverage bottom-up to make sure that all of the code 
paths in ClientRequestState/Coordinator were covered. Those two additional code 
paths go through UnregisterQuery(), same as the Close() path (and a bunch of 
other things).

It's not a bad idea to cover the other cases, but I think that would require 
switching the test to HS2, since beeswax doesn't have the same idea of sessions.

I guess we could drop the beeswax connection and check the state of the query 
via the debug server or similar, but then it would probably just make more 
sense to switch to HS2. I can have a crack at that but no idea how easy or hard 
that would be.



--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:26:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-06 Thread Tim Armstrong (Code Review)
Hello anujphadke, Joe McDonnell, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8323

to look at the new patch set (#6).

Change subject: IMPALA-1575: part 2: yield admission control resources
..

IMPALA-1575: part 2: yield admission control resources

This change releases admission control resources more eagerly,
once the query has finished actively executing. Some resources
(tracked and untracked) are still consumed by the client request
as long as it remains open, e.g. memory for control structures
and the result cache. However, these resources are relatively
small and should not block admission of new queries.

The same as in part 1, query execution is considered to be finished
under any of the following conditions:
1. The query encounters an error and fails
2. The query is cancelled due to the idle query timeout
3. The query reaches eos (or the DML completes)
4. The client cancels the query without closing the query

Admission control resources are released in two ways:
1. by calling AdmissionController::ReleaseQuery() on the coordinator
   promptly after query execution finishes, instead of waiting for
   UnregisterQuery(). This means that the query and its memory is
   no longer considered "admitted".
2. by changing the behaviour of MemTracker::GetPoolMemReserved() so
   that it is aware of when a query has finished executing and does not
   consider its entire memory limit to be "reserved".

The preconditions for releasing an admitted query are subtle because the
queries are being admitted to a distributed system, not just the
coordinator.  The comment for ReleaseAdmissionControlResources()
documents the preconditions and rationale. Note that the preconditions
are not weaker than the preconditions of calling UnregisterQuery()
before this patch.

Testing:
TestAdmissionController is extended to end queries in four ways:
cancellation by client, idle timeout, the last row being fetched,
and the client closing the query. The test uses a mix of all four.
After the query ends, all clients wait for the test to complete
before closing the query or closing the connection. This ensures
that the admission control decisions are based entirely on the
query end behavior. This test works for both query admission control
and mem_limit admission control and can detect both kinds of admission
control resources ("admitted" and "reserved") not being released
promptly.

This is based on an earlier patch by Joe McDonnell.

Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
---
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/query-state.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/client-request-state.cc
M tests/custom_cluster/test_admission_controller.py
9 files changed, 189 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8323/6
--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8461/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8461/3/be/src/service/impala-server.cc@1052
PS3, Line 1052:   ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(-1L);
> My understanding is that archiving can take a long time (e.g. huge profile)
I made the change in PS4



--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:11:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-06 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8172

to look at the new patch set (#12).

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..

IMPALA-5307: Part 4: copy out uncompressed text and seq

This is the final patch for IMPALA-5307.

Text and Seq scanners are converted to use the same approach
as Avro.

contains_tuple_data is now false so a bunch of dead code in
ScannerContext can be removed. We also no longer attach I/O
buffers to row batches so that logic can be removed.

Testing:
Ran core ASAN tests.

Perf:
I reran the same benchmarks as in Part 2. There was no measurable
difference before and after - for both text and seq processing time
is dominated by text parsing.

Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/text-converter.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
23 files changed, 133 insertions(+), 211 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/12
--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6105: Clarify argument order in single node perf run

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8470 )

Change subject: IMPALA-6105: Clarify argument order in single_node_perf_run
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib236ce7e83dc193ef1382f630ce58759a639
Gerrit-Change-Number: 8470
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 21:25:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
> I agree it's a little strange. But Cloudera issued a technical service bull
I think TSBs probably use a different scale though. Most of the medium severity 
issues here wouldn't warrant a TSB. This bug is definitely worse than the ABS 
bug below.



--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 21:18:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-06 Thread Tim Armstrong (Code Review)
Hello Thomas Tauber-Marshall, Mostafa Mokhtar, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8461

to look at the new patch set (#4).

Change subject: IMPALA-6151: add query-level fragment/backend counters
..

IMPALA-6151: add query-level fragment/backend counters

This adds NumBackends, NumFragments and NumFragmentInstances
counters to the query profile to make it easier to manually or
programmatically analyse the query.

Also add a num-queries-registered metric to track the number
of queries that have been executed but are not yet unregistered.

Testing:
Ran "select count(*) from alltypessmall" and checked profile:

Backend startup latencies: Count: 3, min / max: 1ms / 1ms, 25th %-ile: 1ms, 
50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th 
%-ile: 1ms
Per Node Peak Memory Usage: tarmstrong-box:22000(1.10 MB) 
tarmstrong-box:22001(1.02 MB) tarmstrong-box:22002(1.02 MB)
 - FiltersReceived: 0 (0)
 - FinalizationTimer: 0.000ns
 - NumBackends: 3 (3)
 - NumFragmentInstances: 4 (4)
 - NumFragments: 2 (2)

Ran some queries and manually checked the num-queries-registered metric
on the /metrics page when the queries were running and after they
finished.

Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
---
M be/src/runtime/coordinator.cc
M be/src/service/impala-server.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
5 files changed, 36 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/8461/4
--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8461/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8461/3/be/src/service/impala-server.cc@1052
PS3, Line 1052:   ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(-1L);
> Why keep it above ArchiveQuery()?
At this point the query has been removed from all tracking structures so I 
figured that means that it's unregistered.

I don't really think it matters that much - is there a question that someone 
would use the metric to answer that would be affected by whether the query is 
archived or not?



--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 17:48:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt
..


Patch Set 4:

(3 comments)

Testing found one subtle bug.

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr-reader-context.cc
File be/src/runtime/disk-io-mgr-reader-context.cc:

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr-reader-context.cc@50
PS4, Line 50:   DCHECK(buffer->scan_range()->external_buffer_tag_ == 
ScanRange::ExternalBufferTag::NO_BUFFER)
> long line
Done


http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc@462
PS4, Line 462:   // Buffers the were not allocated by DiskIoMgr don't need 
to be freed.
> existing typo: that
Done


http://gerrit.cloudera.org:8080/#/c/8414/4/be/src/runtime/disk-io-mgr.cc@797
PS4, Line 797:   
reader->disk_states_[disk_queue->disk_id].DecrementRequestThread();
There were a subtle bug here that could cause readers to get wedged. The 
problem is that we should call DecrementRequestThreadAndCheckDone() when the 
range is cancelled, since it's possible that the RequestContext was cancelled 
in the meantime and this is the last thread that needs to do the cleanup.

Ideally we would rethink the names of some of these functions and clarify the 
protocol but I don't want to get too distracted from the main point of this 
work.



--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 17:37:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

2017-11-06 Thread Tim Armstrong (Code Review)
Hello Tianyi Wang,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8414

to look at the new patch set (#7).

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt
..

IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Misc other refactoring and simplification.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/test-env.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
16 files changed, 252 insertions(+), 600 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/7
--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

2017-11-06 Thread Tim Armstrong (Code Review)
Hello Tianyi Wang,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8414

to look at the new patch set (#6).

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt
..

IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Misc other refactoring and simplification.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/test-env.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
16 files changed, 250 insertions(+), 599 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/6
--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 11: Code-Review+1

rebased to pick up IMPALA-6137 fix


--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 17:31:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 6:

rebased


--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 17:32:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 7: Code-Review+1

rebased


--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 17:32:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

2017-11-06 Thread Tim Armstrong (Code Review)
Hello Tianyi Wang,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8414

to look at the new patch set (#5).

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt
..

IMPALA-4835: Part 1: simplify I/O mgr mem mgmt

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Misc other refactoring and simplification.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/test-env.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
16 files changed, 252 insertions(+), 599 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/5
--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 2: Code-Review+1

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8428/2/be/src/runtime/data-stream-recvr.cc
File be/src/runtime/data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8428/2/be/src/runtime/data-stream-recvr.cc@220
PS2, Line 220: unique_lock
This change is harmless but doesn't seem necessary.


http://gerrit.cloudera.org:8080/#/c/8428/2/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/8428/2/be/src/runtime/fragment-instance-state.cc@332
PS2, Line 332:   auto wait_duration = 
boost::posix_time::seconds(report_fragment_offset);
The use of "auto" here is probably marginal as far as our coding idioms go. I 
think this case is fine, but we're generally biased towards spelling out the 
type when it is reasonably concise and meaningful. I.e. we'd use auto to save 
typing and reading something like "vector::iterator", but usually avoid it 
when it's just a plain class name.


http://gerrit.cloudera.org:8080/#/c/8428/2/be/src/runtime/fragment-instance-state.cc@337
PS2, Line 337: auto wait_duration = 
boost::posix_time::seconds(FLAGS_status_report_interval);
This is pre-existing but we should avoid shadowing variables.


http://gerrit.cloudera.org:8080/#/c/8428/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8428/2/be/src/service/impala-server.cc@1800
PS2, Line 1800: session_timeout_cv_.WaitFor(timeout_lock, 
wait_duration);
nit: could just be a one liner.


http://gerrit.cloudera.org:8080/#/c/8428/2/be/src/util/condition-variable.h
File be/src/util/condition-variable.h:

http://gerrit.cloudera.org:8080/#/c/8428/2/be/src/util/condition-variable.h@67
PS2, Line 67:   bool WaitFor(boost::unique_lock& lock,
Thanks, I like these new names.



--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 06 Nov 2017 17:26:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8461/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8461/2/be/src/service/impala-server.cc@993
PS2, Line 993:   RETURN_IF_ERROR(CancelInternal(query_id, check_inflight, 
cause));
> What if we hit an error here? Need to still maintain the metric.
The only reason CancelInternal() can fail is if the query isn't registered - 
otherwise this would be really broken.

By decrementing it below we're keeping the count in sync with the entries in 
client_request_state_map_, which seems better.


http://gerrit.cloudera.org:8080/#/c/8461/2/be/src/service/impala-server.cc@1008
PS2, Line 1008:   ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(-1L);
> To be most accurate, shouldn't we move this to the end of this function? Th
Done


http://gerrit.cloudera.org:8080/#/c/8461/2/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/8461/2/common/thrift/metrics.json@477
PS2, Line 477: "label": "Queries Registered",
> Imo we need to be careful with terminology. Our query states are already ha
I added to the description. I agree that we should be careful here - I picked 
this definition of queries since it's the most expansive and doesn't depend on 
the different query states directly.



--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 04 Nov 2017 00:39:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-03 Thread Tim Armstrong (Code Review)
Hello Thomas Tauber-Marshall, Mostafa Mokhtar, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8461

to look at the new patch set (#3).

Change subject: IMPALA-6151: add query-level fragment/backend counters
..

IMPALA-6151: add query-level fragment/backend counters

This adds NumBackends, NumFragments and NumFragmentInstances
counters to the query profile to make it easier to manually or
programmatically analyse the query.

Also add a num-queries-registered metric to track the number
of queries that have been executed but are not yet unregistered.

Testing:
Ran "select count(*) from alltypessmall" and checked profile:

Backend startup latencies: Count: 3, min / max: 1ms / 1ms, 25th %-ile: 1ms, 
50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th 
%-ile: 1ms
Per Node Peak Memory Usage: tarmstrong-box:22000(1.10 MB) 
tarmstrong-box:22001(1.02 MB) tarmstrong-box:22002(1.02 MB)
 - FiltersReceived: 0 (0)
 - FinalizationTimer: 0.000ns
 - NumBackends: 3 (3)
 - NumFragmentInstances: 4 (4)
 - NumFragments: 2 (2)

Ran some queries and manually checked the num-queries-registered metric
on the /metrics page when the queries were running and after they
finished.

Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
---
M be/src/runtime/coordinator.cc
M be/src/service/impala-server.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
5 files changed, 36 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/8461/3
--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2494: Support for byte array encoded decimals in Parquet scanner

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7822 )

Change subject: IMPALA-2494: Support for byte array encoded decimals in Parquet 
scanner
..


Patch Set 9: Code-Review+2

LGTM. Probably best to wait until after the weekend to merge it though :).


--
To view, visit http://gerrit.cloudera.org:8080/7822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c0e881045109f337fecba53fec21f9cfb9e619e
Gerrit-Change-Number: 7822
Gerrit-PatchSet: 9
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 04 Nov 2017 00:29:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
..


Patch Set 2:

(12 comments)

Did a first pass over it.

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/codegen/gen_ir_descriptions.py
File be/src/codegen/gen_ir_descriptions.py:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/codegen/gen_ir_descriptions.py@99
PS2, Line 99:   ["HASH_FAST", "IrFastHash"],
Do we use this? I don't see any places where it's used currently.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc@a468
PS2, Line 468:
Lol, weird.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc@474
PS2, Line 474: partition_expr_evals_[j
Use 'eval' directly.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/raw-value.cc
File be/src/runtime/raw-value.cc:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/raw-value.cc@220
PS2, Line 220:   return HashUtil::FastHash64(v, 
static_cast(type.GetByteSize()), seed);
I think this might be slightly slower - with the previous approach I think we 
get a specialised version of the hash function for that input length for each 
data type, whereas here we have to go through another switch in GetByteSize().


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h
File be/src/util/hash-util.h:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@235
PS2, Line 235:   /* The MIT License
I think this should go at the top of the file beneath the apache license. Then 
we can just say that the FastHash64 implementation came with that license. E.g.

/*
  FastHash64 implementation derived from MIT-licensed code written by Zilong Tan

  The MIT License
  ...
*/


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@263
PS2, Line 263: size_t
use int64_t - we generally use signed integers for lengths.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@266
PS2, Line 266: (const uint64_t *)
We use c-style casts - i.e. reinterpret_cast<>


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@267
PS2, Line 267: pos
I believe the C++ standard doesn't allow pointer arithmetic on void - we should 
convert to uint8_t for doing that.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@278
PS2, Line 278:  (const unsigned char*
See above comments.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@282
PS2, Line 282: (uint64_t)
should use static_cast()


http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test
File 
testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test:

http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test@145
PS2, Line 145: 8,'k1',-1,'k1',1
Why does this make a difference when we have VERIFY_IS_EQUAL_SORTED above?


http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/tpcds/queries/tpcds-q77a.test
File testdata/workloads/tpcds/queries/tpcds-q77a.test:

http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/tpcds/queries/tpcds-q77a.test@125
PS2, Line 125: 'catalog channel',NULL,538912.55,2050279.74,-1383554.73
I'll file a JIRA to make this test deterministically pass: IMPALA-6155. In the 
meantime, can you change this to

    RESULTS: VERIFY_IS_EQUAL_SORTED

That will make it ignore the order of results.



--
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 04 Nov 2017 00:28:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4591: Bound Kudu client error mem usage

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8464 )

Change subject: IMPALA-4591: Bound Kudu client error mem usage
..


Patch Set 1:

(8 comments)

This is a nice improvement. Had some questions/ideas but as-is this seems 
strictly better than before.

http://gerrit.cloudera.org:8080/#/c/8464/1/be/src/exec/kudu-table-sink.h
File be/src/exec/kudu-table-sink.h:

http://gerrit.cloudera.org:8080/#/c/8464/1/be/src/exec/kudu-table-sink.h@105
PS1, Line 105: allocated
nit: "consumed" to be consistent with the memtracker terminology.


http://gerrit.cloudera.org:8080/#/c/8464/1/be/src/exec/kudu-table-sink.h@106
PS1, Line 106: allocated_mem_
Maybe client_tracked_bytes_ to make it clearer that the unit is bytes and it's 
tracked for accounting purposes.


http://gerrit.cloudera.org:8080/#/c/8464/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

http://gerrit.cloudera.org:8080/#/c/8464/1/be/src/exec/kudu-table-sink.cc@a52
PS1, Line 52:
Was this flag documented? Just wondering if we should consider what happens if 
someone set this manually.


http://gerrit.cloudera.org:8080/#/c/8464/1/be/src/exec/kudu-table-sink.cc@39
PS1, Line 39: DEFINE_int32(kudu_mutation_buffer_size, 
DEFAULT_KUDU_MUTATION_BUFFER_SIZE,
Just throwing out ideas here, but did we think about pros/cons of making these 
query options? I guess mostly these defaults should be fine but sometimes it 
turns out to be inconvenient to only have a global setting (and to have to 
restart Impala for it to take effect.)


http://gerrit.cloudera.org:8080/#/c/8464/1/be/src/exec/kudu-table-sink.cc@124
PS1, Line 124:   FLAGS_kudu_error_buffer_size > 1024 ? 
FLAGS_kudu_error_buffer_size : 1024;
Is this equivalent to the following?

  max(1024, FLAGS_kudu_error_buffer_size)


http://gerrit.cloudera.org:8080/#/c/8464/1/be/src/exec/kudu-table-sink.cc@132
PS1, Line 132:   
RETURN_IF_ERROR(state->exec_env()->GetKuduClient(table_desc_->kudu_master_addresses(),
 _));
nit: long line.


http://gerrit.cloudera.org:8080/#/c/8464/1/tests/custom_cluster/test_kudu.py
File tests/custom_cluster/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/8464/1/tests/custom_cluster/test_kudu.py@66
PS1, Line 66:   
@CustomClusterTestSuite.with_args(impalad_args="-kudu_error_buffer_size=1024")
It might be faster to make this a regular query test but insert more data so 
that it exceeds the default error buffer limit. Starting up a new cluster takes 
a long time (to be honest, this is fine though).


http://gerrit.cloudera.org:8080/#/c/8464/1/tests/custom_cluster/test_kudu.py@74
PS1, Line 74: prsent
present



--
To view, visit http://gerrit.cloudera.org:8080/8464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I186ddb3f3b5865e08f17dba57cf6640591d06b14
Gerrit-Change-Number: 8464
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Nov 2017 23:41:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 5:

Rebased since some things changed underneath this patch.


--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Nov 2017 23:24:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-03 Thread Tim Armstrong (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8424

to look at the new patch set (#5).

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..

IMPALA-4835 (prep only): create io subfolder and namespace

Instead of using the DiskIoMgr class as a namespace, which prevents
forward-declaration of inner classes, create an impala::io namespace
and unnested the inner class.

This is done in anticipation of DiskIoMgr depending on BufferPool. This
helps avoid a circular dependency between DiskIoMgr, TmpFileMgr and
BufferPool headers that could not be broken with forward declarations.

Testing:
Ran core tests.

Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
---
M be/CMakeLists.txt
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/disk-io-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
A be/src/runtime/io/CMakeLists.txt
R be/src/runtime/io/disk-io-mgr-internal.h
R be/src/runtime/io/disk-io-mgr-stress-test.cc
R be/src/runtime/io/disk-io-mgr-stress.cc
R be/src/runtime/io/disk-io-mgr-stress.h
R be/src/runtime/io/disk-io-mgr-test.cc
R be/src/runtime/io/disk-io-mgr.cc
A be/src/runtime/io/disk-io-mgr.h
R be/src/runtime/io/handle-cache.h
R be/src/runtime/io/handle-cache.inline.h
R be/src/runtime/io/request-context.cc
R be/src/runtime/io/request-context.h
A be/src/runtime/io/request-ranges.h
R be/src/runtime/io/scan-range.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
38 files changed, 1,417 insertions(+), 1,311 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/8424/5
-- 
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12
PS1, Line 12: In this patch, the beharior is changed so that an error is 
produced in
> typo: beharior should be behavior
I'll also accept behaviour ;)


http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@27
PS1, Line 27:   select avg(dec_38_19) from decimal_tbl
I guess these regressions occur regardless of the width of the input decimal, 
rigth?


http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@413
PS1, Line 413:   abs(avg->sum_val16) > 
DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val4))) {
This isn't correct if sum_val16 and src.val* have opposite sign, is it?


http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@420
PS1, Line 420:   abs(avg->sum_val16) > 
DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val8))) {
What do you think about only checking for overflow of the integer type at this 
point and checking for whether it fits in the decimal type during finalisation? 
Might be cheaper.

We could also probably do a cheaper initial check of sum_val16 versus as 
constant in the 4 and 8 byte cases since they can't possible overflow unless 
the sum is already quite large.



--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 03 Nov 2017 23:21:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 15:

(8 comments)

Pretty close, have some formatting nits (nothing horrible, just trying to keep 
the codebase consistent) and one perf concern.

http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@54
PS15, Line 54: : dict_encoded_size_(0), pool_(NULL), dict_bytes_cnt_(0), 
dict_mem_tracker_(NULL) {}
Ok to leave for now but we've generally been moving towards using the recent 
C++ extension that allows initialising member variables to constants at their 
declartion.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@59
PS15, Line 59: DCHECK
DCHECK_EQ(dict_bytes_cnt_, 0);


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@77
PS15, Line 77:   void ClearIndices() {
We generally use the more concise one-line formatting for very short 
single-statement functions like this one.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@123
PS15, Line 123:
formatting nit: * goes on the left with the type name.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@168
PS15, Line 168: inline
nit: inline isn't necessary. It isn't harmful but can be confusing to have 
unnecessary modifiers.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@269
PS15, Line 269:  *
nit: * should be on left.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@330
PS15, Line 330: inline
nit: unnecessary inline


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@374
PS15, Line 374:   ConsumeBytes(sizeof(node));
I'm concerned that calling MemTracker::Consume for every node added to the 
table could hurt performance in some workloads since it will end up 
incrementing the memory consumption counter in the root process-wide MemTracker 
up to 40k times per dictionary.

(In contrast, MemPool::Allocate() is generally fine since it allocates memory 
in chunks).

How about we track the memory for some number of nodes at a time. E.g.

  const int NODE_MEM_TRACKING_GRANULARITY = 4096;
  ...
  if (nodes % NODE_MEM_TRACKING_GRANULARITY == 0) {
ConsumeBytes(sizeof(node) * NODE_MEM_TRACKING_GRANULARITY);
  }



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 15
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Nov 2017 23:07:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
Missed this on the first pass - shouldn't this be high?



--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Nov 2017 22:48:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6144: PublishFilter() continues to run after query failure/cancellation

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8455 )

Change subject: IMPALA-6144: PublishFilter() continues to run after query 
failure/cancellation
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I400456ad85adb9c23d2d432d772311fa4dcff2ed
Gerrit-Change-Number: 8455
Gerrit-PatchSet: 2
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Nov 2017 22:45:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 6: Code-Review+1

Rebased since some of the patches under it changed.


--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Nov 2017 22:30:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8428/1/be/src/util/condition-variable.h
File be/src/util/condition-variable.h:

http://gerrit.cloudera.org:8080/#/c/8428/1/be/src/util/condition-variable.h@65
PS1, Line 65:   bool TimedWait(boost::unique_lock& lock,
> Now that I looked into the callsites more carefully, I realized that there
I like those names - good to distinguish between the two cases. Seems 
reasonable, especially considering there's precedent in the standard library.



--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 03 Nov 2017 22:22:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Fix errant, newline-including log directory.

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has removed a vote on this change.

Change subject: Fix errant, newline-including log directory.
..


Removed Verified-1 by Impala Public Jenkins (255)
--
To view, visit http://gerrit.cloudera.org:8080/8459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Id18404c3f3fc0008f67b01afbac9d908532fc317
Gerrit-Change-Number: 8459
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


  1   2   3   4   5   6   7   8   9   10   >