[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain

2018-02-14 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9274 )

Change subject: PREVIEW: build ORC C++ lib in toolchain
..


Patch Set 3: Code-Review+1

LGTM. I can build the orc patch based on this. Thank you for solving these!


--
To view, visit http://gerrit.cloudera.org:8080/9274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e
Gerrit-Change-Number: 9274
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 15 Feb 2018 07:47:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9331 )

Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1941/


--
To view, visit http://gerrit.cloudera.org:8080/9331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9
Gerrit-Change-Number: 9331
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 15 Feb 2018 07:14:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9331 )

Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9
Gerrit-Change-Number: 9331
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 15 Feb 2018 06:49:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 25:

(33 comments)

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/exec/hdfs-parquet-scanner.cc@1445
PS25, Line 1445: ;
<< "Already provided a buffer" was nice


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@84
PS25, Line 84: GetNextRange: returns to the caller the next scan range it 
should process.
 : /// This is based on disk load. This also begins reading the 
data in this scan
 : /// range. This is blocking.
that could probably use updating? or maybe rephrase to talk about states or 
something rather than API calls.


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@109
PS25, Line 109: Memory usage in the IoMgr comes from queued read buffers.  If 
we queue the minimum
  : /// (i.e. 1), then the disks are idle while we are processing 
the buffer. If we don't
  : /// limit the queue, then it possible we end up queueing the 
entire data set (i.e. CPU
  : /// is slower than disks) and run out of memory.
this seems a bit outdated


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@116
PS25, Line 116: In that case each query will need fewer queued buffers and
  : /// therefore have less memory usage.
  : //
and that


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@136
PS25, Line 136: a client buffer provided by the caller
  : /// when constructing the scan range.
in that case, is it required that the single buffer can fit the entire range?


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@147
PS25, Line 147: buffers are owned
how does a buffer become owned by the caller? I guess that means when it's 
returned by GetNext() but not yet ReturnBuffer()?  How does the caller know how 
many buffers there are in play in order to satisfy this requirement?  or does 
it have to just conservatively assume that it should only "own" one buffer at a 
time?

So it seems we effectively still have a client cooperation requirement in the 
new code (which I think is okay). It's just that instead of leading to a 
reservation violation, it would lead to a resource deadlock. I do agree the 
waiting on buffers rather than on queue space is an improvement though.


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@149
PS25, Line 149: it can allocate at least three
  : /// max-sized buffers per scan range
the API below (AllocateBuffersForRange()) doesn't seem to give the caller 
(explicit) control over how many buffers are allocated. I guess it's implicit 
because of the maximum IO buffer size?


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@269
PS25, Line 269: ,
"on return"
or "If 'needs_buffers' is returned as true"
to make it clearer this is an out-only param


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@282
PS25, Line 282: range is
  :   /// available
what does it mean for a range to be available (from an API perspective)?


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@288
PS25, Line 288: 'needs_buffers' is set to true
this returns true in '*needs_buffers'


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@300
PS25, Line 300: allocate
allocated


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@474
PS25, Line 474:   /// Reads the specified scan range and calls 
HandleReadFinished() when done.
this should probably mention how it might not do anything if there's no buffer 
available


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc@375
PS25, Line 375:   // Add the range to the queue of the disk the range is on
that doesn't seem to match the code block


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc@387
PS25, Line 387:   // Can't schedule until the caller allocates some buffers for 
the range.
that comment seems to talk about only one case handled here


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc@391
PS25, Line 391: *needs_buffers ? ScheduleMode::BY_CALLER
why is this case different than when AddScanRanges() has NO_BUFFER (which does 
UPON_GETNEXT)?  I guess in this case we don't want it to go through the 

[Impala-ASF-CR] IMPALA-6008: Creating a UDF from a shared library with a .ll extenion crashes impala

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9154 )

Change subject: IMPALA-6008: Creating a UDF from a shared library with a .ll 
extenion crashes impala
..


Patch Set 3:

(4 comments)

Thanks for fixing this, it would be good to get this merged. The test coverage 
looks good and the fix is valid, I just noticed the test may be flaky and the 
structure of the error handling in the code could be improved to reduce the 
chance of similar mistakes in future.

http://gerrit.cloudera.org:8080/#/c/9154/3/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/9154/3/be/src/codegen/llvm-codegen.cc@223
PS3, Line 223: CreateFromMemory
This function also has the same bad pattern of not closing 'codegen' on errors 
- we should probably fix that too while we're here.


http://gerrit.cloudera.org:8080/#/c/9154/3/be/src/codegen/llvm-codegen.cc@1315
PS3, Line 1315:   Status status = CreateFromFile(nullptr, , nullptr, file, 
module_id, );
CreateFromFile() should really close the LlvmCodegen object on failure - it's 
awkward requiring the caller to do it. Might be easier to use the "goto error" 
pattern with a cleanup block at the end of the CreateFromFile() function.


http://gerrit.cloudera.org:8080/#/c/9154/3/tests/query_test/test_udfs.py
File tests/query_test/test_udfs.py:

http://gerrit.cloudera.org:8080/#/c/9154/3/tests/query_test/test_udfs.py@332
PS3, Line 332:   with open("bad_udf.ll", "w") as f:
The local and HDFS file creation will race if test_udf_errors() runs with 
itself concurrently. Would be best to use tempfile.mkstemp() for the local file 
and include the unique_database string in the path of the HDFS file.


http://gerrit.cloudera.org:8080/#/c/9154/3/tests/query_test/test_udfs.py@338
PS3, Line 338: check_call
Use call() instead of check_call(), since we want to execute the next cleanup 
step even on errors.



--
To view, visit http://gerrit.cloudera.org:8080/9154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id060668802ca9c80367cdc0e8a823b968d549bbb
Gerrit-Change-Number: 9154
Gerrit-PatchSet: 3
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 15 Feb 2018 06:29:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8363 )

Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and 
client_request_state_map_lock_
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1
Gerrit-Change-Number: 8363
Gerrit-PatchSet: 10
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 15 Feb 2018 05:04:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5139: Update mvn-quiet.sh to print execution content to log file

2018-02-14 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9273 )

Change subject: IMPALA-5139: Update mvn-quiet.sh to print execution content to 
log file
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9273/5/bin/mvn-quiet.sh
File bin/mvn-quiet.sh:

http://gerrit.cloudera.org:8080/#/c/9273/5/bin/mvn-quiet.sh@35
PS5, Line 35: mvn "$@" | tee -a $LOG_FILE | grep -E -e WARNING -e ERROR -e 
SUCCESS -e FAILURE -e Test; then
I'll repeat my caveat about not being any sort of bash expert. I recommend 
running my comments past someone else -- e.g., Phil Z. or Michael.

I feel like there is something that is still confusing about this code, and 
some of it has to do with this patch, but a lot of it preceded your patch as 
well.

* I still think mvn being piped to mvn is an odd construction, and probably 
something to be avoided

* I think that the combined interaction between "set -eo pipefail" and the 
if-block and the piping is a little hard to reason about

* I strongly suspect that inside the if-block, as it stands, "echo [...] exited 
with code $?" will *always* show an exit code of 0, regardless of whether mvn 
failed or not

* That said, I do think that "exit 1" will execute if mvn failed, so at least 
the script will fail as expected


Stepping back for a minute to consider what we want this script to do, I think 
it's something like this:

* we don't want to see the verbose output of mvn, but we want to capture it in 
a log file

* however, if there are lines ERRORS or WARNINGS or SUCCESS or FAILURE, we want 
to send those lines to the console

* in the chain of mvn | tee | grep, we want to exit with a non-zero status if 
mvn fails, but we don't esepcially care about the exit status of tee and 
certainly not grep


I think that all of those things can be accomplished with something like this.

  set -uo pipefail  # i.e., get rid of set -e

  [...]

  mvn $IMPALA_MAVEN_OPTIONS "$@" | tee -a $LOG_FILE | \
  grep -E -e WARNING -e ERROR -e SUCCESS -e FAILURE -e Test

  MVN_EXIT_STATUS=${PIPESTATUS[0]}
  echo "mvn exited with ${MVN_EXIT_STATUS}"
  exit ${MVN_EXIT_STATUS}


Getting rid of "set -e" means that we can continue to execute in this script 
even if mvn fails, but without having to resort to "if mvn [...]; then...".

Everything gets logged, but only lines matching our grep pattern go to console.

mvn exit status is captured in ${PIPESTATUS[0]}, but tee and grep status are 
ignored.

The script exits with the same status as mvn.

This manages to avoid piping mvn to mvn.

FYI, this is the page I read about to clarify my understanding of pipefail and 
PIPESTATUS:
https://unix.stackexchange.com/questions/14270/get-exit-status-of-process-thats-piped-to-another



--
To view, visit http://gerrit.cloudera.org:8080/9273
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I475b17a4dccfa624dda61402491b461c53473f8b
Gerrit-Change-Number: 9273
Gerrit-PatchSet: 5
Gerrit-Owner: Nithya Janarthanan 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 15 Feb 2018 04:59:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 3:

This should have read:
> However, the dependencies are often *not* just between
 > the existence of the object but between the state of the bojects
 > (e.g. whether Init() was called yet), so it doesn't always help.


--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 15 Feb 2018 03:44:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 3:

> (2 comments)
 >
 > > Patch Set 1:
 > >
 > > (6 comments)
 > >
 > > I went through the change and found that I would prefer to not
 > access dependencies through the ExecEnv singleton, but instead
 > inject them through the ctors and Init(). That way it feels easier
 > to me to follow the code. I don't feel strongly about it though.
 >
 > I'm curious how do others feel about this?
 >

My feeling is that if something is a singleton, then it's usually easier for me 
to follow the code if it's accessed directly rather than adding indirection by 
passing around pointers to it (which ultimately I'll often need to trace back 
to figure out where the object came from anyway).  Additionally, at least in 
some places, the architecture dictates some of these daemon services are 
singletons, so I don't see the advantage in pretending that they parametrizable.

On the other hand, passing by argument can help show dependencies between 
objects. However, the dependencies are often just between the existence of the 
object but between the state of the bojects (e.g. whether Init() was called 
yet), so it doesn't always help.


--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 15 Feb 2018 03:43:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6508: add KRPC test flag

2018-02-14 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9291 )

Change subject: IMPALA-6508: add KRPC test flag
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9291/7/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/9291/7/bin/start-impala-cluster.py@264
PS7, Line 264: "-use_krpc=tru
> seems more consistent to do -use_krpc=true here as other options above seem
Done


http://gerrit.cloudera.org:8080/#/c/9291/7/tests/common/custom_cluster_test_suite.py
File tests/common/custom_cluster_test_suite.py:

http://gerrit.cloudera.org:8080/#/c/9291/7/tests/common/custom_cluster_test_suite.py@138
PS7, Line 138: --use_krpc")
> why not just --use_krpc here ?
Good point, done.



--
To view, visit http://gerrit.cloudera.org:8080/9291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab
Gerrit-Change-Number: 9291
Gerrit-PatchSet: 8
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 15 Feb 2018 03:27:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6508: add KRPC test flag

2018-02-14 Thread Lars Volker (Code Review)
Hello Michael Ho, Michael Brown, Sailesh Mukil, David Knupp,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9291

to look at the new patch set (#8).

Change subject: IMPALA-6508: add KRPC test flag
..

IMPALA-6508: add KRPC test flag

This change adds a flag "--use_krpc" to start-impala-cluster.py. The
flag is currently passed as an argument to the impalad daemon. In the
future it will also enable KRPC for the catalogd and statestored
daemons.

This change also adds a flag "--test_krpc" to pytest. When running tests
using "impala-py.test --test_krpc", the test cluster will be started
by passing "--use_krpc" to start-impala-cluster.py (see above).

This change also adds a SkipIf to skip tests based on whether the
cluster was started with KRPC support or not.

- SkipIf.not_krpc can be used to mark a test that depends on KRPC.
- SkipIf.not_thrift can be used to mark a test that depends on Thrift
  RPC.

This change adds a meta test to make sure that the new SkipIf decorators
work correctly. The test should be removed as soon as real tests have
been added with the new decorators.

Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab
---
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M tests/common/custom_cluster_test_suite.py
M tests/common/skip.py
A tests/common/test_skip.py
M tests/conftest.py
M tests/custom_cluster/test_exchange_delays.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/custom_cluster/test_rpc_exception.py
9 files changed, 85 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/9291/8
--
To view, visit http://gerrit.cloudera.org:8080/9291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab
Gerrit-Change-Number: 9291
Gerrit-PatchSet: 8
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-14 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 3:

(2 comments)

> Patch Set 1:
>
> (6 comments)
>
> I went through the change and found that I would prefer to not access 
> dependencies through the ExecEnv singleton, but instead inject them through 
> the ctors and Init(). That way it feels easier to me to follow the code. I 
> don't feel strongly about it though.

I'm curious how do others feel about this?

>
> However, if we want to go with ExecEnv, we should probably remove some of the 
> injected dependencies as well to keep it consistent.

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr-test-base.h
File be/src/rpc/rpc-mgr-test-base.h:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr-test-base.h@133
PS1, Line 133:   // Takes over ownership of the newly created 'service' which 
needs to have a lifetime
> The RpcMgr no longer owns the rpc service. RpcMgr::Shutdown() will call rpc
Shouldn't the RpcMgr own the services then?


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc@83
PS1, Line 83:   mem_tracker_.reset(new MemTracker(-1, "Data Stream Manager 
Deferred RPCs",
> It seems the common pattern is to have the MemTracker inside the owning obj
That makes sense to me, thanks for the explanation.



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 15 Feb 2018 03:23:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9250 )

Change subject: IMPALA-6519: API to allocate unreserved buffer
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 15 Feb 2018 01:35:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9250 )

Change subject: IMPALA-6519: API to allocate unreserved buffer
..

IMPALA-6519: API to allocate unreserved buffer

The motivation is to allow allocation of buffers without reservation in
ExchangeNode. Currently this it not possible because
IncreaseReservationToFit() followed by AllocateBuffer() is non-atomic.
We need to handle concurrent allocations in ExchangeNode because there
may be multiple batches being received at a given time.

This is a temporary solution until we can implement proper reservations
in ExchangeNode (IMPALA-6524).

Testing:
Added basic unit test.

Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Reviewed-on: http://gerrit.cloudera.org:8080/9250
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.cc
M be/src/runtime/bufferpool/reservation-tracker.h
6 files changed, 131 insertions(+), 33 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8363 )

Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and 
client_request_state_map_lock_
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1940/


--
To view, visit http://gerrit.cloudera.org:8080/8363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1
Gerrit-Change-Number: 8363
Gerrit-PatchSet: 10
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 15 Feb 2018 01:26:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock

2018-02-14 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8363 )

Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and 
client_request_state_map_lock_
..


Patch Set 10: Code-Review+2

Fixed clang-tidy issues.

Rebase, carry +2.


--
To view, visit http://gerrit.cloudera.org:8080/8363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1
Gerrit-Change-Number: 8363
Gerrit-PatchSet: 10
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 15 Feb 2018 01:26:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock

2018-02-14 Thread Sailesh Mukil (Code Review)
Hello Philip Zeyliger, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8363

to look at the new patch set (#10).

Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and 
client_request_state_map_lock_
..

IMPALA-4456: Address scalability issues of qs_map_lock_ and 
client_request_state_map_lock_

The following 2 locks have shown to be frequent points of contention
on recent perf runs:

- qs_map_lock_
- client_request_state_map_lock_

Since these are process wide locks, any threads waiting on these locks
potentially slow down the runtime of a query.

I tried to address this previously by converting the 
client_request_state_map_lock_
to a reader-writer lock. This showed great perf improvements in the general
case, however, there were edge cases with big regressions as well.
In the general case, strict readers of the map got through so quickly
that we were able to see a reduction in the number of client connections
created, since this lock was contended for in the context of Thrift threads too.
The bad case is when writers were starved trying to register a new query
since there were so many readers. Changing the starve option resulted in
worse read performance all round.

Another approach which is relatively simpler is to shard the locks, which
proves to be very effective with no regressions. The maps and locks are
sharded to a default of 4 buckets initally.

Query IDs are created by using boost::uuids::random_generator. We use the
high bits of a query ID to assign queries to buckets. I verified that the
distribution of the high bits of a query ID are even across buckets on
my local machine:

For 10,000 queries sharded across 4 buckets, the distribution was:
bucket[0]: 2500
bucket[1]: 2489
bucket[2]: 2566
bucket[3]: 2445

A micro-benchmark is added to measure the improvement in performance. This
benchmark creates multiple threads each of which creates a QueryState and
accesses it multiple times. We can see improvements in the range 2x - 3.5x.

BEFORE:
--Benchmark 1: Create and access Query States.
Total Time (#Queries: 5 #Accesses: 100) : 1ms
Total Time (#Queries: 50 #Accesses: 100) : 8ms
Total Time (#Queries: 50 #Accesses: 1000) : 54ms
Total Time (#Queries: 500 #Accesses: 100) : 76ms
Total Time (#Queries: 500 #Accesses: 1000) : 543ms

AFTER:
--Benchmark 1: Create and access Query States.
Total Time (#Queries: 5 #Accesses: 100) : 2173.59K clock cycles
Total Time (#Queries: 50 #Accesses: 100) : 4ms
Total Time (#Queries: 50 #Accesses: 1000) : 15ms
Total Time (#Queries: 500 #Accesses: 100) : 46ms
Total Time (#Queries: 500 #Accesses: 1000) : 151ms

This change introduces a ShardedQueryMap, which is used to replace
the QueryExecMgr::qs_map_ and the ImpalaServer::client_request_state_map_,
and their corresponding locks, thereby abstracting away the access to the
maps locks.

For operations that need to happen on every entry in the ShardedQueryMap
maps, a new function ShardedQueryMap::DoFuncForAllEntries() is
introduced which takes a user supplied lambda and passes it every individual
map entry and executes it.

NOTE: This microbenchmark has shown that SpinLock has better performance
than boost::mutex for the qs_map_lock_'s, so that change has been made
too.

TODO: Add benchmark for client_request_state_map_lock_ too. The APIs
around that are more complicated, so this patch only includes
the benchmarking of qs_map_lock_.

TODO 2: Consider adopting the ShardedQueryMapTemplate for the SessionStateMap.

Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/process-wide-locks-benchmark.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
A be/src/util/sharded-query-map-util.h
8 files changed, 379 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/8363/10
--
To view, visit http://gerrit.cloudera.org:8080/8363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1
Gerrit-Change-Number: 8363
Gerrit-PatchSet: 10
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC

2018-02-14 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9331 )

Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/9331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9
Gerrit-Change-Number: 9331
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 15 Feb 2018 01:23:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC

2018-02-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9331 )

Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC
..


Patch Set 1:

FWIW, this is the error message in question: 
https://github.com/apache/impala/blob/master/common/thrift/generate_error_codes.py#L227-L228


--
To view, visit http://gerrit.cloudera.org:8080/9331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9
Gerrit-Change-Number: 9331
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 15 Feb 2018 01:04:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6512: Fix test exchange delays for KRPC

2018-02-14 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9331


Change subject: IMPALA-6512: Fix test_exchange_delays for KRPC
..

IMPALA-6512: Fix test_exchange_delays for KRPC

The sender timed out error message diverges between Thrift and KRPC
slightly due to the source address being not readily available with
Thrift RPC implementation. This leads to failure in test_exchange_delays
when KRPC is enabled.

This change fixes the problem by shortening the error message string
to match against.

Testing done: Tested with KRPC enabled in the code and verified the tests 
passed.

Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9
---
M 
testdata/workloads/functional-query/queries/QueryTest/exchange-delays-zero-rows.test
M testdata/workloads/functional-query/queries/QueryTest/exchange-delays.test
2 files changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/9331/1
--
To view, visit http://gerrit.cloudera.org:8080/9331
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idd9410381dbb931231c92f084917265e5067b4c9
Gerrit-Change-Number: 9331
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1938/


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 15 Feb 2018 00:56:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc@176
PS2, Line 176:   service_mem_tracker_->Release(transfer_size);
> The process limit is checked by looking at TC-malloc and the buffer pool, r
The parent trackers are both process memory tracker.



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 15 Feb 2018 00:36:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6508: add KRPC test flag

2018-02-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9291 )

Change subject: IMPALA-6508: add KRPC test flag
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9291/7/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/9291/7/bin/start-impala-cluster.py@264
PS7, Line 264: "-use_krpc %s"
seems more consistent to do -use_krpc=true here as other options above seem to 
explicitly set the boolean flag.


http://gerrit.cloudera.org:8080/#/c/9291/7/tests/common/custom_cluster_test_suite.py
File tests/common/custom_cluster_test_suite.py:

http://gerrit.cloudera.org:8080/#/c/9291/7/tests/common/custom_cluster_test_suite.py@138
PS7, Line 138: --impalad_args=-use_krpc
why not just --use_krpc here ?



--
To view, visit http://gerrit.cloudera.org:8080/9291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab
Gerrit-Change-Number: 9291
Gerrit-PatchSet: 7
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 15 Feb 2018 00:22:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc@176
PS2, Line 176:   service_mem_tracker_->Release(transfer_size);
> There may be a temporary window in which the process mem limit is exceeded,
The process limit is checked by looking at TC-malloc and the buffer pool, 
rather than adding up the children consumption.  So, I don't think this impacts 
the process limit.

Who are the parents of these two trackers?

Tim, what's your take on the order that the Consume() and Release() should 
occur in?



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 15 Feb 2018 00:06:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/exec-env.cc@304
PS2, Line 304: RETURN_IF_ERROR(data_svc_->Init());
> now that we don't pass parameters, if there are still dependencies in the o
Done


http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc@176
PS2, Line 176:   service_mem_tracker_->Release(transfer_size);
> what is the impact of the transient double counting now?
There may be a temporary window in which the process mem limit is exceeded, 
leading to other callers of MemTracker::TryConsume() to fail. Worst case, we 
may be double counting by # service threads * avg row batch size.

On the other hand, leaving the window open may lead to a higher chance of going 
over the process mem limit.


http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/service/data-stream-service.h
File be/src/service/data-stream-service.h:

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/service/data-stream-service.h@45
PS2, Line 45:   /// Initializes the service by registering it with the 
singleton RPC manager.
> note here that the RpcMgr::Init() must have already been called (or whateve
Done



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 23:57:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-14 Thread Michael Ho (Code Review)
Hello Lars Volker, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9282

to look at the new patch set (#3).

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..

IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

The fix for IMPALA-6193 added a memory tracker for the memory consumed
by the payloads in the service queue of DataStreamService. This change
extends it by introducing a bound on the memory usage for that service
queue. In addition, it deprecates FLAGS_datastream_service_queue_depth
and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags
only take effect when KRPC is in use and KRPC was never enabled in any
previous releases so it seems safe to do this flag replacement. The new
flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount
of memory which can be consumed by the service queue of DataStreamService.
This allows a more direct control over the memory usage of the queue instead
of inferring via the number of entries in the queue. The default value of
this flag is left at 0, in which case it will be set to 20% of process
memory limit.

Testing done: exhaustive debug builds. Updated data-stream-test to
exercise the case in which the payload is larger than the limit.

Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test-base.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/mem-tracker.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M tests/custom_cluster/test_krpc_mem_usage.py
15 files changed, 283 insertions(+), 164 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/9282/3
--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of rejected RPCs

2018-02-14 Thread Lars Volker (Code Review)
Lars Volker has abandoned this change. ( http://gerrit.cloudera.org:8080/9320 )

Change subject: DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of 
rejected RPCs
..


Abandoned

Squashed into IMPALA-6269
--
To view, visit http://gerrit.cloudera.org:8080/9320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc
Gerrit-Change-Number: 9320
Gerrit-PatchSet: 2
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-14 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#5).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page.

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change is based on a change by Sailesh Mukil.

TODO: testing, once IMPALA-6508 has been submitted.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
15 files changed, 424 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/5
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-14 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 5:

PS5 exposes metrics for rejected RPCs on /metrics.


--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 14 Feb 2018 23:37:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9123 )

Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog 
topic updates
..

IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates

This adds a set of "prioritized" statestore topics that are small but
are important to deliver in a timely manner. These are delivered more
frequently by a separate thread pool to reduce the window for stale
admission control and scheduling information.

The contract between statestore and subscriber is changed so that the
statestore can send concurrent Update() RPCs for disjoint sets of
topics. This required changes to the subscriber implementation, which
assumed that only one Update RPC would arrive at a time.

It also changes the locking in the statestore so that the prioritized
update threads don't get stuck behind the catalog threads holding
'topic_lock_'. Specifically, it uses a reader-writer lock to protect
modification of the set of topics and a reader-writer lock per topic to
allow the topic data to be read by multiple threads concurrently.

Added metrics to monitor the per-topic update interval.

Testing:
Ran core tests.

Inspected metrics on Impala daemons, saw that membership and request
queue processing times had more samples recorded than the catalog
topic, reflecting the increased frequency.

Ran under thread sanitizer, made sure no data races were reported in
Statestore or StatestoreSubscriber.

Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210
Reviewed-on: http://gerrit.cloudera.org:8080/9123
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/impala-server.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M common/thrift/metrics.json
M tests/custom_cluster/test_admission_controller.py
M tests/statestore/test_statestore.py
M www/statestore_subscribers.tmpl
M www/statestore_topics.tmpl
15 files changed, 845 insertions(+), 485 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210
Gerrit-Change-Number: 9123
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9123 )

Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog 
topic updates
..


Patch Set 16: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210
Gerrit-Change-Number: 9123
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 22:44:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9250 )

Change subject: IMPALA-6519: API to allocate unreserved buffer
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1939/


--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:56:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9250 )

Change subject: IMPALA-6519: API to allocate unreserved buffer
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc
File be/src/runtime/bufferpool/buffer-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@569
PS4, Line 569:   }
> That bug probably wouldn't have happened if the code was just put into the
Yes I undermined myself a bit there.


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@570
PS4, Line 570:   if (success != nullptr) *success = false;
> I was suggesting just moving the logic directly into BufferPool::AllocateUn
I'll stick with this for now.



--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:56:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9250 )

Change subject: IMPALA-6519: API to allocate unreserved buffer
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:56:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9250 )

Change subject: IMPALA-6519: API to allocate unreserved buffer
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/9250/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9250/4//COMMIT_MSG@14
PS4, Line 14:
> let's note that this is a temporary solution so that we can introduce real
Done


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h
File be/src/runtime/bufferpool/buffer-pool.h:

http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h@246
PS4, Line 246:   /// operations for 'client', except for operations on the same 
'handle'.
> maybe note that if there is an unexpected runtime failure during allocation
Done


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h@250
PS4, Line 250:   /// out of that instead of relying on the "best effort" 
interface.
> how about explicitly saying that the interface is provided so to help trans
Done


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc
File be/src/runtime/bufferpool/buffer-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@562
PS4, Line 562: Ensure we have the reservation required first.
> that's not really the case when 'reserved' is true -- the client needs to e
Restructured the comments to separate the two cases.


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@569
PS4, Line 569:   }
There was a bug here because we didn't test the failure case. Extended the test 
to exercise it.


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@570
PS4, Line 570:   if (success != nullptr) *success = false;
> given that there's only one callsite with reserved==true and one with reser
I had trouble judging whether it was better to have one function with more 
complex control flow vs two functions with  simpler control flow. Getting the 
accounting right was a little subtle so I thought this way we'd get better test 
coverage, but I can see the argument for the other approach if you think it is 
superior.



--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:29:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9250

to look at the new patch set (#5).

Change subject: IMPALA-6519: API to allocate unreserved buffer
..

IMPALA-6519: API to allocate unreserved buffer

The motivation is to allow allocation of buffers without reservation in
ExchangeNode. Currently this it not possible because
IncreaseReservationToFit() followed by AllocateBuffer() is non-atomic.
We need to handle concurrent allocations in ExchangeNode because there
may be multiple batches being received at a given time.

This is a temporary solution until we can implement proper reservations
in ExchangeNode (IMPALA-6524).

Testing:
Added basic unit test.

Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
---
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.cc
M be/src/runtime/bufferpool/reservation-tracker.h
6 files changed, 131 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/9250/5
--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1938/


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:11:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:11:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 5: Code-Review+2

Thanks, I think this makes sense.


-- 
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:10:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of rejected RPCs

2018-02-14 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/9320 )

Change subject: DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of 
rejected RPCs
..

DRAFT MISSING COMMENTS IMPALA-6520: Add metrics for number of rejected RPCs

Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/exec-env.cc
M common/thrift/metrics.json
6 files changed, 51 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/9320/2
--
To view, visit http://gerrit.cloudera.org:8080/9320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc
Gerrit-Change-Number: 9320
Gerrit-PatchSet: 2
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-6520: Add metrics for number of rejected RPCs

2018-02-14 Thread Lars Volker (Code Review)
Lars Volker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9320


Change subject: IMPALA-6520: Add metrics for number of rejected RPCs
..

IMPALA-6520: Add metrics for number of rejected RPCs

Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/exec-env.cc
M common/thrift/metrics.json
6 files changed, 51 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/9320/1
--
To view, visit http://gerrit.cloudera.org:8080/9320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I636478808d9a1c8625c6636109fb0ec5c662f2bc
Gerrit-Change-Number: 9320
Gerrit-PatchSet: 1
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9123 )

Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog 
topic updates
..


Patch Set 16:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1937/


--
To view, visit http://gerrit.cloudera.org:8080/9123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210
Gerrit-Change-Number: 9123
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 19:15:37 +
Gerrit-HasComments: No


[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9274 )

Change subject: PREVIEW: build ORC C++ lib in toolchain
..


Patch Set 3:

It looks like my first issue may be fixed by ORC-266 
https://issues.apache.org/jira/browse/ORC-266 . When they release ORC 1.5 we 
might be able to remove that custom patch.


--
To view, visit http://gerrit.cloudera.org:8080/9274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e
Gerrit-Change-Number: 9274
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 19:13:51 +
Gerrit-HasComments: No


[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9274 )

Change subject: PREVIEW: build ORC C++ lib in toolchain
..


Patch Set 2:

Also bumped the version, thanks for letting me know!


--
To view, visit http://gerrit.cloudera.org:8080/9274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e
Gerrit-Change-Number: 9274
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 19:07:14 +
Gerrit-HasComments: No


[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain

2018-02-14 Thread Tim Armstrong (Code Review)
Hello Quanlong Huang,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9274

to look at the new patch set (#3).

Change subject: PREVIEW: build ORC C++ lib in toolchain
..

PREVIEW: build ORC C++ lib in toolchain

Includes a patch to allow pointing ORC at our versions of the
dependencies instead of building them itself.

Testing:
Confirmed that it built locally, confirmed that the built utilities were
runnable after sourcing impala-config.sh.

TODO: build on more Linux distros, check that the built libraries are
ususable.

Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e
---
M buildall.sh
A source/orc/build.sh
A 
source/orc/orc-1.4.3-patches/0001-Allow-building-against-external-versions-of-dependen.patch
A 
source/orc/orc-1.4.3-patches/0002-ORC-239.-Install-missing-Statistics.hh-header-file.patch
4 files changed, 214 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/74/9274/3
--
To view, visit http://gerrit.cloudera.org:8080/9274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e
Gerrit-Change-Number: 9274
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9274 )

Change subject: PREVIEW: build ORC C++ lib in toolchain
..


Patch Set 2:

It looks like they fixed the issue on master with this commit, but I added a 
simple patch to the 1.4.2 branch.

commit 06a013cabecea8b9c560abd69d4c2d03509867a5
Author: Jim Crist 
Date:   Wed Nov 1 17:20:00 2017 -0500

Fix installation to include all header files

- Fixes installation to include all necessary header files in
  `include/orc`
- Removes a few unnecessary includes

Fixes #185

Signed-off-by: Owen O'Malley 


--
To view, visit http://gerrit.cloudera.org:8080/9274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e
Gerrit-Change-Number: 9274
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 19:02:30 +
Gerrit-HasComments: No


[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain

2018-02-14 Thread Tim Armstrong (Code Review)
Hello Quanlong Huang,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9274

to look at the new patch set (#2).

Change subject: PREVIEW: build ORC C++ lib in toolchain
..

PREVIEW: build ORC C++ lib in toolchain

Includes a patch to allow pointing ORC at our versions of the
dependencies instead of building them itself.

Testing:
Confirmed that it built locally, confirmed that the built utilities were
runnable after sourcing impala-config.sh.

TODO: build on more Linux distros, check that the built libraries are
ususable.

Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e
---
M buildall.sh
A source/orc/build.sh
A 
source/orc/orc-1.4.2-patches/0001-Allow-building-against-external-versions-of-dependen.patch
A source/orc/orc-1.4.2-patches/0002-Install-Statistics.hh.patch
4 files changed, 212 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/74/9274/2
--
To view, visit http://gerrit.cloudera.org:8080/9274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e
Gerrit-Change-Number: 9274
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/exec-env.cc@304
PS2, Line 304: RETURN_IF_ERROR(data_svc_->Init());
now that we don't pass parameters, if there are still dependencies in the 
ordering of these methods, how about noting that?


http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/runtime/krpc-data-stream-mgr.cc@176
PS2, Line 176:   service_mem_tracker_->Release(transfer_size);
what is the impact of the transient double counting now?


http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/service/data-stream-service.h
File be/src/service/data-stream-service.h:

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/service/data-stream-service.h@45
PS2, Line 45:   /// Initializes the service by registering it with the 
singleton RPC manager.
note here that the RpcMgr::Init() must have already been called (or whatever 
the ordering dependency is).



-- 
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 18:27:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-14 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9063 )

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..


Patch Set 7:

I have uploaded a patch with the template approach. (please ignore patch set 6, 
which contains some file by accident)

The patch also contains some simplification around the changed call sites (like 
replacing GetTypes(...) with convenience functions), so it is not "rename only".

I have also noticed several places where codegen is used in over complicated 
ways, for example 
"llvm::ConstantInt::get(llvm::Type::getInt1Ty(codegen->context()), false)" 
could be simply "codegen->false_value()". I can do some GetConst cleanup here, 
or I can create a ticket for it.


--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 18:22:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9250 )

Change subject: IMPALA-6519: API to allocate unreserved buffer
..


Patch Set 4:

@tarmstrong. oops. sorry, must have picked the wrong change-id when squashing 
multiple commits


--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 18:19:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9123 )

Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog 
topic updates
..


Patch Set 16:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1936/


--
To view, visit http://gerrit.cloudera.org:8080/9123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210
Gerrit-Change-Number: 9123
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 17:57:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9123 )

Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog 
topic updates
..


Patch Set 16: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210
Gerrit-Change-Number: 9123
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 17:57:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6482: add QUERY TIME LIMIT S option

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9227 )

Change subject: IMPALA-6482: add QUERY_TIME_LIMIT_S option
..


Patch Set 5:

Is there already a time limit for time spent in the AC queue? If not, maybe 
we'll need that as well since this clock starts after that?


--
To view, visit http://gerrit.cloudera.org:8080/9227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id81772ee223ffb64746e241027a5a734a811e1b8
Gerrit-Change-Number: 9227
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Wed, 14 Feb 2018 17:37:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog topic updates

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9123 )

Change subject: IMPALA-4953,IMPALA-6437: separate AC/scheduler from catalog 
topic updates
..


Patch Set 15: Code-Review+2

Thanks for the larger scale testing.


--
To view, visit http://gerrit.cloudera.org:8080/9123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210
Gerrit-Change-Number: 9123
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 17:36:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Zoltan Borok-Nagy (Code Review)
Hello Lars Volker, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9053

to look at the new patch set (#5).

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..

IMPALA-6416: extend Thread::Create to track instance id

This commit builds upon IMPALA-3703. Each thread that
was created through Thread::Create() has a ThreadDebugInfo
object on the stack frame of Thread::SuperviseThread().
This object has stack allocated char buffers that can be
read during a debug session even if we only have minidumps.

However, with the old solution ThreadDebugInfo::instance_id
was set manually for each thread. It is too easy to forget
to set instance_id every time we create a new thread.

This commit has the assumption that if a thread has an
instance id associated, then the threads spawned by it will
always work on the same instance id. In Thread::StartThread
the parent thread passes its ThreadDebugInfo object to
its child who copies the instance id, and also stores the
name and system thread id of its parent.

This means if we set ThreadDebugInfo::instance_id in some
"root thread", then all descendant threads will annotate
themselves with the instance id automatically. Since threads
also record the name (and a system thread id) of their parent,
it might be also possible to reconstruct the thread creation
graph.

With GDB I tested if it copies the instance id at every
place where we previously needed to set it manually.

I added an automated test to thread-debug-info-test.cc

Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
---
M be/src/common/thread-debug-info-test.cc
M be/src/common/thread-debug-info.cc
M be/src/common/thread-debug-info.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/util/thread.cc
M be/src/util/thread.h
8 files changed, 74 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9053/5
--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@115
PS4, Line 115: system_thread_id_
> Are you saying that thread_debug_info_ pointer may point to stray
 > memory?

Yes, I think that can potentially happen.

 > If that's the case, maybe we shouldn't keep that pointer
 > but instead have a way to traverse TDIs and find it if still
 > exists?

We can use the system thread id to find the parent thread and then its TDI if 
it still exists.

I also added a copy of the parent thread name. Even if the parent thread exits, 
at least we will now what was its name.
We can also check if parent thread name equals to the TDI's thread name that we 
found via the system thread id. If they're not the same, we'll know that the 
parent thread was re-cycled in some way.


http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@119
PS4, Line 119:   boost::thread::id boost_thread_id_;
 :   int64_t system_thread_id_ = 0;
> I'm not sure what you mean by "switch to the parent thread
 > quickly".

I was just thinking of a GDB session. GDB prints out the system thread id for 
each thread, so we can traverse the threads quickly if we know the ID. It can 
be also useful if we write a script for that and we don't want to iterate 
through all the threads.

 > I don't think I fully understand the tradeoffs.  Just seems
 > confusing to have multiple IDs and I don't understand how each of
 > these fields helps in debugging.  Perhaps some comments attached to
 > them explaining how one can use them for debugging would help?

I removed boost::thread::id, and added the parent's thread name. I think it is 
more clear, and knowing the parent thread name can be useful on its own. And we 
can also use it to verify the parent thread.



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 17:32:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9250 )

Change subject: IMPALA-6519: API to allocate unreserved buffer
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/9250/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9250/4//COMMIT_MSG@14
PS4, Line 14:
let's note that this is a temporary solution so that we can introduce real 
reservation accounting in the exchange node as a second step.


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h
File be/src/runtime/bufferpool/buffer-pool.h:

http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h@246
PS4, Line 246:   /// operations for 'client', except for operations on the same 
'handle'.
maybe note that if there is an unexpected runtime failure during allocation, 
the reservation may still have been increased?


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.h@250
PS4, Line 250:   /// out of that instead of relying on the "best effort" 
interface.
how about explicitly saying that the interface is provided so to help 
transition components to the buffer pool so that they can implement reservation 
accounting as a second step during that transition?


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc
File be/src/runtime/bufferpool/buffer-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@562
PS4, Line 562: Ensure we have the reservation required first.
that's not really the case when 'reserved' is true -- the client needs to 
ensure that.  Maybe move that first sentence into the else and int he first 
case say: the client manages the reservation and must ensure it is available, 
or something?


http://gerrit.cloudera.org:8080/#/c/9250/4/be/src/runtime/bufferpool/buffer-pool.cc@570
PS4, Line 570:   if (success != nullptr) *success = false;
given that there's only one callsite with reserved==true and one with 
reserved==false, maybe refactor this to move the reservation accounting into 
the callers. Actually, maybe the whole thing should just go into the callers 
since there's just a couple of lines of common code here.



--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 17:31:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-14 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9063

to look at the new patch set (#7).

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..

IMPALA-5801: Clean up codegen GetType() interface

Several functions that return llvm::(Pointer)Type were renamed
to make them shorter or indicate their roles more clearly. Some
additional convenience functions were created to make some common
codegen tasks simpler.

The renamed functions can be found in llvm-codegen.h/cc. Changes
in other files are mainly renamed calls with the same functionality.

Testing:
No new tests are necessary, as no functionality was changed.

Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/exec-node.cc
M be/src/exec/filter-context.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/tuple.cc
M be/src/runtime/types.cc
M be/src/util/tuple-row-compare.cc
26 files changed, 339 insertions(+), 361 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/9063/7
--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6519: API to allocate unreserved buffer

2018-02-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9250 )

Change subject: IMPALA-6519: API to allocate unreserved buffer
..


Patch Set 4:

@kwho I think you took over my Change-Id for your draft patch. I deleted the 
draft for now.


--
To view, visit http://gerrit.cloudera.org:8080/9250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4d17b3db25491f796484de22405fbdee7a0f983
Gerrit-Change-Number: 9250
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 16:53:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries

2018-02-14 Thread Zoltan Borok-Nagy (Code Review)
Hello Attila Jeges, Dimitris Tsirogiannis, Tim Armstrong, Csaba Ringhofer, Alex 
Behm, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9005

to look at the new patch set (#12).

Change subject: IMPALA-6314: Add run time scalar subquery check for 
uncorrelated subqueries
..

IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries

If a scalar subquery is used with a binary predicate,
or, used in an arithmetic expression, it must return
only one row/column to be valid. If this cannot be
guaranteed at parse time through a single row aggregate
or limit clause, Impala fails the query like such.

E.g., currently the following query is not allowed:
SELECT bigint_col
FROM alltypesagg
WHERE id = (SELECT id FROM alltypesagg WHERE id = 1)

However, it would be allowed if the query contained
a LIMIT 1 clause, or instead of id it was max(id).

This commit makes the example valid by introducing a
runtime check to test if the subquery returns a single
row. If the subquery returns more than one row, it
aborts the query with an error.

I added a new node type, called CardinalityCheckNode. It
is created during planning on top of the subquery when
needed, then during execution it checks if its child
only returns a single row.

I extended the frontend tests and e2e tests as well.

Change-Id: I0f52b93a60eeacedd242a2f17fa6b99c4fc38e06
---
M be/src/exec/CMakeLists.txt
A be/src/exec/cardinality-check-node.cc
A be/src/exec/cardinality-check-node.h
M be/src/exec/exec-node.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
A fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-subplan.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
28 files changed, 958 insertions(+), 78 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/9005/12
--
To view, visit http://gerrit.cloudera.org:8080/9005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0f52b93a60eeacedd242a2f17fa6b99c4fc38e06
Gerrit-Change-Number: 9005
Gerrit-PatchSet: 12
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6314: Add run time scalar subquery check for uncorrelated subqueries

2018-02-14 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9005 )

Change subject: IMPALA-6314: Add run time scalar subquery check for 
uncorrelated subqueries
..


Patch Set 11:

(24 comments)

http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.h
File be/src/exec/cardinality-check-node.h:

http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.h@27
PS11, Line 27: /// Node that passes along the row pulled from its child 
unchanged.
> First sentence should crisply state purpose of this node, e.g.:
Thanks, I am using your sentences :)


http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.h@29
PS11, Line 29: /// Note that this node must be a blocking node!
> Please give an explanation why, e.g.:
Thanks again, I don't think I could formulate it so nicely! :)


http://gerrit.cloudera.org:8080/#/c/9005/9/be/src/exec/cardinality-check-node.h
File be/src/exec/cardinality-check-node.h:

http://gerrit.cloudera.org:8080/#/c/9005/9/be/src/exec/cardinality-check-node.h@29
PS9, Line 29: /// Note that this node must be a blocking node!
> Thanks.
To my defense in PS1 it was called ScalarCheckNode and it only allowed one row 
:)
Later when I renamed it I felt that some extra flexibility suits the new name.
But I see that it is unlikely that we'll ever need this flexibility.


http://gerrit.cloudera.org:8080/#/c/9005/9/be/src/exec/cardinality-check-node.h@38
PS9, Line 38:   virtual Status Reset(RuntimeState* state) override;
> Nice experiments and tests! Seems really hard to hit this case with a real
Thanks! Done.


http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.cc
File be/src/exec/cardinality-check-node.cc:

http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.cc@68
PS11, Line 68:   DCHECK_LE(rows_collected, 1);
> DCHECK that rows_collected is either 0 or 1. The current check accepts nega
Done


http://gerrit.cloudera.org:8080/#/c/9005/11/be/src/exec/cardinality-check-node.cc@85
PS11, Line 85: row_batch_->DeepCopyTo(output_row_batch);
> Why do we deep copy the row twice? Once in Open() should be sufficient.
Done


http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
File fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java:

http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java@175
PS11, Line 175:   StmtRewriter.rewriteNonScalarSubqueries(operand, 
analyzer);
> We should still keep the analysis and rewrite phases distinctly separate as
I'm a bit confused about how to organise this.
In AnalysisContext.analyze() I see the following phases:
* analyze
* rewrite exprs
* rewrite stmts
* re-analyze if needed

If I don't do anything here, this code will throw an exception that needs to be 
handled somewhere, but the real problem is that the first analyze phase will 
remain unfinished. Then, we would start applying rewrites on the partially 
analyzed tree.

So, some modifications are needed to analyzeImpl() I think. I tried to simply 
return from it when we encounter a non-scalar subquery. Then, do the rewrite in 
StmtRewriter, but analyze() is also called during statement rewriting. This is 
problematic because the AnalysisExceptions are wrapped in an 
IllegalStateException and that causes some tests to fail. This happens when the 
subquery is still invalid after making it scalar, e.g.:

 select id
 from functional.alltypestiny
 where int_col = (select max(timestamp_col) from
  functional.alltypessmall)

 ERROR: AnalysisException: null
 CAUSED BY: IllegalStateException: Failed analysis after expr substitution.
 CAUSED BY: AnalysisException: operands of type INT and TIMESTAMP are not 
comparable: int_col = (SELECT max(timestamp_col) FROM functional.alltypessmall)


My subquery "rewrite" only changes the type of the subquery. Aren't these kind 
of type deductions belong to the analysis phase? Maybe I should move 
rewriteNonScalarSubqueries() to Expr and rename it to something like 
makeContainedSubquerySalar() ?

Am I missing something? Do you have something in mind how should I deal with it?


http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@177
PS11, Line 177:   for (Expr expr: children_) {
> Move to the rewrite phase.
see comment in ArithmeticExpr.java


http://gerrit.cloudera.org:8080/#/c/9005/11/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:


[native-toolchain-CR] PREVIEW: build ORC C++ lib in toolchain

2018-02-14 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9274 )

Change subject: PREVIEW: build ORC C++ lib in toolchain
..


Patch Set 1:

Hi Tim, one file (include/orc/Statistics.hh) is not added to 
build/orc-1.4.2-p1/include/orc. After manually copying it into the dir, I was 
able to build my orc-patch.

Upgrading from orc-1.2.3 to orc-1.4.2 bring so many benefits. I was able to 
simplify my patch by using the ORC lib to parse the orc file tail, instead of 
writing codes to do this in the initial patch.

BTW, recently orc-1.4.3 was released. We can use the latest version.


--
To view, visit http://gerrit.cloudera.org:8080/9274
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6abd86a5a683f19aa44b47629edf283b938b7e7e
Gerrit-Change-Number: 9274
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Feb 2018 10:15:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 1:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG@20
PS1, Line 20: this flag is left at 0, in which case it will be set to 20% of 
process
> It seems too aggressive for small-to-medium deployments for sure :). It mig
Pushed it down to 5% for now. Until IMPALA-5859 is fixed, RPC retries may end 
up consuming quite a lot more network bandwidth.


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.h
File be/src/rpc/impala-service-pool.h:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.h@43
PS1, Line 43: MemTracker* mem_tracker
> it would be good to document these parameters. Is this the mem_tracker that
Done


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.cc@150
PS1, Line 150: or
> , otherwise
Done


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/impala-service-pool.cc@152
PS1, Line 152: std::
> nit: names.h should take care of that
Done


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr-test-base.h
File be/src/rpc/rpc-mgr-test-base.h:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr-test-base.h@133
PS1, Line 133:   // Takes over ownership of the newly created 'service'.
> Can you explain why this method is needed? Could you just pass in a pointer
The RpcMgr no longer owns the rpc service. RpcMgr::Shutdown() will call rpc 
service's Shutdown() function so the lifetime of the service should be as long 
as the rpc_mgr_. That's why the ownership is transferred to RpcMgrTestBase 
class instead. Comments added.


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/rpc/rpc-mgr.h@127
PS1, Line 127: mem_tracker
> document. and this is the service's mem tracker right? if so, maybe service
Done


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/exec-env.cc@304
PS1, Line 304: RETURN_IF_ERROR(data_svc_->Init(rpc_mgr_.get()));
> The RpcMgr has a getter in the ExecEnv, too, so you could remove it from th
Done


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/exec-env.cc@305
PS1, Line 305: data_svc_
> this one, too
This is a bit tricky as this is needed in data-stream-test.cc.


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.h@234
PS1, Line 234: handed over to DataStreamService
> is that correct? Isn't incoming_request_tracker the DataStreamService's mem
DataStreamMgr / Exchange node to be more precise.


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc@83
PS1, Line 83:   mem_tracker_.reset(new MemTracker(-1, "Data Stream Manager 
Deferred RPCs",
It seems the common pattern is to have the MemTracker inside the owning object 
(e.g. exec node, DataStreamRecvr etc) so having it externally in ExecEnv seems 
to be an uncommon pattern. The lifetime of the MemTracker of DataStreamMgr 
shouldn't be longer than that of the DataStreamMgr.


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/runtime/krpc-data-stream-mgr.cc@175
PS1, Line 175: Release
> I guess we were using the "Local" versions before since we were just transf
The "local" variant doesn't allow the tracker to have memory limit. There is a 
window but then both of these trackers have process_mem_tracker as their 
parents. I can change the order of Consume() and Release() to avoid 
under-counting if that's a big concern at all.


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.h
File be/src/service/data-stream-service.h:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.h@58
PS1, Line 58:   MemTracker* mem_tracker() { return mem_tracker_.get(); }
> does that need to be public?
It's used in ExecEnv::Init().


http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.cc
File be/src/service/data-stream-service.cc:

http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.cc@37
PS1, Line 37: DEFINE_int64(datastream_service_queue_mem_limit, 0,
> It would be good to use ParseUtil::ParseMemSpec() for this for consistency
Good idea. Done.