[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1950/


--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 07:19:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 6:

Yes, we will do appropriate tuning for the queue memory limit to get reasonable 
performance before turning KRPC on by default.


--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 07:19:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 6: Code-Review+2

Carry Tim's +2


--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 07:18:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9282/5/be/src/rpc/impala-service-pool.h
File be/src/rpc/impala-service-pool.h:

http://gerrit.cloudera.org:8080/#/c/9282/5/be/src/rpc/impala-service-pool.h@41
PS5, Line 41: service_queue_depth
> service_queue_length?
Done



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 07:17:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Michael Ho (Code Review)
Hello Lars Volker, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9282

to look at the new patch set (#6).

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..

IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

The fix for IMPALA-6193 added a memory tracker for the memory consumed
by the payloads in the service queue of DataStreamService. This change
extends it by introducing a bound on the memory usage for that service
queue. In addition, it deprecates FLAGS_datastream_service_queue_depth
and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags
only take effect when KRPC is in use and KRPC was never enabled in any
previous releases so it seems safe to do this flag replacement. The new
flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount
of memory which can be consumed by the service queue of DataStreamService.
This allows a more direct control over the memory usage of the queue instead
of inferring via the number of entries in the queue. The default value of
this flag is left at 0, in which case it will be set to 5% of process
memory limit.

Testing done: exhaustive debug builds. Updated data-stream-test to
exercise the case in which the payload is larger than the limit.

Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test-base.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/mem-tracker.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M tests/custom_cluster/test_krpc_mem_usage.py
15 files changed, 282 insertions(+), 162 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/9282/6
-- 
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-16 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8707

to look at the new patch set (#28).

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..

IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

There is also some miscellaneous cleanup included - e.g. reducing the
amount of code devoted to maintaining counters and metrics.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
19 files changed, 956 insertions(+), 546 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/28
--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 28
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9354 )

Change subject: IMPALA-6526: Fix spilling test for running on local FS
..

IMPALA-6526: Fix spilling test for running on local FS

One of the spilling test was failing because its minimum bufferpool
mem requirement was more when ran on local FS as compared to when
it is run on HDFS.
The fix is to increase the bufferpool limit to a value just above
the min limit so that it still forces spill to disk on both filesystems.

Testing:
Ran core tests with local FS as target file system. Made sure the
failing test passed.

Change-Id: I50648d7936007a26891cf64d6343c47d9d646596
Reviewed-on: http://gerrit.cloudera.org:8080/9354
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596
Gerrit-Change-Number: 9354
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9354 )

Change subject: IMPALA-6526: Fix spilling test for running on local FS
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596
Gerrit-Change-Number: 9354
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 01:40:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 5:

(I'm assuming that we'll validate that the mem limit works ok by doing 
additional concurrency testing before turning on KRPC by default).


--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 01:38:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 27:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/scan-range.cc
File be/src/runtime/io/scan-range.cc:

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/scan-range.cc@95
PS25, Line 95:   reader_->num_used_buffers_.Add(-1);
> It looks like it's only used for debugging (validation + debug strings). I
Either now or later is okay with me, if you're confident they don't help (I 
agree they don't seem valuable). I think they are a fairly big distraction in 
already complicated code.



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 27
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 01:22:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.h@411
PS6, Line 411:   /// Populate a row batch from the serialized row batch header.
that comment doesn't look correct. This constructor just constructs an empty 
row batch, right?
Actually, wouldn't it be better to not even pass in header here, but instead 
just "resize" the rowbatch in FromThrift when allocating the buffers? i.e make 
this a constructor that just creates a completely empty row batch.



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 01:11:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 6:

(4 comments)

The overall scope of the change looks good. Comments are just about the error 
paths and testing.

http://gerrit.cloudera.org:8080/#/c/9344/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9344/6//COMMIT_MSG@39
PS6, Line 39: Testing done: Debug core build.
How will we be able to test the error paths? Are we relying on the stress tests?


http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/krpc-data-stream-recvr.cc@304
PS6, Line 304:   recvr_->num_buffered_bytes_.Add(batch_size);
Does this need to be decremented on failure? Or is it somehow ok because the 
query is going to fail anyway? Would be good to either fix or add a comment to 
explain why it's intentional.


http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/krpc-data-stream-recvr.cc@306
PS6, Line 306:   ++num_pending_enqueue_;
Same here.


http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.cc@193
PS6, Line 193:   row_batch->tuple_ptrs_info_->client = client;
tuple_ptrs_info_ gets left in an intermediate state if the allocation fails. 
It's probably benign but it's hard to reason about. Would we be better off 
setting tuple_ptrs_info_ after the AllocateBuffer() call?



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 00:59:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters

2018-02-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9286 )

Change subject: IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9286/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9286/5//COMMIT_MSG@7
PS5, Line 7: docs
> Uppercase "DOCS"
Done


http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml
File docs/topics/impala_proxy.xml:

http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml@244
PS5, Line 244: k
> Uppercase "Kerberized" is the convention we use elsewhere. You'll only see
Done


http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml@247
PS5, Line 247: 
 :   
> Please put the note outside the paragraph, i.e.
Done



--
To view, visit http://gerrit.cloudera.org:8080/9286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1ce9930e3f658c52502a2ba95f93647e2706d58c
Gerrit-Change-Number: 9286
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Sat, 17 Feb 2018 00:49:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters

2018-02-16 Thread Alex Rodoni (Code Review)
Hello John Russell, Philip Zeyliger, Sailesh Mukil, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9286

to look at the new patch set (#6).

Change subject: IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters
..

IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters

Noted that after enabling ha-proxy in a kerberized cluster,
users can no-longer connect to individual impala daemons directly
from impala shell.

Change-Id: I1ce9930e3f658c52502a2ba95f93647e2706d58c
---
M docs/topics/impala_proxy.xml
1 file changed, 12 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9286/6
--
To view, visit http://gerrit.cloudera.org:8080/9286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1ce9930e3f658c52502a2ba95f93647e2706d58c
Gerrit-Change-Number: 9286
Gerrit-PatchSet: 6
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-16 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8707

to look at the new patch set (#27).

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..

IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
19 files changed, 964 insertions(+), 505 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/27
--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 27
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 26:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h@113
PS26, Line 113: /// concurrently consumes and processes the data. For example, 
the logic in a scanner thread
Long line


http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h@140
PS26, Line 140: /// when the scan range has no buffers to read data into causes 
a resource deadlock.
I should add a note to explain that it's actually ok to hold onto multiple 
buffers.


http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h@281
PS26, Line 281: encounted
typo


http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h@290
PS26, Line 290: GetNextRange
I feel like this name is pretty vague - I wonder if we should call it 
GetNextUnstartedRange() to disambiguate?



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 26
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 00:28:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 25:

(35 comments)

Thanks for the comments. Addressed most of your points but had a few minor 
follow-on questions. Still need to do a second pass over my changes for typos, 
but I thought it was worth pushing out now (it's also easier to see typos for 
me in gerrit :))

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/exec/hdfs-parquet-scanner.cc@1445
PS25, Line 1445: ;
> << "Already provided a buffer" was nice
Done


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@71
PS25, Line 71: /// TODO: We should map readers to queries. A reader is the unit 
of scheduling and queries
Moved this to live with the other TODOs at the bottom.


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@84
PS25, Line 84: GetNextRange: returns to the caller the next scan range it 
should process.
 : /// This is based on disk load. This also begins reading the 
data in this scan
 : /// range. This is blocking.
> that could probably use updating? or maybe rephrase to talk about states or
Removed some of the detail here that fits better in the method comments.


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@109
PS25, Line 109: Memory usage in the IoMgr comes from queued read buffers.  If 
we queue the minimum
  : /// (i.e. 1), then the disks are idle while we are processing 
the buffer. If we don't
  : /// limit the queue, then it possible we end up queueing the 
entire data set (i.e. CPU
  : /// is slower than disks) and run out of memory.
> this seems a bit outdated
Removed this part, I think it was irrelevant in light of later sections.


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@116
PS25, Line 116: In that case each query will need fewer queued buffers and
  : /// therefore have less memory usage.
  : //
> and that
Reworked this and the below paragraph. It's all a bit outdated.


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@136
PS25, Line 136: a client buffer provided by the caller
  : /// when constructing the scan range.
> in that case, is it required that the single buffer can fit the entire rang
Clarified.


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@147
PS25, Line 147: buffers are owned
> how does a buffer become owned by the caller? I guess that means when it's
I guess we don't actually provide a way for clients to determine how many 
buffers were allocated. It's easy to expose but we don't have a use case right 
now. I reworded this comment in terms of GetNext()/ReturnBuffer() to suggest 
the conservative behaviour for now.


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@149
PS25, Line 149: it can allocate at least three
  : /// max-sized buffers per scan range
> the API below (AllocateBuffersForRange()) doesn't seem to give the caller (
Reworded slightly - agree it implied that. The AllocateBuffersForRange() 
comment does mention 3 * max_buffer_size() so that makes it a little explicit 
at least.


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@269
PS25, Line 269: ,
> "on return"
Done


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@282
PS25, Line 282: range is
  :   /// available
> what does it mean for a range to be available (from an API perspective)?
I think the bit about blocking was actually wrong (or at least extremely 
misleading) - it doesn't block indefinitely but rather gives up ASAP if there 
are no unstarted ranges.

There's a .wait() call on a condition variable, but I think that only exists to 
deal with races with ranges being pulled of queues (I'm sure there's a 
potential simplification there).


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@288
PS25, Line 288: 'needs_buffers' is set to true
> this returns true in '*needs_buffers'
Done


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@300
PS25, Line 300: allocate
> allocated
Done


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@474
PS25, Line 474:   /// Reads the specified scan range and calls 
HandleReadFinished() when done.
> this should probably mention how it might not do anything if there's no buf
Done


http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc
File 

[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-16 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8707

to look at the new patch set (#26).

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..

IMPALA-4835: Part 2: Allocate scan range buffers upfront

This change is a step towards reserving memory for buffers from the
buffer pool and constraining per-scanner memory requirements. This
change restructures the DiskIoMgr code so that each ScanRange operates
with a fixed set of buffers that are allocated upfront and recycled as
the I/O mgr works through the ScanRange.

One major change is that ScanRanges get blocked when a buffer is not
available and get unblocked when a client returns a buffer via
ReturnBuffer(). I was able to remove the logic to maintain the
blocked_ranges_ list by instead adding a separate set with all ranges
that are active.

I plan to merge this along with the actual buffer pool switch, but
separated it out to allow review of the DiskIoMgr changes separate from
other aspects of the buffer pool switchover.

Testing:
* Ran core and exhaustive tests.

Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/io/disk-io-mgr-internal.h
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.h
19 files changed, 961 insertions(+), 505 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/26
--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 26
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] [DOCS] Removed the obsolete Llama options files

2018-02-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9219 )

Change subject: [DOCS] Removed the obsolete Llama options files
..


Patch Set 4:

Cherry-picks added


--
To view, visit http://gerrit.cloudera.org:8080/9219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d
Gerrit-Change-Number: 9219
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 00:05:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9344/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9344/3//COMMIT_MSG@29
PS3, Line 29: Note that the proper reservation mechanism of the exchange node
> Good point. As a comparison, suppose we allocate the memory from non-buffer
Bumped the buffer pool limit to 85% by default in the latest PS.



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 17 Feb 2018 00:04:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Removed the obsolete Llama options files

2018-02-16 Thread Alex Rodoni (Code Review)
Hello Michael Brown, John Russell, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9219

to look at the new patch set (#4).

Change subject: [DOCS] Removed the obsolete Llama options files
..

[DOCS] Removed the obsolete Llama options files

Removed the Llama options file.
Removed impala_sqlref.ditamap that is not used.
Removed the reference to impala_sqlref.ditamap in README.md

Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d
Cherry-picks: not for 2.x
---
M docs/README.md
M docs/impala.ditamap
D docs/impala_sqlref.ditamap
M docs/shared/impala_common.xml
D docs/topics/impala_reservation_request_timeout.xml
D docs/topics/impala_v_cpu_cores.xml
6 files changed, 0 insertions(+), 261 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/9219/4
--
To view, visit http://gerrit.cloudera.org:8080/9219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d
Gerrit-Change-Number: 9219
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6499: [docs] Fixed formatting errors in split part function

2018-02-16 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9275 )

Change subject: IMPALA-6499: [docs] Fixed formatting errors in split_part 
function
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9275/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9275/3//COMMIT_MSG@7
PS3, Line 7: docs
Uppercase "DOCS"


http://gerrit.cloudera.org:8080/#/c/9275/3/docs/topics/impala_string_functions.xml
File docs/topics/impala_string_functions.xml:

http://gerrit.cloudera.org:8080/#/c/9275/3/docs/topics/impala_string_functions.xml@1067
PS3, Line 1067: 
  : Return type:
This looks like the whole paragraph was reformatted by accident. The  line 
shouldn't be wrapped at the end of another line containing text. Was there any 
change to the actual text in this paragraph?


http://gerrit.cloudera.org:8080/#/c/9275/3/docs/topics/impala_string_functions.xml@1075
PS3, Line 1075:
Remove trailing space.


http://gerrit.cloudera.org:8080/#/c/9275/3/docs/topics/impala_string_functions.xml@1129
PS3, Line 1129: 

[Impala-ASF-CR] [DOCS] Removed the obsolete Llama options files

2018-02-16 Thread Alex Rodoni (Code Review)
Hello Michael Brown, John Russell, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9219

to look at the new patch set (#3).

Change subject: [DOCS] Removed the obsolete Llama options files
..

[DOCS] Removed the obsolete Llama options files

Removed the Llama options file.
Removed impala_sqlref.ditamap that is not used.
Removed the reference to impala_sqlref.ditamap in README.md

Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d
---
M docs/README.md
M docs/impala.ditamap
D docs/impala_sqlref.ditamap
M docs/shared/impala_common.xml
D docs/topics/impala_reservation_request_timeout.xml
D docs/topics/impala_v_cpu_cores.xml
6 files changed, 0 insertions(+), 261 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/9219/3
--
To view, visit http://gerrit.cloudera.org:8080/9219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d
Gerrit-Change-Number: 9219
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] [DOCS] Fix in REPLICA PREFERENCE numeric options

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9341 )

Change subject: [DOCS] Fix in REPLICA_PREFERENCE numeric options
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b
Gerrit-Change-Number: 9341
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:56:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Fix in REPLICA PREFERENCE numeric options

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9341 )

Change subject: [DOCS] Fix in REPLICA_PREFERENCE numeric options
..

[DOCS] Fix in REPLICA_PREFERENCE numeric options

Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b
Reviewed-on: http://gerrit.cloudera.org:8080/9341
Reviewed-by: John Russell 
Tested-by: Impala Public Jenkins
---
M docs/topics/impala_replica_preference.xml
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  John Russell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b
Gerrit-Change-Number: 9341
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-6509: [docs] Note for haproxy for Kerberized clusters

2018-02-16 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9286 )

Change subject: IMPALA-6509: [docs] Note for haproxy for Kerberized clusters
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9286/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9286/5//COMMIT_MSG@7
PS5, Line 7: docs
Uppercase "DOCS"


http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml
File docs/topics/impala_proxy.xml:

http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml@244
PS5, Line 244: k
Uppercase "Kerberized" is the convention we use elsewhere. You'll only see it 
lowercase in user-entered text like some JIRA titles.


http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml@247
PS5, Line 247: 
 :   
Please put the note outside the paragraph, i.e.


  ...



...


It makes the output more predictable in terms of indentation and spacing, and 
is also friendlier from a maintenance perspective
because there's less indentation within the actual source file
(i.e. leaves more space on each line for actual text, leading to
less line wrapping).



--
To view, visit http://gerrit.cloudera.org:8080/9286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1ce9930e3f658c52502a2ba95f93647e2706d58c
Gerrit-Change-Number: 9286
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:56:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9347 )

Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29
Gerrit-Change-Number: 9347
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:55:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9347 )

Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc
..

[DOCS] Typos fixed in Impala Analytic Functions doc

Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29
Reviewed-on: http://gerrit.cloudera.org:8080/9347
Reviewed-by: Tim Armstrong 
Reviewed-by: John Russell 
Tested-by: Impala Public Jenkins
---
M docs/shared/impala_common.xml
M docs/topics/impala_analytic_functions.xml
2 files changed, 3 insertions(+), 3 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  John Russell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29
Gerrit-Change-Number: 9347
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Michael Ho (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9344

to look at the new patch set (#6).

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..

IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

Previously, tuple pointers of a row batch are allocated from
the heap via malloc() and tuple data is allocated from the
MemPool associated with the RowBatch. This change converts
the allocations of tuple pointers and tuple data to using
BufferPool for row batches allocated from KrpcDataStreamRecvr.
The primary motivation for this change is to take advantage of
the fact that buffers allocated from BufferPool always go back
to the per-core arena they came from when they are freed. This
alleviates the TCMalloc imbalance between the RPC service threads
and the fragment execution threads. As described in IMPALA-5518,
row batches are always allocated from the service threads' TCMalloc
cache and placed into the fragment execution threads' TCMalloc cache
when they're freed. This leads to underflow and overflow in those
threads' caches and high contention for the spinlock of the central
free list. With BufferPool, the memory always went back to its
originating arena so this kind of imbalance is less likely to occur.
This also dovetails with the long term plan to put most allocations
under BufferPool and have each operators in the plan reserved
appropriate amount of memory before execution.

Note that the proper reservation mechanism of the exchange node
hasn't yet been implemented in this change so the buffer pool client
handle used for allocating buffers has an ad-hoc set-up of no reservation
limit and using root reservation tracker as parent. This needs to be
fixed as part of IMPALA-6524. The default buffer pool limit is also
bumped to 85% to account for the extra usage from the exchange nodes.
The minimum buffer size is also lowered to 8KB to reduce amount of memory
wastage as a row batch's tuple pointers / tuple data can sometimes be
much smaller than 64KB.

Testing done: Debug core build.

Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
---
M be/src/common/global-flags.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/partial-sort-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/service/query-options-test.cc
M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
18 files changed, 217 insertions(+), 85 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/9344/6
--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6515: [docs] HAproxy with sticky session requires the check option

2018-02-16 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9293 )

Change subject: IMPALA-6515: [docs] HAproxy with sticky session requires the 
check option
..


Patch Set 3:

Different people from various field organizations have requested / suggested 
all sorts of combinations of HAProxy settings. There has never been a 
consistent set that worked the best for everyone. I suggest tagging in Alan 
Choi since he was the first one who verified the HAProxy instructions.


--
To view, visit http://gerrit.cloudera.org:8080/9293
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I47165df6624f958c2e0542e2627d4f5377789ab8
Gerrit-Change-Number: 9293
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:51:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] [docs] Removed the obsolete Llama options files

2018-02-16 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9219 )

Change subject: [docs] Removed the obsolete Llama options files
..


Patch Set 2:

> (1 comment)
 >
 > Now the question is, how do we make sure on the doc side that this
 > change stays only in the 3.x (master?) branch and doesn't get
 > applied to Impala 2.x branches?

If you don't want a commit to end up in the 2.x branch, your commit message 
needs to have this line:

Cherry-picks: not for 2.x

This granularity is at the commit level: if this entire commit doesn't belong 
in 2.x, just git commit --amend the commit message and push to Gerrit.


--
To view, visit http://gerrit.cloudera.org:8080/9219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d
Gerrit-Change-Number: 9219
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:50:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Fix in REPLICA PREFERENCE numeric options

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9341 )

Change subject: [DOCS] Fix in REPLICA_PREFERENCE numeric options
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/198/


--
To view, visit http://gerrit.cloudera.org:8080/9341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b
Gerrit-Change-Number: 9341
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:48:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Fix in REPLICA PREFERENCE numeric options

2018-02-16 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9341 )

Change subject: [DOCS] Fix in REPLICA_PREFERENCE numeric options
..


Patch Set 3:

I'm not familiar with whatever error condition is indicated in the console 
output from the verification job, i.e. 
https://jenkins.impala.io/job/gerrit-docs-submit/195/console

I'm going to try the verification job again with the hope that it's a transient 
error. If not, something might be going wrong with the commit flow (incorrect 
rebase, or review created from master instead of a private branch); if that's 
the case, the solution would be to abandon and re-do.


--
To view, visit http://gerrit.cloudera.org:8080/9341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b
Gerrit-Change-Number: 9341
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:48:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9347 )

Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/197/


--
To view, visit http://gerrit.cloudera.org:8080/9347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29
Gerrit-Change-Number: 9347
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:44:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc

2018-02-16 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9347 )

Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29
Gerrit-Change-Number: 9347
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:43:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] [docs] Removed the obsolete Llama options files

2018-02-16 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9219 )

Change subject: [docs] Removed the obsolete Llama options files
..


Patch Set 2:

(1 comment)

Now the question is, how do we make sure on the doc side that this change stays 
only in the 3.x (master?) branch and doesn't get applied to Impala 2.x 
branches? (I presume the stub files with their warnings should stay in place 
for any 2.12, 2.13, etc. releases that are done.

http://gerrit.cloudera.org:8080/#/c/9219/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9219/2//COMMIT_MSG@7
PS2, Line 7: docs
Uppercase "DOCS"



--
To view, visit http://gerrit.cloudera.org:8080/9219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d
Gerrit-Change-Number: 9219
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 23:42:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Michael Ho (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9344

to look at the new patch set (#5).

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..

IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

Previously, tuple pointers of a row batch are allocated from
the heap via malloc() and tuple data is allocated from the
MemPool associated with the RowBatch. This change converts
the allocations of tuple pointers and tuple data to using
BufferPool for row batches allocated from KrpcDataStreamRecvr.
The primary motivation for this change is to take advantage of
the fact that buffers allocated from BufferPool always go back
to the per-core arena they came from when they are freed. This
alleviates the TCMalloc imbalance between the RPC service threads
and the fragment execution threads. As described in IMPALA-5518,
row batches are always allocated from the service threads' TCMalloc
cache and placed into the fragment execution threads' TCMalloc cache
when they're freed. This leads to underflow and overflow in those
threads' caches and high contention for the spinlock of the central
free list. With BufferPool, the memory always went back to its
originating arena so this kind of imbalance is less likely to occur.
This also dovetails with the long term plan to put most allocations
under BufferPool and have each operators in the plan reserved
appropriate amount of memory before execution.

Note that the proper reservation mechanism of the exchange node
hasn't yet been implemented in this change so the buffer pool client
handle used for allocating buffers has an ad-hoc set-up of no reservation
limit and using root reservation tracker as parent. This needs to be
fixed as part of IMPALA-6524. Part of this change also lowers the
minimum buffer size to 32KB to reduce amount of memory wastage as
a row batch's tuple pointers / tuple data can sometimes be smaller
than 64KB.

Testing done: Debug core build.

Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
---
M be/src/common/global-flags.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/partial-sort-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/service/query-options-test.cc
M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
18 files changed, 216 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/9344/5
--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Michael Ho (Code Review)
Hello Lars Volker, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9282

to look at the new patch set (#5).

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..

IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

The fix for IMPALA-6193 added a memory tracker for the memory consumed
by the payloads in the service queue of DataStreamService. This change
extends it by introducing a bound on the memory usage for that service
queue. In addition, it deprecates FLAGS_datastream_service_queue_depth
and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags
only take effect when KRPC is in use and KRPC was never enabled in any
previous releases so it seems safe to do this flag replacement. The new
flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount
of memory which can be consumed by the service queue of DataStreamService.
This allows a more direct control over the memory usage of the queue instead
of inferring via the number of entries in the queue. The default value of
this flag is left at 0, in which case it will be set to 5% of process
memory limit.

Testing done: exhaustive debug builds. Updated data-stream-test to
exercise the case in which the payload is larger than the limit.

Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test-base.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/mem-tracker.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M tests/custom_cluster/test_krpc_mem_usage.py
15 files changed, 282 insertions(+), 162 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/9282/5
-- 
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-16 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#11).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page. See here for an example: https://git.io/vAczm

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change contains tests to check that the metrics show up in /rpcz
and /metrics and that they update as expected when executing queries.

This change is based on a change by Sailesh Mukil.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
A tests/custom_cluster/test_krpc_metrics.py
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
16 files changed, 503 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/11
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-16 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 10:

Please see the tests in PS10.


--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 16 Feb 2018 22:05:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-16 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#10).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page. See here for an example: https://git.io/vAczm

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change is based on a change by Sailesh Mukil.

TODO: testing, once IMPALA-6508 has been submitted.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
A tests/custom_cluster/test_krpc_metrics.py
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
16 files changed, 503 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/10
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9354 )

Change subject: IMPALA-6526: Fix spilling test for running on local FS
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1949/


--
To view, visit http://gerrit.cloudera.org:8080/9354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596
Gerrit-Change-Number: 9354
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 21:51:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h@252
PS4, Line 252: skip_gc = false
> Should we change this so that we only GC if the process limit was exceeded?
Actually, Tim pointed out that we only installed GC functions for process mem 
tracker so I was just misreading the code then. Sorry for the confusion. Will 
remove the flag.



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 21:50:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS

2018-02-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9354 )

Change subject: IMPALA-6526: Fix spilling test for running on local FS
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596
Gerrit-Change-Number: 9354
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 21:47:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS

2018-02-16 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9354


Change subject: IMPALA-6526: Fix spilling test for running on local FS
..

IMPALA-6526: Fix spilling test for running on local FS

One of the spilling test was failing because its minimum bufferpool
mem requirement was more when ran on local FS as compared to when
it is run on HDFS.
The fix is to increase the bufferpool limit to a value just above
the min limit so that it still forces spill to disk on both filesystems.

Testing:
Ran core tests with local FS as target file system. Made sure the
failing test passed.

Change-Id: I50648d7936007a26891cf64d6343c47d9d646596
---
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/9354/1
--
To view, visit http://gerrit.cloudera.org:8080/9354
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596
Gerrit-Change-Number: 9354
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@159
PS1, Line 159: Status ExchangeNode::GetNext(RuntimeState* state, RowBatch* 
output_batch, bool* eos) {
> I thought we discussed this in the past and the conclusion was that we can
Oops, you are right.


http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123
PS1, Line 123: BufferPool::BufferHandle* tuple_ptrs_buffer = 
&(tuple_ptrs_info_->buffer);
> Mostly 8KB I suspect if there are 1024 rows and 1 tuple (e.g. from scan nod
Seems ok to me.



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 21:22:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-16 Thread Grant Henke (Code Review)
Hello Thomas Tauber-Marshall, Taras Bobrovytsky,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9306

to look at the new patch set (#4).

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..

IMPALA-5752: Add support for DECIMAL on Kudu tables

Adds support for the Kudu DECIMAL type introduced in Kudu 1.7.0.

Note: Adding support for Kudu decimal min/max filters is
tracked in IMPALA-6533.

Tests:
* Added Kudu create with decimal test to AnalyzeDDLTest.java
* Added Kudu table_format to test_decimal_queries.py
** Both decimal.test and decimal-exprs.test workloads
* Added decimal queries to the following Kudu workloads:
** kudu_create.test
** kudu_delete.test
** kudu_insert.test
** kudu_update.test
** kudu_upsert.test

Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_kudu.py
20 files changed, 716 insertions(+), 603 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/9306/4
--
To view, visit http://gerrit.cloudera.org:8080/9306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
Gerrit-Change-Number: 9306
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9239 )

Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for 
empty rows
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de
Gerrit-Change-Number: 9239
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 16 Feb 2018 20:23:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9239 )

Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for 
empty rows
..

IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows

Tuple pointers in the generated row batches may not be initialized
if a tuple has byte size 0. There are some codes which compare these
uninitialized pointers against nullptr so having them uninitialized
may return wrong (and non-deterministic) results, e.g.:
impala::TupleIsNullPredicate::GetBooleanVal()

The following query produces non-deterministic results currently:

SELECT count(v.x) FROM functional.alltypestiny t3 LEFT OUTER JOIN (
  SELECT true AS x FROM functional.alltypestiny t1 LEFT OUTER JOIN
  functional.alltypestiny t2 ON (true)) v ON (v.x = t3.bool_col)
WHERE t3.bool_col = true;

The alltypestiny table has 8 records, 4 records of them has the true
boolean value for bool_col. Therefore, the above query is a fancy
way of saying "8 * 8 * 4", i.e. it should return 256.

The solution is that scanners initialize tuple pointers to a non-null
value when there are no materialized slots. This non-null value is
provided by the new static member Tuple::POISON.

I extended QueryTest/scanners.test with the above query. This test
executes the query against all table formats.

This change has the biggest performance impact on count(*) queries on
large kudu tables. For my quick benchmark I copied tpch_kudu.lineitem
and doubled its data. The resulting table has 12,002,430 rows.

Without this patch 'select count(*) from biglineitem' runs for ~0.12s.
With the patch applied, the overhead is around a dozens of ms. I measured
the query on my desktop PC using a relase build of Impala. On debug builds,
the execution time of the patched version is around 160% of the original
version.

Without this patch:
+--++--+--+++---+---+-+
| Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak Mem  
| Est. Peak Mem | Detail  |
+--++--+--+++---+---+-+
| 03:AGGREGATE | 1  | 127.50us | 127.50us | 1  | 1  | 28.00 KB  
| 10.00 MB  | FINALIZE|
| 02:EXCHANGE  | 1  | 22.32ms  | 22.32ms  | 3  | 1  | 0 B   
| 0 B   | UNPARTITIONED   |
| 01:AGGREGATE | 3  | 1.78ms   | 1.89ms   | 3  | 1  | 16.00 KB  
| 10.00 MB  | |
| 00:SCAN KUDU | 3  | 8.00ms   | 8.28ms   | 12.00M | -1 | 512.00 KB 
| 0 B   | default.biglineitem |
+--++--+--+++---+---+-+

With this patch:
+--++--+--+++---+---+-+
| Operator | #Hosts | Avg Time | Max Time | #Rows  | Est. #Rows | Peak Mem  
| Est. Peak Mem | Detail  |
+--++--+--+++---+---+-+
| 03:AGGREGATE | 1  | 129.01us | 129.01us | 1  | 1  | 28.00 KB  
| 10.00 MB  | FINALIZE|
| 02:EXCHANGE  | 1  | 33.00ms  | 33.00ms  | 3  | 1  | 0 B   
| 0 B   | UNPARTITIONED   |
| 01:AGGREGATE | 3  | 1.99ms   | 2.13ms   | 3  | 1  | 16.00 KB  
| 10.00 MB  | |
| 00:SCAN KUDU | 3  | 13.13ms  | 13.97ms  | 12.00M | -1 | 512.00 KB 
| 0 B   | default.biglineitem |
+--++--+--+++---+---+-+

Change-Id: I2981227e62eb5971508e0698e189519755de
Reviewed-on: http://gerrit.cloudera.org:8080/9239
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-scanner.cc
M be/src/exec/kudu-scanner.cc
M be/src/runtime/tuple.cc
M be/src/runtime/tuple.h
M testdata/workloads/functional-query/queries/QueryTest/scanners.test
5 files changed, 27 insertions(+), 2 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de
Gerrit-Change-Number: 9239
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 

[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@159
PS1, Line 159: Status ExchangeNode::GetNext(RuntimeState* state, RowBatch* 
output_batch, bool* eos) {
> Forgot to check this. I think we should be setting the flush_resources_ bit
I thought we discussed this in the past and the conclusion was that we can live 
with this wrong accounting for now if it means not breaking the ability to 
accumulate multiple row batches here in ExchangeNode::GetNext().


http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123
PS1, Line 123: BufferPool::BufferHandle* tuple_ptrs_buffer = 
&(tuple_ptrs_info_->buffer);
> The common case is that the caller of ExchangeNode::GetNext() operates on a
Mostly 8KB I suspect if there are 1024 rows and 1 tuple (e.g. from scan node). 
From join, it may be 16KB with 2 tuples. How about we lower the min buffer size 
to 8KB. Will the overhead be unbearable then ?



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 20:16:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-02-16 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..


Patch Set 9: Code-Review+1

lgtm.

would anyone with more experience with the shell/python care to have a look at 
this patch?


--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 16 Feb 2018 20:02:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-02-16 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..

IMPALA-6337: Fix infinite loop in Impala shell

The patch uses a workaround for a bug in the sqlparse.split() function
by joining the statements that contain error tokens into a single
statement

Testing:
- Ran end-to-end shell tests

Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
---
M shell/impala_shell.py
M tests/shell/test_shell_interactive.py
2 files changed, 59 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/9195/9
--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell

2018-02-16 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9195 )

Change subject: IMPALA-6337: Fix infinite loop in Impala shell
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9195/8/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/9195/8/shell/impala_shell.py@580
PS8, Line 580: select
repeated 'select'?


http://gerrit.cloudera.org:8080/#/c/9195/8/tests/shell/test_shell_interactive.py
File tests/shell/test_shell_interactive.py:

http://gerrit.cloudera.org:8080/#/c/9195/8/tests/shell/test_shell_interactive.py@482
PS8, Line 482: # IMPALA-6337: Fix infinite loop.
add the simpler example from the jira



--
To view, visit http://gerrit.cloudera.org:8080/9195
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Gerrit-Change-Number: 9195
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 16 Feb 2018 19:26:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h@252
PS4, Line 252: skip_gc = false
> Yes, we can try to document it.
Should we change this so that we only GC if the process limit was exceeded? And 
then we don't need this flag, right?



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 19:18:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h@252
PS4, Line 252: skip_gc = false
> I think it's going to be kind of hard for readers of the code and new calls
Yes, we can try to document it.

I guess the problem to me is that this function is unconditionally GC'ing when 
any MemTracker's limit is exceeded. However, IIUC, GC'ing should help if you 
exceeded the process mem limit. So, it's unclear to me if it's okay to add a 
check here to see if the mem_tracker exceeding the limit is the 
process_mem_tracker_ and only call GcMemory in that case.



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 19:17:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5690: Part 1: Rename ostream operators for thrift types

2018-02-16 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9168 )

Change subject: IMPALA-5690: Part 1: Rename ostream operators for thrift types
..


Patch Set 9:

> Patch Set 9: Code-Review+1
>
> Fyi, its nice for reviewers if you don't submit changes at the same time as a 
> rebase - if you look at the diff between patch 8 and 9 here it contains a 
> bunch of unrelated stuff, which makes it hard to see what changes you 
> actually made.
>
> I usually submit the review with the changes I made and then immediately 
> rebase and submit again.

Thanks for the tip.


--
To view, visit http://gerrit.cloudera.org:8080/9168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c303997411237e988ef960157f781776f6fcb60
Gerrit-Change-Number: 9168
Gerrit-PatchSet: 9
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 16 Feb 2018 19:15:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc

2018-02-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9347 )

Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc
..


Patch Set 1:

Now it is public.


--
To view, visit http://gerrit.cloudera.org:8080/9347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29
Gerrit-Change-Number: 9347
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 19:01:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc

2018-02-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9347


Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc
..

[DOCS] Typos fixed in Impala Analytic Functions doc

Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29
---
M docs/shared/impala_common.xml
M docs/topics/impala_analytic_functions.xml
2 files changed, 3 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/9347/1
--
To view, visit http://gerrit.cloudera.org:8080/9347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29
Gerrit-Change-Number: 9347
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@159
PS1, Line 159: Status ExchangeNode::GetNext(RuntimeState* state, RowBatch* 
output_batch, bool* eos) {
Forgot to check this. I think we should be setting the flush_resources_ bit on 
RowBatches that are backed by buffers so that consumers don't accumulate 
memory, which will be accounted against this ExchangeNode.


http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123
PS1, Line 123: BufferPool::BufferHandle* tuple_ptrs_buffer = 
&(tuple_ptrs_info_->buffer);
> I'd prefer if we don't break the RowBatch abstraction further. i.e. if diff
The common case is that the caller of ExchangeNode::GetNext() operates on a 
single batch at a time, whereas we can have many buffered row batches here, so 
I think that is not important to optimise for.. My biggest concern with this 
patch is memory consumption of the queued batches.

It does depend on the numbers though - do you have a sense for what the typical 
size of tuple_ptrs_ is?



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 18:58:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-16 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9306 )

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..


Patch Set 3:

FYI. I am adding some primary key tests and working on running/validating the 
tests.


--
To view, visit http://gerrit.cloudera.org:8080/9306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
Gerrit-Change-Number: 9306
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 16 Feb 2018 18:55:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5690: Part 1: Rename ostream operators for thrift types

2018-02-16 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9168 )

Change subject: IMPALA-5690: Part 1: Rename ostream operators for thrift types
..


Patch Set 9: Code-Review+1

Fyi, its nice for reviewers if you don't submit changes at the same time as a 
rebase - if you look at the diff between patch 8 and 9 here it contains a bunch 
of unrelated stuff, which makes it hard to see what changes you actually made.

I usually submit the review with the changes I made and then immediately rebase 
and submit again.


--
To view, visit http://gerrit.cloudera.org:8080/9168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c303997411237e988ef960157f781776f6fcb60
Gerrit-Change-Number: 9168
Gerrit-PatchSet: 9
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 16 Feb 2018 18:50:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123
PS1, Line 123: BufferPool::BufferHandle* tuple_ptrs_buffer = 
&(tuple_ptrs_info_->buffer);
> On the other hand, that also means holding onto the memory of tuple_ptrs_ l
I'd prefer if we don't break the RowBatch abstraction further. i.e. if 
different instances behave differently (w.r.t. Reset() for example), then the 
abstraction is probably not right.  We already  have this abstraction breakage 
for AcquireState().  Later, when figuring out the reservation for receiver/xchg 
node, I think we should revisit this. There are probably other improvements we 
can make and maybe RowBatch isn't the best way to do this buffering.



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 18:47:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h@252
PS4, Line 252: skip_gc = false
I think it's going to be kind of hard for readers of the code and new callsites 
to know why/when to set skip_gc=true.



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 18:42:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-16 Thread Lars Volker (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9292

to look at the new patch set (#8).

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..

IMPALA-6269: Expose KRPC metrics on debug webpage

This change exposes KRPC metrics on the /rpcz debug web page.

This change also exposes metrics for rejected RPCs on the /metrics debug
web page. See here for an example: https://git.io/vAczm

This change also fixes a bug in PrettyPrinter::GetByteUnit(), which
previously did not work for unsigned values due to an implicit cast.

This change is based on a change by Sailesh Mukil.

TODO: testing, once IMPALA-6508 has been submitted.

Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-trace.cc
M be/src/rpc/rpc-trace.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/impalad-main.cc
M be/src/util/histogram-metric.h
M be/src/util/pretty-printer.h
M common/thrift/Metrics.thrift
M common/thrift/metrics.json
M tests/webserver/test_web_pages.py
M www/rpcz.tmpl
15 files changed, 387 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/8
--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 8
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-16 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 7:

Thanks for the review, I address the comments in PS8. I will rebase the change 
next to pull in the --test_krpc flag.


--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 7
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 16 Feb 2018 18:33:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase

2018-02-16 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8958 )

Change subject: IMPALA-5152: Introduce metadata loading phase
..


Patch Set 5: Code-Review+1

Carry +1


--
To view, visit http://gerrit.cloudera.org:8080/8958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Gerrit-Change-Number: 8958
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 16 Feb 2018 18:00:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase

2018-02-16 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8958 )

Change subject: IMPALA-5152: Introduce metadata loading phase
..


Patch Set 4: Code-Review+1

Carry +1


--
To view, visit http://gerrit.cloudera.org:8080/8958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Gerrit-Change-Number: 8958
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 16 Feb 2018 18:00:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase

2018-02-16 Thread Alex Behm (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Dimitris Tsirogiannis, Vuk 
Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8958

to look at the new patch set (#5).

Change subject: IMPALA-5152: Introduce metadata loading phase
..

IMPALA-5152: Introduce metadata loading phase

Reworks the collection and loading of missing metadata
when compiling a statement. Introduces a new
metadata-loading phase between parsing and analysis.
Summary of the new compilation flow:
1. Parse statement.
2. Collect all table references from the parsed
   statement and generate a list of tables that need
   to be loaded for analysis to succeed.
3. Request missing metadata and wait for it to arrive.
   As views become loaded we expand the set of required
   tables based on the view definitions.
   This step populates a statement-local table cache
   that contains all loaded tables relevant to the
   statement.
4. Create a new Analyzer with the table cache and
   analyze the statement. During analysis only the
   table cache is consulted for table metadata, the
   ImpaladCatalog is not used for that purpose anymore.
5. Authorize the statement.
6. Plan generation as usual.

The intent of the existing code was to collect all tables
missing metadata during analysis, load the metadata, and then
re-analyze the statement (and repeat those steps until all
metadata is loaded).
Unfortunately, the relevant code was hard-to-follow, subtle
and not well tested, and therefore it was broken in several
ways over the course of time. For example, the introduction
of path analysis for nested types subtly broke the intended
behavior, and there are other similar examples.

The serial table loading observed in the JIRA was caused by the
following code in the resolution of table references:
for (all path interpretations) {
  try {
// Try to resolve the path; might call getTable() which
// throws for nonexistent tables.
  } catch (AnalysisException e) {
if (analyzer.hasMissingTbls()) throw e;
  }
}

The following example illustrates the problem:
SELECT * FROM a.b, x.y
When resolving the path "a.b" we consider that "a" could be a
database or a table. Similarly, "b" could be a table or a
nested collection.
If the path resolution for "a.b" adds a missing table entry,
then the path resolution for "x.y" could exit prematurely,
without trying the other path interpretations that would
lead to adding the expected missing table. So effectively,
the tables end up being loaded one-by-one.

Testing:
- A core/hdfs run succeeded
- No new tests were added because the existing functional tests
  provide good coverage of various metadata loading scenarios.
- The issue reported in IMPALA-5152 is basically impossible now.
  Adding FE unit tests for that bug specifically would require
  ugly changes to the new code to enable such testing.

Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
---
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/AuthorizationStmt.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescribeDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/LimitElement.java
M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SetStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateFunctionStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java
M 

[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase

2018-02-16 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8958 )

Change subject: IMPALA-5152: Introduce metadata loading phase
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@302
PS4, Line 302: Map of all query-relevant tables that is populated before 
analysis in
 : // Frontend#requestTableLoadAndWait().
 : // An entry in the map is guaranteed to point to a loaded 
table. This could mean
 : // the table was loaded successfully or a load was attempted 
but failed.
 : // The absence of an entry indicates that table was not in 
the Catalog at the time
 : // t
> nit: is most of the detail for this comment better stated in the StmtTableC
Condensed comment.


http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@43
PS4, Line 43: v
> nit: sp
Done


http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@46
PS4, Line 46: Frontend.class
> use this class.
Oops. Done.


http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@62
PS4, Line 62: numCatalogUpdates_
> unclear from the name what numCatalogUpdates_ tracks. how about: numLoadsRe
In my mind these are two completely different things, so I'd prefer not to use 
the same "load" terminology for them.

I modified the var names to distinguish between sent/received and added 
comments.


http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@115
PS4, Line 115:* Collects and loads all tables and views required to analyze 
the given statement
> nit: period
Done



--
To view, visit http://gerrit.cloudera.org:8080/8958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Gerrit-Change-Number: 8958
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 16 Feb 2018 18:00:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase

2018-02-16 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8958 )

Change subject: IMPALA-5152: Introduce metadata loading phase
..


Patch Set 4: Code-Review+1

(5 comments)

thanks for the refactor and db load handling. mostly small comments from my end.

http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@302
PS4, Line 302: Map of all query-relevant tables that is populated before 
analysis in
 : // Frontend#requestTableLoadAndWait().
 : // An entry in the map is guaranteed to point to a loaded 
table. This could mean
 : // the table was loaded successfully or a load was attempted 
but failed.
 : // The absence of an entry indicates that table was not in 
the Catalog at the time
 : // t
nit: is most of the detail for this comment better stated in the StmtTableCache 
class?


http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@43
PS4, Line 43: v
nit: sp


http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@46
PS4, Line 46: Frontend.class
use this class.


http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@62
PS4, Line 62: numCatalogUpdates_
unclear from the name what numCatalogUpdates_ tracks. how about: 
numLoadsReceived_ (and perhaps numLoadsRequested_)?


http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@115
PS4, Line 115:* Collects and loads all tables and views required to analyze 
the given statement
nit: period



--
To view, visit http://gerrit.cloudera.org:8080/8958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Gerrit-Change-Number: 8958
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 16 Feb 2018 17:29:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9239 )

Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for 
empty rows
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1948/


--
To view, visit http://gerrit.cloudera.org:8080/9239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de
Gerrit-Change-Number: 9239
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 16 Feb 2018 16:41:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows

2018-02-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9239 )

Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for 
empty rows
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de
Gerrit-Change-Number: 9239
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 16 Feb 2018 16:40:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

2018-02-16 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Gabor Kaszab, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8400

to look at the new patch set (#14).

Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS 
SELECT
..

IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT

This change adds support for "clustered", "noclustered", "shuffle" and
"noshuffle" hints in CTAS statement.

Example:
create /*+ clustered,noshuffle */ table t partitioned by (year, month)
as select * from functional.alltypes

The effect of these hints are the same as in insert statements:

clustered:
Sort locally by partition expression before insert to ensure that only
one partition is written at a time. The goal is to reduce the number of
files kept open / buffers kept in memory simultaneously.

noclustered:
Do not sort by primary key before insert to Kudu table. No effect on HDFS
tables currently, as this is their default behavior.

shuffle:
Forces the planner to add an exchange node that repartitions by the
partition expression of the output table. This means that a partition
will be written only by a single node, which minimizes the global
number of simultaneous writes.
If only one partition is written (because all partitioning columns
are constant or the target table is not partitioned), then the shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.

noshuffle:
Do not add exchange node before insert to partitioned tables.

The parser needed some modifications to be able to differentiate between
CREATE statements that allow hints (CTAS), and CREATE statements that
do not (every other type of CREATE statements). As a result, KW_CREATE
was moved from tbl_def_without_col_defs to statement rules.

The tests mainly mirror the analysis / planner / parser tests of INSERT.
Query tests are not created, as the hints have no effect on
the DDL part of CTAS, and the actual query ran is the same as in
the insert case.

Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
9 files changed, 333 insertions(+), 78 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/14
--
To view, visit http://gerrit.cloudera.org:8080/8400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0
Gerrit-Change-Number: 8400
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-16 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9306 )

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9306/2/be/src/exec/kudu-util.cc
File be/src/exec/kudu-util.cc:

http://gerrit.cloudera.org:8080/#/c/9306/2/be/src/exec/kudu-util.cc@214
PS2, Line 214:
> Yeah, I just took what the formatter suggested. Will update.
Done


http://gerrit.cloudera.org:8080/#/c/9306/2/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/9306/2/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@625
PS2, Line 625: for (TupleId tupleId: targetSlotsByTid.keySet()) {
> I filed IMPALA-6533. Can you reference that here?
Done



--
To view, visit http://gerrit.cloudera.org:8080/9306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
Gerrit-Change-Number: 9306
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 16 Feb 2018 16:22:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-16 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9306 )

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..


Patch Set 3:

(4 comments)

> For example, you could add tests for insert/update/delete with decimal values 
> that are outside the range for the precision/scale of the column, creating a 
> table with a decimal primary key, etc.

Do these tests exist for the decimal type in general? For other formats like 
text, parquet, and Avro? I looked in the decimal.test and decimal_exprs.test 
workloads but didn't see anything. I suspect that sort of "validation" occurs 
before calling down to the specific format. If not, reusing the existing tests 
would be great.

http://gerrit.cloudera.org:8080/#/c/9306/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9306/2//COMMIT_MSG@10
PS2, Line 10:
> We generally include a "Testing" section here, eg. what tests were added, w
Done


http://gerrit.cloudera.org:8080/#/c/9306/1/be/src/exec/kudu-util.h
File be/src/exec/kudu-util.h:

http://gerrit.cloudera.org:8080/#/c/9306/1/be/src/exec/kudu-util.h@a69
PS1, Line 69:
> Well its useful for debugging
I filed KUDU-2303 to add a ToString method to KuduSchema itself.


http://gerrit.cloudera.org:8080/#/c/9306/2/be/src/exec/kudu-util.cc
File be/src/exec/kudu-util.cc:

http://gerrit.cloudera.org:8080/#/c/9306/2/be/src/exec/kudu-util.cc@214
PS2, Line 214:
> I imagine the formatter told you to do this. We don't follow it 100%, so yo
Yeah, I just took what the formatter suggested. Will update.


http://gerrit.cloudera.org:8080/#/c/9306/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java
File fe/src/main/java/org/apache/impala/util/KuduUtil.java:

http://gerrit.cloudera.org:8080/#/c/9306/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java@400
PS2, Line 400:  Fall through below */
> Another place you can ignore the formatter.
Done



--
To view, visit http://gerrit.cloudera.org:8080/9306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
Gerrit-Change-Number: 9306
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 16 Feb 2018 16:21:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables

2018-02-16 Thread Grant Henke (Code Review)
Hello Thomas Tauber-Marshall,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9306

to look at the new patch set (#3).

Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables
..

IMPALA-5752: Add support for DECIMAL on Kudu tables

Adds support for the Kudu DECIMAL type introduced in Kudu 1.7.0.

Note: Adding support for Kudu decimal min/max filters is
tracked in IMPALA-6533.

Tests:
* Added Kudu create with decimal test to AnalyzeDDLTest.java
* Added Kudu table_format to test_decimal_queries.py
** Both decimal.test and decimal-exprs.test workloads
* Added decimal queries to the following Kudu workloads:
** kudu_create.test
** kudu_delete.test
** kudu_insert.test
** kudu_update.test
** kudu_upsert.test

Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_kudu.py
20 files changed, 691 insertions(+), 603 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/9306/3
--
To view, visit http://gerrit.cloudera.org:8080/9306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6
Gerrit-Change-Number: 9306
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows

2018-02-16 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9239 )

Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for 
empty rows
..


Patch Set 5:

Seems like it was hit by IMPALA-6532


--
To view, visit http://gerrit.cloudera.org:8080/9239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de
Gerrit-Change-Number: 9239
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 16 Feb 2018 14:25:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-16 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9063 )

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..


Patch Set 8:

I took the liberty to make the GetIntConstant cleanup in patch set 8.


--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 14:20:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface

2018-02-16 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9063

to look at the new patch set (#8).

Change subject: IMPALA-5801: Clean up codegen GetType() interface
..

IMPALA-5801: Clean up codegen GetType() interface

Several functions that return llvm::(Pointer)Type were renamed
to make them shorter or indicate their roles more clearly. Some
additional convenience functions were created to make some common
codegen tasks simpler:

- Get(Ptr)Type functions with string parameter are renamed to
  GetNamed(Ptr)Type
- GetStruct(Ptr)Type template functions are created to make
  GetNamedType(MyStruct::LLVM_CLASS_NAME) calls simpler (some
  classes had LLVM_CLASS_NAME as private, these are changed to
  public)
- integer type convenience functions are renamed to indicate
  bit width instead of matching SQL type (e.g. int_type->i32_type)
- similar convenience functions were created for ptr to primitive
  types, int_ptr_type
- Get(Ptr)Type functions with ColumnType parameter are renamed
  to GetSlot(Ptr)Type
- GetIntConstant function is split to several functions depending
  on the type of the integer e.g. GetI32Constant

The renamed functions can be found in llvm-codegen.h/cc. Changes
in other files are mainly renamed calls with the same functionality.

Testing:
No new tests are necessary, as no functionality was changed.

Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/exec-node.cc
M be/src/exec/filter-context.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/tuple.cc
M be/src/runtime/types.cc
M be/src/util/tuple-row-compare.cc
26 files changed, 357 insertions(+), 392 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/9063/8
--
To view, visit http://gerrit.cloudera.org:8080/9063
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77
Gerrit-Change-Number: 9063
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9239 )

Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for 
empty rows
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1947/


--
To view, visit http://gerrit.cloudera.org:8080/9239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de
Gerrit-Change-Number: 9239
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 16 Feb 2018 09:45:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6508: add KRPC test flag

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9291 )

Change subject: IMPALA-6508: add KRPC test flag
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab
Gerrit-Change-Number: 9291
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 16 Feb 2018 09:26:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6508: add KRPC test flag

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9291 )

Change subject: IMPALA-6508: add KRPC test flag
..

IMPALA-6508: add KRPC test flag

This change adds a flag "--use_krpc" to start-impala-cluster.py. The
flag is currently passed as an argument to the impalad daemon. In the
future it will also enable KRPC for the catalogd and statestored
daemons.

This change also adds a flag "--test_krpc" to pytest. When running tests
using "impala-py.test --test_krpc", the test cluster will be started
by passing "--use_krpc" to start-impala-cluster.py (see above).

This change also adds a SkipIf to skip tests based on whether the
cluster was started with KRPC support or not.

- SkipIf.not_krpc can be used to mark a test that depends on KRPC.
- SkipIf.not_thrift can be used to mark a test that depends on Thrift
  RPC.

This change adds a meta test to make sure that the new SkipIf decorators
work correctly. The test should be removed as soon as real tests have
been added with the new decorators.

Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab
Reviewed-on: http://gerrit.cloudera.org:8080/9291
Reviewed-by: David Knupp 
Tested-by: Impala Public Jenkins
---
M bin/run-all-tests.sh
M bin/start-impala-cluster.py
M tests/common/custom_cluster_test_suite.py
M tests/common/skip.py
A tests/common/test_skip.py
M tests/conftest.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/custom_cluster/test_rpc_exception.py
8 files changed, 86 insertions(+), 7 deletions(-)

Approvals:
  David Knupp: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab
Gerrit-Change-Number: 9291
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9292 )

Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc@55
PS7, Line 55: RPC_QUEUE_OVERFLOW_METRIC_KEY = "rpc.$0.rpcs_queue_overflow";
Looks reasonable to me.


http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc@72
PS7, Line 72: service_->methods_by_name();
nit: indent


http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc@74
PS7, Line 74:
nit: space


http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc@77
PS7, Line 77:
nit: indent



--
To view, visit http://gerrit.cloudera.org:8080/9292
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f
Gerrit-Change-Number: 9292
Gerrit-PatchSet: 7
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 16 Feb 2018 09:17:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9344 )

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/9344/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9344/3//COMMIT_MSG@29
PS3, Line 29: Note that the proper reservation mechanism of the exchange node
> I was thinking through the implications of allocating this memory through t
Good point. As a comparison, suppose we allocate the memory from non-bufferpool 
sources, does it mean the chance of allocation failure will be lower when we 
are that close to capacity ? If so, will bumping it to 85% get us to parity ?


http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/blocking-join-node.cc@163
PS1, Line 163: // Release resources in 'build_batch_' before closing the 
children as some of the
> I wonder if this was the root cause of IMPALA-5715.
Could be.


http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@82
PS1, Line 82: ExecEnv::GetInsta
> Use ExecEnv::GetInstance() for consistency?
Done


http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@138
PS1, Line 138:   
ExecEnv::GetInstance()->buffer_pool()->DeregisterClient(_buffer_pool_client_);
> This check isn't necessary, deregistering an unregistered client is a no-op
Done


http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123
PS1, Line 123:
> Why can't we allocate one buffer large enough for the tuple pointers and th
On the other hand, that also means holding onto the memory of tuple_ptrs_ 
longer than necessary as row batches are accumulated by the caller of 
ExchangeNode::GetNext(). The hope is that this tuple_ptrs_ will mostly be 
satisfied with min_buffer_size so they will be recycled quickly across row 
batches.

While I agree that we don't necessarily need to preserve tuple_ptrs_ across 
Reset() for batch generated from KrpcDataStreamRecvr, it seems to complicate 
the RowBatch logic even further. In other words, one cannot easily tell whether 
tuple_ptrs_ will remain valid after Reset() if we follow this pattern, making 
the code slightly harder to reason about. So, in light of that, I try to keep 
the logic of Reset() similar across row batches created from the 3 different 
ctors.


http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@199
PS1, Line 199:   const int64_t uncompressed_size = header.uncompressed_size();
> nit: don't need the std:: prefixes in .cc generally
Done


http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@484
PS1, Line 484:   // tuple_ptrs_ were allocated with malloc so can be swapped 
between batches.
> We don't need to support AcquireState() for these batches do we? I don't th
Good point. Replaced with a DCHECK.



--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 09:06:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

2018-02-16 Thread Michael Ho (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9344

to look at the new patch set (#4).

Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from 
BufferPool
..

IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool

Previously, tuple pointers of a row batch are allocated from
the heap via malloc() and tuple data is allocated from the
MemPool associated with the RowBatch. This change converts
the allocations of tuple pointers and tuple data to using
BufferPool for row batches allocated from KrpcDataStreamRecvr.
The primary motivation for this change is to take advantage of
the fact that buffers allocated from BufferPool always go back
to the per-core arena they came from when they are freed. This
alleviates the TCMalloc imbalance between the RPC service threads
and the fragment execution threads. As described in IMPALA-5518,
row batches are always allocated from the service threads' TCMalloc
cache and placed into the fragment execution threads' TCMalloc cache
when they're freed. This leads to underflow and overflow in those
threads' caches and high contention for the spinlock of the central
free list. With BufferPool, the memory always went back to its
originating arena so this kind of imbalance is less likely to occur.
This also dovetails with the long term plan to put most allocations
under BufferPool and have each operators in the plan reserved
appropriate amount of memory before execution.

Note that the proper reservation mechanism of the exchange node
hasn't yet been implemented in this change so the buffer pool client
handle used for allocating buffers has an ad-hoc set-up of no reservation
limit and using root reservation tracker as parent. This needs to be
fixed as part of IMPALA-6524. Part of this change also lowers the
minimum buffer size to 32KB to reduce amount of memory wastage as
a row batch's tuple pointers / tuple data can sometimes be smaller
than 64KB.

Testing done: Debug core build.

Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
---
M be/src/common/global-flags.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/partial-sort-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/service/query-options-test.cc
M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
18 files changed, 216 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/9344/4
--
To view, visit http://gerrit.cloudera.org:8080/9344
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2
Gerrit-Change-Number: 9344
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Bump Kudu Java version to 1.7.0

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9349 )

Change subject: Bump Kudu Java version to 1.7.0
..

Bump Kudu Java version to 1.7.0

Change-Id: I69a9bc9e36657a936591f723cf4110a1297973ab
Reviewed-on: http://gerrit.cloudera.org:8080/9349
Reviewed-by: Grant Henke 
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins
---
M bin/impala-config.sh
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Grant Henke: Looks good to me, but someone else must approve
  Philip Zeyliger: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I69a9bc9e36657a936591f723cf4110a1297973ab
Gerrit-Change-Number: 9349
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9282 )

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/9282/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9282/3//COMMIT_MSG@20
PS3, Line 20: 20
> 5%
Done


http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/rpc/impala-service-pool.cc@155
PS2, Line 155: CheckLimitExceeded
> Why don't we call AnyLimitExceeded()? If we're over the process limit it se
I thought about that option but I am concerned about calling Gc in the middle 
of the Rpc path. Updated the patch to make AnyLimitExceeded() to not GC if 
specified by caller.


http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/impala-service-pool.cc@159
PS3, Line 159:   c->DiscardTransfer();
> Maybe drop the lock at this point to avoid calling free() while holding the
Done


http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test-base.h
File be/src/rpc/rpc-mgr-test-base.h:

http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test-base.h@135
PS3, Line 135: TakeoverService
> nit: take over is two words, so TakeOverService
Done


http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test-base.h@234
PS3, Line 234:   dynamic_cast(ping_impl)->mem_tracker()));
> dynamic_cast<> here and below shouldn't be necessary - don't need to check
Done


http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test.cc
File be/src/rpc/rpc-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test.cc@186
PS3, Line 186: dynamic_cast
> static_cast should also work here and below.
Done


http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/runtime/mem-tracker.h@281
PS3, Line 281:   bool CheckLimitExceeded() const { return limit_ >= 0 && limit_ 
< consumption(); }
> Why not use the existing public LimitExceeded() or AnyLimitExceeded() APIs.
I am a bit concerned by the fact that exceeding the memory limit may lead to 
GcMemory being called in the middle of some critical path. May be an 
alternative would be to make LimitExceeded() take an argument to skip Gc and 
simply return true.

Implemented this idea in the new patch.



--
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 16 Feb 2018 08:46:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Bump Kudu Java version to 1.7.0

2018-02-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9349 )

Change subject: Bump Kudu Java version to 1.7.0
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69a9bc9e36657a936591f723cf4110a1297973ab
Gerrit-Change-Number: 9349
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 16 Feb 2018 08:46:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

2018-02-16 Thread Michael Ho (Code Review)
Hello Lars Volker, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9282

to look at the new patch set (#4).

Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service 
queue
..

IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

The fix for IMPALA-6193 added a memory tracker for the memory consumed
by the payloads in the service queue of DataStreamService. This change
extends it by introducing a bound on the memory usage for that service
queue. In addition, it deprecates FLAGS_datastream_service_queue_depth
and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags
only take effect when KRPC is in use and KRPC was never enabled in any
previous releases so it seems safe to do this flag replacement. The new
flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount
of memory which can be consumed by the service queue of DataStreamService.
This allows a more direct control over the memory usage of the queue instead
of inferring via the number of entries in the queue. The default value of
this flag is left at 0, in which case it will be set to 5% of process
memory limit.

Testing done: exhaustive debug builds. Updated data-stream-test to
exercise the case in which the payload is larger than the limit.

Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test-base.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/mem-tracker.h
M be/src/service/data-stream-service.cc
M be/src/service/data-stream-service.h
M tests/custom_cluster/test_krpc_mem_usage.py
15 files changed, 292 insertions(+), 170 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/9282/4
-- 
To view, visit http://gerrit.cloudera.org:8080/9282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9
Gerrit-Change-Number: 9282
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong