[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..

IMPALA-6396: Exchange node's memory usage should include its receiver's

A DataStreamRecvr is co-owned by the DataStreamMgr and
an Exchange node. However, the life time of the memory
allocations (e.g. row batches) of a DataStreamRecvr never
exceeds that of its owning Exchange node. Previously, we
used the fragment instance's MemTracker as the parent of
the DataStreamRecvr's MemTracker. This change switches to
using the MemTracker of the owning Exchange node as the
parent tracker of the DataStreamRecvr. This makes it
easier to identify the peak memory usage of the receivers
of different exchange nodes in the runtime profile and
query summary. Most of the exchange node's memory usage
is from its receiver so we don't track the peak memory
usage of the receiver separately.

Sample output from TPCH-Q21:

EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 
18.35%)
   - ConvertRowBatchTime: 223.895ms
   - PeakMemoryUsage: 10.04 MB (10524943)
   - RowsReturned: 1.27M (1267464)
   - RowsReturnedRate: 875.19 K/sec
  RecvrSide:
BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 
10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB
 - FirstBatchArrivalWaitTime: 0.000ns
 - TotalBytesReceived: 93.07 MB (97594728)
 - TotalGetBatchTime: 1s194ms
   - DataArrivalTimer: 1s183ms
   SenderSide:
  - DeserializeRowBatchTime: 344.343ms
  - NumBatchesAccepted: 3.80K (3796)
  - NumBatchesDeferred: 5 (5)
  - NumEarlySenders: 0 (0)

Testing done: Updated test_observability.py to verify the
peak memory usage of exchange node is not 0.

Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Reviewed-on: http://gerrit.cloudera.org:8080/9202
Reviewed-by: Michael Ho 
Tested-by: Impala Public Jenkins
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M tests/query_test/test_observability.py
8 files changed, 42 insertions(+), 40 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Sat, 10 Feb 2018 03:10:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1917/


--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 09 Feb 2018 23:28:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 6: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 09 Feb 2018 23:27:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9202/6/be/src/runtime/data-stream-mgr.h
File be/src/runtime/data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9202/6/be/src/runtime/data-stream-mgr.h@74
PS6, Line 74:  owns
nit: 'creates'.


http://gerrit.cloudera.org:8080/#/c/9202/5/be/src/runtime/data-stream-mgr.h
File be/src/runtime/data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9202/5/be/src/runtime/data-stream-mgr.h@74
PS5, Line 74: MemTracker of the
> I was hoping for something more insightful.  Like can't we just say it shou
Good point. Done.



--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 09 Feb 2018 22:43:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Michael Ho (Code Review)
Hello Lars Volker, Sailesh Mukil, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9202

to look at the new patch set (#6).

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..

IMPALA-6396: Exchange node's memory usage should include its receiver's

A DataStreamRecvr is co-owned by the DataStreamMgr and
an Exchange node. However, the life time of the memory
allocations (e.g. row batches) of a DataStreamRecvr never
exceeds that of its owning Exchange node. Previously, we
used the fragment instance's MemTracker as the parent of
the DataStreamRecvr's MemTracker. This change switches to
using the MemTracker of the owning Exchange node as the
parent tracker of the DataStreamRecvr. This makes it
easier to identify the peak memory usage of the receivers
of different exchange nodes in the runtime profile and
query summary. Most of the exchange node's memory usage
is from its receiver so we don't track the peak memory
usage of the receiver separately.

Sample output from TPCH-Q21:

EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 
18.35%)
   - ConvertRowBatchTime: 223.895ms
   - PeakMemoryUsage: 10.04 MB (10524943)
   - RowsReturned: 1.27M (1267464)
   - RowsReturnedRate: 875.19 K/sec
  RecvrSide:
BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 
10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB
 - FirstBatchArrivalWaitTime: 0.000ns
 - TotalBytesReceived: 93.07 MB (97594728)
 - TotalGetBatchTime: 1s194ms
   - DataArrivalTimer: 1s183ms
   SenderSide:
  - DeserializeRowBatchTime: 344.343ms
  - NumBatchesAccepted: 3.80K (3796)
  - NumBatchesDeferred: 5 (5)
  - NumEarlySenders: 0 (0)

Testing done: Updated test_observability.py to verify the
peak memory usage of exchange node is not 0.

Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M tests/query_test/test_observability.py
8 files changed, 42 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/6
--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1916/


--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 09 Feb 2018 22:34:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9202/5/be/src/runtime/data-stream-mgr.h
File be/src/runtime/data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9202/5/be/src/runtime/data-stream-mgr.h@74
PS5, Line 74: parent MemTracker
I was hoping for something more insightful.  Like can't we just say it should 
be the MemTracker for the exchange node.



--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 09 Feb 2018 22:33:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/data-stream-mgr.cc
File be/src/runtime/data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/data-stream-mgr.cc@82
PS4, Line 82:   DCHECK(profile != nullptr);
> nit: I think we usually switch files from NULL to nullptr on the first occu
Yes, I tried to keep the scope of the change to be smaller. I converted this 
one to make it at least consistent within a function.


http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/krpc-data-stream-mgr.h@241
PS4, Line 241:   /// Ownership of the receiver is shared between this 
DataStream mgr instance and the
> could you document what parent_tracker should be?
Done



--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 09 Feb 2018 22:31:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Michael Ho (Code Review)
Hello Lars Volker, Sailesh Mukil, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9202

to look at the new patch set (#5).

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..

IMPALA-6396: Exchange node's memory usage should include its receiver's

A DataStreamRecvr is co-owned by the DataStreamMgr and
an Exchange node. However, the life time of the memory
allocations (e.g. row batches) of a DataStreamRecvr never
exceeds that of its owning Exchange node. Previously, we
used the fragment instance's MemTracker as the parent of
the DataStreamRecvr's MemTracker. This change switches to
using the MemTracker of the owning Exchange node as the
parent tracker of the DataStreamRecvr. This makes it
easier to identify the peak memory usage of the receivers
of different exchange nodes in the runtime profile and
query summary. Most of the exchange node's memory usage
is from its receiver so we don't track the peak memory
usage of the receiver separately.

Sample output from TPCH-Q21:

EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 
18.35%)
   - ConvertRowBatchTime: 223.895ms
   - PeakMemoryUsage: 10.04 MB (10524943)
   - RowsReturned: 1.27M (1267464)
   - RowsReturnedRate: 875.19 K/sec
  RecvrSide:
BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 
10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB
 - FirstBatchArrivalWaitTime: 0.000ns
 - TotalBytesReceived: 93.07 MB (97594728)
 - TotalGetBatchTime: 1s194ms
   - DataArrivalTimer: 1s183ms
   SenderSide:
  - DeserializeRowBatchTime: 344.343ms
  - NumBatchesAccepted: 3.80K (3796)
  - NumBatchesDeferred: 5 (5)
  - NumEarlySenders: 0 (0)

Testing done: Updated test_observability.py to verify the
peak memory usage of exchange node is not 0.

Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M tests/query_test/test_observability.py
8 files changed, 42 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/5
--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 5
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/krpc-data-stream-mgr.h@241
PS4, Line 241:   /// caller.
could you document what parent_tracker should be?



--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 09 Feb 2018 21:49:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 4: Code-Review+1

(1 comment)

Left one nit but I'm good with submitting as is.

http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/data-stream-mgr.cc
File be/src/runtime/data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/data-stream-mgr.cc@82
PS4, Line 82:   DCHECK(profile != nullptr);
nit: I think we usually switch files from NULL to nullptr on the first 
occurrence. I don't feel strongly about it though, since it make merge 
conflicts more likely.



--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 09 Feb 2018 21:39:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-09 Thread Michael Ho (Code Review)
Hello Lars Volker, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9202

to look at the new patch set (#4).

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..

IMPALA-6396: Exchange node's memory usage should include its receiver's

A DataStreamRecvr is co-owned by the DataStreamMgr and
an Exchange node. However, the life time of the memory
allocations (e.g. row batches) of a DataStreamRecvr never
exceeds that of its owning Exchange node. Previously, we
used the fragment instance's MemTracker as the parent of
the DataStreamRecvr's MemTracker. This change switches to
using the MemTracker of the owning Exchange node as the
parent tracker of the DataStreamRecvr. This makes it
easier to identify the peak memory usage of the receivers
of different exchange nodes in the runtime profile and
query summary. Most of the exchange node's memory usage
is from its receiver so we don't track the peak memory
usage of the receiver separately.

Sample output from TPCH-Q21:

EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 
18.35%)
   - ConvertRowBatchTime: 223.895ms
   - PeakMemoryUsage: 10.04 MB (10524943)
   - RowsReturned: 1.27M (1267464)
   - RowsReturnedRate: 875.19 K/sec
  RecvrSide:
BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 
10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB
 - FirstBatchArrivalWaitTime: 0.000ns
 - TotalBytesReceived: 93.07 MB (97594728)
 - TotalGetBatchTime: 1s194ms
   - DataArrivalTimer: 1s183ms
   SenderSide:
  - DeserializeRowBatchTime: 344.343ms
  - NumBatchesAccepted: 3.80K (3796)
  - NumBatchesDeferred: 5 (5)
  - NumEarlySenders: 0 (0)

Testing done: Updated test_observability.py to verify the
peak memory usage of exchange node is not 0.

Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M tests/query_test/test_observability.py
8 files changed, 38 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/4
--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-08 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG@16
PS2, Line 16: This makes it
: easier to identify the peak memory usage of the receivers
: of different exchange nodes in the runtime profile and
: query summary.
> Yes, they use different variants of the MemTracker constructors. The exchan
Ok thanks for the explanation. Could you add that line to the commit message?

I was worried about the SortedRunMerger, but I see that's in the 
DataStreamRecvr. I previously erroneously thought it was part of the exchange 
node itself. Given that's the case, I don't think it's a critical necessity to 
track the peak mem usage of the recvr separately.



--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 09 Feb 2018 06:50:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-07 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
> How does this approach compare with tracking and logging the receiver's ind
The receiver itself still has its own MemTracker after this change. So, it will 
still be printed out separately if one were to call MemTracker::LogUsage() 
(e.g. when memory limit is exceeded). It's true that one cannot tell the 
breakdown of the exchange node memory usage from the profile. That said, most 
of the memory usage from an exchange node is from its receiver anyway.

FWIW, one also cannot tell the peak memory usage of the receiver from the 
profile even before this change.


http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG@16
PS2, Line 16: This makes it
: easier to identify the peak memory usage of the receivers
: of different exchange nodes in the runtime profile and
: query summary.
> Could you just state why that is the case? I'm guessing it's because the ex
Yes, they use different variants of the MemTracker constructors. The exchange 
node's MemTracker will add a counter to the RuntimeProfile. Please let me know 
if you think we need to track the peak memory usage of the receiver separately 
(e.g. it can be added to the sender_side_profile_). As far as I can tell, most 
of the memory consumption of an exchange node should be from its receiver so I 
didn't bother to have two counters which show mostly the same numbers.


http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/exec/exchange-node.cc@85
PS2, Line 85:   state->fragment_instance_id(), id_, num_senders_,
> nit: input_row_desc could fit on previous line.
Done


http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/runtime/krpc-data-stream-mgr.cc@100
PS2, Line 100:   DCHECK(profile != nullptr);
> nit: DCHECK that parent_tracker != nullptr
Done



--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 07 Feb 2018 23:38:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-07 Thread Michael Ho (Code Review)
Hello Lars Volker, Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9202

to look at the new patch set (#3).

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..

IMPALA-6396: Exchange node's memory usage should include its receiver's

A DataStreamRecvr is co-owned by the DataStreamMgr and
an Exchange node. However, the life time of the memory
allocations (e.g. row batches) of a DataStreamRecvr never
exceeds that of its owning Exchange node. Previously, we
used the fragment instance's MemTracker as the parent of
the DataStreamRecvr's MemTracker. This change switches to
using the MemTracker of the owning Exchange node as the
parent tracker of the DataStreamRecvr. This makes it
easier to identify the peak memory usage of the receivers
of different exchange nodes in the runtime profile and
query summary.

Sample output from TPCH-Q21:

EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 
18.35%)
   - ConvertRowBatchTime: 223.895ms
   - PeakMemoryUsage: 10.04 MB (10524943)
   - RowsReturned: 1.27M (1267464)
   - RowsReturnedRate: 875.19 K/sec
  RecvrSide:
BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 
10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB
 - FirstBatchArrivalWaitTime: 0.000ns
 - TotalBytesReceived: 93.07 MB (97594728)
 - TotalGetBatchTime: 1s194ms
   - DataArrivalTimer: 1s183ms
   SenderSide:
  - DeserializeRowBatchTime: 344.343ms
  - NumBatchesAccepted: 3.80K (3796)
  - NumBatchesDeferred: 5 (5)
  - NumEarlySenders: 0 (0)

Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M tests/query_test/test_observability.py
8 files changed, 38 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/3
--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-07 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/exec/exchange-node.cc@85
PS2, Line 85:   _row_desc_, state->fragment_instance_id(), id_, 
num_senders_,
nit: input_row_desc could fit on previous line.


http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/runtime/krpc-data-stream-mgr.cc@100
PS2, Line 100:   DCHECK(profile != nullptr);
nit: DCHECK that parent_tracker != nullptr



--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 07 Feb 2018 18:09:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-06 Thread Michael Ho (Code Review)
Michael Ho has removed Lars Volker from this change.  ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Removed reviewer Lars Volker.
--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-02 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..

IMPALA-6396: Exchange node's memory usage should include its receiver's

A DataStreamRecvr is co-owned by the DataStreamMgr and
an Exchange node. However, the life time of the memory
allocations (e.g. row batches) of a DataStreamRecvr never
exceeds that of its owning Exchange node. Previously, we
used the fragment instance's MemTracker as the parent of
the DataStreamRecvr's MemTracker. This change switches to
using the MemTracker of the owning Exchange node as the
parent tracker of the DataStreamRecvr. This makes it
easier to identify the peak memory usage of the receivers
of different exchange nodes in the runtime profile and
query summary.

Sample output from TPCH-Q21:

EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 
18.35%)
   - ConvertRowBatchTime: 223.895ms
   - PeakMemoryUsage: 10.04 MB (10524943)
   - RowsReturned: 1.27M (1267464)
   - RowsReturnedRate: 875.19 K/sec
  RecvrSide:
BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 
10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB
 - FirstBatchArrivalWaitTime: 0.000ns
 - TotalBytesReceived: 93.07 MB (97594728)
 - TotalGetBatchTime: 1s194ms
   - DataArrivalTimer: 1s183ms
   SenderSide:
  - DeserializeRowBatchTime: 344.343ms
  - NumBatchesAccepted: 3.80K (3796)
  - NumBatchesDeferred: 5 (5)
  - NumEarlySenders: 0 (0)

Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
---
M be/src/exec/exchange-node.cc
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M tests/query_test/test_observability.py
8 files changed, 35 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/2
--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho