[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. IMPALA-6396: Exchange node's memory usage should include its receiver's A DataStreamRecvr is co-owned by the DataStreamMgr and an Exchange node. However, the life time of the memory allocations (e.g. row batches) of a DataStreamRecvr never exceeds that of its owning Exchange node. Previously, we used the fragment instance's MemTracker as the parent of the DataStreamRecvr's MemTracker. This change switches to using the MemTracker of the owning Exchange node as the parent tracker of the DataStreamRecvr. This makes it easier to identify the peak memory usage of the receivers of different exchange nodes in the runtime profile and query summary. Most of the exchange node's memory usage is from its receiver so we don't track the peak memory usage of the receiver separately. Sample output from TPCH-Q21: EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 18.35%) - ConvertRowBatchTime: 223.895ms - PeakMemoryUsage: 10.04 MB (10524943) - RowsReturned: 1.27M (1267464) - RowsReturnedRate: 875.19 K/sec RecvrSide: BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB - FirstBatchArrivalWaitTime: 0.000ns - TotalBytesReceived: 93.07 MB (97594728) - TotalGetBatchTime: 1s194ms - DataArrivalTimer: 1s183ms SenderSide: - DeserializeRowBatchTime: 344.343ms - NumBatchesAccepted: 3.80K (3796) - NumBatchesDeferred: 5 (5) - NumEarlySenders: 0 (0) Testing done: Updated test_observability.py to verify the peak memory usage of exchange node is not 0. Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Reviewed-on: http://gerrit.cloudera.org:8080/9202 Reviewed-by: Michael HoTested-by: Impala Public Jenkins --- M be/src/exec/exchange-node.cc M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M tests/query_test/test_observability.py 8 files changed, 42 insertions(+), 40 deletions(-) Approvals: Michael Ho: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 7 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Sat, 10 Feb 2018 03:10:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1917/ -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 09 Feb 2018 23:28:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 6: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 09 Feb 2018 23:27:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/9202/6/be/src/runtime/data-stream-mgr.h File be/src/runtime/data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/9202/6/be/src/runtime/data-stream-mgr.h@74 PS6, Line 74: owns nit: 'creates'. http://gerrit.cloudera.org:8080/#/c/9202/5/be/src/runtime/data-stream-mgr.h File be/src/runtime/data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/9202/5/be/src/runtime/data-stream-mgr.h@74 PS5, Line 74: MemTracker of the > I was hoping for something more insightful. Like can't we just say it shou Good point. Done. -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 09 Feb 2018 22:43:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Hello Lars Volker, Sailesh Mukil, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9202 to look at the new patch set (#6). Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. IMPALA-6396: Exchange node's memory usage should include its receiver's A DataStreamRecvr is co-owned by the DataStreamMgr and an Exchange node. However, the life time of the memory allocations (e.g. row batches) of a DataStreamRecvr never exceeds that of its owning Exchange node. Previously, we used the fragment instance's MemTracker as the parent of the DataStreamRecvr's MemTracker. This change switches to using the MemTracker of the owning Exchange node as the parent tracker of the DataStreamRecvr. This makes it easier to identify the peak memory usage of the receivers of different exchange nodes in the runtime profile and query summary. Most of the exchange node's memory usage is from its receiver so we don't track the peak memory usage of the receiver separately. Sample output from TPCH-Q21: EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 18.35%) - ConvertRowBatchTime: 223.895ms - PeakMemoryUsage: 10.04 MB (10524943) - RowsReturned: 1.27M (1267464) - RowsReturnedRate: 875.19 K/sec RecvrSide: BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB - FirstBatchArrivalWaitTime: 0.000ns - TotalBytesReceived: 93.07 MB (97594728) - TotalGetBatchTime: 1s194ms - DataArrivalTimer: 1s183ms SenderSide: - DeserializeRowBatchTime: 344.343ms - NumBatchesAccepted: 3.80K (3796) - NumBatchesDeferred: 5 (5) - NumEarlySenders: 0 (0) Testing done: Updated test_observability.py to verify the peak memory usage of exchange node is not 0. Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 --- M be/src/exec/exchange-node.cc M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M tests/query_test/test_observability.py 8 files changed, 42 insertions(+), 40 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/6 -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1916/ -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 09 Feb 2018 22:34:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/9202/5/be/src/runtime/data-stream-mgr.h File be/src/runtime/data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/9202/5/be/src/runtime/data-stream-mgr.h@74 PS5, Line 74: parent MemTracker I was hoping for something more insightful. Like can't we just say it should be the MemTracker for the exchange node. -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 09 Feb 2018 22:33:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/data-stream-mgr.cc File be/src/runtime/data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/data-stream-mgr.cc@82 PS4, Line 82: DCHECK(profile != nullptr); > nit: I think we usually switch files from NULL to nullptr on the first occu Yes, I tried to keep the scope of the change to be smaller. I converted this one to make it at least consistent within a function. http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/krpc-data-stream-mgr.h@241 PS4, Line 241: /// Ownership of the receiver is shared between this DataStream mgr instance and the > could you document what parent_tracker should be? Done -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 09 Feb 2018 22:31:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Hello Lars Volker, Sailesh Mukil, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9202 to look at the new patch set (#5). Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. IMPALA-6396: Exchange node's memory usage should include its receiver's A DataStreamRecvr is co-owned by the DataStreamMgr and an Exchange node. However, the life time of the memory allocations (e.g. row batches) of a DataStreamRecvr never exceeds that of its owning Exchange node. Previously, we used the fragment instance's MemTracker as the parent of the DataStreamRecvr's MemTracker. This change switches to using the MemTracker of the owning Exchange node as the parent tracker of the DataStreamRecvr. This makes it easier to identify the peak memory usage of the receivers of different exchange nodes in the runtime profile and query summary. Most of the exchange node's memory usage is from its receiver so we don't track the peak memory usage of the receiver separately. Sample output from TPCH-Q21: EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 18.35%) - ConvertRowBatchTime: 223.895ms - PeakMemoryUsage: 10.04 MB (10524943) - RowsReturned: 1.27M (1267464) - RowsReturnedRate: 875.19 K/sec RecvrSide: BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB - FirstBatchArrivalWaitTime: 0.000ns - TotalBytesReceived: 93.07 MB (97594728) - TotalGetBatchTime: 1s194ms - DataArrivalTimer: 1s183ms SenderSide: - DeserializeRowBatchTime: 344.343ms - NumBatchesAccepted: 3.80K (3796) - NumBatchesDeferred: 5 (5) - NumEarlySenders: 0 (0) Testing done: Updated test_observability.py to verify the peak memory usage of exchange node is not 0. Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 --- M be/src/exec/exchange-node.cc M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M tests/query_test/test_observability.py 8 files changed, 42 insertions(+), 40 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/5 -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/krpc-data-stream-mgr.h@241 PS4, Line 241: /// caller. could you document what parent_tracker should be? -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 09 Feb 2018 21:49:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 4: Code-Review+1 (1 comment) Left one nit but I'm good with submitting as is. http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/data-stream-mgr.cc File be/src/runtime/data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9202/4/be/src/runtime/data-stream-mgr.cc@82 PS4, Line 82: DCHECK(profile != nullptr); nit: I think we usually switch files from NULL to nullptr on the first occurrence. I don't feel strongly about it though, since it make merge conflicts more likely. -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 09 Feb 2018 21:39:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Hello Lars Volker, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9202 to look at the new patch set (#4). Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. IMPALA-6396: Exchange node's memory usage should include its receiver's A DataStreamRecvr is co-owned by the DataStreamMgr and an Exchange node. However, the life time of the memory allocations (e.g. row batches) of a DataStreamRecvr never exceeds that of its owning Exchange node. Previously, we used the fragment instance's MemTracker as the parent of the DataStreamRecvr's MemTracker. This change switches to using the MemTracker of the owning Exchange node as the parent tracker of the DataStreamRecvr. This makes it easier to identify the peak memory usage of the receivers of different exchange nodes in the runtime profile and query summary. Most of the exchange node's memory usage is from its receiver so we don't track the peak memory usage of the receiver separately. Sample output from TPCH-Q21: EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 18.35%) - ConvertRowBatchTime: 223.895ms - PeakMemoryUsage: 10.04 MB (10524943) - RowsReturned: 1.27M (1267464) - RowsReturnedRate: 875.19 K/sec RecvrSide: BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB - FirstBatchArrivalWaitTime: 0.000ns - TotalBytesReceived: 93.07 MB (97594728) - TotalGetBatchTime: 1s194ms - DataArrivalTimer: 1s183ms SenderSide: - DeserializeRowBatchTime: 344.343ms - NumBatchesAccepted: 3.80K (3796) - NumBatchesDeferred: 5 (5) - NumEarlySenders: 0 (0) Testing done: Updated test_observability.py to verify the peak memory usage of exchange node is not 0. Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 --- M be/src/exec/exchange-node.cc M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M tests/query_test/test_observability.py 8 files changed, 38 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/4 -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG@16 PS2, Line 16: This makes it : easier to identify the peak memory usage of the receivers : of different exchange nodes in the runtime profile and : query summary. > Yes, they use different variants of the MemTracker constructors. The exchan Ok thanks for the explanation. Could you add that line to the commit message? I was worried about the SortedRunMerger, but I see that's in the DataStreamRecvr. I previously erroneously thought it was part of the exchange node itself. Given that's the case, I don't think it's a critical necessity to track the peak mem usage of the recvr separately. -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 3 Gerrit-Owner: Michael HoGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 09 Feb 2018 06:50:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG@7 PS2, Line 7: IMPALA-6396: Exchange node's memory usage should include its receiver's > How does this approach compare with tracking and logging the receiver's ind The receiver itself still has its own MemTracker after this change. So, it will still be printed out separately if one were to call MemTracker::LogUsage() (e.g. when memory limit is exceeded). It's true that one cannot tell the breakdown of the exchange node memory usage from the profile. That said, most of the memory usage from an exchange node is from its receiver anyway. FWIW, one also cannot tell the peak memory usage of the receiver from the profile even before this change. http://gerrit.cloudera.org:8080/#/c/9202/2//COMMIT_MSG@16 PS2, Line 16: This makes it : easier to identify the peak memory usage of the receivers : of different exchange nodes in the runtime profile and : query summary. > Could you just state why that is the case? I'm guessing it's because the ex Yes, they use different variants of the MemTracker constructors. The exchange node's MemTracker will add a counter to the RuntimeProfile. Please let me know if you think we need to track the peak memory usage of the receiver separately (e.g. it can be added to the sender_side_profile_). As far as I can tell, most of the memory consumption of an exchange node should be from its receiver so I didn't bother to have two counters which show mostly the same numbers. http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/exec/exchange-node.cc File be/src/exec/exchange-node.cc: http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/exec/exchange-node.cc@85 PS2, Line 85: state->fragment_instance_id(), id_, num_senders_, > nit: input_row_desc could fit on previous line. Done http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/runtime/krpc-data-stream-mgr.cc@100 PS2, Line 100: DCHECK(profile != nullptr); > nit: DCHECK that parent_tracker != nullptr Done -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 3 Gerrit-Owner: Michael HoGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 07 Feb 2018 23:38:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Hello Lars Volker, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9202 to look at the new patch set (#3). Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. IMPALA-6396: Exchange node's memory usage should include its receiver's A DataStreamRecvr is co-owned by the DataStreamMgr and an Exchange node. However, the life time of the memory allocations (e.g. row batches) of a DataStreamRecvr never exceeds that of its owning Exchange node. Previously, we used the fragment instance's MemTracker as the parent of the DataStreamRecvr's MemTracker. This change switches to using the MemTracker of the owning Exchange node as the parent tracker of the DataStreamRecvr. This makes it easier to identify the peak memory usage of the receivers of different exchange nodes in the runtime profile and query summary. Sample output from TPCH-Q21: EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 18.35%) - ConvertRowBatchTime: 223.895ms - PeakMemoryUsage: 10.04 MB (10524943) - RowsReturned: 1.27M (1267464) - RowsReturnedRate: 875.19 K/sec RecvrSide: BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB - FirstBatchArrivalWaitTime: 0.000ns - TotalBytesReceived: 93.07 MB (97594728) - TotalGetBatchTime: 1s194ms - DataArrivalTimer: 1s183ms SenderSide: - DeserializeRowBatchTime: 344.343ms - NumBatchesAccepted: 3.80K (3796) - NumBatchesDeferred: 5 (5) - NumEarlySenders: 0 (0) Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 --- M be/src/exec/exchange-node.cc M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M tests/query_test/test_observability.py 8 files changed, 38 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/3 -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 3 Gerrit-Owner: Michael HoGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/exec/exchange-node.cc File be/src/exec/exchange-node.cc: http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/exec/exchange-node.cc@85 PS2, Line 85: _row_desc_, state->fragment_instance_id(), id_, num_senders_, nit: input_row_desc could fit on previous line. http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9202/2/be/src/runtime/krpc-data-stream-mgr.cc@100 PS2, Line 100: DCHECK(profile != nullptr); nit: DCHECK that parent_tracker != nullptr -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 2 Gerrit-Owner: Michael HoGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 07 Feb 2018 18:09:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Michael Ho has removed Lars Volker from this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Removed reviewer Lars Volker. -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 2 Gerrit-Owner: Michael HoGerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Michael Ho has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. IMPALA-6396: Exchange node's memory usage should include its receiver's A DataStreamRecvr is co-owned by the DataStreamMgr and an Exchange node. However, the life time of the memory allocations (e.g. row batches) of a DataStreamRecvr never exceeds that of its owning Exchange node. Previously, we used the fragment instance's MemTracker as the parent of the DataStreamRecvr's MemTracker. This change switches to using the MemTracker of the owning Exchange node as the parent tracker of the DataStreamRecvr. This makes it easier to identify the peak memory usage of the receivers of different exchange nodes in the runtime profile and query summary. Sample output from TPCH-Q21: EXCHANGE_NODE (id=18):(Total: 1s448ms, non-child: 265.818ms, % non-child: 18.35%) - ConvertRowBatchTime: 223.895ms - PeakMemoryUsage: 10.04 MB (10524943) - RowsReturned: 1.27M (1267464) - RowsReturnedRate: 875.19 K/sec RecvrSide: BytesReceived(500.000ms): 0, 1.64 MB, 9.98 MB, 9.98 MB, 10.01 MB, 10.01 MB, 10.01 MB, 31.79 MB, 60.19 MB, 87.84 MB - FirstBatchArrivalWaitTime: 0.000ns - TotalBytesReceived: 93.07 MB (97594728) - TotalGetBatchTime: 1s194ms - DataArrivalTimer: 1s183ms SenderSide: - DeserializeRowBatchTime: 344.343ms - NumBatchesAccepted: 3.80K (3796) - NumBatchesDeferred: 5 (5) - NumEarlySenders: 0 (0) Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 --- M be/src/exec/exchange-node.cc M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M tests/query_test/test_observability.py 8 files changed, 35 insertions(+), 36 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/9202/2 -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 2 Gerrit-Owner: Michael Ho