[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. IMPALA-7731: Add Read/Exchange counters to profile Selective scans (and by extension selective fragment instances) take higher performance hits when reading data remotely. They can be identified by a low ratio between data being transmitted vs data being read from HDFS. This change adds several counters to the profile to make it easier to identify queries based on their scan instance selectivity. * TotalBytesSent - The total number of bytes sent by a query in exchange nodes. Does not include remote reads, data written to disk, or data sent to the client. * TotalScanBytesSent - The total number of bytes sent by fragment instances that had a scan node in their plan. * TotalInnerBytesSent - The total number of bytes sent by fragment instances that did not have a scan node in their plan, i.e. that received their input data from other instances through exchange node. * ExchangeScanRatio - The ratio between TotalScanBytesSent and TotalBytesRead, i.e. the selectivity over all fragment instances that had a scan node in their plan. This counter is also added to each fragment instance. * InnerNodeSelectivityRatio - The ratio between bytes sent by instances with a scan node in their plan and instances without a scan node in their plan. This indicates how well the inner nodes of the execution plan reduced the data volume. Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Reviewed-on: http://gerrit.cloudera.org:8080/12229 Reviewed-by: Lars Volker Tested-by: Impala Public Jenkins --- M be/src/exec/scan-node.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/util/pretty-printer.h M tests/query_test/test_observability.py 12 files changed, 141 insertions(+), 3 deletions(-) Approvals: Lars Volker: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 6 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 5 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 25 Jan 2019 03:50:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3673/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 5 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 24 Jan 2019 23:24:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 5: Code-Review+2 Rebased, carrying Phil's +2 -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 5 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Thu, 24 Jan 2019 23:23:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1830/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 4 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 Jan 2019 23:53:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 4: (10 comments) http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator-backend-state.cc@251 PS2, Line 251: } > Is this how we determine whether the fragment had a scan in it or not? Poss Done http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@524 PS3, Line 524: RuntimeProfile::Counter* c = p->GetCounter(ScanNode::SCAN_RANGES_COMPLETE_COUNTER); > ?? Sry, done. http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@531 PS3, Line 531: p->GetCounter(KrpcDataStreamSender::TOTAL_BYTES_SENT_COUNTER); > This this be a constant somewhere? Done http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@779 PS2, Line 779: total_utilization.bytes_read); > Should we add comments here about what these new counters mean? (Especially I added comments to all of them. Once IMPALA-7550 comes around we can mode them to the counter definition. (WIP here: https://gerrit.cloudera.org/#/c/12116/) http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@794 PS2, Line 794: double xchg_scan_ratio = 0; > nit: extra line Done http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc@790 PS3, Line 790: // node. > Is it the case that It follows directly from these three COUNTER_SET calls, the addition for TotalBytesSent happens in L783. I added a test to the python tests. http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391 PS2, Line 391: "TotalInnerBytesSent", "ExchangeScanRatio", > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391 PS2, Line 391: > This query takes 3.4 secs on my machine, which is far from extreme, but I w I picked a query that does a partitioned exchange without being able to push predicates down to the scan, so that the TxRatio would be large. This was the simplest one that I could come up with. It takes 2.4 s on my machine. Do you have an idea for a faster one. http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391 PS2, Line 391: > flake8: W291 trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py@390 PS3, Line 390: expected_counters = ["TotalBytesRead", "TotalBytesSent", "TotalScanBytesSent", > Should we add small tests for the other new metrics you have? I added a second test that checks all counters are present, and amended this one to check that the *BytesSent counters add up. -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 4 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 Jan 2019 23:20:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Hello Philip Zeyliger, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12229 to look at the new patch set (#4). Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. IMPALA-7731: Add Read/Exchange counters to profile Selective scans (and by extension selective fragment instances) take higher performance hits when reading data remotely. They can be identified by a low ratio between data being transmitted vs data being read from HDFS. This change adds several counters to the profile to make it easier to identify queries based on their scan instance selectivity. * TotalBytesSent - The total number of bytes sent by a query in exchange nodes. Does not include remote reads, data written to disk, or data sent to the client. * TotalScanBytesSent - The total number of bytes sent by fragment instances that had a scan node in their plan. * TotalInnerBytesSent - The total number of bytes sent by fragment instances that did not have a scan node in their plan, i.e. that received their input data from other instances through exchange node. * ExchangeScanRatio - The ratio between TotalScanBytesSent and TotalBytesRead, i.e. the selectivity over all fragment instances that had a scan node in their plan. This counter is also added to each fragment instance. * InnerNodeSelectivityRatio - The ratio between bytes sent by instances with a scan node in their plan and instances without a scan node in their plan. This indicates how well the inner nodes of the execution plan reduced the data volume. Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 --- M be/src/exec/scan-node.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/util/pretty-printer.h M tests/query_test/test_observability.py 12 files changed, 141 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/12229/4 -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 4 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 3: Code-Review+2 (4 comments) Thanks! http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@524 PS3, Line 524: //if (!p->metadata().__isset.plan_node_id) continue; ?? http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@531 PS3, Line 531: RuntimeProfile::Counter* bytes_sent = p->GetCounter("TotalBytesSent"); This this be a constant somewhere? http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc@790 PS3, Line 790: // node. Is it the case that TotalBytesSent == TotalScanBytesSent + TotalInnerBytesSent ? Do we have a place to DCHECK assert that and/or test it? http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py@390 PS3, Line 390: def test_exchange_scan_ratio_non_zero(self): Should we add small tests for the other new metrics you have? -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 3 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 Jan 2019 22:40:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12229 ) Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1828/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 3 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 18 Jan 2019 22:24:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile
Hello Philip Zeyliger, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12229 to look at the new patch set (#3). Change subject: IMPALA-7731: Add Read/Exchange counters to profile .. IMPALA-7731: Add Read/Exchange counters to profile Selective scans (and by extension selective fragment instances) take higher performance hits when reading data remotely. They can be identified by a low ratio between data being transmitted vs data being read from HDFS. This change adds several counters to the profile to make it easier to identify queries based on their scan instance selectivity. * TotalBytesRead - The total number of bytes read by a query. * TotalBytesSent - The total number of bytes sent by a query in exchange nodes. Does not include remote reads, data written to disk, or data sent to the client. * TotalScanBytesSent - The total number of bytes sent by fragment instances that had a scan node in their plan. * TotalInnerBytesSent - The total number of bytes sent by fragment instances that did not have a scan node in their plan, i.e. that received their input data from other instances through exchange node. * ExchangeScanRatio - The ratio between TotalScanBytesSent and TotalBytesRead, i.e. the selectivity over all fragment instances that had a scan node in their plan. This counter is also added to each fragment instance. * InnerNodeSelectivityRatio - The ratio between bytes sent by instances with a scan node in their plan and instances without a scan node in their plan. This indicates how well the inner nodes of the execution plan reduced the data volume. Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 --- M be/src/exec/scan-node.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/util/pretty-printer.h M tests/query_test/test_observability.py 11 files changed, 112 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/12229/3 -- To view, visit http://gerrit.cloudera.org:8080/12229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648 Gerrit-Change-Number: 12229 Gerrit-PatchSet: 3 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Philip Zeyliger