[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..

IMPALA-7731: Add Read/Exchange counters to profile

Selective scans (and by extension selective fragment instances)
take higher performance hits when reading data remotely. They can
be identified by a low ratio between data being transmitted vs data
being read from HDFS.

This change adds several counters to the profile to make it easier to
identify queries based on their scan instance selectivity.

* TotalBytesSent - The total number of bytes sent by a query in
  exchange nodes. Does not include remote reads, data written to disk,
  or data sent to the client.

* TotalScanBytesSent - The total number of bytes sent by fragment
  instances that had a scan node in their plan.

* TotalInnerBytesSent - The total number of bytes sent by fragment
  instances that did not have a scan node in their plan, i.e. that
  received their input data from other instances through exchange node.

* ExchangeScanRatio - The ratio between TotalScanBytesSent and
  TotalBytesRead, i.e. the selectivity over all fragment instances that
  had a scan node in their plan. This counter is also added to each
  fragment instance.

* InnerNodeSelectivityRatio - The ratio between bytes sent by instances
  with a scan node in their plan and instances without a scan node in
  their plan. This indicates how well the inner nodes of the execution
  plan reduced the data volume.

Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Reviewed-on: http://gerrit.cloudera.org:8080/12229
Reviewed-by: Lars Volker 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/util/pretty-printer.h
M tests/query_test/test_observability.py
12 files changed, 141 insertions(+), 3 deletions(-)

Approvals:
  Lars Volker: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 6
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 25 Jan 2019 03:50:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3673/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 24 Jan 2019 23:24:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-24 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 5: Code-Review+2

Rebased, carrying Phil's +2


--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 5
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 24 Jan 2019 23:23:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1830/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 Jan 2019 23:53:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator-backend-state.cc@251
PS2, Line 251: }
> Is this how we determine whether the fragment had a scan in it or not? Poss
Done


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@524
PS3, Line 524: RuntimeProfile::Counter* c = 
p->GetCounter(ScanNode::SCAN_RANGES_COMPLETE_COUNTER);
> ??
Sry, done.


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@531
PS3, Line 531: 
p->GetCounter(KrpcDataStreamSender::TOTAL_BYTES_SENT_COUNTER);
> This this be a constant somewhere?
Done


http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@779
PS2, Line 779:   total_utilization.bytes_read);
> Should we add comments here about what these new counters mean? (Especially
I added comments to all of them. Once IMPALA-7550 comes around we can mode them 
to the counter definition. (WIP here: https://gerrit.cloudera.org/#/c/12116/)


http://gerrit.cloudera.org:8080/#/c/12229/2/be/src/runtime/coordinator.cc@794
PS2, Line 794:   double xchg_scan_ratio = 0;
> nit: extra line
Done


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc@790
PS3, Line 790:   // node.
> Is it the case that
It follows directly from these three COUNTER_SET calls, the addition for 
TotalBytesSent happens in L783. I added a test to the python tests.


http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391
PS2, Line 391:  "TotalInnerBytesSent", 
"ExchangeScanRatio",
> line has trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391
PS2, Line 391:
> This query takes 3.4 secs on my machine, which is far from extreme, but I w
I picked a query that does a partitioned exchange without being able to push 
predicates down to the scan, so that the TxRatio would be large. This was the 
simplest one that I could come up with. It takes 2.4 s on my machine. Do you 
have an idea for a faster one.


http://gerrit.cloudera.org:8080/#/c/12229/2/tests/query_test/test_observability.py@391
PS2, Line 391:
> flake8: W291 trailing whitespace
Done


http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py@390
PS3, Line 390: expected_counters = ["TotalBytesRead", "TotalBytesSent", 
"TotalScanBytesSent",
> Should we add small tests for the other new metrics you have?
I added a second test that checks all counters are present, and amended this 
one to check that the *BytesSent counters add up.



--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 Jan 2019 23:20:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Lars Volker (Code Review)
Hello Philip Zeyliger, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12229

to look at the new patch set (#4).

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..

IMPALA-7731: Add Read/Exchange counters to profile

Selective scans (and by extension selective fragment instances)
take higher performance hits when reading data remotely. They can
be identified by a low ratio between data being transmitted vs data
being read from HDFS.

This change adds several counters to the profile to make it easier to
identify queries based on their scan instance selectivity.

* TotalBytesSent - The total number of bytes sent by a query in
  exchange nodes. Does not include remote reads, data written to disk,
  or data sent to the client.

* TotalScanBytesSent - The total number of bytes sent by fragment
  instances that had a scan node in their plan.

* TotalInnerBytesSent - The total number of bytes sent by fragment
  instances that did not have a scan node in their plan, i.e. that
  received their input data from other instances through exchange node.

* ExchangeScanRatio - The ratio between TotalScanBytesSent and
  TotalBytesRead, i.e. the selectivity over all fragment instances that
  had a scan node in their plan. This counter is also added to each
  fragment instance.

* InnerNodeSelectivityRatio - The ratio between bytes sent by instances
  with a scan node in their plan and instances without a scan node in
  their plan. This indicates how well the inner nodes of the execution
  plan reduced the data volume.

Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
---
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/util/pretty-printer.h
M tests/query_test/test_observability.py
12 files changed, 141 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/12229/4
--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 3: Code-Review+2

(4 comments)

Thanks!

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@524
PS3, Line 524: //if (!p->metadata().__isset.plan_node_id) continue;
??


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator-backend-state.cc@531
PS3, Line 531: RuntimeProfile::Counter* bytes_sent = 
p->GetCounter("TotalBytesSent");
This this be a constant somewhere?


http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12229/3/be/src/runtime/coordinator.cc@790
PS3, Line 790:   // node.
Is it the case that

TotalBytesSent == TotalScanBytesSent + TotalInnerBytesSent

?

Do we have a place to DCHECK assert that and/or test it?


http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12229/3/tests/query_test/test_observability.py@390
PS3, Line 390:   def test_exchange_scan_ratio_non_zero(self):
Should we add small tests for the other new metrics you have?



--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 Jan 2019 22:40:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12229 )

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1828/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 Jan 2019 22:24:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7731: Add Read/Exchange counters to profile

2019-01-18 Thread Lars Volker (Code Review)
Hello Philip Zeyliger, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12229

to look at the new patch set (#3).

Change subject: IMPALA-7731: Add Read/Exchange counters to profile
..

IMPALA-7731: Add Read/Exchange counters to profile

Selective scans (and by extension selective fragment instances)
take higher performance hits when reading data remotely. They can
be identified by a low ratio between data being transmitted vs data
being read from HDFS.

This change adds several counters to the profile to make it easier to
identify queries based on their scan instance selectivity.

* TotalBytesRead - The total number of bytes read by a query.

* TotalBytesSent - The total number of bytes sent by a query in
  exchange nodes. Does not include remote reads, data written to disk,
  or data sent to the client.

* TotalScanBytesSent - The total number of bytes sent by fragment
  instances that had a scan node in their plan.

* TotalInnerBytesSent - The total number of bytes sent by fragment
  instances that did not have a scan node in their plan, i.e. that
  received their input data from other instances through exchange node.

* ExchangeScanRatio - The ratio between TotalScanBytesSent and
  TotalBytesRead, i.e. the selectivity over all fragment instances that
  had a scan node in their plan. This counter is also added to each
  fragment instance.

* InnerNodeSelectivityRatio - The ratio between bytes sent by instances
  with a scan node in their plan and instances without a scan node in
  their plan. This indicates how well the inner nodes of the execution
  plan reduced the data volume.

Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
---
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/util/pretty-printer.h
M tests/query_test/test_observability.py
11 files changed, 112 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/12229/3
--
To view, visit http://gerrit.cloudera.org:8080/12229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ife7ec78fe42558429c1cbe6e5eba79842bffd648
Gerrit-Change-Number: 12229
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger