Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12604 )

Change subject: Add timing information to /scans
......................................................................


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12604/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12604/1//COMMIT_MSG@7
PS1, Line 7: Add timing information to /scans
> Link to KUDU-2705.
:O I didn't know there was a JIRA for this. I thought someone had just 
mentioned it in Slack the other day.


http://gerrit.cloudera.org:8080/#/c/12604/1//COMMIT_MSG@14
PS1, Line 14: Screenshot: 
https://github.com/wdberkeley/kudu/blob/scanspage_screenshot/ss.png
> Curious that all of those scans didn't scan anything. Is that expected? Are
They are scans of loadgen tables with 10K's of rows only. I added a laboriously 
long tooltip to explain the difference between wall time and duration. If 
someone needs to know what user and system time are and how they differ from 
duration or wall (real) time, they can look that up.

The fact that wall > duration is weird. At first I thought this was because I 
was double-counting time by having two ScopedAddScannerTiming's in scope at 
once for a scanner, but that's not so. Then I thought it was because the calls 
to UpdateAccessTime were not covering the whole scope of the 
ScopedAddScannerTiming. This was true, and I adjusted things to make them match 
up more closely, and this should make duration >= wall. But then I tried to 
test it, and I see an example on my mac of a scan of < 2000 rows from 5 seconds 
ago that reports a duration of 855 microseconds, a wall time of 4.57s, and a 
system time of 1.15 minutes:

    real: 4.72 s user: 3.93 ms sys: 1.15 min

Also, I see a number of scans with almost exactly the same weird timing. So I 
think what we're seeing is timing being broken or at least not working like we 
expect on macOS. I'm researching it a bit further but I will also try to take 
this patch and test it on a Linux machine.


http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/scanners.h
File src/kudu/tserver/scanners.h:

http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/scanners.h@46
PS1, Line 46: #include "kudu/util/stopwatch.h"
> warning: #includes are not sorted properly [llvm-include-order]
Done


http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/scanners.h@434
PS1, Line 434:   // The cumulative amounts of wall, user cpu, and system cpu 
time spent on
             :   // this scanner, in seconds.
             :   double wall_secs;
             :   double user_secs;
             :   double sys_secs;
> curious if you considered just using CpuTimes from stopwatch.h for this and
Done


http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/scanners.h@444
PS1, Line 444:   explicit ScopedAddScannerTiming(Scanner* scanner)
> Add a comment about the lifetime of the scanner vs. the scoped object.
Done


http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/scanners.h@466
PS1, Line 466: };
> should probably have a DISALLOW_COPY_AND_ASSIGN so it isn't accidentally co
Done


http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/tablet_service.cc@1990
PS1, Line 1990:   // TODO: add test cases for bad projections including 0 
columns
> warning: missing username/bug in TODO [google-readability-todo]
Done


http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/tablet_service.cc@2182
PS1, Line 2182:     // TODO: Instead of copying the pb, instead split 
HandleContinueScanRequest
> warning: missing username/bug in TODO [google-readability-todo]
Done


http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/tserver_path_handlers.cc
File src/kudu/tserver/tserver_path_handlers.cc:

http://gerrit.cloudera.org:8080/#/c/12604/1/src/kudu/tserver/tserver_path_handlers.cc@542
PS1, Line 542: duration
> can we somehow clarify in the UI the distinction between duration and wall_
Done. And see my very long response to Andrew's comment about the same issue 
(on the commit message).


http://gerrit.cloudera.org:8080/#/c/12604/1/www/scans.mustache
File www/scans.mustache:

http://gerrit.cloudera.org:8080/#/c/12604/1/www/scans.mustache@29
PS1, Line 29: running time of the scan"
> can we clarify how this differs from wall time?
Done



--
To view, visit http://gerrit.cloudera.org:8080/12604
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id30251557ae7f0d97e0d1328bd53511dae7a820b
Gerrit-Change-Number: 12604
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-Comment-Date: Wed, 27 Feb 2019 00:52:28 +0000
Gerrit-HasComments: Yes

Reply via email to