Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14776 )
Change subject: IMPALA-7550: Add documentation to profile counters ...................................................................... Patch Set 6: (27 comments) Hi Balazs, Thanks so much for help correct the profile counters descriptions! That's really helpful! Agree that we need to keep code out of stable counters descriptions. Unfortunately, I only have so little knowledge on the scan counters so that a lot of counters I am not sure how to fix the description. Ask @Tim Armstrong and @Lars to help take a look on that. Also, if you have strong options how to update the counters. Please feel free to let me know. Have already applied a few suggestions you made. Thanks Jiawei http://gerrit.cloudera.org:8080/#/c/14776/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14776/5//COMMIT_MSG@37 PS5, Line 37: 2. Profile counters are also annotated with their significance to users. : * STABLE_HIGH - High level and stable counters, always useful on measuring : query performance and status. Counters that everyone is interested. should : rarely change and if it does we will make some effort to notify users. : : * STABLE_LOW - Low level and stable counters. Interesting counters to monitor : and analyze by machine. It will probably be interesting under some : circumstance for users. Lots of developers are interested. : : * Unstable - Unstable but useful. Useful to understand query performance, : but subject to change, particularly if the implementation changes. : E.g. MaterializeTupleTimer : : * Debug - Debugging counters. Generally not useful to users of Impala, : the main use case is low-level debugging. Can be hidden to reduce noise : for most consumers of profiles. : : 3. We have around 250 counters. This commit did the replacement in : scan-node and hdfs-scan-node-base and coordinator. : > Please simplify this to what's in the code - I think the descriptions in Si Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@70 PS5, Line 70: across all Disk I/O threads in HDFS read operations."); : PROFILE_DEFINE_TIMER(TotalRawHdfsOpenFileTime, STABLE_LOW, "The tota > Instead of including an explanation in individual description (but not in e Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@79 PS5, Line 79: lectionItemsRead, STABLE_LOW, TUnit::UNIT, > Is this included in profiles where it's irrelevant? If no, remove this to a Not sure about it. @Lars and @Tarmstrong maybe can help to take a look? http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@85 PS5, Line 85: _DEFINE_SA > Maybe '...remote data source...'? Disk queue is not a term users will be fa @Lars and @Tarmstrong http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@88 PS5, Line 88: ty of the sys > Than what? - Is there a point of comparison we can provide? For example, 'v Not sure... if someone have better knowledge than this might want to fix it... http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@89 PS5, Line 89: the s > scan Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@90 PS5, Line 90: it::BYTES, > I don't think we should speculate on root causes - there can be many reason Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@121 PS5, Line 121: the scanner thread was ready to process " : "the data. High values show that scanner threa > Isn't this the same as 'HDFS scans'? IDK to be honest... @Tarmstrong and @Lars http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@125 PS5, Line 125: pressedBytesReadPerColumn, STABLE_LOW, : TUnit::BYTES, "Stats a > replace: Note that this includes the time when Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@130 PS5, Line 130: , "Stats about the number of compressed bytes read per column. " : "Each sample in the counter is the size of a single column that is scanned by the " : "scan node."); > remove? Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/hdfs-scan-node-base.cc@136 PS5, Line 136: of data cache partially hit"); : PROFILE_DEFINE_COUNTER(DataCacheMissCount, STABLE_HIGH, TUnit::UNIT, : "Total count of data > remove? Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc File be/src/exec/scan-node.cc: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc@57 PS5, Line 57: Only implemented for scan node subclasses that expose " : "the byte > I'd leave this to the BytesReadSeries description. Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc@58 PS5, Line 58: read, e.g. HDFS and HBase."); : PROFILE_DEFINE_COUNTER(R > If this refers to the time series version, move there. If the time series i Not sure about this. @Lars or @Tarmstrong might have deeper understanding on this http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc@63 PS5, Line 63: me that > STABLE_LOW? Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc@66 PS5, Line 66: e series of BytesRead that samples the BytesRead counter."); : PROFILE_DEFINE_TIMER(MaterializeTupleTime, UNSTABLE, "Wall > This also includes time spent in the scanners, ie. applying predicates, mak Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc@74 PS5, Line 74: er of scanner threads can fluctuate during > More than one scanner thread can be spun up to process data from a single s Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc@85 PS5, Line 85: , TUnit::BYTES, : "Peak memory con > Aggregate wall clock time across all scanner threads [...] Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/exec/scan-node.cc@94 PS5, Line 94: t string ScanNode::SCANNER_THREAD_COUNTERS_PREFIX = "ScannerThreads"; > remove? Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@88 PS5, Line 88: ry in exchange no > I'd be hand wavy and add '(across the network)' to help interpret this with Done. Not sure if I understand this correctly. Please help to check this. http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@91 PS5, Line 91: of > clarify: sent 'across the network' (exchange), or just returned further up Done, as above http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@93 PS5, Line 93: R(To > clarify, as above Done, as above. http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@97 PS5, Line 97: OUBLE_VALUE, : "The ratio between TotalScanByteSent and TotalBytesRead, i.e > If both counters' descriptions are correct, I think this is not true. A sca I only have limited knowledge of this part... So I am not sure if there is a misunderstanding or a bug. Maybe @Lars or @Tarmstrong can help to take a look? http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@104 PS5, Line 104: mCompletedBackends, STABLE_HIGH, TUnit::UNIT,"The number of " : "completed backends. Only valid after > remove? Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/runtime/coordinator.cc@105 PS5, Line 105: ll backends have started executing. " : "Does n > Can this be rephrased in less specific terms, e.g. 'after all backends have Done http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: http://gerrit.cloudera.org:8080/#/c/14776/5/be/src/util/runtime-profile-counters.h@146 PS5, Line 146: stabl > nit: stable Done http://gerrit.cloudera.org:8080/#/c/14776/5/www/profile_docs.tmpl File www/profile_docs.tmpl: http://gerrit.cloudera.org:8080/#/c/14776/5/www/profile_docs.tmpl@29 PS5, Line 29: <th>Name</th> : <th>Significance</th> : <th>Unit</th> : <th>Description</th> : </tr> : </thead> : <tbody> : {{#profile_docs}} : <tr> : <td> : {{name}} : </td> : <td> : {{significance}} : </td> : <td> : {{unit}} : </td> : <td> : {{description}} : </td> : </tr> > Looks like stability and significance needs to be merged in the header. Ple Thanks for catching that! I forget to delete in the header. Not sure what do you mean by adding the descriptions? The description of significance level? If so, where do we expose it in the web UI page. http://gerrit.cloudera.org:8080/#/c/14776/5/www/query_profile.tmpl File www/query_profile.tmpl: http://gerrit.cloudera.org:8080/#/c/14776/5/www/query_profile.tmpl@40 PS5, Line 40: iv> : : <pre>{{profile}}</pre> : : <script> > The profile counters should be referenced from outside the context of a sin Moving it under /queries page. https://drive.google.com/open?id=12VsFPdJbXNdkIpXgHB0ez85CPSYlGFif But I feel the description is specific for query profile so it should be located somewhere more related to that? Not sure if it is a good place... Keep it in /query_profile probably makes more sense? https://drive.google.com/open?id=1aAAu8abV4k2yu75qL9zit1u3Q0JJwsaZ Need options from others... -- To view, visit http://gerrit.cloudera.org:8080/14776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idc03faddb27754001290bb6d899840e2cbe7ccb7 Gerrit-Change-Number: 14776 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Balazs Jeszenszky <[email protected]> Gerrit-Reviewer: David Rorke <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jiawei Wang <[email protected]> Gerrit-Reviewer: Jiawei Wang <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Thu, 05 Dec 2019 20:28:30 +0000 Gerrit-HasComments: Yes
