Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3823: Add timer to measure Parquet footer reads ......................................................................
Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/4371/3/be/src/exec/hdfs-parquet-scanner.h File be/src/exec/hdfs-parquet-scanner.h: PS3, Line 430: avg_timer_ let's update the variable names too. e.g. something like process_footer_timer_stats_ or such http://gerrit.cloudera.org:8080/#/c/4371/2/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: PS2, Line 240: t TSummaryStatsCount > Yes that's right. The only reason I kept it as a Counter instead of a value Thanks I think it's more useful to remove the restriction. Even in hdfs-parquet-scanner where you're using this, there is a Counter constructed just for it's timer, it's not included in a profile. That code might as well have just used a timing function explicitly. (I'm not saying it's worth changing that code, just that there's no dependency on this taking Counters.) Also, I don't think we get much from checking the Counter's unit. We don't really set counter units carefully in many cases (e.g. we use UNIT all over the place instead of anything useful), and ultimately it's up to the user of this interface to make sure what they're providing here makes sense. http://gerrit.cloudera.org:8080/#/c/4371/3/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: PS3, Line 240: Overwrite "SetStats()" to be more consistent with some of the other fns, e.g. SetSamples(). If I saw Overwrite() I'd wonder why there wasn't just a copy constructor or something like that. http://gerrit.cloudera.org:8080/#/c/4371/2/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: Line 661: } > Done thanks! http://gerrit.cloudera.org:8080/#/c/4371/3/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: PS3, Line 668: - I don't see this dash in the example above? Is it intentional? I see the event markers under an event sequence make a list, though those are items of a specific timeline. I don't see why that'd be the case here. Also value() is an average, right? I think it'd be good to have that specified. It'd be nice to include the number of samples as well. Also, avg isn't really all that different from the other stats, so maybe just print them all the same way, e.g.: " <name>: avg: <avgval>, min: <minval>, max: <maxval>, num_samples: <numval>" -- To view, visit http://gerrit.cloudera.org:8080/4371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
