Sailesh Mukil has posted comments on this change. Change subject: IMPALA-3823: Add timer to measure Parquet footer reads ......................................................................
Patch Set 3: (18 comments) http://gerrit.cloudera.org:8080/#/c/4371/2//COMMIT_MSG Commit Message: PS2, Line 15: tr > remove a Done PS2, Line 18: > remove, this is implied if it is a runtime profile counter Done PS2, Line 21: > serialize ... to thrift Done http://gerrit.cloudera.org:8080/#/c/4371/2/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: PS2, Line 214: { > I'd be nice to have a simpler name rather than enumerate the stats this kee Done PS2, Line 223: current_sum_(0) { : } > space between these Done PS2, Line 226: 64_t min_value() const { return current_min_; } : int64_t max_value( > initialize min and max as well Done PS2, Line 230: /// seen so far. : void UpdateCounter(int64_t new_value); : > 1 line Done PS2, Line 234: /// Set() and Add() should not be used. : virtual void Set(double value) { DCHECK(false); } : v > 1 line Done PS2, Line 238: : /// This functio > this can be one sentence Done PS2, Line 240: t TSummaryStatsCount > This is a bit confusing since it starts to look like the AveragedCounter wh Yes that's right. The only reason I kept it as a Counter instead of a value was to compare if the units of both the values were equal. But if you foresee it being used even without a counter as a source, it makes sense to just take a value here. Changed it. PS2, Line 244: private: : /// This keeps track of the total number of values seen so far. : int32_t total_num_values_; : : /// Current sums of values seen so far and min/max values seen so far. : int64_t current_min_; : int64_t current_max_; : int64_t current_sum_; : : SpinLock counter_lock_; : }; > remove extra space, each fn can be 1 line. I know this is what happens abov Done http://gerrit.cloudera.org:8080/#/c/4371/2/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: PS2, Line 303: if (it == summary_stats_map_.end()) { > if this counter already existed, shouldn't the values be updated? Yes, I initially thought that counters will always have the same value when they arrive through reports, however, that needn't be true for long running queries. I've added a method Overwrite() to take care of this. Line 661: } > can you add a comment with how this looks printed out? Done PS2, Line 794: : { > vector can be constructed with a size parameter to do this in 1 step. Done Line 1008: ++total_num_values_; > why is this safe? couldn't current_sum_ become 0 at an arbtrary time (if a Done http://gerrit.cloudera.org:8080/#/c/4371/2/common/thrift/RuntimeProfile.thrift File common/thrift/RuntimeProfile.thrift: PS2, Line 56: TSummaryStatsCounter { > TSummaryStatsCounter Done PS2, Line 59: sum > sum? Done http://gerrit.cloudera.org:8080/#/c/4371/2/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: PS2, Line 343: # The chance that the times taken by different scan ranges to process the footer, : # are equal, is very unlikely. However, it could happen which could make the min and : # max time to process a footer equal. Assert that at a time, no more than 1 node has : # equal min and max time. : assert num_equal < 2 > this seems brittle... The reason I didn't do that is because the times from the runtime profile are listed as such: "330.23ms, 1s23ms, 22ns", etc. There's no good way to compare them without passing them through a helper function to standardize their values before comparison. However, I thought that adding that helper function might clutter this file. If you still think it's worth it, I can go ahead and do it. -- To view, visit http://gerrit.cloudera.org:8080/4371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
