Thanks Caizhi for your comments. I'll think about introducing Histogram metric type and making the metrics description more clear.
Best, Guojun On Thu, May 25, 2023 at 7:13 PM Caizhi Weng <[email protected]> wrote: > I also forgot to mention that Flink metrics [1] is a good reference. It > clearly describes what Gauge, Counter and Histogram should do and has a > list of metrics with clear definition. For example they have a metric named > "lastCheckpointDuration" (which is a Gauge) and not just > "checkpointDuration". > > [1] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/ > > Caizhi Weng <[email protected]> 于2023年5月25日周四 19:10写道: > > > Hi Guojun! > > > > Thanks for raising this discussion. After reading the document, I'd like > > to raise a few opinions. > > > > > > > -------------------------------------------------------------------------------- > > > > 1. About the overall types of metrics. > > > > You suggested two types of metrics, namely Gauge and Counter. A gauge > > indicates the value of a metric at that instance, for example "the > duration > > of the last compaction". A counter indicates an accumulated value of a > > metric overtime, for example "the total number of records written". > > > > However there are times when we want to study the history and see the > > trend. For example if we notice that the last compaction is slower than > > normal, we might want to know "the average duration of the last few > > compactions" to make sure that this slow down is just by chance or is a > > common case. Both gauge and counter cannot meet this need. What we need > is > > another type of metrics called Histogram. This metrics type should be > able > > to record the last few values of a metric and calculate their statistics > > such as average, min/max and more. > > > > > > > -------------------------------------------------------------------------------- > > > > 2. About the types of some specific metrics. > > > > It seems to me that in your proposal, you cannot clearly tell the > > difference between Gauges and Counters, and sometimes it is not clear > that > > what a specific metric means. Let me give you some example. > > > > Metric Name: commitDuration > > > > Description: Commit > > > > Type: Gauge > > > > Update at: Timer starts before commit starting, update commit duration > >> after commit finished > > > > > > At first glance, "Commit" is not a valid description. From the metric > name > > I guess you'd like to record the length of duration for committing > > snapshots. As its type is a Gauge, I suppose you want to record the > *last* > > commit. Please make things clear what exactly you want to record. And as > I > > said in the last section, it might be better to introduce a Histogram > type > > so that we can study the average duration of the last few commits. > > > > Metric Name: numTableFilesAdded > > > > Description: Number of added table files in this commit > > > > Type: Counter > > > > Update at: Collecting changes from committables > > > > > > By "this commit" I suppose you mean *last* commit. If you the type should > > be a Gauge, not a Counter. By using Counter you may want to record the > *total > > number* of files added during all commits. > > > > Metric Name: numFilesCompactedBefore > > > > Description: Number of deleted files in compaction > > > > Type: Counter > > > > Update at: Triggering compaction > > > > > > It is also unclear to me what you're going to record. Are you going the > > record the number of deleted files during the *last* compaction? Or you'd > > like to record the *total number* of deleted files during all compaction? > > > > I'm afraid most of your proposed metrics seem unclear to me. Please > > rewrite the proposal to make them clear. > > > > Guojun Li <[email protected]> 于2023年5月25日周四 16:59写道: > > > >> Hi, Paimon Devs, > >> I’d like to start a discussion about PIP-3[1]. > >> In this PIP, I'm talking about how to support metrics for paimon, > >> what > >> metrics paimon needs. Look forward to your question and suggestions. > >> > >> Best, > >> Guojun > >> > >> [1] > >> > >> > https://cwiki.apache.org/confluence/display/PAIMON/PIP-3%3A+Introduce+metrics+for+Paimon > >> > > >
