onebox-li commented on PR #2078: URL: https://github.com/apache/incubator-celeborn/pull/2078#issuecomment-1801781716
> > > IMO previously `gauge` type of `metrics_SlotsAllocated_Value` is more meaningful, which shows the allocated slots in the last hour. Maybe we can directly use `WorkerInfo#usedSlots`. > > > > > > Thanks @waitinfuture. IMO using counter here can get the original data for SlotsAllocated. If there is a need to get statistics similar to the previous one hour, we can use increase(metrics_SlotsAllocated_Count[1h]) to get them. Compared with the previous one, using prometheus function to calculate can obtain more accurate results, and the time window size can also be defined according to user needs. May here change the panel to follow previous behavior? BTW I think `WorkerInfo#usedSlots` is a little different from SlotsAllocated, may add a new panel for it if necessary. > > Thanks for the explanation, sounds good to me, going to merge to main(v0.4.0)/branch-0.3(v0.3.2). Curious how will you use the new metrics, `increase(metrics_SlotsAllocated_Count[1h]) `? I'm OK with adding `WorkerInfo#usedSlots`, though I think we'd better use the name like `ActiveSlots` Almost like that increase expr, and the window size will be adjusted smaller for me, such as 10m. Do I need to change the panel to similar expr like `increase(metrics_SlotsAllocated_Count[1h])`(which is previous behavior)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
