Hi Clayton, Certainly some of the metrics should be preserved across reloads, e.g. metrics like *haproxy_server_http_responses_total *should be preserved across reload (though to an extent, Prometheus can handle resets correctly with its native support).
However, the metric *haproxy_server_http_average_response_latency_milliseconds* appears also to be accumulating when we wouldn't expect it to. (According the the haproxy stats, I think that's a rolling average over the last 1024 calls -- so it goes up and down, or should.) Thoughts? Cheers, Dani On Thu, Aug 15, 2019 at 3:59 PM Clayton Coleman <[email protected]> wrote: > Metrics memory use in the router should be proportional to number of > services, endpoints, and routes. I doubt it's leaking there and if it were > it'd be really slow since we don't restart the router monitor process > ever. Stats should definitely be preserved across reloads, but will not be > preserved across the pod being restarted. > > On Thu, Aug 15, 2019 at 10:30 AM Dan Mace <[email protected]> wrote: > >> >> >> On Thu, Aug 15, 2019 at 10:03 AM Daniel Comnea <[email protected]> >> wrote: >> >>> Hi, >>> >>> Would appreciate if anyone can please confirm that my understanding is >>> correct w.r.t the way the router haproxy image [1] is built. >>> Am i right to assume that the image [1] is is built as it's seen without >>> any other layer being added to include [2] ? >>> Also am i right to say the haproxy metrics [2] is part of the origin >>> package ? >>> >>> >>> A bit of background/ context: >>> >>> a while back on OKD 3.7 we had to swap the openshift 3.7.2 router image >>> with 3.10 because we were seeing some problems with the reload and so we >>> wanted to take the benefit of the native haproxy 1.8 reload feature to stop >>> affecting the traffic. >>> >>> While everything was nice and working okay we've noticed recently that >>> the haproxy stats do slowly increase and we do wonder if this is an >>> accumulation or not cause (maybe?) by the reloads. Now i'm aware of a >>> change made [3] however i suspect that is not part of the 3.10 image hence >>> my question to double check if my understanding is wrong or not. >>> >>> >>> Cheers, >>> Dani >>> >>> [1] >>> https://github.com/openshift/origin/tree/release-3.10/images/router/haproxy >>> [2] >>> https://github.com/openshift/origin/tree/release-3.10/pkg/router/metrics >>> [3] >>> https://github.com/openshift/origin/commit/8f0119bdd9c3b679cdfdf2962143435a95e08eae#diff-58216897083787e1c87c90955aabceff >>> _______________________________________________ >>> dev mailing list >>> [email protected] >>> http://lists.openshift.redhat.com/openshiftmm/listinfo/dev >>> >> >> I think Clayton (copied) has the history here, but the nature of the >> metrics commit you referenced is that many of the exposed metrics points >> are counters which were being reset across reloads. The patch was (I think) >> to enable counter metrics to correctly aaccumulate across reloads. >> >> As to how the image itself is built, the pkg directly is part of the >> router controller code included with the image. Not sure if that answers >> your question. >> >> -- >> >> Dan Mace >> >> Principal Software Engineer, OpenShift >> >> Red Hat >> >> [email protected] >> >> >>
_______________________________________________ dev mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
