> > > On Wed, Sep 7, 2016 at 1:21 PM, Andy Grimm <[email protected]> wrote: > >> On Wed, Sep 7, 2016 at 11:22 PM, Diego Castro < >> [email protected]> wrote: >> >>> Hello, list. >>> We have been running Origin since last November and i'd like to share >>> some experiences, pains and thoughts. >>> >>> Our origin cluster has about 25 servers including masters,nodes and >>> routers. We have roughly 500 applications exposing services and a bunch of >>> HPA firing up containers all the time. >>> >>> 1) Resource consumption: i noticed during the day a increase of memory >>> consumption due multiple reloads, a lot of process keep running until the >>> connections is finished or OOM kill. Other issue regarding restarts is that >>> due to TCP SYN DROP iptables we are facing some high latencies. What can >>> we do to reduce restart overhead ? >>> >> >> You seem to have several questions intertwined here, and I am by no means >> an expert on this, but on the "lots of processes keep running" topic, you >> may be hitting https://bugzilla.redhat.com/show_bug.cgi?id=1364870 >> (though this manifests as more of a CPU consumption issue than a memory >> issue). In short, what we've seen is cases where haproxy connections are >> "orphaned", so the old processes never exit -- they continuously think they >> have one or two "jobs" left, but they never actually handle them. I think >> this is fixed in the latest 1.5.x release of haproxy, but have not had a >> chance to test yet. >> > > > In 3.3 there are some more knobs you can set to limit the length of time > that an haproxy will stay around after a restart, you may wish to try > playing wit hthat... but the underlying bug is still there in 3.3. >
Understood, i'll give it a try. > > >> >> >>> >>> 2) Metrics: Would be nice to pull some metrics from the routers, >>> something like general network i/o and per endpoint traffic, i found a >>> prometheus export but due to process restart the endpoint states are >>> cleaned. HAProxy 1.6 have a fix for that (http://blog.haproxy.com/2015/ >>> 10/14/whats-new-in-haproxy-1-6/). Do we have plans to upgrade to 1.6 ? >>> What kind of metrics do we have available today? >>> >>> > The lack of metrics is a problem, and there's no great answer to your > question/ > > There are no plans to go to 1.6 at the moment, but we do need to solce the > stats problem, and we need to solve the reload problem, so we may end up > moving. But we are investigating upstream ingress and trying to get > support for that into OpenShift so we can migrate and deprecate the router. > Nice, i'd like to track this work, can you point me on the right direction? > > > -ben > > > >> >>> >>> --- >>> Diego Castro / The CloudFather >>> GetupCloud.com - Eliminamos a Gravidade >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>> >>> >> >> _______________________________________________ >> users mailing list >> [email protected] >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >> >> >
_______________________________________________ users mailing list [email protected] http://lists.openshift.redhat.com/openshiftmm/listinfo/users
