Are you seeing high iops on impacted nodes?

If so, it could be related to the following:
https://github.com/openshift/origin/pull/12822

If so, you can try to remove thin_ls from your host so it will not be used
to do per container devicemapper usage stats in cAdvisor which has been
shown to cause issues similar to this.

Thanks,

On Wed, Mar 22, 2017 at 9:42 PM Mateus Caruccio <
[email protected]> wrote:

> At
> https://paste.fedoraproject.org/paste/FYFahXSMMQOVUWHkXcrer15M1UNdIGYhyRLivL9gydE=
> you can find a log grep from heapster with --sink=log set.
>
> Looking for pod "portal-107-rg2ia" one can see it's not being sinked every
> scraping period (only 3/9 during this snippet).
>
>
>
> --
> Mateus Caruccio / Master of Puppets
> GetupCloud.com
> We make the infrastructure invisible
>
> 2017-03-22 19:43 GMT-03:00 Derek Carr <[email protected]>:
>
> +Solly
>
> Anything you can assist with here?
>
> Thanks,
>
> On Wed, Mar 22, 2017 at 6:27 PM Mateus Caruccio <
> [email protected]> wrote:
>
> Hi.
>
> Heapster is experiencing failures for some pods of the cluster, which in
> turn causes HPA to malfunction.
>
> From project events I can see:
>
> 2017-03-22T22:13:29Z   2017-03-22T21:32:59Z   32        portal
>  HorizontalPodAutoscaler               Warning   FailedGetMetrics
> {horizontal-pod-autoscaler }   failed to get CPU consumption and request:
> metrics obtained for 2/4 of pods
> 2017-03-22T22:13:29Z   2017-03-22T21:32:59Z   32        portal
>  HorizontalPodAutoscaler             Warning   FailedComputeReplicas
> {horizontal-pod-autoscaler }   failed to get CPU utilization: failed to get
> CPU consumption and request: metrics obtained for 2/4 of pods
>
>
> Heapster logs says some pods have no metrics, while other pods from the
> same project does:
>
> I0322 22:10:29.104727       1 handlers.go:242] No metrics for container
> wordpress in pod kondzilla/portal-107-rg2ia
> I0322 22:10:29.104746       1 handlers.go:178] No metrics for pod
> kondzilla/portal-107-rg2ia
> ...
> I0322 22:12:21.780763       1 pod_based_enricher.go:141] Container
> namespace:kondzilla/pod:portal-107-rg2ia/container:wordpress not found,
> creating a stub
>
>
> Hitting kubelete's /stats/container/ does returns valid stats, as
> expected.
>
>
> I'm running:
>
> openshift v1.3.1
> kubernetes v1.3.0+52492b4
> etcd 2.3.0+git
>
> openshift/origin-metrics-cassandra:v1.3.1
> openshift/origin-metrics-hawkular-metrics:v1.3.1
> openshift/origin-metrics-heapster:v1.3.2 (v1.3.1 has the same effect)
>
>
> Thanks,
>
> --
> Mateus Caruccio / Master of Puppets
> GetupCloud.com
> We make the infrastructure invisible
> _______________________________________________
> dev mailing list
> [email protected]
> http://lists.openshift.redhat.com/openshiftmm/listinfo/dev
>
>
>
_______________________________________________
dev mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

Reply via email to