Thanks! Hopefully we don't hit this too much until 1.5.0 is released On Fri, 21 Apr 2017 at 01:26 Patrick Tescher <patr...@outtherelabs.com> wrote:
> We upgraded to 1.5.0 and that error went away. > > -- > Patrick Tescher > > On Apr 19, 2017, at 10:59 PM, Andrew Lau <and...@andrewklau.com> wrote: > > thin_ls has been happening for quite some time > https://github.com/openshift/origin/issues/10940 > > On Thu, 20 Apr 2017 at 15:55 Tero Ahonen <taho...@redhat.com> wrote: > >> It seems that error is related to docker storage on that vm >> >> .t >> >> Sent from my iPhone >> >> On 20 Apr 2017, at 8.53, Andrew Lau <and...@andrewklau.com> wrote: >> >> Unfortunately I did not. I dumped the logs and just removed the node in >> order to quickly restore the current containers on another node. >> >> At the exact time it failed I saw a lot of the following: >> >> === >> thin_pool_watcher.go:72] encountered error refreshing thin pool watcher: >> error performing thin_ls on metadata device >> /dev/mapper/docker_vg-docker--pool_tmeta: Error running command `thin_ls >> --no-headers -m -o DEV, >> EXCLUSIVE_BYTES /dev/mapper/docker_vg-docker--pool_tmeta`: exit status 127 >> >> failed (failure): rpc error: code = 2 desc = shim error: context deadline >> exceeded#015 >> >> Error running exec in container: rpc error: code = 2 desc = shim error: >> context deadline exceeded >> === >> >> Seems to match https://bugzilla.redhat.com/show_bug.cgi?id=1427212 >> >> >> On Thu, 20 Apr 2017 at 15:41 Tero Ahonen <taho...@redhat.com> wrote: >> >>> Hi >>> >>> Did u try to ssh to that node and execute sudo docker run to some >>> container? >>> >>> .t >>> >>> Sent from my iPhone >>> >>> > On 20 Apr 2017, at 8.18, Andrew Lau <and...@andrewklau.com> wrote: >>> > >>> > I'm trying to debug a weird scenario where a node has had every pod >>> crash with the error: >>> > "rpc error: code = 2 desc = shim error: context deadline exceeded" >>> > >>> > The pods stayed in the state Ready 0/1 >>> > The docker daemon was responding and the kublet and all it's services >>> were running. The node was reporting with the OK status. >>> > >>> > No resource limits were hit with CPU almost idle and memory at 25% >>> utilisation. >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > users mailing list >>> > us...@lists.openshift.redhat.com >>> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>> >> _______________________________________________ > dev mailing list > dev@lists.openshift.redhat.com > http://lists.openshift.redhat.com/openshiftmm/listinfo/dev > >
_______________________________________________ dev mailing list dev@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/dev