Re: Node report OK but every pod marked unready

Patrick Tescher Thu, 20 Apr 2017 08:27:26 -0700

We upgraded to 1.5.0 and that error went away. 

--
Patrick Tescher


> On Apr 19, 2017, at 10:59 PM, Andrew Lau <[email protected]> wrote:
> 
> thin_ls has been happening for quite some time 
> https://github.com/openshift/origin/issues/10940
> 
>> On Thu, 20 Apr 2017 at 15:55 Tero Ahonen <[email protected]> wrote:
>> It seems that error is related to docker storage on that vm
>> 
>> .t
>> 
>> Sent from my iPhone
>> 
>>> On 20 Apr 2017, at 8.53, Andrew Lau <[email protected]> wrote:
>>> 
>>> Unfortunately I did not. I dumped the logs and just removed the node in 
>>> order to quickly restore the current containers on another node.
>>> 
>>> At the exact time it failed I saw a lot of the following:
>>> 
>>> ===
>>> thin_pool_watcher.go:72] encountered error refreshing thin pool watcher: 
>>> error performing thin_ls on metadata device 
>>> /dev/mapper/docker_vg-docker--pool_tmeta: Error running command `thin_ls 
>>> --no-headers -m -o DEV,
>>> EXCLUSIVE_BYTES /dev/mapper/docker_vg-docker--pool_tmeta`: exit status 127
>>> 
>>> failed (failure): rpc error: code = 2 desc = shim error: context deadline 
>>> exceeded#015
>>> 
>>> Error running exec in container: rpc error: code = 2 desc = shim error: 
>>> context deadline exceeded
>>> ===
>>> 
>>> Seems to match https://bugzilla.redhat.com/show_bug.cgi?id=1427212
>>> 
>>> 
>>>> On Thu, 20 Apr 2017 at 15:41 Tero Ahonen <[email protected]> wrote:
>>>> Hi
>>>> 
>>>> Did u try to ssh to that node and execute sudo docker run to some 
>>>> container?
>>>> 
>>>> .t
>>>> 
>>>> Sent from my iPhone
>>>> 
>>>> > On 20 Apr 2017, at 8.18, Andrew Lau <[email protected]> wrote:
>>>> >
>>>> > I'm trying to debug a weird scenario where a node has had every pod 
>>>> > crash with the error:
>>>> > "rpc error: code = 2 desc = shim error: context deadline exceeded"
>>>> >
>>>> > The pods stayed in the state Ready 0/1
>>>> > The docker daemon was responding and the kublet and all it's services 
>>>> > were running. The node was reporting with the OK status.
>>>> >
>>>> > No resource limits were hit with CPU almost idle and memory at 25% 
>>>> > utilisation.
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > users mailing list
>>>> > [email protected]
>>>> > http://lists.openshift.redhat.com/openshiftmm/listinfo/users
> _______________________________________________
> dev mailing list
> [email protected]
> http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

_______________________________________________
dev mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

Re: Node report OK but every pod marked unready

Reply via email to