Re: Build pod already exists

2018-03-02 Thread Lionel Orellana
For anyone that comes across this issue here, it is likely it was related
to ntp. There is a bugzilla for it:
https://bugzilla.redhat.com/show_bug.cgi?id=1547551

On 29 January 2018 at 06:12, Ben Parees  wrote:

>
>
> On Sun, Jan 28, 2018 at 6:05 AM, Lionel Orellana 
> wrote:
>
>> Thanks Ben.
>>
>> I can reproduce it with the nodejs and wildfly bjild configs directly off
>> the catalog.
>>
>
> and you haven't created those buildconfigs previously and then deleted
> them, within that project?  This is the first time you're creating the
> buildconfig in the project?
>
>
>>
>> My first thought was that the network issue was causing the build pod to
>> be terminated and a second one is being created before the first one dies.
>> Is that a possibility?
>>
>
> I don't think so, once we create a pod for a given build we consider it
> done and shouldn't be creating another one, but the logs should shed some
> light on that.
>
>
>> I will get the logs tomorrow.
>>
>> On Sun, 28 Jan 2018 at 1:51 pm, Ben Parees  wrote:
>>
>>> On Sat, Jan 27, 2018 at 4:06 PM, Lionel Orellana 
>>> wrote:
>>>
 Hi,

 I'm seeing an random error when running builds. Some builds fail very
 quickly with "build pod already exists". This is happening with a number of
 build configs and seems to occur when more than one build from different
 build configs are running at the same time.

>>>
>>> This can happen if you "reset" the build sequence number in your
>>> buildconfig (buildconfig.status.latestversion).  Are you recreating
>>> buildconfigs that may have previously existed and run builds, or editing
>>> the buildconfig in a way that might be reseting the lastversion field to an
>>> older value?
>>>
>>> Are you able to confirm whether or not a pod does exist for that build?
>>> The build pod name will be in the form like "buildconfigname-buildnumber-b
>>> uild"
>>>
>>> If you're able to recreate this consistently(assuming you're sure it's
>>> not a case of you having recreated an old buildconfig or reset the
>>> buildconfig lastversion sequence value), enabling level 4 logging in your
>>> master and reproducing it and then providing us with the logs would be
>>> helpful to trace what is happening.
>>>
>>>
>>>

 There is a possibly related error in one of the nodes:

 Jan 28 07:26:39  atomic-openshift-node[10121]: W0128 07:26:39.735522
 10848 docker_sandbox.go:266] NetworkPlugin cni failed on the stat
 us hook for pod "nodejs-26-build_bimorl": Unexpected command output
 nsenter: cannot open : No such file or directory
 Jan 28 07:26:39 atomic-openshift-node[10121]: with error: exit status 1

>>>
>>> This shouldn't be related to issues with pod creation (since this error
>>> wouldn't occur until after the pod is created and attempting to run on a
>>> node), but it's definitely something you'll want to sort out.  I've CCed
>>> our networking lead into this thread.
>>>
>>>
>>>

 -bash-4.2$ oc version
 oc v3.6.0+c4dd4cf
 kubernetes v1.6.1+5115d708d7

 Any ideas?

 Thanks


 Lionel.

 ___
 users mailing list
 users@lists.openshift.redhat.com
 http://lists.openshift.redhat.com/openshiftmm/listinfo/users


>>>
>>>
>>> --
>>> Ben Parees | OpenShift
>>>
>>>
>
>
> --
> Ben Parees | OpenShift
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Build pod already exists

2018-01-28 Thread Ben Parees
On Sun, Jan 28, 2018 at 6:05 AM, Lionel Orellana  wrote:

> Thanks Ben.
>
> I can reproduce it with the nodejs and wildfly bjild configs directly off
> the catalog.
>

and you haven't created those buildconfigs previously and then deleted
them, within that project?  This is the first time you're creating the
buildconfig in the project?


>
> My first thought was that the network issue was causing the build pod to
> be terminated and a second one is being created before the first one dies.
> Is that a possibility?
>

I don't think so, once we create a pod for a given build we consider it
done and shouldn't be creating another one, but the logs should shed some
light on that.


> I will get the logs tomorrow.
>
> On Sun, 28 Jan 2018 at 1:51 pm, Ben Parees  wrote:
>
>> On Sat, Jan 27, 2018 at 4:06 PM, Lionel Orellana 
>> wrote:
>>
>>> Hi,
>>>
>>> I'm seeing an random error when running builds. Some builds fail very
>>> quickly with "build pod already exists". This is happening with a number of
>>> build configs and seems to occur when more than one build from different
>>> build configs are running at the same time.
>>>
>>
>> This can happen if you "reset" the build sequence number in your
>> buildconfig (buildconfig.status.latestversion).  Are you recreating
>> buildconfigs that may have previously existed and run builds, or editing
>> the buildconfig in a way that might be reseting the lastversion field to an
>> older value?
>>
>> Are you able to confirm whether or not a pod does exist for that build?
>> The build pod name will be in the form like "buildconfigname-buildnumber-
>> build"
>>
>> If you're able to recreate this consistently(assuming you're sure it's
>> not a case of you having recreated an old buildconfig or reset the
>> buildconfig lastversion sequence value), enabling level 4 logging in your
>> master and reproducing it and then providing us with the logs would be
>> helpful to trace what is happening.
>>
>>
>>
>>>
>>> There is a possibly related error in one of the nodes:
>>>
>>> Jan 28 07:26:39  atomic-openshift-node[10121]: W0128 07:26:39.735522
>>> 10848 docker_sandbox.go:266] NetworkPlugin cni failed on the stat
>>> us hook for pod "nodejs-26-build_bimorl": Unexpected command output
>>> nsenter: cannot open : No such file or directory
>>> Jan 28 07:26:39 atomic-openshift-node[10121]: with error: exit status 1
>>>
>>
>> This shouldn't be related to issues with pod creation (since this error
>> wouldn't occur until after the pod is created and attempting to run on a
>> node), but it's definitely something you'll want to sort out.  I've CCed
>> our networking lead into this thread.
>>
>>
>>
>>>
>>> -bash-4.2$ oc version
>>> oc v3.6.0+c4dd4cf
>>> kubernetes v1.6.1+5115d708d7
>>>
>>> Any ideas?
>>>
>>> Thanks
>>>
>>>
>>> Lionel.
>>>
>>> ___
>>> users mailing list
>>> users@lists.openshift.redhat.com
>>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>>
>>>
>>
>>
>> --
>> Ben Parees | OpenShift
>>
>>


-- 
Ben Parees | OpenShift
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Build pod already exists

2018-01-28 Thread Lionel Orellana
Thanks Ben.

I can reproduce it with the nodejs and wildfly bjild configs directly off
the catalog.

My first thought was that the network issue was causing the build pod to be
terminated and a second one is being created before the first one dies. Is
that a possibility?

I will get the logs tomorrow.
On Sun, 28 Jan 2018 at 1:51 pm, Ben Parees  wrote:

> On Sat, Jan 27, 2018 at 4:06 PM, Lionel Orellana 
> wrote:
>
>> Hi,
>>
>> I'm seeing an random error when running builds. Some builds fail very
>> quickly with "build pod already exists". This is happening with a number of
>> build configs and seems to occur when more than one build from different
>> build configs are running at the same time.
>>
>
> This can happen if you "reset" the build sequence number in your
> buildconfig (buildconfig.status.latestversion).  Are you recreating
> buildconfigs that may have previously existed and run builds, or editing
> the buildconfig in a way that might be reseting the lastversion field to an
> older value?
>
> Are you able to confirm whether or not a pod does exist for that build?
> The build pod name will be in the form like
> "buildconfigname-buildnumber-build"
>
> If you're able to recreate this consistently(assuming you're sure it's not
> a case of you having recreated an old buildconfig or reset the buildconfig
> lastversion sequence value), enabling level 4 logging in your master and
> reproducing it and then providing us with the logs would be helpful to
> trace what is happening.
>
>
>
>>
>> There is a possibly related error in one of the nodes:
>>
>> Jan 28 07:26:39  atomic-openshift-node[10121]: W0128 07:26:39.735522
>> 10848 docker_sandbox.go:266] NetworkPlugin cni failed on the stat
>> us hook for pod "nodejs-26-build_bimorl": Unexpected command output
>> nsenter: cannot open : No such file or directory
>> Jan 28 07:26:39 atomic-openshift-node[10121]: with error: exit status 1
>>
>
> This shouldn't be related to issues with pod creation (since this error
> wouldn't occur until after the pod is created and attempting to run on a
> node), but it's definitely something you'll want to sort out.  I've CCed
> our networking lead into this thread.
>
>
>
>>
>> -bash-4.2$ oc version
>> oc v3.6.0+c4dd4cf
>> kubernetes v1.6.1+5115d708d7
>>
>> Any ideas?
>>
>> Thanks
>>
>>
>> Lionel.
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>
>
> --
> Ben Parees | OpenShift
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Build pod already exists

2018-01-27 Thread Ben Parees
On Sat, Jan 27, 2018 at 4:06 PM, Lionel Orellana  wrote:

> Hi,
>
> I'm seeing an random error when running builds. Some builds fail very
> quickly with "build pod already exists". This is happening with a number of
> build configs and seems to occur when more than one build from different
> build configs are running at the same time.
>

This can happen if you "reset" the build sequence number in your
buildconfig (buildconfig.status.latestversion).  Are you recreating
buildconfigs that may have previously existed and run builds, or editing
the buildconfig in a way that might be reseting the lastversion field to an
older value?

Are you able to confirm whether or not a pod does exist for that build?
The build pod name will be in the form like
"buildconfigname-buildnumber-build"

If you're able to recreate this consistently(assuming you're sure it's not
a case of you having recreated an old buildconfig or reset the buildconfig
lastversion sequence value), enabling level 4 logging in your master and
reproducing it and then providing us with the logs would be helpful to
trace what is happening.



>
> There is a possibly related error in one of the nodes:
>
> Jan 28 07:26:39  atomic-openshift-node[10121]: W0128 07:26:39.735522
> 10848 docker_sandbox.go:266] NetworkPlugin cni failed on the stat
> us hook for pod "nodejs-26-build_bimorl": Unexpected command output
> nsenter: cannot open : No such file or directory
> Jan 28 07:26:39 atomic-openshift-node[10121]: with error: exit status 1
>

This shouldn't be related to issues with pod creation (since this error
wouldn't occur until after the pod is created and attempting to run on a
node), but it's definitely something you'll want to sort out.  I've CCed
our networking lead into this thread.



>
> -bash-4.2$ oc version
> oc v3.6.0+c4dd4cf
> kubernetes v1.6.1+5115d708d7
>
> Any ideas?
>
> Thanks
>
>
> Lionel.
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 
Ben Parees | OpenShift
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users