Re: Build pod already exists
For anyone that comes across this issue here, it is likely it was related to ntp. There is a bugzilla for it: https://bugzilla.redhat.com/show_bug.cgi?id=1547551 On 29 January 2018 at 06:12, Ben Pareeswrote: > > > On Sun, Jan 28, 2018 at 6:05 AM, Lionel Orellana > wrote: > >> Thanks Ben. >> >> I can reproduce it with the nodejs and wildfly bjild configs directly off >> the catalog. >> > > and you haven't created those buildconfigs previously and then deleted > them, within that project? This is the first time you're creating the > buildconfig in the project? > > >> >> My first thought was that the network issue was causing the build pod to >> be terminated and a second one is being created before the first one dies. >> Is that a possibility? >> > > I don't think so, once we create a pod for a given build we consider it > done and shouldn't be creating another one, but the logs should shed some > light on that. > > >> I will get the logs tomorrow. >> >> On Sun, 28 Jan 2018 at 1:51 pm, Ben Parees wrote: >> >>> On Sat, Jan 27, 2018 at 4:06 PM, Lionel Orellana >>> wrote: >>> Hi, I'm seeing an random error when running builds. Some builds fail very quickly with "build pod already exists". This is happening with a number of build configs and seems to occur when more than one build from different build configs are running at the same time. >>> >>> This can happen if you "reset" the build sequence number in your >>> buildconfig (buildconfig.status.latestversion). Are you recreating >>> buildconfigs that may have previously existed and run builds, or editing >>> the buildconfig in a way that might be reseting the lastversion field to an >>> older value? >>> >>> Are you able to confirm whether or not a pod does exist for that build? >>> The build pod name will be in the form like "buildconfigname-buildnumber-b >>> uild" >>> >>> If you're able to recreate this consistently(assuming you're sure it's >>> not a case of you having recreated an old buildconfig or reset the >>> buildconfig lastversion sequence value), enabling level 4 logging in your >>> master and reproducing it and then providing us with the logs would be >>> helpful to trace what is happening. >>> >>> >>> There is a possibly related error in one of the nodes: Jan 28 07:26:39 atomic-openshift-node[10121]: W0128 07:26:39.735522 10848 docker_sandbox.go:266] NetworkPlugin cni failed on the stat us hook for pod "nodejs-26-build_bimorl": Unexpected command output nsenter: cannot open : No such file or directory Jan 28 07:26:39 atomic-openshift-node[10121]: with error: exit status 1 >>> >>> This shouldn't be related to issues with pod creation (since this error >>> wouldn't occur until after the pod is created and attempting to run on a >>> node), but it's definitely something you'll want to sort out. I've CCed >>> our networking lead into this thread. >>> >>> >>> -bash-4.2$ oc version oc v3.6.0+c4dd4cf kubernetes v1.6.1+5115d708d7 Any ideas? Thanks Lionel. ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>> >>> >>> -- >>> Ben Parees | OpenShift >>> >>> > > > -- > Ben Parees | OpenShift > > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Build pod already exists
On Sun, Jan 28, 2018 at 6:05 AM, Lionel Orellanawrote: > Thanks Ben. > > I can reproduce it with the nodejs and wildfly bjild configs directly off > the catalog. > and you haven't created those buildconfigs previously and then deleted them, within that project? This is the first time you're creating the buildconfig in the project? > > My first thought was that the network issue was causing the build pod to > be terminated and a second one is being created before the first one dies. > Is that a possibility? > I don't think so, once we create a pod for a given build we consider it done and shouldn't be creating another one, but the logs should shed some light on that. > I will get the logs tomorrow. > > On Sun, 28 Jan 2018 at 1:51 pm, Ben Parees wrote: > >> On Sat, Jan 27, 2018 at 4:06 PM, Lionel Orellana >> wrote: >> >>> Hi, >>> >>> I'm seeing an random error when running builds. Some builds fail very >>> quickly with "build pod already exists". This is happening with a number of >>> build configs and seems to occur when more than one build from different >>> build configs are running at the same time. >>> >> >> This can happen if you "reset" the build sequence number in your >> buildconfig (buildconfig.status.latestversion). Are you recreating >> buildconfigs that may have previously existed and run builds, or editing >> the buildconfig in a way that might be reseting the lastversion field to an >> older value? >> >> Are you able to confirm whether or not a pod does exist for that build? >> The build pod name will be in the form like "buildconfigname-buildnumber- >> build" >> >> If you're able to recreate this consistently(assuming you're sure it's >> not a case of you having recreated an old buildconfig or reset the >> buildconfig lastversion sequence value), enabling level 4 logging in your >> master and reproducing it and then providing us with the logs would be >> helpful to trace what is happening. >> >> >> >>> >>> There is a possibly related error in one of the nodes: >>> >>> Jan 28 07:26:39 atomic-openshift-node[10121]: W0128 07:26:39.735522 >>> 10848 docker_sandbox.go:266] NetworkPlugin cni failed on the stat >>> us hook for pod "nodejs-26-build_bimorl": Unexpected command output >>> nsenter: cannot open : No such file or directory >>> Jan 28 07:26:39 atomic-openshift-node[10121]: with error: exit status 1 >>> >> >> This shouldn't be related to issues with pod creation (since this error >> wouldn't occur until after the pod is created and attempting to run on a >> node), but it's definitely something you'll want to sort out. I've CCed >> our networking lead into this thread. >> >> >> >>> >>> -bash-4.2$ oc version >>> oc v3.6.0+c4dd4cf >>> kubernetes v1.6.1+5115d708d7 >>> >>> Any ideas? >>> >>> Thanks >>> >>> >>> Lionel. >>> >>> ___ >>> users mailing list >>> users@lists.openshift.redhat.com >>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >>> >>> >> >> >> -- >> Ben Parees | OpenShift >> >> -- Ben Parees | OpenShift ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Build pod already exists
Thanks Ben. I can reproduce it with the nodejs and wildfly bjild configs directly off the catalog. My first thought was that the network issue was causing the build pod to be terminated and a second one is being created before the first one dies. Is that a possibility? I will get the logs tomorrow. On Sun, 28 Jan 2018 at 1:51 pm, Ben Pareeswrote: > On Sat, Jan 27, 2018 at 4:06 PM, Lionel Orellana > wrote: > >> Hi, >> >> I'm seeing an random error when running builds. Some builds fail very >> quickly with "build pod already exists". This is happening with a number of >> build configs and seems to occur when more than one build from different >> build configs are running at the same time. >> > > This can happen if you "reset" the build sequence number in your > buildconfig (buildconfig.status.latestversion). Are you recreating > buildconfigs that may have previously existed and run builds, or editing > the buildconfig in a way that might be reseting the lastversion field to an > older value? > > Are you able to confirm whether or not a pod does exist for that build? > The build pod name will be in the form like > "buildconfigname-buildnumber-build" > > If you're able to recreate this consistently(assuming you're sure it's not > a case of you having recreated an old buildconfig or reset the buildconfig > lastversion sequence value), enabling level 4 logging in your master and > reproducing it and then providing us with the logs would be helpful to > trace what is happening. > > > >> >> There is a possibly related error in one of the nodes: >> >> Jan 28 07:26:39 atomic-openshift-node[10121]: W0128 07:26:39.735522 >> 10848 docker_sandbox.go:266] NetworkPlugin cni failed on the stat >> us hook for pod "nodejs-26-build_bimorl": Unexpected command output >> nsenter: cannot open : No such file or directory >> Jan 28 07:26:39 atomic-openshift-node[10121]: with error: exit status 1 >> > > This shouldn't be related to issues with pod creation (since this error > wouldn't occur until after the pod is created and attempting to run on a > node), but it's definitely something you'll want to sort out. I've CCed > our networking lead into this thread. > > > >> >> -bash-4.2$ oc version >> oc v3.6.0+c4dd4cf >> kubernetes v1.6.1+5115d708d7 >> >> Any ideas? >> >> Thanks >> >> >> Lionel. >> >> ___ >> users mailing list >> users@lists.openshift.redhat.com >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users >> >> > > > -- > Ben Parees | OpenShift > > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: Build pod already exists
On Sat, Jan 27, 2018 at 4:06 PM, Lionel Orellanawrote: > Hi, > > I'm seeing an random error when running builds. Some builds fail very > quickly with "build pod already exists". This is happening with a number of > build configs and seems to occur when more than one build from different > build configs are running at the same time. > This can happen if you "reset" the build sequence number in your buildconfig (buildconfig.status.latestversion). Are you recreating buildconfigs that may have previously existed and run builds, or editing the buildconfig in a way that might be reseting the lastversion field to an older value? Are you able to confirm whether or not a pod does exist for that build? The build pod name will be in the form like "buildconfigname-buildnumber-build" If you're able to recreate this consistently(assuming you're sure it's not a case of you having recreated an old buildconfig or reset the buildconfig lastversion sequence value), enabling level 4 logging in your master and reproducing it and then providing us with the logs would be helpful to trace what is happening. > > There is a possibly related error in one of the nodes: > > Jan 28 07:26:39 atomic-openshift-node[10121]: W0128 07:26:39.735522 > 10848 docker_sandbox.go:266] NetworkPlugin cni failed on the stat > us hook for pod "nodejs-26-build_bimorl": Unexpected command output > nsenter: cannot open : No such file or directory > Jan 28 07:26:39 atomic-openshift-node[10121]: with error: exit status 1 > This shouldn't be related to issues with pod creation (since this error wouldn't occur until after the pod is created and attempting to run on a node), but it's definitely something you'll want to sort out. I've CCed our networking lead into this thread. > > -bash-4.2$ oc version > oc v3.6.0+c4dd4cf > kubernetes v1.6.1+5115d708d7 > > Any ideas? > > Thanks > > > Lionel. > > ___ > users mailing list > users@lists.openshift.redhat.com > http://lists.openshift.redhat.com/openshiftmm/listinfo/users > > -- Ben Parees | OpenShift ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users