Re: [Gluster-devel] [Gluster-infra] 8/10 AWS jenkins builders disconnected

2019-03-06 Thread Michael Scherer
Le mercredi 06 mars 2019 à 17:53 +0530, Sankarshan Mukhopadhyay a
écrit :
> On Wed, Mar 6, 2019 at 5:38 PM Deepshikha Khandelwal
>  wrote:
> > 
> > Hello,
> > 
> > Today while debugging the centos7-regression failed builds I saw
> > most of the builders did not pass the instance status check on AWS
> > and were unreachable.
> > 
> > Misc investigated this and came to know about the patch[1] which
> > seems to break the builder one after the other. They all ran the
> > regression test for this specific change before going offline.
> > We suspect that this change do result in infinite loop of processes
> > as we did not see any trace of error in the system logs.
> > 
> > We did reboot all those builders and they all seem to be running
> > fine now.
> > 
> 
> The question though is - what to do about the patch, if the patch
> itself is the root cause? Is this assigned to anyone to look into?

We also pondered on wether we should protect the builder from that kind
of issue. But since:
- we are not sure that the hypothesis is right
- any protection based on "limit the number of process" would surely
sooner or later block legitimate tests, and requires adjustement (and
likely investigation)

we didn't choose to follow that road for now.

> > Please let us know if you see any such issues again.
> > 
> > [1] https://review.gluster.org/#/c/glusterfs/+/22290/
> 
> 
-- 
Michael Scherer
Sysadmin, Community Infrastructure and Platform, OSAS




signature.asc
Description: This is a digitally signed message part
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-infra] 8/10 AWS jenkins builders disconnected

2019-03-06 Thread Deepshikha Khandelwal
Yes, Mohit is looking into it. There's some issue in the patch itself.

I forgot to link the bug filed for this:
https://bugzilla.redhat.com/show_bug.cgi?id=1685813

On Wed, Mar 6, 2019 at 5:54 PM Sankarshan Mukhopadhyay <
sankarshan.mukhopadh...@gmail.com> wrote:

> On Wed, Mar 6, 2019 at 5:38 PM Deepshikha Khandelwal
>  wrote:
> >
> > Hello,
> >
> > Today while debugging the centos7-regression failed builds I saw most of
> the builders did not pass the instance status check on AWS and were
> unreachable.
> >
> > Misc investigated this and came to know about the patch[1] which seems
> to break the builder one after the other. They all ran the regression test
> for this specific change before going offline.
> > We suspect that this change do result in infinite loop of processes as
> we did not see any trace of error in the system logs.
> >
> > We did reboot all those builders and they all seem to be running fine
> now.
> >
>
> The question though is - what to do about the patch, if the patch
> itself is the root cause? Is this assigned to anyone to look into?
>
> > Please let us know if you see any such issues again.
> >
> > [1] https://review.gluster.org/#/c/glusterfs/+/22290/
>
>
> --
> sankarshan mukhopadhyay
> 
> ___
> Gluster-infra mailing list
> gluster-in...@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-infra
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-infra] 8/10 AWS jenkins builders disconnected

2019-03-06 Thread Sankarshan Mukhopadhyay
On Wed, Mar 6, 2019 at 5:38 PM Deepshikha Khandelwal
 wrote:
>
> Hello,
>
> Today while debugging the centos7-regression failed builds I saw most of the 
> builders did not pass the instance status check on AWS and were unreachable.
>
> Misc investigated this and came to know about the patch[1] which seems to 
> break the builder one after the other. They all ran the regression test for 
> this specific change before going offline.
> We suspect that this change do result in infinite loop of processes as we did 
> not see any trace of error in the system logs.
>
> We did reboot all those builders and they all seem to be running fine now.
>

The question though is - what to do about the patch, if the patch
itself is the root cause? Is this assigned to anyone to look into?

> Please let us know if you see any such issues again.
>
> [1] https://review.gluster.org/#/c/glusterfs/+/22290/


-- 
sankarshan mukhopadhyay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel