Great that you were able to figure out the root cause.

Cheers.

On Wed, Apr 13, 2022, 08:20 Sifu Tian <[email protected]> wrote:

> Hi Ash,
>
> I wanted to provide a follow-up response and say that I found the root
> cause of the issue.
>
> What I discovered is that during each stage, depending on the scripts our
> code is running, will spike in memory and CPU.  These spikes could last a
> few seconds, up to 30 sec.
> Within our Kubernetes Elastic agent profiles YAML files, I defined memory
> and CPU requests and limits.  The reason I did this was when running
> multiple pipelines,  the pods would only saturate one node while the other
> nodes stayed idle.
> This was the cause of the Pods either hanging or Kubernetes killing them.
> It was an oversight because I would continue to play with the CPU and Mem
> allocation in the file with no improvement.
> (e.g.  memory 1Gi - 4Gi and CPU 1.0 - 4.0.).
> Once I removed the specified memory and CPU in the YAML file, and allow
> Kubernetes to handle the distribution, none of the pods died.  I did notice
> 1 or 2 pipelines that handle our very heavy cpu and memory build hang but I
> can adjust different instances to accommodate the load on the cluster.
>
> [image: Screen Shot 2022-04-12 at 10.48.24 PM.png]
>
> On Tuesday, April 12, 2022 at 9:41:12 AM UTC-4 [email protected] wrote:
>
>> This behaviour of GoCD usually points to Agent process dying mid-way and
>> GoCD automatically re-assign the work to another agent and they would start
>> from scratch. Can you check the agent process logs for the earlier runs to
>> see if there are any exceptions that might have caused the GoCD Server to
>> reassign the process to another agent?
>>
>> Sometimes it could be the pipeline itself that's killing the agent
>> process for a variety of reasons.
>>
>> On Tue, 12 Apr 2022 at 19:02, Sifu Tian <[email protected]> wrote:
>>
>>> [image: Screen Shot 2022-04-12 at 9.21.28 AM.png]Hi all,
>>>
>>> I have some unusual behavior that is happening on random pipelines.
>>> When the pipeline runs, it will run fine but the job will get to a
>>> certain point and start all over again pulling materials and running the
>>> same task. The first task appears to hang or just stops and a new but same
>>> job is run.  The pipeline never fails it just continues to run and it will
>>> spawn the same job over and over.  On the K8 cluster status page, it will
>>> only show one pod but in the console, it will show a new pod was issued.
>>>
>>> I am using the Kubernetes elastic agent plugin
>>> GoCD Server and agent are at 22.1
>>>
>>> Any thoughts or help would be greatly appreciated.[image: Screen Shot
>>> 2022-04-12 at 9.23.20 AM.png]
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "go-cd" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/go-cd/3bf59b24-31f1-4445-be9e-a2ba6606d396n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/go-cd/3bf59b24-31f1-4445-be9e-a2ba6606d396n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> --
>>
>> Ashwanth Kumar / ashwanthkumar.in
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "go-cd" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/go-cd/d3a5b605-130a-40d3-a70f-542a2878927dn%40googlegroups.com
> <https://groups.google.com/d/msgid/go-cd/d3a5b605-130a-40d3-a70f-542a2878927dn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/go-cd/CAD9m7Cz_cSoOLSJXKJbM0GAwCuEwNJDzNEffHibu3Bfiykifiw%40mail.gmail.com.

Reply via email to