Re: Update on DUnit test pipeline on Github actions

2023-05-04 Thread Sai Boorlagadda
Mark,

There isn't any specific ticket I created for larger VMs. I heard there is
an initiative within the infra team to sponsor bigger VMs than the free
tier we use for Github actions, so we have to wait until that initiative
picks up steam.

On Thu, 4 May 2023 at 13:58, Mark Bretl  wrote:

> Sai,
>
> Do you have an INFRA Jira ticket?
>
> --Mark
>
> On Tue, May 2, 2023 at 9:53 AM Sai Boorlagadda 
> wrote:
>
> > There is no ETA provided.
> >
> > Sai
> >
> > On Sat, 29 Apr 2023 at 14:47, Kirk Lund  wrote:
> >
> > > Did INFRA give any hint as to when they might provide the bigger VMs?
> > >
> > > On Mon, Apr 17, 2023 at 8:24 PM Sai Boorlagadda <
> > sai.boorlaga...@gmail.com
> > > >
> > > wrote:
> > >
> > > > I went ahead and merged github workflow jobs that tests
> > > > WAN, CQ, Assembly and Managment distributed tests.
> > > >
> > > > Free workers VMs has 2 cores and tuning any sort of
> > > > parameters isn't speeding up geode-core DUnits.
> > > >
> > > > Talking to infra team found that infra is working on providing
> > > > self-hosted (sponsored by infra) that are much bigger VMs.
> > > >
> > > > So until such VMs are available I am going to find if there are any
> > > > alternate solution.
> > > >
> > > > On Thu, 13 Apr 2023 at 15:08, Kirk Lund  wrote:
> > > >
> > > > > I see that there is at least one person concerned about DUnit tests
> > > > > requiring longer timeouts. This is the current situation with an
> > > unknown
> > > > > number of the DUnit tests. One possibility is to move the worst
> > > offenders
> > > > > to a new src set within geode-core and then give that its own job
> > with
> > > a
> > > > > larger timeout. The longer term solution is to fix or even rewrite
> > some
> > > > of
> > > > > those tests. Excluding them is also not a viable option as we risk
> > > > > losing important test coverage that way. I agree that some of these
> > > tests
> > > > > need a lot more help than tweaking overall job timeout values, but
> > > > without
> > > > > a lot more time commitment from contributors that might not be an
> > > > > option for some time.
> > > > >
> > > > > -Kirk
> > > > >
> > > > > On Thu, Apr 13, 2023 at 3:02 PM Kirk Lund 
> wrote:
> > > > >
> > > > > > Is the coreDistributedTests the only dunit job that currently
> takes
> > > too
> > > > > > long? If it is we may want to split that into more than one job.
> > > > > >
> > > > > > -Kirk
> > > > > >
> > > > > > On Wed, Apr 12, 2023 at 7:58 PM Sai Boorlagadda <
> > > > > sai.boorlaga...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > >> All,
> > > > > >>
> > > > > >> There is an upper bound for job execution time on free workers
> > (set
> > > > to 6
> > > > > >> hours max[1]), which can be configured beyond 6hrs with a
> > > self-hosted
> > > > > >> worker. All of our pipeline jobs are using `--max-workers` to
> > > > > parallelize
> > > > > >> gradle tasks but `testMaxParallelForks` is left to default which
> > is
> > > > > (1/4th
> > > > > >> of the available CPU cores), so primarily due to running only a
> > > single
> > > > > >> test
> > > > > >> in each parallel fork geode-core distribution tests are taking
> > more
> > > > > than 6
> > > > > >> hours. Other than finding a solution for core distributed tests,
> > > most
> > > > > >> DUnit
> > > > > >> tests are passed[2] by splitting them into individual jobs (WAN,
> > CQ,
> > > > > >> Lucene, assembly, management).
> > > > > >>
> > > > > >> Will reach out to infra team and trying playing with
> > `--max-workers`
> > > > to
> > > > > >> parallelize more tests than having to run parallel tests with
> in a
> > > > fork
> > > > > >> would be options.
> > > > > >>
> > > > > >> I am going to wait for few days to get answers from infra team
> > > before
> > > > I
> > > > > >> can
> > > > > >> create a PR to add at least the passing DUnits.
> > > > > >>
> > > > > >> [1]
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
> > > > > >> [2] https://github.com/apache/geode/actions/runs/4639012912
> > > > > >>
> > > > > >> Sai
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Update on DUnit test pipeline on Github actions

2023-05-04 Thread Mark Bretl
Sai,

Do you have an INFRA Jira ticket?

--Mark

On Tue, May 2, 2023 at 9:53 AM Sai Boorlagadda 
wrote:

> There is no ETA provided.
>
> Sai
>
> On Sat, 29 Apr 2023 at 14:47, Kirk Lund  wrote:
>
> > Did INFRA give any hint as to when they might provide the bigger VMs?
> >
> > On Mon, Apr 17, 2023 at 8:24 PM Sai Boorlagadda <
> sai.boorlaga...@gmail.com
> > >
> > wrote:
> >
> > > I went ahead and merged github workflow jobs that tests
> > > WAN, CQ, Assembly and Managment distributed tests.
> > >
> > > Free workers VMs has 2 cores and tuning any sort of
> > > parameters isn't speeding up geode-core DUnits.
> > >
> > > Talking to infra team found that infra is working on providing
> > > self-hosted (sponsored by infra) that are much bigger VMs.
> > >
> > > So until such VMs are available I am going to find if there are any
> > > alternate solution.
> > >
> > > On Thu, 13 Apr 2023 at 15:08, Kirk Lund  wrote:
> > >
> > > > I see that there is at least one person concerned about DUnit tests
> > > > requiring longer timeouts. This is the current situation with an
> > unknown
> > > > number of the DUnit tests. One possibility is to move the worst
> > offenders
> > > > to a new src set within geode-core and then give that its own job
> with
> > a
> > > > larger timeout. The longer term solution is to fix or even rewrite
> some
> > > of
> > > > those tests. Excluding them is also not a viable option as we risk
> > > > losing important test coverage that way. I agree that some of these
> > tests
> > > > need a lot more help than tweaking overall job timeout values, but
> > > without
> > > > a lot more time commitment from contributors that might not be an
> > > > option for some time.
> > > >
> > > > -Kirk
> > > >
> > > > On Thu, Apr 13, 2023 at 3:02 PM Kirk Lund  wrote:
> > > >
> > > > > Is the coreDistributedTests the only dunit job that currently takes
> > too
> > > > > long? If it is we may want to split that into more than one job.
> > > > >
> > > > > -Kirk
> > > > >
> > > > > On Wed, Apr 12, 2023 at 7:58 PM Sai Boorlagadda <
> > > > sai.boorlaga...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> All,
> > > > >>
> > > > >> There is an upper bound for job execution time on free workers
> (set
> > > to 6
> > > > >> hours max[1]), which can be configured beyond 6hrs with a
> > self-hosted
> > > > >> worker. All of our pipeline jobs are using `--max-workers` to
> > > > parallelize
> > > > >> gradle tasks but `testMaxParallelForks` is left to default which
> is
> > > > (1/4th
> > > > >> of the available CPU cores), so primarily due to running only a
> > single
> > > > >> test
> > > > >> in each parallel fork geode-core distribution tests are taking
> more
> > > > than 6
> > > > >> hours. Other than finding a solution for core distributed tests,
> > most
> > > > >> DUnit
> > > > >> tests are passed[2] by splitting them into individual jobs (WAN,
> CQ,
> > > > >> Lucene, assembly, management).
> > > > >>
> > > > >> Will reach out to infra team and trying playing with
> `--max-workers`
> > > to
> > > > >> parallelize more tests than having to run parallel tests with in a
> > > fork
> > > > >> would be options.
> > > > >>
> > > > >> I am going to wait for few days to get answers from infra team
> > before
> > > I
> > > > >> can
> > > > >> create a PR to add at least the passing DUnits.
> > > > >>
> > > > >> [1]
> > > > >>
> > > > >>
> > > >
> > >
> >
> https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration
> > > > >> [2] https://github.com/apache/geode/actions/runs/4639012912
> > > > >>
> > > > >> Sai
> > > > >>
> > > > >
> > > >
> > >
> >
>