Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-06 Thread Rob Clark
On Mon, Apr 6, 2020 at 8:43 AM Adam Jackson  wrote:
>
> On Sat, 2020-04-04 at 08:11 -0700, Rob Clark wrote:
> > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > > For Mesa, we could run CI only when Marge pushes, so that it's a 
> > > > strictly
> > > > pre-merge CI.
> > >
> > > Thanks for the suggestion! I implemented something like this for Mesa:
> > >
> > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> >
> > I wouldn't mind manually triggering pipelines, but unless there is
> > some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> > click first the container jobs.. then wait.. then the build jobs..
> > then wait some more.. and then finally the actual runners.  That would
> > be a real step back in terms of usefulness of CI.. one might call it a
> > regression :-(
>
> I think that's mostly a complaint about the conditionals we've written
> so far, tbh. As I commented on the bug, when I clicked the container
> job (which the rules happen to have evaluated to being "manual"), every
> job (recursively) downstream of it got enqueued, which isn't what
> you're describing. So I think if you can describe the UX you'd like we
> can write rules to make that reality.

Ok, I was fearing that we'd have to manually trigger each stage of
dependencies in the pipeline.  Which wouldn't be so bad except that
gitlab makes you wait for the previous stage to complete before
triggering the next one.

The ideal thing would be to be able to click any jobs that we want to
run, say "arm64_a630_gles31", and for gitlab to realize that it needs
to automatically trigger dependencies of that job (meson-arm64, and
arm_build+arm_test).  But not sure if that is a thing gitlab can do.

Triggering just the first container jobs and having everything from
there run automatically would be ok if the current rules to filter out
unneeded jobs still apply, ie. a panfrost change isn't triggering
freedreno CI jobs and visa versa.  But tbh I don't understand enough
of what that MR is doing to understand if that is what it does.  (It
was suggested on IRC that this is probably what it does.)

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-06 Thread Adam Jackson
On Sat, 2020-04-04 at 08:11 -0700, Rob Clark wrote:
> On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > For Mesa, we could run CI only when Marge pushes, so that it's a strictly
> > > pre-merge CI.
> > 
> > Thanks for the suggestion! I implemented something like this for Mesa:
> > 
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> 
> I wouldn't mind manually triggering pipelines, but unless there is
> some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> click first the container jobs.. then wait.. then the build jobs..
> then wait some more.. and then finally the actual runners.  That would
> be a real step back in terms of usefulness of CI.. one might call it a
> regression :-(

I think that's mostly a complaint about the conditionals we've written
so far, tbh. As I commented on the bug, when I clicked the container
job (which the rules happen to have evaluated to being "manual"), every
job (recursively) downstream of it got enqueued, which isn't what
you're describing. So I think if you can describe the UX you'd like we
can write rules to make that reality.

But I don't really know which jobs are most expensive in terms of
bandwidth, or storage, or CPUs, and even if I knew those I don't know
how to map those to currency. So I'm not sure if the UI we'd like would
minimize the cost the way we'd like.

- ajax

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-05 Thread Nicolas Dufresne
Le samedi 04 avril 2020 à 15:55 +0200, Andreas Bergmeier a écrit :
> The problem of data transfer costs is not new in Cloud environments. At work 
> we usually just opt for paying for it since dev time is scarser. For private 
> projects though, I opt for aggressive (remote) caching.
> So you can setup a global cache in Google Cloud Storage and more local caches 
> wherever your executors are (reduces egress as much as possible).
> This setup works great with Bazel and Pants among others. Note that these 
> systems are pretty hermetic in contrast to Meson.
> IIRC Eric by now works at Google. They internally use Blaze which AFAIK does 
> aggressive caching, too.
> So maybe using any of these systems would be a way of not having to sacrifice 
> any of the current functionality.
> Downside is that you have lower a bit of dev productivity since you cannot 
> eyeball your build definitions anymore.
> 
Did you mean Bazel [0] ? I'm not sure I follow your reflection, why is
Meson vs Bazel related to this issue ?

Nicolas

[0] https://bazel.build/

> ym2c
> 
> 
> On Fri, 28 Feb 2020 at 20:34, Eric Anholt  wrote:
> > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie  wrote:
> > >
> > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone  wrote:
> > > >
> > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie  wrote:
> > > > > b) we probably need to take a large step back here.
> > > > >
> > > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > > sponsorship money that they are just giving straight to google to pay
> > > > > for hosting credits? Google are profiting in some minor way from these
> > > > > hosting credits being bought by us, and I assume we aren't getting any
> > > > > sort of discounts here. Having google sponsor the credits costs google
> > > > > substantially less than having any other company give us money to do
> > > > > it.
> > > >
> > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > > comparable in terms of what you get and what you pay for them.
> > > > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > > > services are cheaper, but then you need to find someone who is going
> > > > to properly administer the various machines, install decent
> > > > monitoring, make sure that more storage is provisioned when we need
> > > > more storage (which is basically all the time), make sure that the
> > > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > > machines has had a drive in imminent-failure state for the last few
> > > > months), etc.
> > > >
> > > > Given the size of our service, that's a much better plan (IMO) than
> > > > relying on someone who a) isn't an admin by trade, b) has a million
> > > > other things to do, and c) hasn't wanted to do it for the past several
> > > > years. But as long as that's the resources we have, then we're paying
> > > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > > problems.
> > >
> > > Admin for gitlab and CI is a full time role anyways. The system is
> > > definitely not self sustaining without time being put in by you and
> > > anholt still. If we have $75k to burn on credits, and it was diverted
> > > to just pay an admin to admin the real hw + gitlab/CI would that not
> > > be a better use of the money? I didn't know if we can afford $75k for
> > > an admin, but suddenly we can afford it for gitlab credits?
> > 
> > As I think about the time that I've spent at google in less than a
> > year on trying to keep the lights on for CI and optimize our
> > infrastructure in the current cloud environment, that's more than the
> > entire yearly budget you're talking about here.  Saying "let's just
> > pay for people to do more work instead of paying for full-service
> > cloud" is not a cost optimization.
> > 
> > 
> > > > Yes, we could federate everything back out so everyone runs their own
> > > > builds and executes those. Tinderbox did something really similar to
> > > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > > pre-merge testing, mind.
> > >
> > > Why? does gitlab not support the model? having builds done in parallel
> > > on runners closer to the test runners seems like it should be a thing.
> > > I guess artifact transfer would cost less then as a result.
> > 
> > Let's do some napkin math.  The biggest artifacts cost we have in Mesa
> > is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> > downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> > makes ~1.8TB/month ($180 or so).  We could build a local storage next
> > to the lava dispatcher so that the artifacts didn't have to contain
> > the rootfs that came from the container (~2/3 of the insides of the
> > zip file), but that's another service to build and maintain.  Building
> > the drivers once locally and storing it would save downloading the
> > other ~1/3 of the inside of the zip file, but that requires a big
> > enough system to do 

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-05 Thread Nicolas Dufresne
Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit :
> On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > For Mesa, we could run CI only when Marge pushes, so that it's a strictly
> > > pre-merge CI.
> > 
> > Thanks for the suggestion! I implemented something like this for Mesa:
> > 
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> > 
> 
> I wouldn't mind manually triggering pipelines, but unless there is
> some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> click first the container jobs.. then wait.. then the build jobs..
> then wait some more.. and then finally the actual runners.  That would
> be a real step back in terms of usefulness of CI.. one might call it a
> regression :-(

On GStreamer side we have moved some existing pipeline to manual mode.
As we use needs: between jobs, we could simply set the first job to
manual (in our case it's a single job called manifest in your case it
would be the N container jobs). This way you can have a manual pipeline
that is triggered in single (or fewer) clicks. Here's an example:

https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292

That our post-merge pipelines, we only trigger then if we suspect a
problem.

> 
> Is there a possible middle ground where pre-marge pipelines that touch
> a particular driver trigger that driver's CI jobs, but MRs that don't
> touch that driver but do touch shared code don't until triggered by
> marge?  Ie. if I have a MR that only touches nir, it's probably ok to
> not run freedreno jobs until marge triggers it.  But if I have a MR
> that is touching freedreno, I'd really rather not have to wait until
> marge triggers the freedreno CI jobs.
> 
> Btw, I was under the impression (from periodically skimming the logs
> in #freedesktop, so I could well be missing or misunderstanding
> something) that caching/etc had been improved and mesa's part of the
> egress wasn't the bigger issue at this point?
> 
> BR,
> -R

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-05 Thread Andreas Bergmeier
The problem of data transfer costs is not new in Cloud environments. At
work we usually just opt for paying for it since dev time is scarser. For
private projects though, I opt for aggressive (remote) caching.
So you can setup a global cache in Google Cloud Storage and more local
caches wherever your executors are (reduces egress as much as possible).
This setup works great with Bazel and Pants among others. Note that these
systems are pretty hermetic in contrast to Meson.
IIRC Eric by now works at Google. They internally use Blaze which AFAIK
does aggressive caching, too.
So maybe using any of these systems would be a way of not having to
sacrifice any of the current functionality.
Downside is that you have lower a bit of dev productivity since you cannot
eyeball your build definitions anymore.

ym2c


On Fri, 28 Feb 2020 at 20:34, Eric Anholt  wrote:

> On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie  wrote:
> >
> > On Fri, 28 Feb 2020 at 18:18, Daniel Stone  wrote:
> > >
> > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie  wrote:
> > > > b) we probably need to take a large step back here.
> > > >
> > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > sponsorship money that they are just giving straight to google to pay
> > > > for hosting credits? Google are profiting in some minor way from
> these
> > > > hosting credits being bought by us, and I assume we aren't getting
> any
> > > > sort of discounts here. Having google sponsor the credits costs
> google
> > > > substantially less than having any other company give us money to do
> > > > it.
> > >
> > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > comparable in terms of what you get and what you pay for them.
> > > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > > services are cheaper, but then you need to find someone who is going
> > > to properly administer the various machines, install decent
> > > monitoring, make sure that more storage is provisioned when we need
> > > more storage (which is basically all the time), make sure that the
> > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > machines has had a drive in imminent-failure state for the last few
> > > months), etc.
> > >
> > > Given the size of our service, that's a much better plan (IMO) than
> > > relying on someone who a) isn't an admin by trade, b) has a million
> > > other things to do, and c) hasn't wanted to do it for the past several
> > > years. But as long as that's the resources we have, then we're paying
> > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > problems.
> >
> > Admin for gitlab and CI is a full time role anyways. The system is
> > definitely not self sustaining without time being put in by you and
> > anholt still. If we have $75k to burn on credits, and it was diverted
> > to just pay an admin to admin the real hw + gitlab/CI would that not
> > be a better use of the money? I didn't know if we can afford $75k for
> > an admin, but suddenly we can afford it for gitlab credits?
>
> As I think about the time that I've spent at google in less than a
> year on trying to keep the lights on for CI and optimize our
> infrastructure in the current cloud environment, that's more than the
> entire yearly budget you're talking about here.  Saying "let's just
> pay for people to do more work instead of paying for full-service
> cloud" is not a cost optimization.
>
>
> > > Yes, we could federate everything back out so everyone runs their own
> > > builds and executes those. Tinderbox did something really similar to
> > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > pre-merge testing, mind.
> >
> > Why? does gitlab not support the model? having builds done in parallel
> > on runners closer to the test runners seems like it should be a thing.
> > I guess artifact transfer would cost less then as a result.
>
> Let's do some napkin math.  The biggest artifacts cost we have in Mesa
> is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> makes ~1.8TB/month ($180 or so).  We could build a local storage next
> to the lava dispatcher so that the artifacts didn't have to contain
> the rootfs that came from the container (~2/3 of the insides of the
> zip file), but that's another service to build and maintain.  Building
> the drivers once locally and storing it would save downloading the
> other ~1/3 of the inside of the zip file, but that requires a big
> enough system to do builds in time.
>
> I'm planning on doing a local filestore for google's lava lab, since I
> need to be able to move our xml files off of the lava DUTs to get the
> xml results we've become accustomed to, but this would not bubble up
> to being a priority for my time if I wasn't doing it anyway.  If it
> takes me a single day to set all this up (I estimate a couple of
> weeks), that costs 

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-05 Thread Peter Hutterer
On Sat, Apr 04, 2020 at 11:16:08AM -0700, Rob Clark wrote:
> On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne  wrote:
> >
> > Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit :
> > > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > > > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > > > For Mesa, we could run CI only when Marge pushes, so that it's a 
> > > > > strictly
> > > > > pre-merge CI.
> > > >
> > > > Thanks for the suggestion! I implemented something like this for Mesa:
> > > >
> > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> > > >
> > >
> > > I wouldn't mind manually triggering pipelines, but unless there is
> > > some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> > > click first the container jobs.. then wait.. then the build jobs..
> > > then wait some more.. and then finally the actual runners.  That would
> > > be a real step back in terms of usefulness of CI.. one might call it a
> > > regression :-(
> >
> > On GStreamer side we have moved some existing pipeline to manual mode.
> > As we use needs: between jobs, we could simply set the first job to
> > manual (in our case it's a single job called manifest in your case it
> > would be the N container jobs). This way you can have a manual pipeline
> > that is triggered in single (or fewer) clicks. Here's an example:
> >
> > https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292
> >
> > That our post-merge pipelines, we only trigger then if we suspect a
> > problem.
> >
> 
> I'm not sure that would work for mesa since the hierarchy of jobs
> branches out pretty far.. ie. if I just clicked the arm64 build + test
> container jobs, and everything else ran automatically after that, it
> would end up running all the CI jobs for all the arm devices (or at
> least all the 64b ones)

generate your gitlab-ci from a template so each pipeline has its own job
dependency. The duplication won't hurt you if it's expanded through
templating and it gives you fine-grained running of the manual jobs.

We're using this in ci-templates/libevdev/libinput for the various
distributions and their versions so each distribution+version is effectively
its own pipeline. But we only need to maintain one job in the actual
template file.

https://freedesktop.pages.freedesktop.org/ci-templates/ci-fairy.html#templating-gitlab-ci-yml

Cheers,
   Peter
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-05 Thread Peter Hutterer
On Sat, Apr 04, 2020 at 08:11:23AM -0700, Rob Clark wrote:
> On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> >
> > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > For Mesa, we could run CI only when Marge pushes, so that it's a strictly
> > > pre-merge CI.
> >
> > Thanks for the suggestion! I implemented something like this for Mesa:
> >
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> >
> 
> I wouldn't mind manually triggering pipelines, but unless there is
> some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> click first the container jobs.. then wait.. then the build jobs..
> then wait some more.. and then finally the actual runners.  That would
> be a real step back in terms of usefulness of CI.. one might call it a
> regression :-(

I *think* this should work though if you set up the right job dependencies.
very simple example:
https://gitlab.freedesktop.org/whot/ci-playground/pipelines/128601

job1 is "when:manual", job2 has "needs: job1", job3 has "needs: job2".
Nothing runs at first, if you trigger job1 it'll cascade down to job 2 and
3.

The main limit you have here are the stages - where a job is part of a stage
but does not have an explicit "needs:" it will wait for the previous stage
to complete. That will never happen if one job in that stage has a manual
dependency. See this pipeline as an example:
https://gitlab.freedesktop.org/whot/ci-playground/pipelines/128605

So basically: if you set up all your jobs with the correct "needs" you could
even have a noop stage for user interface purposes. Here's an example:
https://gitlab.freedesktop.org/whot/ci-playground/pipelines/128606

It has a UI stage with "test-arm" and "test-x86" manual jobs. It has other
stages with dependent jobs on those (cascading down) but it also has 
a set of autorun jobs that run independent of the manual triggers. When you
push, the autorun jobs run. When you trigger "test-arm" manually, it
triggers the various dependent jobs.

So I think what you want to do is possible, it just requires some tweaking
of the "needs" entries.

Cheers,
   Peter

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-04 Thread Rob Clark
On Sat, Apr 4, 2020 at 11:41 AM Rob Clark  wrote:
>
> On Sat, Apr 4, 2020 at 11:16 AM Rob Clark  wrote:
> >
> > On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne  
> > wrote:
> > >
> > > Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit :
> > > > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > > > > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > > > > For Mesa, we could run CI only when Marge pushes, so that it's a 
> > > > > > strictly
> > > > > > pre-merge CI.
> > > > >
> > > > > Thanks for the suggestion! I implemented something like this for Mesa:
> > > > >
> > > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> > > > >
> > > >
> > > > I wouldn't mind manually triggering pipelines, but unless there is
> > > > some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> > > > click first the container jobs.. then wait.. then the build jobs..
> > > > then wait some more.. and then finally the actual runners.  That would
> > > > be a real step back in terms of usefulness of CI.. one might call it a
> > > > regression :-(
> > >
> > > On GStreamer side we have moved some existing pipeline to manual mode.
> > > As we use needs: between jobs, we could simply set the first job to
> > > manual (in our case it's a single job called manifest in your case it
> > > would be the N container jobs). This way you can have a manual pipeline
> > > that is triggered in single (or fewer) clicks. Here's an example:
> > >
> > > https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292
> > >
> > > That our post-merge pipelines, we only trigger then if we suspect a
> > > problem.
> > >
> >
> > I'm not sure that would work for mesa since the hierarchy of jobs
> > branches out pretty far.. ie. if I just clicked the arm64 build + test
> > container jobs, and everything else ran automatically after that, it
> > would end up running all the CI jobs for all the arm devices (or at
> > least all the 64b ones)
>
> update: pepp pointed out on #dri-devel that the path-based rules
> should still apply to prune out hw CI jobs for hw not affected by the
> MR.  If that is the case, and we only need to click the container jobs
> (without then doing the wait dance), then this doesn't sound as
> bad as I feared.


PS. I should add, that in these wfh days, I'm relying on CI to be able
to test changes on some generations of hw that I don't physically have
with me.  It's easy to take for granted, I did until I thought about
what I'd do without CI.  So big thanks to all the people who are
working on CI, it's more important these days than you might realize
:-)

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-04 Thread Rob Clark
On Sat, Apr 4, 2020 at 11:16 AM Rob Clark  wrote:
>
> On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne  wrote:
> >
> > Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit :
> > > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > > > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > > > For Mesa, we could run CI only when Marge pushes, so that it's a 
> > > > > strictly
> > > > > pre-merge CI.
> > > >
> > > > Thanks for the suggestion! I implemented something like this for Mesa:
> > > >
> > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> > > >
> > >
> > > I wouldn't mind manually triggering pipelines, but unless there is
> > > some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> > > click first the container jobs.. then wait.. then the build jobs..
> > > then wait some more.. and then finally the actual runners.  That would
> > > be a real step back in terms of usefulness of CI.. one might call it a
> > > regression :-(
> >
> > On GStreamer side we have moved some existing pipeline to manual mode.
> > As we use needs: between jobs, we could simply set the first job to
> > manual (in our case it's a single job called manifest in your case it
> > would be the N container jobs). This way you can have a manual pipeline
> > that is triggered in single (or fewer) clicks. Here's an example:
> >
> > https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292
> >
> > That our post-merge pipelines, we only trigger then if we suspect a
> > problem.
> >
>
> I'm not sure that would work for mesa since the hierarchy of jobs
> branches out pretty far.. ie. if I just clicked the arm64 build + test
> container jobs, and everything else ran automatically after that, it
> would end up running all the CI jobs for all the arm devices (or at
> least all the 64b ones)

update: pepp pointed out on #dri-devel that the path-based rules
should still apply to prune out hw CI jobs for hw not affected by the
MR.  If that is the case, and we only need to click the container jobs
(without then doing the wait dance), then this doesn't sound as
bad as I feared.

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-04 Thread Rob Clark
On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne  wrote:
>
> Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit :
> > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > > For Mesa, we could run CI only when Marge pushes, so that it's a 
> > > > strictly
> > > > pre-merge CI.
> > >
> > > Thanks for the suggestion! I implemented something like this for Mesa:
> > >
> > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> > >
> >
> > I wouldn't mind manually triggering pipelines, but unless there is
> > some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> > click first the container jobs.. then wait.. then the build jobs..
> > then wait some more.. and then finally the actual runners.  That would
> > be a real step back in terms of usefulness of CI.. one might call it a
> > regression :-(
>
> On GStreamer side we have moved some existing pipeline to manual mode.
> As we use needs: between jobs, we could simply set the first job to
> manual (in our case it's a single job called manifest in your case it
> would be the N container jobs). This way you can have a manual pipeline
> that is triggered in single (or fewer) clicks. Here's an example:
>
> https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292
>
> That our post-merge pipelines, we only trigger then if we suspect a
> problem.
>

I'm not sure that would work for mesa since the hierarchy of jobs
branches out pretty far.. ie. if I just clicked the arm64 build + test
container jobs, and everything else ran automatically after that, it
would end up running all the CI jobs for all the arm devices (or at
least all the 64b ones)

I'm not sure why gitlab works this way, a more sensible approach would
be to click on the last jobs you want to run and for that to
automatically propagate up to run the jobs needed to run clicked job.

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-04 Thread Rob Clark
On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
>
> On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > For Mesa, we could run CI only when Marge pushes, so that it's a strictly
> > pre-merge CI.
>
> Thanks for the suggestion! I implemented something like this for Mesa:
>
> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
>

I wouldn't mind manually triggering pipelines, but unless there is
some trick I'm not realizing, it is super cumbersome.  Ie. you have to
click first the container jobs.. then wait.. then the build jobs..
then wait some more.. and then finally the actual runners.  That would
be a real step back in terms of usefulness of CI.. one might call it a
regression :-(

Is there a possible middle ground where pre-marge pipelines that touch
a particular driver trigger that driver's CI jobs, but MRs that don't
touch that driver but do touch shared code don't until triggered by
marge?  Ie. if I have a MR that only touches nir, it's probably ok to
not run freedreno jobs until marge triggers it.  But if I have a MR
that is touching freedreno, I'd really rather not have to wait until
marge triggers the freedreno CI jobs.

Btw, I was under the impression (from periodically skimming the logs
in #freedesktop, so I could well be missing or misunderstanding
something) that caching/etc had been improved and mesa's part of the
egress wasn't the bigger issue at this point?

BR,
-R
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-03 Thread Michel Dänzer
On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> For Mesa, we could run CI only when Marge pushes, so that it's a strictly
> pre-merge CI.

Thanks for the suggestion! I implemented something like this for Mesa:

https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-02 Thread Timur Kristóf
On Sat, 2020-02-29 at 14:46 -0500, Nicolas Dufresne wrote:
> > 
> > 1. I think we should completely disable running the CI on MRs which
> > are
> > marked WIP. Speaking from personal experience, I usually make a lot
> > of
> > changes to my MRs before they are merged, so it is a waste of CI
> > resources.
> 
> In the mean time, you can help by taking the habit to use:
> 
>   git push -o ci.skip

Thanks for the advice, I wasn't aware such an option exists. Does this
also work on the mesa gitlab or is this a GStreamer only thing?

How hard would it be to make this the default?

> That's a much more difficult goal then it looks like. Let each
> projects
> manage their CI graph and content, as each case is unique. Running
> more
> tests, or building more code isn't the main issue as the CPU time is
> mostly sponsored. The data transfers between the cloud of gitlab and
> the runners (which are external), along to sending OS image to Lava
> labs is what is likely the most expensive.
> 
> As it was already mention in the thread, what we are missing now, and
> being worked on, is per group/project statistics that give us the
> hotspot so we can better target the optimization work.

Yes, would be nice to know what the hotspot is, indeed.

As far as I understand, the problem is not CI itself, but the bandwidth
needed by the build artifacts, right? Would it be possible to not host
the build artifacts on the gitlab, but rather only the place where the
build actually happened? Or at least, only transfer the build artifacts
on-demand?

I'm not exactly familiar with how the system works, so sorry if this is
a silly question.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-02 Thread Nicolas Dufresne
Le samedi 29 février 2020 à 19:14 +0100, Timur Kristóf a écrit :
> On Fri, 2020-02-28 at 10:43 +, Daniel Stone wrote:
> > On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
> >  wrote:
> > > On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
> > > > Yeah, changes on vulkan drivers or backend compilers should be
> > > > fairly
> > > > sandboxed.
> > > > 
> > > > We also have tools that only work for intel stuff, that should
> > > > never
> > > > trigger anything on other people's HW.
> > > > 
> > > > Could something be worked out using the tags?
> > > 
> > > I think so! We have the pre-defined environment variable
> > > CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
> > > 
> > > https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
> > > 
> > > That sounds like a pretty neat middle-ground to me. I just hope
> > > that
> > > new pipelines are triggered if new labels are added, because not
> > > everyone is allowed to set labels, and sometimes people forget...
> > 
> > There's also this which is somewhat more robust:
> > https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569
> 
> My 20 cents:
> 
> 1. I think we should completely disable running the CI on MRs which are
> marked WIP. Speaking from personal experience, I usually make a lot of
> changes to my MRs before they are merged, so it is a waste of CI
> resources.

In the mean time, you can help by taking the habit to use:

  git push -o ci.skip

CI is in fact run for all branches that you push. When we (GStreamer
Project) started our CI we wanted to limit this to MR, but haven't
found a good way yet (and Gitlab is not helping much). The main issue
is that it's near impossible to use gitlab web API from a runner
(requires private key, in an all or nothing manner). But with the
current situation we are revisiting this.

The truth is that probably every CI have lot of room for optimization,
but it can be really time consuming. So until we have a reason to, we
live with inefficiency, like over sized artifact, unused artifacts,
over-sized docker image, etc. Doing a new round of optimization is
obviously a clear short term goals for project, including GStreamer
project. We have discussions going on and are trying to find solutions.
Notably, we would like to get rid of the post merge CI, as in a rebase
flow like we have in GStreamer, it's a really minor risk.

> 
> 2. Maybe we could take this one step further and only allow the CI to
> be only triggered manually instead of automatically on every push.
> 
> 3. I completely agree with Pierre-Eric on MR 2569, let's not run the
> full CI pipeline on every change, only those parts which are affected
> by the change. It not only costs money, but is also frustrating when
> you submit a change and you get unrelated failures from a completely
> unrelated driver.

That's a much more difficult goal then it looks like. Let each projects
manage their CI graph and content, as each case is unique. Running more
tests, or building more code isn't the main issue as the CPU time is
mostly sponsored. The data transfers between the cloud of gitlab and
the runners (which are external), along to sending OS image to Lava
labs is what is likely the most expensive.

As it was already mention in the thread, what we are missing now, and
being worked on, is per group/project statistics that give us the
hotspot so we can better target the optimization work.

> 
> Best regards,
> Timur
> 
> ___
> gstreamer-devel mailing list
> gstreamer-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-02 Thread Nicolas Dufresne
Le samedi 29 février 2020 à 15:54 -0600, Jason Ekstrand a écrit :
> On Sat, Feb 29, 2020 at 3:47 PM Timur Kristóf  wrote:
> > On Sat, 2020-02-29 at 14:46 -0500, Nicolas Dufresne wrote:
> > > > 1. I think we should completely disable running the CI on MRs which
> > > > are
> > > > marked WIP. Speaking from personal experience, I usually make a lot
> > > > of
> > > > changes to my MRs before they are merged, so it is a waste of CI
> > > > resources.
> > > 
> > > In the mean time, you can help by taking the habit to use:
> > > 
> > >   git push -o ci.skip
> > 
> > Thanks for the advice, I wasn't aware such an option exists. Does this
> > also work on the mesa gitlab or is this a GStreamer only thing?
> 
> Mesa is already set up so that it only runs on MRs and branches named
> ci-* (or maybe it's ci/*; I can't remember).
> 
> > How hard would it be to make this the default?
> 
> I strongly suggest looking at how Mesa does it and doing that in
> GStreamer if you can.  It seems to work pretty well in Mesa.

You are right, they added CI_MERGE_REQUEST_SOURCE_BRANCH_NAME in 11.6
(we started our CI a while ago). But there is even better now, ou can
do:

  only:
refs:
  - merge_requests

Thanks for the hint, I'll suggest that. I've lookup some of the backend
of mesa, I think it's really nice, though there is a lot of concept
that won't work in a multi-repo CI. Again, I need to refresh on what
was moved from the enterprise to the community version in this regard,

> 
> --Jason
> 
> 
> > > That's a much more difficult goal then it looks like. Let each
> > > projects
> > > manage their CI graph and content, as each case is unique. Running
> > > more
> > > tests, or building more code isn't the main issue as the CPU time is
> > > mostly sponsored. The data transfers between the cloud of gitlab and
> > > the runners (which are external), along to sending OS image to Lava
> > > labs is what is likely the most expensive.
> > > 
> > > As it was already mention in the thread, what we are missing now, and
> > > being worked on, is per group/project statistics that give us the
> > > hotspot so we can better target the optimization work.
> > 
> > Yes, would be nice to know what the hotspot is, indeed.
> > 
> > As far as I understand, the problem is not CI itself, but the bandwidth
> > needed by the build artifacts, right? Would it be possible to not host
> > the build artifacts on the gitlab, but rather only the place where the
> > build actually happened? Or at least, only transfer the build artifacts
> > on-demand?
> > 
> > I'm not exactly familiar with how the system works, so sorry if this is
> > a silly question.
> > 
> > ___
> > mesa-dev mailing list
> > mesa-...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-02 Thread Timur Kristóf
On Fri, 2020-02-28 at 10:43 +, Daniel Stone wrote:
> On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
>  wrote:
> > On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
> > > Yeah, changes on vulkan drivers or backend compilers should be
> > > fairly
> > > sandboxed.
> > > 
> > > We also have tools that only work for intel stuff, that should
> > > never
> > > trigger anything on other people's HW.
> > > 
> > > Could something be worked out using the tags?
> > 
> > I think so! We have the pre-defined environment variable
> > CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
> > 
> > https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
> > 
> > That sounds like a pretty neat middle-ground to me. I just hope
> > that
> > new pipelines are triggered if new labels are added, because not
> > everyone is allowed to set labels, and sometimes people forget...
> 
> There's also this which is somewhat more robust:
> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569

My 20 cents:

1. I think we should completely disable running the CI on MRs which are
marked WIP. Speaking from personal experience, I usually make a lot of
changes to my MRs before they are merged, so it is a waste of CI
resources.

2. Maybe we could take this one step further and only allow the CI to
be only triggered manually instead of automatically on every push.

3. I completely agree with Pierre-Eric on MR 2569, let's not run the
full CI pipeline on every change, only those parts which are affected
by the change. It not only costs money, but is also frustrating when
you submit a change and you get unrelated failures from a completely
unrelated driver.

Best regards,
Timur

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-02 Thread Nicolas Dufresne
Le dimanche 01 mars 2020 à 15:14 +0100, Michel Dänzer a écrit :
> On 2020-02-29 8:46 p.m., Nicolas Dufresne wrote:
> > Le samedi 29 février 2020 à 19:14 +0100, Timur Kristóf a écrit :
> > > 1. I think we should completely disable running the CI on MRs which are
> > > marked WIP. Speaking from personal experience, I usually make a lot of
> > > changes to my MRs before they are merged, so it is a waste of CI
> > > resources.
> 
> Interesting idea, do you want to create an MR implementing it?
> 
> 
> > In the mean time, you can help by taking the habit to use:
> > 
> >   git push -o ci.skip
> 
> That breaks Marge Bot.
> 
> 
> > Notably, we would like to get rid of the post merge CI, as in a rebase
> > flow like we have in GStreamer, it's a really minor risk.
> 
> That should be pretty easy, see Mesa and
> https://docs.gitlab.com/ce/ci/variables/predefined_variables.html.
> Something like this should work:
> 
>   rules:
> - if: '$CI_PROJECT_NAMESPACE != "gstreamer"'
>   when: never
> 
> This is another interesting idea we could consider for Mesa as well. It
> would however require (mostly) banning direct pushes to the main repository.

We already have this policy in GStreamer group. We rely on maintainers
to make the right call though, as we have few cases in multi-repo usage
where pushing manually is the only way to reduce the breakage time
(e.g. when we undo a new API in development branch). (We have
implemented support so that CI is run across users repository with the
same branch name, so that allow doing CI with all the changes, but the
merge remains non-atomic.)

> 
> 
> > > 2. Maybe we could take this one step further and only allow the CI to
> > > be only triggered manually instead of automatically on every push.
> 
> That would again break Marge Bot.

Marge is just a software, we can update it to trigger CI on rebases, or
if the CI haven't been run. There was proposal to actually do that and
let marge trigger CI on merge from maintainers. Though, from my point
view, having a longer delay between submission and the author being
aware of CI breakage have some side effects. Authors are often less
available a week later, when someone review and try to merge, which
make merging patches a lot longer.

> 
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Jacob Lifshay
One idea for Marge-bot (don't know if you already do this):
Rust-lang has their bot (bors) automatically group together a few merge
requests into a single merge commit, which it then tests, then, then the
tests pass, it merges. This could help reduce CI runs to once a day (or
some other rate). If the tests fail, then it could automatically deduce
which one failed, by recursive subdivision or similar. There's also a
mechanism to adjust priority and grouping behavior when the defaults aren't
sufficient.

Jacob
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Michel Dänzer
On 2020-02-29 8:46 p.m., Nicolas Dufresne wrote:
> Le samedi 29 février 2020 à 19:14 +0100, Timur Kristóf a écrit :
>>
>> 1. I think we should completely disable running the CI on MRs which are
>> marked WIP. Speaking from personal experience, I usually make a lot of
>> changes to my MRs before they are merged, so it is a waste of CI
>> resources.

Interesting idea, do you want to create an MR implementing it?


> In the mean time, you can help by taking the habit to use:
> 
>   git push -o ci.skip

That breaks Marge Bot.


> Notably, we would like to get rid of the post merge CI, as in a rebase
> flow like we have in GStreamer, it's a really minor risk.

That should be pretty easy, see Mesa and
https://docs.gitlab.com/ce/ci/variables/predefined_variables.html.
Something like this should work:

  rules:
- if: '$CI_PROJECT_NAMESPACE != "gstreamer"'
  when: never

This is another interesting idea we could consider for Mesa as well. It
would however require (mostly) banning direct pushes to the main repository.


>> 2. Maybe we could take this one step further and only allow the CI to
>> be only triggered manually instead of automatically on every push.

That would again break Marge Bot.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-29 Thread Marek Olšák
For Mesa, we could run CI only when Marge pushes, so that it's a strictly
pre-merge CI.

Marek

On Sat., Feb. 29, 2020, 17:20 Nicolas Dufresne, 
wrote:

> Le samedi 29 février 2020 à 15:54 -0600, Jason Ekstrand a écrit :
> > On Sat, Feb 29, 2020 at 3:47 PM Timur Kristóf 
> wrote:
> > > On Sat, 2020-02-29 at 14:46 -0500, Nicolas Dufresne wrote:
> > > > > 1. I think we should completely disable running the CI on MRs which
> > > > > are
> > > > > marked WIP. Speaking from personal experience, I usually make a lot
> > > > > of
> > > > > changes to my MRs before they are merged, so it is a waste of CI
> > > > > resources.
> > > >
> > > > In the mean time, you can help by taking the habit to use:
> > > >
> > > >   git push -o ci.skip
> > >
> > > Thanks for the advice, I wasn't aware such an option exists. Does this
> > > also work on the mesa gitlab or is this a GStreamer only thing?
> >
> > Mesa is already set up so that it only runs on MRs and branches named
> > ci-* (or maybe it's ci/*; I can't remember).
> >
> > > How hard would it be to make this the default?
> >
> > I strongly suggest looking at how Mesa does it and doing that in
> > GStreamer if you can.  It seems to work pretty well in Mesa.
>
> You are right, they added CI_MERGE_REQUEST_SOURCE_BRANCH_NAME in 11.6
> (we started our CI a while ago). But there is even better now, ou can
> do:
>
>   only:
> refs:
>   - merge_requests
>
> Thanks for the hint, I'll suggest that. I've lookup some of the backend
> of mesa, I think it's really nice, though there is a lot of concept
> that won't work in a multi-repo CI. Again, I need to refresh on what
> was moved from the enterprise to the community version in this regard,
>
> >
> > --Jason
> >
> >
> > > > That's a much more difficult goal then it looks like. Let each
> > > > projects
> > > > manage their CI graph and content, as each case is unique. Running
> > > > more
> > > > tests, or building more code isn't the main issue as the CPU time is
> > > > mostly sponsored. The data transfers between the cloud of gitlab and
> > > > the runners (which are external), along to sending OS image to Lava
> > > > labs is what is likely the most expensive.
> > > >
> > > > As it was already mention in the thread, what we are missing now, and
> > > > being worked on, is per group/project statistics that give us the
> > > > hotspot so we can better target the optimization work.
> > >
> > > Yes, would be nice to know what the hotspot is, indeed.
> > >
> > > As far as I understand, the problem is not CI itself, but the bandwidth
> > > needed by the build artifacts, right? Would it be possible to not host
> > > the build artifacts on the gitlab, but rather only the place where the
> > > build actually happened? Or at least, only transfer the build artifacts
> > > on-demand?
> > >
> > > I'm not exactly familiar with how the system works, so sorry if this is
> > > a silly question.
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-...@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> ___
> mesa-dev mailing list
> mesa-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-29 Thread Jason Ekstrand
On Sat, Feb 29, 2020 at 3:47 PM Timur Kristóf  wrote:
>
> On Sat, 2020-02-29 at 14:46 -0500, Nicolas Dufresne wrote:
> > >
> > > 1. I think we should completely disable running the CI on MRs which
> > > are
> > > marked WIP. Speaking from personal experience, I usually make a lot
> > > of
> > > changes to my MRs before they are merged, so it is a waste of CI
> > > resources.
> >
> > In the mean time, you can help by taking the habit to use:
> >
> >   git push -o ci.skip
>
> Thanks for the advice, I wasn't aware such an option exists. Does this
> also work on the mesa gitlab or is this a GStreamer only thing?

Mesa is already set up so that it only runs on MRs and branches named
ci-* (or maybe it's ci/*; I can't remember).

> How hard would it be to make this the default?

I strongly suggest looking at how Mesa does it and doing that in
GStreamer if you can.  It seems to work pretty well in Mesa.

--Jason


> > That's a much more difficult goal then it looks like. Let each
> > projects
> > manage their CI graph and content, as each case is unique. Running
> > more
> > tests, or building more code isn't the main issue as the CPU time is
> > mostly sponsored. The data transfers between the cloud of gitlab and
> > the runners (which are external), along to sending OS image to Lava
> > labs is what is likely the most expensive.
> >
> > As it was already mention in the thread, what we are missing now, and
> > being worked on, is per group/project statistics that give us the
> > hotspot so we can better target the optimization work.
>
> Yes, would be nice to know what the hotspot is, indeed.
>
> As far as I understand, the problem is not CI itself, but the bandwidth
> needed by the build artifacts, right? Would it be possible to not host
> the build artifacts on the gitlab, but rather only the place where the
> build actually happened? Or at least, only transfer the build artifacts
> on-demand?
>
> I'm not exactly familiar with how the system works, so sorry if this is
> a silly question.
>
> ___
> mesa-dev mailing list
> mesa-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-29 Thread Nicholas Krause




On 2/28/20 4:22 PM, Daniel Vetter wrote:

On Fri, Feb 28, 2020 at 9:31 PM Dave Airlie  wrote:

On Sat, 29 Feb 2020 at 05:34, Eric Anholt  wrote:

On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie  wrote:

On Fri, 28 Feb 2020 at 18:18, Daniel Stone  wrote:

On Fri, 28 Feb 2020 at 03:38, Dave Airlie  wrote:

b) we probably need to take a large step back here.

Look at this from a sponsor POV, why would I give X.org/fd.o
sponsorship money that they are just giving straight to google to pay
for hosting credits? Google are profiting in some minor way from these
hosting credits being bought by us, and I assume we aren't getting any
sort of discounts here. Having google sponsor the credits costs google
substantially less than having any other company give us money to do
it.

The last I looked, Google GCP / Amazon AWS / Azure were all pretty
comparable in terms of what you get and what you pay for them.
Obviously providers like Packet and Digital Ocean who offer bare-metal
services are cheaper, but then you need to find someone who is going
to properly administer the various machines, install decent
monitoring, make sure that more storage is provisioned when we need
more storage (which is basically all the time), make sure that the
hardware is maintained in decent shape (pretty sure one of the fd.o
machines has had a drive in imminent-failure state for the last few
months), etc.

Given the size of our service, that's a much better plan (IMO) than
relying on someone who a) isn't an admin by trade, b) has a million
other things to do, and c) hasn't wanted to do it for the past several
years. But as long as that's the resources we have, then we're paying
the cloud tradeoff, where we pay more money in exchange for fewer
problems.

Admin for gitlab and CI is a full time role anyways. The system is
definitely not self sustaining without time being put in by you and
anholt still. If we have $75k to burn on credits, and it was diverted
to just pay an admin to admin the real hw + gitlab/CI would that not
be a better use of the money? I didn't know if we can afford $75k for
an admin, but suddenly we can afford it for gitlab credits?

As I think about the time that I've spent at google in less than a
year on trying to keep the lights on for CI and optimize our
infrastructure in the current cloud environment, that's more than the
entire yearly budget you're talking about here.  Saying "let's just
pay for people to do more work instead of paying for full-service
cloud" is not a cost optimization.



Yes, we could federate everything back out so everyone runs their own
builds and executes those. Tinderbox did something really similar to
that IIRC; not sure if Buildbot does as well. Probably rules out
pre-merge testing, mind.

Why? does gitlab not support the model? having builds done in parallel
on runners closer to the test runners seems like it should be a thing.
I guess artifact transfer would cost less then as a result.

Let's do some napkin math.  The biggest artifacts cost we have in Mesa
is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
makes ~1.8TB/month ($180 or so).  We could build a local storage next
to the lava dispatcher so that the artifacts didn't have to contain
the rootfs that came from the container (~2/3 of the insides of the
zip file), but that's another service to build and maintain.  Building
the drivers once locally and storing it would save downloading the
other ~1/3 of the inside of the zip file, but that requires a big
enough system to do builds in time.

I'm planning on doing a local filestore for google's lava lab, since I
need to be able to move our xml files off of the lava DUTs to get the
xml results we've become accustomed to, but this would not bubble up
to being a priority for my time if I wasn't doing it anyway.  If it
takes me a single day to set all this up (I estimate a couple of
weeks), that costs my employer a lot more than sponsoring the costs of
the inefficiencies of the system that has accumulated.

I'm not trying to knock the engineering works the CI contributors have
done at all, but I've never seen a real discussion about costs until
now. Engineers aren't accountants.

The thing we seem to be missing here is fiscal responsibility. I know
this email is us being fiscally responsible, but it's kinda after the
fact.

I cannot commit my employer to spending a large amount of money (> 0
actually) without a long and lengthy process with checks and bounds.
Can you?

The X.org board has budgets and procedures as well. I as a developer
of Mesa should not be able to commit the X.org foundation to spending
large amounts of money without checks and bounds.

The CI infrastructure lacks any checks and bounds. There is no link
between editing .gitlab-ci/* and cashflow. There is no link to me
adding support for a new feature to llvmpipe that blows out test times
(granted it won't affect CI budget but just an example).


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Nuritzi Sanchez
Hi All,


I know there's been a lot of discussion already, but I wanted to respond to
Daniel's original post.

I joined GitLab earlier this month as their new Open Source Program Manager
[1] and wanted to introduce myself here since I’ll be involved from the
GitLab side as we work together to problem-solve the financial situation
here. My role at GitLab is to help make it easier for Open Source
organizations to migrate (by helping to smooth out some of the current pain
points), and to help advocate internally for changes to the product and our
workflows to make GitLab better for Open Source orgs. We want to make sure
that our Open Source community feels supported beyond just migration. As
such, I’ll be running the GitLab Open Source Program [2].

My background is that I’m the former President and Chairperson of the GNOME
Foundation, which is one of the earliest Free Software projects to migrate
to GitLab. GNOME initially faced some limitations with the CI runner costs
too, but thanks to generous support from donors, has no longer experienced
those issues in recent times. I know there's already a working relationship
between our communities, but it could be good to examine what GNOME and KDE
have done and see if there's anything we can apply here. We've reached out
to Daniel Stone, our main contact for the freedesktop.org migration, and he
has gotten us in touch with Daniel V. and the X.Org Foundation Board to
learn more about what's already been done and what we can do next.

Please bear with me as I continue to get ramped up in my new job, but I’d
like to offer as much support as possible with this issue. We’ll be
exploring ways for GitLab to help make sure there isn’t a gap in coverage
during the time that freedesktop looks for sponsors. I know that on
GitLab’s side, supporting our Open Source user community is a priority.

Best,

Nuritzi

[1] https://about.gitlab.com/company/team/#nuritzi
[2]
https://about.gitlab.com/handbook/marketing/community-relations/opensource-program/


On Fri, Feb 28, 2020 at 1:22 PM Daniel Vetter 
wrote:

> On Fri, Feb 28, 2020 at 9:31 PM Dave Airlie  wrote:
> >
> > On Sat, 29 Feb 2020 at 05:34, Eric Anholt  wrote:
> > >
> > > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie 
> wrote:
> > > >
> > > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone 
> wrote:
> > > > >
> > > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie 
> wrote:
> > > > > > b) we probably need to take a large step back here.
> > > > > >
> > > > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > > > sponsorship money that they are just giving straight to google
> to pay
> > > > > > for hosting credits? Google are profiting in some minor way from
> these
> > > > > > hosting credits being bought by us, and I assume we aren't
> getting any
> > > > > > sort of discounts here. Having google sponsor the credits costs
> google
> > > > > > substantially less than having any other company give us money
> to do
> > > > > > it.
> > > > >
> > > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > > > comparable in terms of what you get and what you pay for them.
> > > > > Obviously providers like Packet and Digital Ocean who offer
> bare-metal
> > > > > services are cheaper, but then you need to find someone who is
> going
> > > > > to properly administer the various machines, install decent
> > > > > monitoring, make sure that more storage is provisioned when we need
> > > > > more storage (which is basically all the time), make sure that the
> > > > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > > > machines has had a drive in imminent-failure state for the last few
> > > > > months), etc.
> > > > >
> > > > > Given the size of our service, that's a much better plan (IMO) than
> > > > > relying on someone who a) isn't an admin by trade, b) has a million
> > > > > other things to do, and c) hasn't wanted to do it for the past
> several
> > > > > years. But as long as that's the resources we have, then we're
> paying
> > > > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > > > problems.
> > > >
> > > > Admin for gitlab and CI is a full time role anyways. The system is
> > > > definitely not self sustaining without time being put in by you and
> > > > anholt still. If we have $75k to burn on credits, and it was diverted
> > > > to just pay an admin to admin the real hw + gitlab/CI would that not
> > > > be a better use of the money? I didn't know if we can afford $75k for
> > > > an admin, but suddenly we can afford it for gitlab credits?
> > >
> > > As I think about the time that I've spent at google in less than a
> > > year on trying to keep the lights on for CI and optimize our
> > > infrastructure in the current cloud environment, that's more than the
> > > entire yearly budget you're talking about here.  Saying "let's just
> > > pay for people to do more work instead of paying for full-service
> > > cloud" is not a cost 

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Vetter
On Fri, Feb 28, 2020 at 9:31 PM Dave Airlie  wrote:
>
> On Sat, 29 Feb 2020 at 05:34, Eric Anholt  wrote:
> >
> > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie  wrote:
> > >
> > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone  wrote:
> > > >
> > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie  wrote:
> > > > > b) we probably need to take a large step back here.
> > > > >
> > > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > > sponsorship money that they are just giving straight to google to pay
> > > > > for hosting credits? Google are profiting in some minor way from these
> > > > > hosting credits being bought by us, and I assume we aren't getting any
> > > > > sort of discounts here. Having google sponsor the credits costs google
> > > > > substantially less than having any other company give us money to do
> > > > > it.
> > > >
> > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > > comparable in terms of what you get and what you pay for them.
> > > > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > > > services are cheaper, but then you need to find someone who is going
> > > > to properly administer the various machines, install decent
> > > > monitoring, make sure that more storage is provisioned when we need
> > > > more storage (which is basically all the time), make sure that the
> > > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > > machines has had a drive in imminent-failure state for the last few
> > > > months), etc.
> > > >
> > > > Given the size of our service, that's a much better plan (IMO) than
> > > > relying on someone who a) isn't an admin by trade, b) has a million
> > > > other things to do, and c) hasn't wanted to do it for the past several
> > > > years. But as long as that's the resources we have, then we're paying
> > > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > > problems.
> > >
> > > Admin for gitlab and CI is a full time role anyways. The system is
> > > definitely not self sustaining without time being put in by you and
> > > anholt still. If we have $75k to burn on credits, and it was diverted
> > > to just pay an admin to admin the real hw + gitlab/CI would that not
> > > be a better use of the money? I didn't know if we can afford $75k for
> > > an admin, but suddenly we can afford it for gitlab credits?
> >
> > As I think about the time that I've spent at google in less than a
> > year on trying to keep the lights on for CI and optimize our
> > infrastructure in the current cloud environment, that's more than the
> > entire yearly budget you're talking about here.  Saying "let's just
> > pay for people to do more work instead of paying for full-service
> > cloud" is not a cost optimization.
> >
> >
> > > > Yes, we could federate everything back out so everyone runs their own
> > > > builds and executes those. Tinderbox did something really similar to
> > > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > > pre-merge testing, mind.
> > >
> > > Why? does gitlab not support the model? having builds done in parallel
> > > on runners closer to the test runners seems like it should be a thing.
> > > I guess artifact transfer would cost less then as a result.
> >
> > Let's do some napkin math.  The biggest artifacts cost we have in Mesa
> > is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> > downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> > makes ~1.8TB/month ($180 or so).  We could build a local storage next
> > to the lava dispatcher so that the artifacts didn't have to contain
> > the rootfs that came from the container (~2/3 of the insides of the
> > zip file), but that's another service to build and maintain.  Building
> > the drivers once locally and storing it would save downloading the
> > other ~1/3 of the inside of the zip file, but that requires a big
> > enough system to do builds in time.
> >
> > I'm planning on doing a local filestore for google's lava lab, since I
> > need to be able to move our xml files off of the lava DUTs to get the
> > xml results we've become accustomed to, but this would not bubble up
> > to being a priority for my time if I wasn't doing it anyway.  If it
> > takes me a single day to set all this up (I estimate a couple of
> > weeks), that costs my employer a lot more than sponsoring the costs of
> > the inefficiencies of the system that has accumulated.
>
> I'm not trying to knock the engineering works the CI contributors have
> done at all, but I've never seen a real discussion about costs until
> now. Engineers aren't accountants.
>
> The thing we seem to be missing here is fiscal responsibility. I know
> this email is us being fiscally responsible, but it's kinda after the
> fact.
>
> I cannot commit my employer to spending a large amount of money (> 0
> actually) without a long and lengthy process with checks and bounds.
> Can you?
>
> The 

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Dave Airlie
On Sat, 29 Feb 2020 at 05:34, Eric Anholt  wrote:
>
> On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie  wrote:
> >
> > On Fri, 28 Feb 2020 at 18:18, Daniel Stone  wrote:
> > >
> > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie  wrote:
> > > > b) we probably need to take a large step back here.
> > > >
> > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > sponsorship money that they are just giving straight to google to pay
> > > > for hosting credits? Google are profiting in some minor way from these
> > > > hosting credits being bought by us, and I assume we aren't getting any
> > > > sort of discounts here. Having google sponsor the credits costs google
> > > > substantially less than having any other company give us money to do
> > > > it.
> > >
> > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > comparable in terms of what you get and what you pay for them.
> > > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > > services are cheaper, but then you need to find someone who is going
> > > to properly administer the various machines, install decent
> > > monitoring, make sure that more storage is provisioned when we need
> > > more storage (which is basically all the time), make sure that the
> > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > machines has had a drive in imminent-failure state for the last few
> > > months), etc.
> > >
> > > Given the size of our service, that's a much better plan (IMO) than
> > > relying on someone who a) isn't an admin by trade, b) has a million
> > > other things to do, and c) hasn't wanted to do it for the past several
> > > years. But as long as that's the resources we have, then we're paying
> > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > problems.
> >
> > Admin for gitlab and CI is a full time role anyways. The system is
> > definitely not self sustaining without time being put in by you and
> > anholt still. If we have $75k to burn on credits, and it was diverted
> > to just pay an admin to admin the real hw + gitlab/CI would that not
> > be a better use of the money? I didn't know if we can afford $75k for
> > an admin, but suddenly we can afford it for gitlab credits?
>
> As I think about the time that I've spent at google in less than a
> year on trying to keep the lights on for CI and optimize our
> infrastructure in the current cloud environment, that's more than the
> entire yearly budget you're talking about here.  Saying "let's just
> pay for people to do more work instead of paying for full-service
> cloud" is not a cost optimization.
>
>
> > > Yes, we could federate everything back out so everyone runs their own
> > > builds and executes those. Tinderbox did something really similar to
> > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > pre-merge testing, mind.
> >
> > Why? does gitlab not support the model? having builds done in parallel
> > on runners closer to the test runners seems like it should be a thing.
> > I guess artifact transfer would cost less then as a result.
>
> Let's do some napkin math.  The biggest artifacts cost we have in Mesa
> is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> makes ~1.8TB/month ($180 or so).  We could build a local storage next
> to the lava dispatcher so that the artifacts didn't have to contain
> the rootfs that came from the container (~2/3 of the insides of the
> zip file), but that's another service to build and maintain.  Building
> the drivers once locally and storing it would save downloading the
> other ~1/3 of the inside of the zip file, but that requires a big
> enough system to do builds in time.
>
> I'm planning on doing a local filestore for google's lava lab, since I
> need to be able to move our xml files off of the lava DUTs to get the
> xml results we've become accustomed to, but this would not bubble up
> to being a priority for my time if I wasn't doing it anyway.  If it
> takes me a single day to set all this up (I estimate a couple of
> weeks), that costs my employer a lot more than sponsoring the costs of
> the inefficiencies of the system that has accumulated.

I'm not trying to knock the engineering works the CI contributors have
done at all, but I've never seen a real discussion about costs until
now. Engineers aren't accountants.

The thing we seem to be missing here is fiscal responsibility. I know
this email is us being fiscally responsible, but it's kinda after the
fact.

I cannot commit my employer to spending a large amount of money (> 0
actually) without a long and lengthy process with checks and bounds.
Can you?

The X.org board has budgets and procedures as well. I as a developer
of Mesa should not be able to commit the X.org foundation to spending
large amounts of money without checks and bounds.

The CI infrastructure lacks any checks and 

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Eric Anholt
On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie  wrote:
>
> On Fri, 28 Feb 2020 at 18:18, Daniel Stone  wrote:
> >
> > On Fri, 28 Feb 2020 at 03:38, Dave Airlie  wrote:
> > > b) we probably need to take a large step back here.
> > >
> > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > sponsorship money that they are just giving straight to google to pay
> > > for hosting credits? Google are profiting in some minor way from these
> > > hosting credits being bought by us, and I assume we aren't getting any
> > > sort of discounts here. Having google sponsor the credits costs google
> > > substantially less than having any other company give us money to do
> > > it.
> >
> > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > comparable in terms of what you get and what you pay for them.
> > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > services are cheaper, but then you need to find someone who is going
> > to properly administer the various machines, install decent
> > monitoring, make sure that more storage is provisioned when we need
> > more storage (which is basically all the time), make sure that the
> > hardware is maintained in decent shape (pretty sure one of the fd.o
> > machines has had a drive in imminent-failure state for the last few
> > months), etc.
> >
> > Given the size of our service, that's a much better plan (IMO) than
> > relying on someone who a) isn't an admin by trade, b) has a million
> > other things to do, and c) hasn't wanted to do it for the past several
> > years. But as long as that's the resources we have, then we're paying
> > the cloud tradeoff, where we pay more money in exchange for fewer
> > problems.
>
> Admin for gitlab and CI is a full time role anyways. The system is
> definitely not self sustaining without time being put in by you and
> anholt still. If we have $75k to burn on credits, and it was diverted
> to just pay an admin to admin the real hw + gitlab/CI would that not
> be a better use of the money? I didn't know if we can afford $75k for
> an admin, but suddenly we can afford it for gitlab credits?

As I think about the time that I've spent at google in less than a
year on trying to keep the lights on for CI and optimize our
infrastructure in the current cloud environment, that's more than the
entire yearly budget you're talking about here.  Saying "let's just
pay for people to do more work instead of paying for full-service
cloud" is not a cost optimization.


> > Yes, we could federate everything back out so everyone runs their own
> > builds and executes those. Tinderbox did something really similar to
> > that IIRC; not sure if Buildbot does as well. Probably rules out
> > pre-merge testing, mind.
>
> Why? does gitlab not support the model? having builds done in parallel
> on runners closer to the test runners seems like it should be a thing.
> I guess artifact transfer would cost less then as a result.

Let's do some napkin math.  The biggest artifacts cost we have in Mesa
is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
makes ~1.8TB/month ($180 or so).  We could build a local storage next
to the lava dispatcher so that the artifacts didn't have to contain
the rootfs that came from the container (~2/3 of the insides of the
zip file), but that's another service to build and maintain.  Building
the drivers once locally and storing it would save downloading the
other ~1/3 of the inside of the zip file, but that requires a big
enough system to do builds in time.

I'm planning on doing a local filestore for google's lava lab, since I
need to be able to move our xml files off of the lava DUTs to get the
xml results we've become accustomed to, but this would not bubble up
to being a priority for my time if I wasn't doing it anyway.  If it
takes me a single day to set all this up (I estimate a couple of
weeks), that costs my employer a lot more than sponsoring the costs of
the inefficiencies of the system that has accumulated.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Erik Faye-Lund
On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
> On 28/02/2020 11:28, Erik Faye-Lund wrote:
> > On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote:
> > > On Fri, 28 Feb 2020 at 07:27, Daniel Vetter <
> > > daniel.vet...@ffwll.ch>
> > > wrote:
> > > > Hi all,
> > > > 
> > > > You might have read the short take in the X.org board meeting
> > > > minutes
> > > > already, here's the long version.
> > > > 
> > > > The good news: gitlab.fd.o has become very popular with our
> > > > communities, and is used extensively. This especially includes
> > > > all
> > > > the
> > > > CI integration. Modern development process and tooling, yay!
> > > > 
> > > > The bad news: The cost in growth has also been tremendous, and
> > > > it's
> > > > breaking our bank account. With reasonable estimates for
> > > > continued
> > > > growth we're expecting hosting expenses totalling 75k USD this
> > > > year,
> > > > and 90k USD next year. With the current sponsors we've set up
> > > > we
> > > > can't
> > > > sustain that. We estimate that hosting expenses for gitlab.fd.o
> > > > without any of the CI features enabled would total 30k USD,
> > > > which
> > > > is
> > > > within X.org's ability to support through various sponsorships,
> > > > mostly
> > > > through XDC.
> > > > 
> > > > Note that X.org does no longer sponsor any CI runners
> > > > themselves,
> > > > we've stopped that. The huge additional expenses are all just
> > > > in
> > > > storing and serving build artifacts and images to outside CI
> > > > runners
> > > > sponsored by various companies. A related topic is that with
> > > > the
> > > > growth in fd.o it's becoming infeasible to maintain it all on
> > > > volunteer admin time. X.org is therefore also looking for admin
> > > > sponsorship, at least medium term.
> > > > 
> > > > Assuming that we want cash flow reserves for one year of
> > > > gitlab.fd.o
> > > > (without CI support) and a trimmed XDC and assuming no sponsor
> > > > payment
> > > > meanwhile, we'd have to cut CI services somewhere between May
> > > > and
> > > > June
> > > > this year. The board is of course working on acquiring
> > > > sponsors,
> > > > but
> > > > filling a shortfall of this magnitude is neither easy nor quick
> > > > work,
> > > > and we therefore decided to give an early warning as soon as
> > > > possible.
> > > > Any help in finding sponsors for fd.o is very much appreciated.
> > > a) Ouch.
> > > 
> > > b) we probably need to take a large step back here.
> > > 
> > I kinda agree, but maybe the step doesn't have to be *too* large?
> > 
> > I wonder if we could solve this by restructuring the project a bit.
> > I'm
> > talking purely from a Mesa point of view here, so it might not
> > solve
> > the full problem, but:
> > 
> > 1. It feels silly that we need to test changes to e.g the i965
> > driver
> > on dragonboards. We only have a big "do not run CI at all" escape-
> > hatch.
> 
> Yeah, changes on vulkan drivers or backend compilers should be
> fairly 
> sandboxed.
> 
> We also have tools that only work for intel stuff, that should never 
> trigger anything on other people's HW.
> 
> Could something be worked out using the tags?
> 

I think so! We have the pre-defined environment variable
CI_MERGE_REQUEST_LABELS, and we can do variable conditions:

https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables

That sounds like a pretty neat middle-ground to me. I just hope that
new pipelines are triggered if new labels are added, because not
everyone is allowed to set labels, and sometimes people forget...

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Erik Faye-Lund
On Fri, 2020-02-28 at 10:43 +, Daniel Stone wrote:
> On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
>  wrote:
> > On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
> > > Yeah, changes on vulkan drivers or backend compilers should be
> > > fairly
> > > sandboxed.
> > > 
> > > We also have tools that only work for intel stuff, that should
> > > never
> > > trigger anything on other people's HW.
> > > 
> > > Could something be worked out using the tags?
> > 
> > I think so! We have the pre-defined environment variable
> > CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
> > 
> > https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
> > 
> > That sounds like a pretty neat middle-ground to me. I just hope
> > that
> > new pipelines are triggered if new labels are added, because not
> > everyone is allowed to set labels, and sometimes people forget...
> 
> There's also this which is somewhat more robust:
> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569
> 
> 

I'm not sure it's more robust, but yeah that a useful tool too.

The reason I'm skeptical about the robustness is that we'll miss
testing if this misses a path. That can of course be fixed by testing
everything once things are in master, and fixing up that list when
something breaks on master.

The person who wrote a change knows more about the intricacies of the
changes than a computer will ever do. But humans are also good at
making mistakes, so I'm not sure which one is better. Maybe the union
of both?

As long as we have both rigorous testing after something landed in
master (doesn't nessecarily need to happen right after, but for now
that's probably fine), as well as a reasonable heuristic for what
testing is needed pre-merge, I think we're good.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Erik Faye-Lund
On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote:
> On Fri, 28 Feb 2020 at 07:27, Daniel Vetter 
> wrote:
> > Hi all,
> > 
> > You might have read the short take in the X.org board meeting
> > minutes
> > already, here's the long version.
> > 
> > The good news: gitlab.fd.o has become very popular with our
> > communities, and is used extensively. This especially includes all
> > the
> > CI integration. Modern development process and tooling, yay!
> > 
> > The bad news: The cost in growth has also been tremendous, and it's
> > breaking our bank account. With reasonable estimates for continued
> > growth we're expecting hosting expenses totalling 75k USD this
> > year,
> > and 90k USD next year. With the current sponsors we've set up we
> > can't
> > sustain that. We estimate that hosting expenses for gitlab.fd.o
> > without any of the CI features enabled would total 30k USD, which
> > is
> > within X.org's ability to support through various sponsorships,
> > mostly
> > through XDC.
> > 
> > Note that X.org does no longer sponsor any CI runners themselves,
> > we've stopped that. The huge additional expenses are all just in
> > storing and serving build artifacts and images to outside CI
> > runners
> > sponsored by various companies. A related topic is that with the
> > growth in fd.o it's becoming infeasible to maintain it all on
> > volunteer admin time. X.org is therefore also looking for admin
> > sponsorship, at least medium term.
> > 
> > Assuming that we want cash flow reserves for one year of
> > gitlab.fd.o
> > (without CI support) and a trimmed XDC and assuming no sponsor
> > payment
> > meanwhile, we'd have to cut CI services somewhere between May and
> > June
> > this year. The board is of course working on acquiring sponsors,
> > but
> > filling a shortfall of this magnitude is neither easy nor quick
> > work,
> > and we therefore decided to give an early warning as soon as
> > possible.
> > Any help in finding sponsors for fd.o is very much appreciated.
> 
> a) Ouch.
> 
> b) we probably need to take a large step back here.
> 

I kinda agree, but maybe the step doesn't have to be *too* large?

I wonder if we could solve this by restructuring the project a bit. I'm
talking purely from a Mesa point of view here, so it might not solve
the full problem, but:

1. It feels silly that we need to test changes to e.g the i965 driver
on dragonboards. We only have a big "do not run CI at all" escape-
hatch.

2. A lot of us are working for a company that can probably pay for
their own needs in terms of CI. Perhaps moving some costs "up front" to
the company that needs it can make the future of CI for those who can't
do this 

3. I think we need a much more detailed break-down of the cost to make
educated changes. For instance, how expensive is Docker image
uploads/downloads (e.g intermediary artifacts) compared to build logs
and final test-results? What kind of artifacts?

One suggestion would be to do something more similar to what the kernel
does, and separate into different repos for different subsystems. This
could allow us to have separate testing-pipelines for these repos,
which would mean that for instance a change to RADV didn't trigger a
full Panfrost test-run.

This would probably require us to accept using a more branch-heavy
work-flow. I don't personally think that would be a bad thing.

But this is all kinda based on an assumption that running hardware-
testing is the expensive part. I think that's quite possibly the case,
but some more numbers would be helpful. I mean, it might turn out that
just throwing up a Docker cache inside the organizations that host
runners might be sufficient for all I know...

We could also do stuff like reducing the amount of tests we run on each
commit, and punt some testing to a per-weekend test-run or someting
like that. We don't *need* to know about every problem up front, just
the stuff that's about to be released, really. The other stuff is just
nice to have. If it's too expensive, I would say drop it.

I would really hope that we can consider approaches like this before we
throw out the baby with the bathwater...

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Erik Faye-Lund
On Fri, 2020-02-28 at 10:47 +0100, Daniel Vetter wrote:
> On Fri, Feb 28, 2020 at 10:29 AM Erik Faye-Lund
>  wrote:
> > On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote:
> > > On Fri, 28 Feb 2020 at 07:27, Daniel Vetter <
> > > daniel.vet...@ffwll.ch>
> > > wrote:
> > > > Hi all,
> > > > 
> > > > You might have read the short take in the X.org board meeting
> > > > minutes
> > > > already, here's the long version.
> > > > 
> > > > The good news: gitlab.fd.o has become very popular with our
> > > > communities, and is used extensively. This especially includes
> > > > all
> > > > the
> > > > CI integration. Modern development process and tooling, yay!
> > > > 
> > > > The bad news: The cost in growth has also been tremendous, and
> > > > it's
> > > > breaking our bank account. With reasonable estimates for
> > > > continued
> > > > growth we're expecting hosting expenses totalling 75k USD this
> > > > year,
> > > > and 90k USD next year. With the current sponsors we've set up
> > > > we
> > > > can't
> > > > sustain that. We estimate that hosting expenses for gitlab.fd.o
> > > > without any of the CI features enabled would total 30k USD,
> > > > which
> > > > is
> > > > within X.org's ability to support through various sponsorships,
> > > > mostly
> > > > through XDC.
> > > > 
> > > > Note that X.org does no longer sponsor any CI runners
> > > > themselves,
> > > > we've stopped that. The huge additional expenses are all just
> > > > in
> > > > storing and serving build artifacts and images to outside CI
> > > > runners
> > > > sponsored by various companies. A related topic is that with
> > > > the
> > > > growth in fd.o it's becoming infeasible to maintain it all on
> > > > volunteer admin time. X.org is therefore also looking for admin
> > > > sponsorship, at least medium term.
> > > > 
> > > > Assuming that we want cash flow reserves for one year of
> > > > gitlab.fd.o
> > > > (without CI support) and a trimmed XDC and assuming no sponsor
> > > > payment
> > > > meanwhile, we'd have to cut CI services somewhere between May
> > > > and
> > > > June
> > > > this year. The board is of course working on acquiring
> > > > sponsors,
> > > > but
> > > > filling a shortfall of this magnitude is neither easy nor quick
> > > > work,
> > > > and we therefore decided to give an early warning as soon as
> > > > possible.
> > > > Any help in finding sponsors for fd.o is very much appreciated.
> > > 
> > > a) Ouch.
> > > 
> > > b) we probably need to take a large step back here.
> > > 
> > 
> > I kinda agree, but maybe the step doesn't have to be *too* large?
> > 
> > I wonder if we could solve this by restructuring the project a bit.
> > I'm
> > talking purely from a Mesa point of view here, so it might not
> > solve
> > the full problem, but:
> > 
> > 1. It feels silly that we need to test changes to e.g the i965
> > driver
> > on dragonboards. We only have a big "do not run CI at all" escape-
> > hatch.
> > 
> > 2. A lot of us are working for a company that can probably pay for
> > their own needs in terms of CI. Perhaps moving some costs "up
> > front" to
> > the company that needs it can make the future of CI for those who
> > can't
> > do this
> > 
> > 3. I think we need a much more detailed break-down of the cost to
> > make
> > educated changes. For instance, how expensive is Docker image
> > uploads/downloads (e.g intermediary artifacts) compared to build
> > logs
> > and final test-results? What kind of artifacts?
> 
> We have logs somewhere, but no one yet got around to analyzing that.
> Which will be quite a bit of work to do since the cloud storage is
> totally disconnected from the gitlab front-end, making the connection
> to which project or CI job caused something is going to require
> scripting. Volunteers definitely very much welcome I think.
> 

Fair enough, but just keep in mind that the same thing as optimizing
software applies here; working blindly reduces the impact. So if we
want to fix the current situation, this is going to have to be a
priority, I think.

> > One suggestion would be to do something more similar to what the
> > kernel
> > does, and separate into different repos for different subsystems.
> > This
> > could allow us to have separate testing-pipelines for these repos,
> > which would mean that for instance a change to RADV didn't trigger
> > a
> > full Panfrost test-run.
> 
> Uh as someone who lives the kernel multi-tree model daily, there's a
> _lot_ of pain involved.

Could you please elaborate a bit? We're not the kernel, so I'm not sure
all of the kernel-pains apply to us. But we probably have other pains
as well ;-)

But yeah, it might be better to take smaller steps first, and see if
that helps.

> I think much better to look at filtering out
> CI targets for when nothing relevant happened. But that gets somewhat
> tricky, since "nothing relevant" is always only relative to some
> baseline, so bit of scripting and all involved to make sure you don't
> run stuff too 

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Lionel Landwerlin

On 28/02/2020 13:46, Michel Dänzer wrote:

On 2020-02-28 12:02 p.m., Erik Faye-Lund wrote:

On Fri, 2020-02-28 at 10:43 +, Daniel Stone wrote:

On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
 wrote:

On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:

Yeah, changes on vulkan drivers or backend compilers should be
fairly
sandboxed.

We also have tools that only work for intel stuff, that should
never
trigger anything on other people's HW.

Could something be worked out using the tags?

I think so! We have the pre-defined environment variable
CI_MERGE_REQUEST_LABELS, and we can do variable conditions:

https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables

That sounds like a pretty neat middle-ground to me. I just hope
that
new pipelines are triggered if new labels are added, because not
everyone is allowed to set labels, and sometimes people forget...

There's also this which is somewhat more robust:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569

I'm not sure it's more robust, but yeah that a useful tool too.

The reason I'm skeptical about the robustness is that we'll miss
testing if this misses a path.

Surely missing a path will be less likely / often to happen compared to
an MR missing a label. (Users which aren't members of the project can't
even set labels for an MR)



Sounds like a good alternative to tags.


-Lionel

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Michel Dänzer
On 2020-02-28 12:02 p.m., Erik Faye-Lund wrote:
> On Fri, 2020-02-28 at 10:43 +, Daniel Stone wrote:
>> On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
>>  wrote:
>>> On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
 Yeah, changes on vulkan drivers or backend compilers should be
 fairly
 sandboxed.

 We also have tools that only work for intel stuff, that should
 never
 trigger anything on other people's HW.

 Could something be worked out using the tags?
>>>
>>> I think so! We have the pre-defined environment variable
>>> CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
>>>
>>> https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
>>>
>>> That sounds like a pretty neat middle-ground to me. I just hope
>>> that
>>> new pipelines are triggered if new labels are added, because not
>>> everyone is allowed to set labels, and sometimes people forget...
>>
>> There's also this which is somewhat more robust:
>> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569
> 
> I'm not sure it's more robust, but yeah that a useful tool too.
> 
> The reason I'm skeptical about the robustness is that we'll miss
> testing if this misses a path.

Surely missing a path will be less likely / often to happen compared to
an MR missing a label. (Users which aren't members of the project can't
even set labels for an MR)


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Michel Dänzer
On 2020-02-28 10:28 a.m., Erik Faye-Lund wrote:
> 
> We could also do stuff like reducing the amount of tests we run on each
> commit, and punt some testing to a per-weekend test-run or someting
> like that. We don't *need* to know about every problem up front, just
> the stuff that's about to be released, really. The other stuff is just
> nice to have. If it's too expensive, I would say drop it.

I don't agree that pre-merge testing is just nice to have. A problem
which is only caught after it lands in mainline has a much bigger impact
than one which is already caught earlier.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Stone
On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
 wrote:
> On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
> > Yeah, changes on vulkan drivers or backend compilers should be
> > fairly
> > sandboxed.
> >
> > We also have tools that only work for intel stuff, that should never
> > trigger anything on other people's HW.
> >
> > Could something be worked out using the tags?
>
> I think so! We have the pre-defined environment variable
> CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
>
> https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
>
> That sounds like a pretty neat middle-ground to me. I just hope that
> new pipelines are triggered if new labels are added, because not
> everyone is allowed to set labels, and sometimes people forget...

There's also this which is somewhat more robust:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569

Cheers,
Daniel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Lucas Stach
On Fr, 2020-02-28 at 10:47 +0100, Daniel Vetter wrote:
> On Fri, Feb 28, 2020 at 10:29 AM Erik Faye-Lund
>  wrote:
> > On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote:
> > > On Fri, 28 Feb 2020 at 07:27, Daniel Vetter 
> > > wrote:
> > > > Hi all,
> > > > 
> > > > You might have read the short take in the X.org board meeting
> > > > minutes
> > > > already, here's the long version.
> > > > 
> > > > The good news: gitlab.fd.o has become very popular with our
> > > > communities, and is used extensively. This especially includes all
> > > > the
> > > > CI integration. Modern development process and tooling, yay!
> > > > 
> > > > The bad news: The cost in growth has also been tremendous, and it's
> > > > breaking our bank account. With reasonable estimates for continued
> > > > growth we're expecting hosting expenses totalling 75k USD this
> > > > year,
> > > > and 90k USD next year. With the current sponsors we've set up we
> > > > can't
> > > > sustain that. We estimate that hosting expenses for gitlab.fd.o
> > > > without any of the CI features enabled would total 30k USD, which
> > > > is
> > > > within X.org's ability to support through various sponsorships,
> > > > mostly
> > > > through XDC.
> > > > 
> > > > Note that X.org does no longer sponsor any CI runners themselves,
> > > > we've stopped that. The huge additional expenses are all just in
> > > > storing and serving build artifacts and images to outside CI
> > > > runners
> > > > sponsored by various companies. A related topic is that with the
> > > > growth in fd.o it's becoming infeasible to maintain it all on
> > > > volunteer admin time. X.org is therefore also looking for admin
> > > > sponsorship, at least medium term.
> > > > 
> > > > Assuming that we want cash flow reserves for one year of
> > > > gitlab.fd.o
> > > > (without CI support) and a trimmed XDC and assuming no sponsor
> > > > payment
> > > > meanwhile, we'd have to cut CI services somewhere between May and
> > > > June
> > > > this year. The board is of course working on acquiring sponsors,
> > > > but
> > > > filling a shortfall of this magnitude is neither easy nor quick
> > > > work,
> > > > and we therefore decided to give an early warning as soon as
> > > > possible.
> > > > Any help in finding sponsors for fd.o is very much appreciated.
> > > 
> > > a) Ouch.
> > > 
> > > b) we probably need to take a large step back here.
> > > 
> > 
> > I kinda agree, but maybe the step doesn't have to be *too* large?
> > 
> > I wonder if we could solve this by restructuring the project a bit. I'm
> > talking purely from a Mesa point of view here, so it might not solve
> > the full problem, but:
> > 
> > 1. It feels silly that we need to test changes to e.g the i965 driver
> > on dragonboards. We only have a big "do not run CI at all" escape-
> > hatch.
> > 
> > 2. A lot of us are working for a company that can probably pay for
> > their own needs in terms of CI. Perhaps moving some costs "up front" to
> > the company that needs it can make the future of CI for those who can't
> > do this
> > 
> > 3. I think we need a much more detailed break-down of the cost to make
> > educated changes. For instance, how expensive is Docker image
> > uploads/downloads (e.g intermediary artifacts) compared to build logs
> > and final test-results? What kind of artifacts?
> 
> We have logs somewhere, but no one yet got around to analyzing that.
> Which will be quite a bit of work to do since the cloud storage is
> totally disconnected from the gitlab front-end, making the connection
> to which project or CI job caused something is going to require
> scripting. Volunteers definitely very much welcome I think.

It's very surprising to me that this kind of cost monitoring is treated
as an afterthought, especially since one of the main jobs of the X.Org
board is to keep spending under control and transparent.

Also from all the conversations it's still unclear to me if the google
hosting costs are already over the sponsored credits (so is burning a
hole into X.org bank account right now) or if this is only going to
happen at a later point in time.

Even with CI disabled it seems that the board estimates a cost of 30k
annually for the plain gitlab hosting. Is this covered by the credits
sponsored by google? If not, why wasn't there a board voting on this
spending? All other spending seem to require pre-approval by the board.
Why wasn't gitlab hosting cost discussed much earlier in the public
board meetings, especially if it's going to be such an big chunk of the
overall spending of the X.Org foundation?

Regards,
Lucas

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Vetter
On Fri, Feb 28, 2020 at 10:29 AM Erik Faye-Lund
 wrote:
>
> On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote:
> > On Fri, 28 Feb 2020 at 07:27, Daniel Vetter 
> > wrote:
> > > Hi all,
> > >
> > > You might have read the short take in the X.org board meeting
> > > minutes
> > > already, here's the long version.
> > >
> > > The good news: gitlab.fd.o has become very popular with our
> > > communities, and is used extensively. This especially includes all
> > > the
> > > CI integration. Modern development process and tooling, yay!
> > >
> > > The bad news: The cost in growth has also been tremendous, and it's
> > > breaking our bank account. With reasonable estimates for continued
> > > growth we're expecting hosting expenses totalling 75k USD this
> > > year,
> > > and 90k USD next year. With the current sponsors we've set up we
> > > can't
> > > sustain that. We estimate that hosting expenses for gitlab.fd.o
> > > without any of the CI features enabled would total 30k USD, which
> > > is
> > > within X.org's ability to support through various sponsorships,
> > > mostly
> > > through XDC.
> > >
> > > Note that X.org does no longer sponsor any CI runners themselves,
> > > we've stopped that. The huge additional expenses are all just in
> > > storing and serving build artifacts and images to outside CI
> > > runners
> > > sponsored by various companies. A related topic is that with the
> > > growth in fd.o it's becoming infeasible to maintain it all on
> > > volunteer admin time. X.org is therefore also looking for admin
> > > sponsorship, at least medium term.
> > >
> > > Assuming that we want cash flow reserves for one year of
> > > gitlab.fd.o
> > > (without CI support) and a trimmed XDC and assuming no sponsor
> > > payment
> > > meanwhile, we'd have to cut CI services somewhere between May and
> > > June
> > > this year. The board is of course working on acquiring sponsors,
> > > but
> > > filling a shortfall of this magnitude is neither easy nor quick
> > > work,
> > > and we therefore decided to give an early warning as soon as
> > > possible.
> > > Any help in finding sponsors for fd.o is very much appreciated.
> >
> > a) Ouch.
> >
> > b) we probably need to take a large step back here.
> >
>
> I kinda agree, but maybe the step doesn't have to be *too* large?
>
> I wonder if we could solve this by restructuring the project a bit. I'm
> talking purely from a Mesa point of view here, so it might not solve
> the full problem, but:
>
> 1. It feels silly that we need to test changes to e.g the i965 driver
> on dragonboards. We only have a big "do not run CI at all" escape-
> hatch.
>
> 2. A lot of us are working for a company that can probably pay for
> their own needs in terms of CI. Perhaps moving some costs "up front" to
> the company that needs it can make the future of CI for those who can't
> do this
>
> 3. I think we need a much more detailed break-down of the cost to make
> educated changes. For instance, how expensive is Docker image
> uploads/downloads (e.g intermediary artifacts) compared to build logs
> and final test-results? What kind of artifacts?

We have logs somewhere, but no one yet got around to analyzing that.
Which will be quite a bit of work to do since the cloud storage is
totally disconnected from the gitlab front-end, making the connection
to which project or CI job caused something is going to require
scripting. Volunteers definitely very much welcome I think.

> One suggestion would be to do something more similar to what the kernel
> does, and separate into different repos for different subsystems. This
> could allow us to have separate testing-pipelines for these repos,
> which would mean that for instance a change to RADV didn't trigger a
> full Panfrost test-run.

Uh as someone who lives the kernel multi-tree model daily, there's a
_lot_ of pain involved. I think much better to look at filtering out
CI targets for when nothing relevant happened. But that gets somewhat
tricky, since "nothing relevant" is always only relative to some
baseline, so bit of scripting and all involved to make sure you don't
run stuff too often or (probably worse) not often enough.
-Daniel

> This would probably require us to accept using a more branch-heavy
> work-flow. I don't personally think that would be a bad thing.
>
> But this is all kinda based on an assumption that running hardware-
> testing is the expensive part. I think that's quite possibly the case,
> but some more numbers would be helpful. I mean, it might turn out that
> just throwing up a Docker cache inside the organizations that host
> runners might be sufficient for all I know...
>
> We could also do stuff like reducing the amount of tests we run on each
> commit, and punt some testing to a per-weekend test-run or someting
> like that. We don't *need* to know about every problem up front, just
> the stuff that's about to be released, really. The other stuff is just
> nice to have. If it's too expensive, I would say drop it.

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Lionel Landwerlin

On 28/02/2020 11:28, Erik Faye-Lund wrote:

On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote:

On Fri, 28 Feb 2020 at 07:27, Daniel Vetter 
wrote:

Hi all,

You might have read the short take in the X.org board meeting
minutes
already, here's the long version.

The good news: gitlab.fd.o has become very popular with our
communities, and is used extensively. This especially includes all
the
CI integration. Modern development process and tooling, yay!

The bad news: The cost in growth has also been tremendous, and it's
breaking our bank account. With reasonable estimates for continued
growth we're expecting hosting expenses totalling 75k USD this
year,
and 90k USD next year. With the current sponsors we've set up we
can't
sustain that. We estimate that hosting expenses for gitlab.fd.o
without any of the CI features enabled would total 30k USD, which
is
within X.org's ability to support through various sponsorships,
mostly
through XDC.

Note that X.org does no longer sponsor any CI runners themselves,
we've stopped that. The huge additional expenses are all just in
storing and serving build artifacts and images to outside CI
runners
sponsored by various companies. A related topic is that with the
growth in fd.o it's becoming infeasible to maintain it all on
volunteer admin time. X.org is therefore also looking for admin
sponsorship, at least medium term.

Assuming that we want cash flow reserves for one year of
gitlab.fd.o
(without CI support) and a trimmed XDC and assuming no sponsor
payment
meanwhile, we'd have to cut CI services somewhere between May and
June
this year. The board is of course working on acquiring sponsors,
but
filling a shortfall of this magnitude is neither easy nor quick
work,
and we therefore decided to give an early warning as soon as
possible.
Any help in finding sponsors for fd.o is very much appreciated.

a) Ouch.

b) we probably need to take a large step back here.


I kinda agree, but maybe the step doesn't have to be *too* large?

I wonder if we could solve this by restructuring the project a bit. I'm
talking purely from a Mesa point of view here, so it might not solve
the full problem, but:

1. It feels silly that we need to test changes to e.g the i965 driver
on dragonboards. We only have a big "do not run CI at all" escape-
hatch.



Yeah, changes on vulkan drivers or backend compilers should be fairly 
sandboxed.


We also have tools that only work for intel stuff, that should never 
trigger anything on other people's HW.


Could something be worked out using the tags?


-Lionel

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx