Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-06 Thread Nicolas Dufresne
Le samedi 04 avril 2020 à 15:55 +0200, Andreas Bergmeier a écrit :
> The problem of data transfer costs is not new in Cloud environments. At work 
> we usually just opt for paying for it since dev time is scarser. For private 
> projects though, I opt for aggressive (remote) caching.
> So you can setup a global cache in Google Cloud Storage and more local caches 
> wherever your executors are (reduces egress as much as possible).
> This setup works great with Bazel and Pants among others. Note that these 
> systems are pretty hermetic in contrast to Meson.
> IIRC Eric by now works at Google. They internally use Blaze which AFAIK does 
> aggressive caching, too.
> So maybe using any of these systems would be a way of not having to sacrifice 
> any of the current functionality.
> Downside is that you have lower a bit of dev productivity since you cannot 
> eyeball your build definitions anymore.
> 
Did you mean Bazel [0] ? I'm not sure I follow your reflection, why is
Meson vs Bazel related to this issue ?

Nicolas

[0] https://bazel.build/

> ym2c
> 
> 
> On Fri, 28 Feb 2020 at 20:34, Eric Anholt  wrote:
> > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie  wrote:
> > >
> > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone  wrote:
> > > >
> > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie  wrote:
> > > > > b) we probably need to take a large step back here.
> > > > >
> > > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > > sponsorship money that they are just giving straight to google to pay
> > > > > for hosting credits? Google are profiting in some minor way from these
> > > > > hosting credits being bought by us, and I assume we aren't getting any
> > > > > sort of discounts here. Having google sponsor the credits costs google
> > > > > substantially less than having any other company give us money to do
> > > > > it.
> > > >
> > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > > comparable in terms of what you get and what you pay for them.
> > > > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > > > services are cheaper, but then you need to find someone who is going
> > > > to properly administer the various machines, install decent
> > > > monitoring, make sure that more storage is provisioned when we need
> > > > more storage (which is basically all the time), make sure that the
> > > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > > machines has had a drive in imminent-failure state for the last few
> > > > months), etc.
> > > >
> > > > Given the size of our service, that's a much better plan (IMO) than
> > > > relying on someone who a) isn't an admin by trade, b) has a million
> > > > other things to do, and c) hasn't wanted to do it for the past several
> > > > years. But as long as that's the resources we have, then we're paying
> > > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > > problems.
> > >
> > > Admin for gitlab and CI is a full time role anyways. The system is
> > > definitely not self sustaining without time being put in by you and
> > > anholt still. If we have $75k to burn on credits, and it was diverted
> > > to just pay an admin to admin the real hw + gitlab/CI would that not
> > > be a better use of the money? I didn't know if we can afford $75k for
> > > an admin, but suddenly we can afford it for gitlab credits?
> > 
> > As I think about the time that I've spent at google in less than a
> > year on trying to keep the lights on for CI and optimize our
> > infrastructure in the current cloud environment, that's more than the
> > entire yearly budget you're talking about here.  Saying "let's just
> > pay for people to do more work instead of paying for full-service
> > cloud" is not a cost optimization.
> > 
> > 
> > > > Yes, we could federate everything back out so everyone runs their own
> > > > builds and executes those. Tinderbox did something really similar to
> > > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > > pre-merge testing, mind.
> > >
> > > Why? does gitlab not support the model? having builds done in parallel
> > > on runners closer to the test runners seems like it should be a thing.
> > > I guess artifact transfer would cost less then as a result.
> > 
> > Let's do some napkin math.  The biggest artifacts cost we have in Mesa
> > is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> > downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> > makes ~1.8TB/month ($180 or so).  We could build a local storage next
> > to the lava dispatcher so that the artifacts didn't have to contain
> > the rootfs that came from the container (~2/3 of the insides of the
> > zip file), but that's another service to build and maintain.  Building
> > the drivers once locally and storing it would save downloading the
> > other ~1/3 of the inside of the zip file, but that requires a big
> > enough system to do 

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-04-04 Thread Nicolas Dufresne
Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit :
> On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer  wrote:
> > On 2020-03-01 6:46 a.m., Marek Olšák wrote:
> > > For Mesa, we could run CI only when Marge pushes, so that it's a strictly
> > > pre-merge CI.
> > 
> > Thanks for the suggestion! I implemented something like this for Mesa:
> > 
> > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432
> > 
> 
> I wouldn't mind manually triggering pipelines, but unless there is
> some trick I'm not realizing, it is super cumbersome.  Ie. you have to
> click first the container jobs.. then wait.. then the build jobs..
> then wait some more.. and then finally the actual runners.  That would
> be a real step back in terms of usefulness of CI.. one might call it a
> regression :-(

On GStreamer side we have moved some existing pipeline to manual mode.
As we use needs: between jobs, we could simply set the first job to
manual (in our case it's a single job called manifest in your case it
would be the N container jobs). This way you can have a manual pipeline
that is triggered in single (or fewer) clicks. Here's an example:

https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292

That our post-merge pipelines, we only trigger then if we suspect a
problem.

> 
> Is there a possible middle ground where pre-marge pipelines that touch
> a particular driver trigger that driver's CI jobs, but MRs that don't
> touch that driver but do touch shared code don't until triggered by
> marge?  Ie. if I have a MR that only touches nir, it's probably ok to
> not run freedreno jobs until marge triggers it.  But if I have a MR
> that is touching freedreno, I'd really rather not have to wait until
> marge triggers the freedreno CI jobs.
> 
> Btw, I was under the impression (from periodically skimming the logs
> in #freedesktop, so I could well be missing or misunderstanding
> something) that caching/etc had been improved and mesa's part of the
> egress wasn't the bigger issue at this point?
> 
> BR,
> -R

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel


Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-19 Thread Nicolas Dufresne
Le mercredi 18 mars 2020 à 11:05 +0100, Michel Dänzer a écrit :
> On 2020-03-17 6:21 p.m., Lucas Stach wrote:
> > That's one of the issues with implicit sync that explicit may solve: 
> > a single client taking way too much time to render something can 
> > block the whole pipeline up until the display flip. With explicit 
> > sync the compositor can just decide to use the last client buffer if 
> > the latest buffer isn't ready by some deadline.
> 
> FWIW, the compositor can do this with implicit sync as well, by polling
> a dma-buf fd for the buffer. (Currently, it has to poll for writable,
> because waiting for the exclusive fence only isn't enough with amdgpu)

That is very interesting, thanks for sharing, could allow fixing some
issues in userspace for backward compatibility.

thanks,
Nicolas

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel


Re: Plumbing explicit synchronization through the Linux ecosystem

2020-03-18 Thread Nicolas Dufresne
Le lundi 16 mars 2020 à 23:15 +0200, Laurent Pinchart a écrit :
> Hi Jason,
> 
> On Mon, Mar 16, 2020 at 10:06:07AM -0500, Jason Ekstrand wrote:
> > On Mon, Mar 16, 2020 at 5:20 AM Laurent Pinchart wrote:
> > > On Wed, Mar 11, 2020 at 04:18:55PM -0400, Nicolas Dufresne wrote:
> > > > (I know I'm going to be spammed by so many mailing list ...)
> > > > 
> > > > Le mercredi 11 mars 2020 à 14:21 -0500, Jason Ekstrand a écrit :
> > > > > On Wed, Mar 11, 2020 at 12:31 PM Jason Ekstrand 
> > > > >  wrote:
> > > > > > All,
> > > > > > 
> > > > > > Sorry for casting such a broad net with this one. I'm sure most 
> > > > > > people
> > > > > > who reply will get at least one mailing list rejection.  However, 
> > > > > > this
> > > > > > is an issue that affects a LOT of components and that's why it's
> > > > > > thorny to begin with.  Please pardon the length of this e-mail as
> > > > > > well; I promise there's a concrete point/proposal at the end.
> > > > > > 
> > > > > > 
> > > > > > Explicit synchronization is the future of graphics and media.  At
> > > > > > least, that seems to be the consensus among all the graphics people
> > > > > > I've talked to.  I had a chat with one of the lead Android graphics
> > > > > > engineers recently who told me that doing explicit sync from the 
> > > > > > start
> > > > > > was one of the best engineering decisions Android ever made.  It's
> > > > > > also the direction being taken by more modern APIs such as Vulkan.
> > > > > > 
> > > > > > 
> > > > > > ## What are implicit and explicit synchronization?
> > > > > > 
> > > > > > For those that aren't familiar with this space, GPUs, media 
> > > > > > encoders,
> > > > > > etc. are massively parallel and synchronization of some form is
> > > > > > required to ensure that everything happens in the right order and
> > > > > > avoid data races.  Implicit synchronization is when bits of work 
> > > > > > (3D,
> > > > > > compute, video encode, etc.) are implicitly based on the absolute
> > > > > > CPU-time order in which API calls occur.  Explicit synchronization 
> > > > > > is
> > > > > > when the client (whatever that means in any given context) provides
> > > > > > the dependency graph explicitly via some sort of synchronization
> > > > > > primitives.  If you're still confused, consider the following
> > > > > > examples:
> > > > > > 
> > > > > > With OpenGL and EGL, almost everything is implicit sync.  Say you 
> > > > > > have
> > > > > > two OpenGL contexts sharing an image where one writes to it and the
> > > > > > other textures from it.  The way the OpenGL spec works, the client 
> > > > > > has
> > > > > > to make the API calls to render to the image before (in CPU time) it
> > > > > > makes the API calls which texture from the image.  As long as it 
> > > > > > does
> > > > > > this (and maybe inserts a glFlush?), the driver will ensure that the
> > > > > > rendering completes before the texturing happens and you get correct
> > > > > > contents.
> > > > > > 
> > > > > > Implicit synchronization can also happen across processes.  Wayland,
> > > > > > for instance, is currently built on implicit sync where the client
> > > > > > does their rendering and then does a hand-off (via 
> > > > > > wl_surface::commit)
> > > > > > to tell the compositor it's done at which point the compositor can 
> > > > > > now
> > > > > > texture from the surface.  The hand-off ensures that the client's
> > > > > > OpenGL API calls happen before the server's OpenGL API calls.
> > > > > > 
> > > > > > A good example of explicit synchronization is the Vulkan API.  
> > > > > > There,
> > > > > > a client (or multiple clients) can simultaneously build command
> > > > > > buffers in different threads where one of those command buffers
> > > > >

Re: Plumbing explicit synchronization through the Linux ecosystem

2020-03-18 Thread Nicolas Dufresne
Le mardi 17 mars 2020 à 11:27 -0500, Jason Ekstrand a écrit :
> On Tue, Mar 17, 2020 at 10:33 AM Nicolas Dufresne  
> wrote:
> > Le lundi 16 mars 2020 à 23:15 +0200, Laurent Pinchart a écrit :
> > > Hi Jason,
> > > 
> > > On Mon, Mar 16, 2020 at 10:06:07AM -0500, Jason Ekstrand wrote:
> > > > On Mon, Mar 16, 2020 at 5:20 AM Laurent Pinchart wrote:
> > > > > On Wed, Mar 11, 2020 at 04:18:55PM -0400, Nicolas Dufresne wrote:
> > > > > > (I know I'm going to be spammed by so many mailing list ...)
> > > > > > 
> > > > > > Le mercredi 11 mars 2020 à 14:21 -0500, Jason Ekstrand a écrit :
> > > > > > > On Wed, Mar 11, 2020 at 12:31 PM Jason Ekstrand 
> > > > > > >  wrote:
> > > > > > > > All,
> > > > > > > > 
> > > > > > > > Sorry for casting such a broad net with this one. I'm sure most 
> > > > > > > > people
> > > > > > > > who reply will get at least one mailing list rejection.  
> > > > > > > > However, this
> > > > > > > > is an issue that affects a LOT of components and that's why it's
> > > > > > > > thorny to begin with.  Please pardon the length of this e-mail 
> > > > > > > > as
> > > > > > > > well; I promise there's a concrete point/proposal at the end.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Explicit synchronization is the future of graphics and media.  
> > > > > > > > At
> > > > > > > > least, that seems to be the consensus among all the graphics 
> > > > > > > > people
> > > > > > > > I've talked to.  I had a chat with one of the lead Android 
> > > > > > > > graphics
> > > > > > > > engineers recently who told me that doing explicit sync from 
> > > > > > > > the start
> > > > > > > > was one of the best engineering decisions Android ever made.  
> > > > > > > > It's
> > > > > > > > also the direction being taken by more modern APIs such as 
> > > > > > > > Vulkan.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > ## What are implicit and explicit synchronization?
> > > > > > > > 
> > > > > > > > For those that aren't familiar with this space, GPUs, media 
> > > > > > > > encoders,
> > > > > > > > etc. are massively parallel and synchronization of some form is
> > > > > > > > required to ensure that everything happens in the right order 
> > > > > > > > and
> > > > > > > > avoid data races.  Implicit synchronization is when bits of 
> > > > > > > > work (3D,
> > > > > > > > compute, video encode, etc.) are implicitly based on the 
> > > > > > > > absolute
> > > > > > > > CPU-time order in which API calls occur.  Explicit 
> > > > > > > > synchronization is
> > > > > > > > when the client (whatever that means in any given context) 
> > > > > > > > provides
> > > > > > > > the dependency graph explicitly via some sort of synchronization
> > > > > > > > primitives.  If you're still confused, consider the following
> > > > > > > > examples:
> > > > > > > > 
> > > > > > > > With OpenGL and EGL, almost everything is implicit sync.  Say 
> > > > > > > > you have
> > > > > > > > two OpenGL contexts sharing an image where one writes to it and 
> > > > > > > > the
> > > > > > > > other textures from it.  The way the OpenGL spec works, the 
> > > > > > > > client has
> > > > > > > > to make the API calls to render to the image before (in CPU 
> > > > > > > > time) it
> > > > > > > > makes the API calls which texture from the image.  As long as 
> > > > > > > > it does
> > > > > > > > this (and maybe inserts a glFlush?), the driver will ensure 
> > > > > > > > that the
> > > > &

Re: Plumbing explicit synchronization through the Linux ecosystem

2020-03-12 Thread Nicolas Dufresne
(I know I'm going to be spammed by so many mailing list ...)

Le mercredi 11 mars 2020 à 14:21 -0500, Jason Ekstrand a écrit :
> On Wed, Mar 11, 2020 at 12:31 PM Jason Ekstrand  wrote:
> > All,
> > 
> > Sorry for casting such a broad net with this one. I'm sure most people
> > who reply will get at least one mailing list rejection.  However, this
> > is an issue that affects a LOT of components and that's why it's
> > thorny to begin with.  Please pardon the length of this e-mail as
> > well; I promise there's a concrete point/proposal at the end.
> > 
> > 
> > Explicit synchronization is the future of graphics and media.  At
> > least, that seems to be the consensus among all the graphics people
> > I've talked to.  I had a chat with one of the lead Android graphics
> > engineers recently who told me that doing explicit sync from the start
> > was one of the best engineering decisions Android ever made.  It's
> > also the direction being taken by more modern APIs such as Vulkan.
> > 
> > 
> > ## What are implicit and explicit synchronization?
> > 
> > For those that aren't familiar with this space, GPUs, media encoders,
> > etc. are massively parallel and synchronization of some form is
> > required to ensure that everything happens in the right order and
> > avoid data races.  Implicit synchronization is when bits of work (3D,
> > compute, video encode, etc.) are implicitly based on the absolute
> > CPU-time order in which API calls occur.  Explicit synchronization is
> > when the client (whatever that means in any given context) provides
> > the dependency graph explicitly via some sort of synchronization
> > primitives.  If you're still confused, consider the following
> > examples:
> > 
> > With OpenGL and EGL, almost everything is implicit sync.  Say you have
> > two OpenGL contexts sharing an image where one writes to it and the
> > other textures from it.  The way the OpenGL spec works, the client has
> > to make the API calls to render to the image before (in CPU time) it
> > makes the API calls which texture from the image.  As long as it does
> > this (and maybe inserts a glFlush?), the driver will ensure that the
> > rendering completes before the texturing happens and you get correct
> > contents.
> > 
> > Implicit synchronization can also happen across processes.  Wayland,
> > for instance, is currently built on implicit sync where the client
> > does their rendering and then does a hand-off (via wl_surface::commit)
> > to tell the compositor it's done at which point the compositor can now
> > texture from the surface.  The hand-off ensures that the client's
> > OpenGL API calls happen before the server's OpenGL API calls.
> > 
> > A good example of explicit synchronization is the Vulkan API.  There,
> > a client (or multiple clients) can simultaneously build command
> > buffers in different threads where one of those command buffers
> > renders to an image and the other textures from it and then submit
> > both of them at the same time with instructions to the driver for
> > which order to execute them in.  The execution order is described via
> > the VkSemaphore primitive.  With the new VK_KHR_timeline_semaphore
> > extension, you can even submit the work which does the texturing
> > BEFORE the work which does the rendering and the driver will sort it
> > out.
> > 
> > The #1 problem with implicit synchronization (which explicit solves)
> > is that it leads to a lot of over-synchronization both in client space
> > and in driver/device space.  The client has to synchronize a lot more
> > because it has to ensure that the API calls happen in a particular
> > order.  The driver/device have to synchronize a lot more because they
> > never know what is going to end up being a synchronization point as an
> > API call on another thread/process may occur at any time.  As we move
> > to more and more multi-threaded programming this synchronization (on
> > the client-side especially) becomes more and more painful.
> > 
> > 
> > ## Current status in Linux
> > 
> > Implicit synchronization in Linux works via a the kernel's internal
> > dma_buf and dma_fence data structures.  A dma_fence is a tiny object
> > which represents the "done" status for some bit of work.  Typically,
> > dma_fences are created as a by-product of someone submitting some bit
> > of work (say, 3D rendering) to the kernel.  The dma_buf object has a
> > set of dma_fences on it representing shared (read) and exclusive
> > (write) access to the object.  When work is submitted which, for
> > instance renders to the dma_buf, it's queued waiting on all the fences
> > on the dma_buf and and a dma_fence is created representing the end of
> > said rendering work and it's installed as the dma_buf's exclusive
> > fence.  This way, the kernel can manage all its internal queues (3D
> > rendering, display, video encode, etc.) and know which things to
> > submit in what order.
> > 
> > For the last few years, we've had sync_file in the kernel 

Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services

2020-03-02 Thread Nicolas Dufresne
Hi Jason,

I personally think the suggestion are still a relatively good
brainstorm data for those implicated. Of course, those not implicated
in the CI scripting itself, I'd say just keep in mind that nothing is
black and white and every changes end-up being time consuming.

Le dimanche 01 mars 2020 à 14:18 -0600, Jason Ekstrand a écrit :
> I've seen a number of suggestions which will do one or both of those things 
> including:
> 
>  - Batching merge requests

Agreed. Or at least I foresee quite complicated code to handle the case
of one batched merge failing the tests, or worst, with flicky tests.

>  - Not running CI on the master branch

A small clarification, this depends on the chosen work-flow. In
GStreamer, we use a rebase flow, so "merge" button isn't really
merging. It means that to merge you need your branch to be rebased on
top of the latest. As it is multi-repo, there is always a tiny chance
of breakage due to mid-air collision in changes in other repos. What we
see is that the post "merge" cannot even catch them all (as we already
observed once). In fact, it usually does not catch anything. Or each
time it cached something, we only notice on the next MR.0 So we are
really considering doing this as for this specific workflow/project, we
found very little gain of having it.

With real merge, the code being tested before/after the merge is
different, and for that I agree with you.

>  - Shutting off CI

Of course :-), specially that we had CI before gitlab in GStreamer
(just not pre-commit), we don't want a regress that far in the past.

>  - Preventing CI on other non-MR branches

Another small nuance, mesa does not prevent CI, it only makes it manual
on non-MR. Users can go click run to get CI results. We could also have
option to trigger the ci (the opposite of ci.skip) from git command
line.

>  - Disabling CI on WIP MRs

That I'm also mitigated about.

>  - I'm sure there are more...


regards,
Nicolas

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Nicolas Dufresne
Le dimanche 01 mars 2020 à 15:14 +0100, Michel Dänzer a écrit :
> On 2020-02-29 8:46 p.m., Nicolas Dufresne wrote:
> > Le samedi 29 février 2020 à 19:14 +0100, Timur Kristóf a écrit :
> > > 1. I think we should completely disable running the CI on MRs which are
> > > marked WIP. Speaking from personal experience, I usually make a lot of
> > > changes to my MRs before they are merged, so it is a waste of CI
> > > resources.
> 
> Interesting idea, do you want to create an MR implementing it?
> 
> 
> > In the mean time, you can help by taking the habit to use:
> > 
> >   git push -o ci.skip
> 
> That breaks Marge Bot.
> 
> 
> > Notably, we would like to get rid of the post merge CI, as in a rebase
> > flow like we have in GStreamer, it's a really minor risk.
> 
> That should be pretty easy, see Mesa and
> https://docs.gitlab.com/ce/ci/variables/predefined_variables.html.
> Something like this should work:
> 
>   rules:
> - if: '$CI_PROJECT_NAMESPACE != "gstreamer"'
>   when: never
> 
> This is another interesting idea we could consider for Mesa as well. It
> would however require (mostly) banning direct pushes to the main repository.

We already have this policy in GStreamer group. We rely on maintainers
to make the right call though, as we have few cases in multi-repo usage
where pushing manually is the only way to reduce the breakage time
(e.g. when we undo a new API in development branch). (We have
implemented support so that CI is run across users repository with the
same branch name, so that allow doing CI with all the changes, but the
merge remains non-atomic.)

> 
> 
> > > 2. Maybe we could take this one step further and only allow the CI to
> > > be only triggered manually instead of automatically on every push.
> 
> That would again break Marge Bot.

Marge is just a software, we can update it to trigger CI on rebases, or
if the CI haven't been run. There was proposal to actually do that and
let marge trigger CI on merge from maintainers. Though, from my point
view, having a longer delay between submission and the author being
aware of CI breakage have some side effects. Authors are often less
available a week later, when someone review and try to merge, which
make merging patches a lot longer.

> 
> 

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Nicolas Dufresne
Le samedi 29 février 2020 à 15:54 -0600, Jason Ekstrand a écrit :
> On Sat, Feb 29, 2020 at 3:47 PM Timur Kristóf  wrote:
> > On Sat, 2020-02-29 at 14:46 -0500, Nicolas Dufresne wrote:
> > > > 1. I think we should completely disable running the CI on MRs which
> > > > are
> > > > marked WIP. Speaking from personal experience, I usually make a lot
> > > > of
> > > > changes to my MRs before they are merged, so it is a waste of CI
> > > > resources.
> > > 
> > > In the mean time, you can help by taking the habit to use:
> > > 
> > >   git push -o ci.skip
> > 
> > Thanks for the advice, I wasn't aware such an option exists. Does this
> > also work on the mesa gitlab or is this a GStreamer only thing?
> 
> Mesa is already set up so that it only runs on MRs and branches named
> ci-* (or maybe it's ci/*; I can't remember).
> 
> > How hard would it be to make this the default?
> 
> I strongly suggest looking at how Mesa does it and doing that in
> GStreamer if you can.  It seems to work pretty well in Mesa.

You are right, they added CI_MERGE_REQUEST_SOURCE_BRANCH_NAME in 11.6
(we started our CI a while ago). But there is even better now, ou can
do:

  only:
refs:
  - merge_requests

Thanks for the hint, I'll suggest that. I've lookup some of the backend
of mesa, I think it's really nice, though there is a lot of concept
that won't work in a multi-repo CI. Again, I need to refresh on what
was moved from the enterprise to the community version in this regard,

> 
> --Jason
> 
> 
> > > That's a much more difficult goal then it looks like. Let each
> > > projects
> > > manage their CI graph and content, as each case is unique. Running
> > > more
> > > tests, or building more code isn't the main issue as the CPU time is
> > > mostly sponsored. The data transfers between the cloud of gitlab and
> > > the runners (which are external), along to sending OS image to Lava
> > > labs is what is likely the most expensive.
> > > 
> > > As it was already mention in the thread, what we are missing now, and
> > > being worked on, is per group/project statistics that give us the
> > > hotspot so we can better target the optimization work.
> > 
> > Yes, would be nice to know what the hotspot is, indeed.
> > 
> > As far as I understand, the problem is not CI itself, but the bandwidth
> > needed by the build artifacts, right? Would it be possible to not host
> > the build artifacts on the gitlab, but rather only the place where the
> > build actually happened? Or at least, only transfer the build artifacts
> > on-demand?
> > 
> > I'm not exactly familiar with how the system works, so sorry if this is
> > a silly question.
> > 
> > ___
> > mesa-dev mailing list
> > mesa-...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-03-01 Thread Nicolas Dufresne
Le samedi 29 février 2020 à 19:14 +0100, Timur Kristóf a écrit :
> On Fri, 2020-02-28 at 10:43 +, Daniel Stone wrote:
> > On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
> >  wrote:
> > > On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
> > > > Yeah, changes on vulkan drivers or backend compilers should be
> > > > fairly
> > > > sandboxed.
> > > > 
> > > > We also have tools that only work for intel stuff, that should
> > > > never
> > > > trigger anything on other people's HW.
> > > > 
> > > > Could something be worked out using the tags?
> > > 
> > > I think so! We have the pre-defined environment variable
> > > CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
> > > 
> > > https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
> > > 
> > > That sounds like a pretty neat middle-ground to me. I just hope
> > > that
> > > new pipelines are triggered if new labels are added, because not
> > > everyone is allowed to set labels, and sometimes people forget...
> > 
> > There's also this which is somewhat more robust:
> > https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569
> 
> My 20 cents:
> 
> 1. I think we should completely disable running the CI on MRs which are
> marked WIP. Speaking from personal experience, I usually make a lot of
> changes to my MRs before they are merged, so it is a waste of CI
> resources.

In the mean time, you can help by taking the habit to use:

  git push -o ci.skip

CI is in fact run for all branches that you push. When we (GStreamer
Project) started our CI we wanted to limit this to MR, but haven't
found a good way yet (and Gitlab is not helping much). The main issue
is that it's near impossible to use gitlab web API from a runner
(requires private key, in an all or nothing manner). But with the
current situation we are revisiting this.

The truth is that probably every CI have lot of room for optimization,
but it can be really time consuming. So until we have a reason to, we
live with inefficiency, like over sized artifact, unused artifacts,
over-sized docker image, etc. Doing a new round of optimization is
obviously a clear short term goals for project, including GStreamer
project. We have discussions going on and are trying to find solutions.
Notably, we would like to get rid of the post merge CI, as in a rebase
flow like we have in GStreamer, it's a really minor risk.

> 
> 2. Maybe we could take this one step further and only allow the CI to
> be only triggered manually instead of automatically on every push.
> 
> 3. I completely agree with Pierre-Eric on MR 2569, let's not run the
> full CI pipeline on every change, only those parts which are affected
> by the change. It not only costs money, but is also frustrating when
> you submit a change and you get unrelated failures from a completely
> unrelated driver.

That's a much more difficult goal then it looks like. Let each projects
manage their CI graph and content, as each case is unique. Running more
tests, or building more code isn't the main issue as the CPU time is
mostly sponsored. The data transfers between the cloud of gitlab and
the runners (which are external), along to sending OS image to Lava
labs is what is likely the most expensive.

As it was already mention in the thread, what we are missing now, and
being worked on, is per group/project statistics that give us the
hotspot so we can better target the optimization work.

> 
> Best regards,
> Timur
> 
> ___
> gstreamer-devel mailing list
> gstreamer-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: https://lists.x.org/mailman/listinfo/xorg-devel