Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services
Hi Jason, I personally think the suggestion are still a relatively good brainstorm data for those implicated. Of course, those not implicated in the CI scripting itself, I'd say just keep in mind that nothing is black and white and every changes end-up being time consuming. Le dimanche 01 mars 2020 à 14:18 -0600, Jason Ekstrand a écrit : > I've seen a number of suggestions which will do one or both of those things > including: > > - Batching merge requests Agreed. Or at least I foresee quite complicated code to handle the case of one batched merge failing the tests, or worst, with flicky tests. > - Not running CI on the master branch A small clarification, this depends on the chosen work-flow. In GStreamer, we use a rebase flow, so "merge" button isn't really merging. It means that to merge you need your branch to be rebased on top of the latest. As it is multi-repo, there is always a tiny chance of breakage due to mid-air collision in changes in other repos. What we see is that the post "merge" cannot even catch them all (as we already observed once). In fact, it usually does not catch anything. Or each time it cached something, we only notice on the next MR.0 So we are really considering doing this as for this specific workflow/project, we found very little gain of having it. With real merge, the code being tested before/after the merge is different, and for that I agree with you. > - Shutting off CI Of course :-), specially that we had CI before gitlab in GStreamer (just not pre-commit), we don't want a regress that far in the past. > - Preventing CI on other non-MR branches Another small nuance, mesa does not prevent CI, it only makes it manual on non-MR. Users can go click run to get CI results. We could also have option to trigger the ci (the opposite of ci.skip) from git command line. > - Disabling CI on WIP MRs That I'm also mitigated about. > - I'm sure there are more... regards, Nicolas ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services
On Sun, Mar 1, 2020 at 2:49 PM Nicolas Dufresne wrote: > > Hi Jason, > > I personally think the suggestion are still a relatively good > brainstorm data for those implicated. Of course, those not implicated > in the CI scripting itself, I'd say just keep in mind that nothing is > black and white and every changes end-up being time consuming. Sorry. I didn't intend to stop a useful brainstorming session. I'm just trying to say that CI is useful and we shouldn't hurt our development flows just to save a little money unless we're truly desperate. From what I understand, I don't think we're that desperate yet. So I was mostly trying to re-focus the discussion towards straightforward things we can do to get rid of pointless waste (there probably is some pretty low-hanging fruit) and away from "OMG X.org is running out of money; CI as little as possible". I don't think you're saying those things; but I've sensed a good bit of fear in this thread. (I could just be totally misreading people, but I don't think so.) One of the things that someone pointed out on this thread is that we need data. Some has been provided here but it's still a bit unclear exactly what the break-down is so it's hard for people to come up with good solutions beyond "just do less CI". We do know that the biggest cost is egress web traffic and that's something we didn't know before. My understanding is that people on the X.org board and/or Daniel are working to get better data. I'm fairly hopeful that, once we understand better what the costs are (or even with just the new data we have), we can bring it down to reasonable and/or come up with money to pay for it in fairly short order. Again, sorry I was so terse. I was just trying to slow the panic. > Le dimanche 01 mars 2020 à 14:18 -0600, Jason Ekstrand a écrit : > > I've seen a number of suggestions which will do one or both of those things > > including: > > > > - Batching merge requests > > Agreed. Or at least I foresee quite complicated code to handle the case > of one batched merge failing the tests, or worst, with flicky tests. > > > - Not running CI on the master branch > > A small clarification, this depends on the chosen work-flow. In > GStreamer, we use a rebase flow, so "merge" button isn't really > merging. It means that to merge you need your branch to be rebased on > top of the latest. As it is multi-repo, there is always a tiny chance > of breakage due to mid-air collision in changes in other repos. What we > see is that the post "merge" cannot even catch them all (as we already > observed once). In fact, it usually does not catch anything. Or each > time it cached something, we only notice on the next MR.0 So we are > really considering doing this as for this specific workflow/project, we > found very little gain of having it. > > With real merge, the code being tested before/after the merge is > different, and for that I agree with you. Even with a rebase model, it's still potentially different; though marge re-runs CI before merging. I agree the risk is low, however, and if you have GitLab set up to block MRs that don't pass CI, then you may be able to drop the master branch to a daily run or something like that. Again, should be project-by-project. > > - Shutting off CI > > Of course :-), specially that we had CI before gitlab in GStreamer > (just not pre-commit), we don't want a regress that far in the past. > > > - Preventing CI on other non-MR branches > > Another small nuance, mesa does not prevent CI, it only makes it manual > on non-MR. Users can go click run to get CI results. We could also have > option to trigger the ci (the opposite of ci.skip) from git command > line. Hence my use of "prevent". :-) It's very useful but, IMO, it should be opt-in and not opt-out. I think we agree here. :-) --Jason ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services
[AMD Official Use Only - Internal Distribution Only] The one suggestion I saw that definitely seemed worth looking at was adding download caches if the larger CI systems didn't already have them. Then again do we know that CI traffic is generating the bulk of the costs ? My guess would have been that individual developers and users would be generating as much traffic as the CI rigs. From: amd-gfx on behalf of Jason Ekstrand Sent: March 1, 2020 3:18 PM To: Jacob Lifshay ; Nicolas Dufresne Cc: Erik Faye-Lund ; Daniel Vetter ; Michel Dänzer ; X.Org development ; amd-gfx list ; wayland ; X.Org Foundation Board ; Xorg Members List ; dri-devel ; Mesa Dev ; intel-gfx ; Discussion of the development of and with GStreamer Subject: Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services I don't think we need to worry so much about the cost of CI that we need to micro-optimize to to get the minimal number of CI runs. We especially shouldn't if it begins to impact coffee quality, people's ability to merge patches in a timely manner, or visibility into what went wrong when CI fails. I've seen a number of suggestions which will do one or both of those things including: - Batching merge requests - Not running CI on the master branch - Shutting off CI - Preventing CI on other non-MR branches - Disabling CI on WIP MRs - I'm sure there are more... I think there are things we can do to make CI runs more efficient with some sort of end-point caching and we can probably find some truly wasteful CI to remove. Most of the things in the list above, I've seen presented by people who are only lightly involved the project to my knowledge (no offense to anyone intended). Developers depend on the CI system for their day-to-day work and hampering it will only show down development, reduce code quality, and ultimately hurt our customers and community. If we're so desperate as to be considering painful solutions which will have a negative impact on development, we're better off trying to find more money. --Jason On March 1, 2020 13:51:32 Jacob Lifshay wrote: One idea for Marge-bot (don't know if you already do this): Rust-lang has their bot (bors) automatically group together a few merge requests into a single merge commit, which it then tests, then, then the tests pass, it merges. This could help reduce CI runs to once a day (or some other rate). If the tests fail, then it could automatically deduce which one failed, by recursive subdivision or similar. There's also a mechanism to adjust priority and grouping behavior when the defaults aren't sufficient. Jacob ___ Intel-gfx mailing list intel-...@lists.freedesktop.org<mailto:Intel-gfx%40lists.freedesktop.org> https://lists.freedesktop.org/mailman/listinfo/intel-gfx<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fintel-gfx=02%7C01%7Cjohn.bridgman%40amd.com%7C96fa507073f24b02f4b808d7be1daf8a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637186907338419170=eT%2FUHbHaS1bZdvQOPjJ6wm0pqZSj2YE8k54%2FZHurRgA%3D=0> ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services
I don't think we need to worry so much about the cost of CI that we need to micro-optimize to to get the minimal number of CI runs. We especially shouldn't if it begins to impact coffee quality, people's ability to merge patches in a timely manner, or visibility into what went wrong when CI fails. I've seen a number of suggestions which will do one or both of those things including: - Batching merge requests - Not running CI on the master branch - Shutting off CI - Preventing CI on other non-MR branches - Disabling CI on WIP MRs - I'm sure there are more... I think there are things we can do to make CI runs more efficient with some sort of end-point caching and we can probably find some truly wasteful CI to remove. Most of the things in the list above, I've seen presented by people who are only lightly involved the project to my knowledge (no offense to anyone intended). Developers depend on the CI system for their day-to-day work and hampering it will only show down development, reduce code quality, and ultimately hurt our customers and community. If we're so desperate as to be considering painful solutions which will have a negative impact on development, we're better off trying to find more money. --Jason On March 1, 2020 13:51:32 Jacob Lifshay wrote: One idea for Marge-bot (don't know if you already do this): Rust-lang has their bot (bors) automatically group together a few merge requests into a single merge commit, which it then tests, then, then the tests pass, it merges. This could help reduce CI runs to once a day (or some other rate). If the tests fail, then it could automatically deduce which one failed, by recursive subdivision or similar. There's also a mechanism to adjust priority and grouping behavior when the defaults aren't sufficient. Jacob ___ Intel-gfx mailing list intel-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services
On Fri, Feb 28, 2020 at 11:00 AM Rob Clark wrote: > > On Fri, Feb 28, 2020 at 3:43 AM Michel Dänzer wrote: > > > > On 2020-02-28 10:28 a.m., Erik Faye-Lund wrote: > > > > > > We could also do stuff like reducing the amount of tests we run on each > > > commit, and punt some testing to a per-weekend test-run or someting > > > like that. We don't *need* to know about every problem up front, just > > > the stuff that's about to be released, really. The other stuff is just > > > nice to have. If it's too expensive, I would say drop it. > > > > I don't agree that pre-merge testing is just nice to have. A problem > > which is only caught after it lands in mainline has a much bigger impact > > than one which is already caught earlier. > > > > one thought.. since with mesa+margebot we effectively get at least > two(ish) CI runs per MR, ie. one when it is initially pushed, and one > when margebot rebases and tries to merge, could we leverage this to > have trimmed down pre-margebot CI which tries to just target affected > drivers, with margebot doing a full CI run (when it is potentially > batching together multiple MRs)? > > Seems like a way to reduce our CI runs with a good safety net to > prevent things from slipping through the cracks. Here are a couple more hopefully constructive but possibly bogus ideas: 1. Suggest people put their CI farms behind a squid transparent caching proxy. There seem to be many HowTo's on the internet for doing this and it shouldn't be terribly hard. Maybe GitLab uses too much HTTPS and that messes things up? If not, this would cut downloads to one-per-farm rather than one-per-machine 2. Add -Dstrip=true to the meson config. We want asserts but do we really need those debug symbols? Quick testing on my machine, it seems to reduce the size of build artifacts by about 60% Feel free to tell the peanut gallery (me) why I'm wrong. :-) --Jason ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services
On Fri, Feb 28, 2020 at 3:43 AM Michel Dänzer wrote: > > On 2020-02-28 10:28 a.m., Erik Faye-Lund wrote: > > > > We could also do stuff like reducing the amount of tests we run on each > > commit, and punt some testing to a per-weekend test-run or someting > > like that. We don't *need* to know about every problem up front, just > > the stuff that's about to be released, really. The other stuff is just > > nice to have. If it's too expensive, I would say drop it. > > I don't agree that pre-merge testing is just nice to have. A problem > which is only caught after it lands in mainline has a much bigger impact > than one which is already caught earlier. > one thought.. since with mesa+margebot we effectively get at least two(ish) CI runs per MR, ie. one when it is initially pushed, and one when margebot rebases and tries to merge, could we leverage this to have trimmed down pre-margebot CI which tries to just target affected drivers, with margebot doing a full CI run (when it is potentially batching together multiple MRs)? Seems like a way to reduce our CI runs with a good safety net to prevent things from slipping through the cracks. (Not sure how much that would help reduce bandwidth costs, but I guess it should help a bit.) BR, -R ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Mesa-dev] gitlab.fd.o financial situation and impact on services
On 02/27/2020 05:00 PM, Tom Stellard wrote: > On 02/27/2020 01:27 PM, Daniel Vetter wrote: >> Hi all, >> >> You might have read the short take in the X.org board meeting minutes >> already, here's the long version. >> >> The good news: gitlab.fd.o has become very popular with our >> communities, and is used extensively. This especially includes all the >> CI integration. Modern development process and tooling, yay! >> >> The bad news: The cost in growth has also been tremendous, and it's >> breaking our bank account. With reasonable estimates for continued >> growth we're expecting hosting expenses totalling 75k USD this year, >> and 90k USD next year. With the current sponsors we've set up we can't >> sustain that. We estimate that hosting expenses for gitlab.fd.o >> without any of the CI features enabled would total 30k USD, which is >> within X.org's ability to support through various sponsorships, mostly >> through XDC. >> > > Have you looked into applying for free credits from amazon: > > https://aws.amazon.com/blogs/opensource/aws-promotional-credits-open-source-projects/ > Also fastly provides free CDN services to some Open Source projects: https://www.fastly.com/open-source?utm_medium=social_source=t.co_campaign=FY17Q4_WebPage_OpenSource It might also be worth looking into if the main costs are coming from data transfers. -Tom > -Tom > >> Note that X.org does no longer sponsor any CI runners themselves, >> we've stopped that. The huge additional expenses are all just in >> storing and serving build artifacts and images to outside CI runners >> sponsored by various companies. A related topic is that with the >> growth in fd.o it's becoming infeasible to maintain it all on >> volunteer admin time. X.org is therefore also looking for admin >> sponsorship, at least medium term. >> >> Assuming that we want cash flow reserves for one year of gitlab.fd.o >> (without CI support) and a trimmed XDC and assuming no sponsor payment >> meanwhile, we'd have to cut CI services somewhere between May and June >> this year. The board is of course working on acquiring sponsors, but >> filling a shortfall of this magnitude is neither easy nor quick work, >> and we therefore decided to give an early warning as soon as possible. >> Any help in finding sponsors for fd.o is very much appreciated. >> >> Thanks, Daniel >> > > ___ > mesa-dev mailing list > mesa-...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [Mesa-dev] gitlab.fd.o financial situation and impact on services
On 02/27/2020 01:27 PM, Daniel Vetter wrote: > Hi all, > > You might have read the short take in the X.org board meeting minutes > already, here's the long version. > > The good news: gitlab.fd.o has become very popular with our > communities, and is used extensively. This especially includes all the > CI integration. Modern development process and tooling, yay! > > The bad news: The cost in growth has also been tremendous, and it's > breaking our bank account. With reasonable estimates for continued > growth we're expecting hosting expenses totalling 75k USD this year, > and 90k USD next year. With the current sponsors we've set up we can't > sustain that. We estimate that hosting expenses for gitlab.fd.o > without any of the CI features enabled would total 30k USD, which is > within X.org's ability to support through various sponsorships, mostly > through XDC. > Have you looked into applying for free credits from amazon: https://aws.amazon.com/blogs/opensource/aws-promotional-credits-open-source-projects/ -Tom > Note that X.org does no longer sponsor any CI runners themselves, > we've stopped that. The huge additional expenses are all just in > storing and serving build artifacts and images to outside CI runners > sponsored by various companies. A related topic is that with the > growth in fd.o it's becoming infeasible to maintain it all on > volunteer admin time. X.org is therefore also looking for admin > sponsorship, at least medium term. > > Assuming that we want cash flow reserves for one year of gitlab.fd.o > (without CI support) and a trimmed XDC and assuming no sponsor payment > meanwhile, we'd have to cut CI services somewhere between May and June > this year. The board is of course working on acquiring sponsors, but > filling a shortfall of this magnitude is neither easy nor quick work, > and we therefore decided to give an early warning as soon as possible. > Any help in finding sponsors for fd.o is very much appreciated. > > Thanks, Daniel > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx