What are the estimated yearly costs? On Tue, Oct 13, 2020 at 9:17 AM Jarek Potiuk <jarek.pot...@polidea.com> wrote:
> Yep, we can do it: *docker build --cpu-shares=100 --memory=1024m * > > On Tue, Oct 13, 2020 at 6:15 PM Jarek Potiuk <jarek.pot...@polidea.com> > wrote: > >> Plus the "workflow_runs" (image building) for all PRs can also be done in >> the self-hosted workers. They are safe as they are using master scripts >> (the only potentially dangerous part in them is that someone could do some >> "mining" as "malicious" Docker image building step, This is the only part >> that comes from the PR for "workflow_run" but this would be isolated within >> the docker build process which I believe has rather limited resources or we >> can limit it additionally to single processor and limited memory. >> >> J. >> >> >> On Tue, Oct 13, 2020 at 6:12 PM Jarek Potiuk <jarek.pot...@polidea.com> >> wrote: >> >>> I think this part is easy: >>> >>> * First of all - It is similar to GA - someone could have used all the >>> 180 workers of Apache by submitting PRs to various projects. So we just >>> need a limited worker queue. All those can run as workers in GKE and it >>> should be easy to manage (we could have auto-scaling GKE cluster with upper >>> limit) >>> * Secondly - we can - likely - continue using the GA public workers for >>> all incoming PRs and only use the self-hosted ones for master pushes. Or we >>> could also use them for PRs coming from maintainers. >>> >>> J. >>> >>> >>> >>> On Tue, Oct 13, 2020 at 5:52 PM Ash Berlin-Taylor <a...@apache.org> >>> wrote: >>> >>>> And a magic security sandbox :D >>>> >>>> On Oct 13 2020, at 4:51 pm, Jarek Potiuk <jarek.pot...@polidea.com> >>>> wrote: >>>> >>>> Yep. Now we just need credits :) >>>> >>>> On Tue, Oct 13, 2020 at 5:30 PM Kaxil Naik <kaxiln...@gmail.com> wrote: >>>> >>>> That's ace, we should go ahead with self-hosted runners then. >>>> >>>> On Tue, Oct 13, 2020 at 4:06 PM Ash Berlin-Taylor <a...@apache.org> >>>> wrote: >>>> >>>> Confirmed, we *can* do it - Arrow has done it already >>>> https://issues.apache.org/jira/browse/INFRA-19875 >>>> >>>> But lets have a think on how to not be a bot net :) >>>> >>>> On Oct 13 2020, at 3:59 pm, Ash Berlin-Taylor <a...@apache.org> wrote: >>>> >>>> I've spoken to a few members of ASF Infra directly, and they are just >>>> confirming but they are okay with the idea of us adding self hosted runners >>>> to our repo, and also okay that we can manage those nodes ourselves. Should >>>> get final confirmation today. >>>> >>>> I wanted to double check that we could use the credits before we get >>>> anyone to stump up the VMs/credits etc. >>>> >>>> -ash >>>> >>>> On Oct 13 2020, at 2:16 pm, Jarek Potiuk <jarek.pot...@polidea.com> >>>> wrote: >>>> >>>> This is also a slight problem as mentioned in the build@ thread: >>>> https://lists.apache.org/thread.html/r1708881f52adbdae722afb8fea16b23325b739b254b60890e72375e1%40%3Cbuilds.apache.org%3E >>>> - >>>> managing hosting runners has to be done through infrastructure and they are >>>> not really responsive recently (I have tickets waiting for weeks now). >>>> >>>> But as I've learned recently that we can manage our own secrets via API >>>> without INFRA (and completely legitimately according to GitHub >>>> documentation), maybe hosted runners will be also possible to self-manage >>>> :D >>>> >>>> J. >>>> >>>> On Tue, Oct 13, 2020 at 2:22 PM Ash Berlin-Taylor <a...@apache.org> >>>> wrote: >>>> >>>> I've thought about private/self-hosted runners, and I think long term >>>> that's the way to go to alievate our CI bottlenecks. >>>> >>>> There's a bit of work we need to do around security of builds - as >>>> mentioned here >>>> https://docs.github.com/en/free-pro-team@latest/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories >>>> >>>> > We recommend that you do not use self-hosted runners with public >>>> repositories. >>>> > >>>> > Forks of your public repository can potentially run dangerous code on >>>> your self-hosted runner machine by creating a pull request that executes >>>> the code in a workflow. >>>> > >>>> > This is not an issue with GitHub-hosted runners because each >>>> GitHub-hosted runner is always a clean isolated virtual machine, and it is >>>> destroyed at the end of the job execution. >>>> >>>> So we'd need to dos something similar. >>>> >>>> All for this and happy to help out once 2.0 is out (or at least once it >>>> starts to quieten down) >>>> >>>> -ash >>>> >>>> On Oct 13 2020, at 1:12 pm, Jarek Potiuk <jarek.pot...@polidea.com> >>>> wrote: >>>> >>>> Hello Aizhamal, Everyone, >>>> >>>> We've had some problems recently with concurrency for Github Actions >>>> and suggested solution for now is to use self-hosted runners (This is >>>> suggested by GitHub Support) >>>> >>>> I made some comments in the issue here: >>>> >>>> https://github.com/apache/airflow/issues/11496 >>>> >>>> And also opened build@ discussion >>>> https://lists.apache.org/thread.html/r1708881f52adbdae722afb8fea16b23325b739b254b60890e72375e1%40%3Cbuilds.apache.org%3E >>>> and >>>> opened an accompanying ticket in JIRA: >>>> https://issues.apache.org/jira/projects/INFRA/issues/INFRA-20978 >>>> >>>> Regardless from those discussions, It would be great if we come back to >>>> the idea of Google Donating some credits to Apache Airlfow to setup their >>>> own runners. >>>> >>>> We have not used them last time when GitLab did not manage to >>>> implement the needed fork support (they have not implemented it till NOW >>>> for more than 1.5 year!) but with GitHub I am quite certain we can switch >>>> and start using such runners pretty much immediately if we had some >>>> credits. >>>> >>>> Or maybe some other companies could donate some credits to us ? >>>> >>>> J. >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Jarek Potiuk >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>> >>>> M: +48 660 796 129 <+48660796129> >>>> [image: Polidea] <https://www.polidea.com/> >>>> >>>> >>>> >>>> -- >>>> >>>> Jarek Potiuk >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>> >>>> M: +48 660 796 129 <+48660796129> >>>> [image: Polidea] <https://www.polidea.com/> >>>> >>>> >>>> >>>> -- >>>> >>>> Jarek Potiuk >>>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>>> >>>> M: +48 660 796 129 <+48660796129> >>>> [image: Polidea] <https://www.polidea.com/> >>>> >>>> >>> >>> -- >>> >>> Jarek Potiuk >>> Polidea <https://www.polidea.com/> | Principal Software Engineer >>> >>> M: +48 660 796 129 <+48660796129> >>> [image: Polidea] <https://www.polidea.com/> >>> >>> >> >> -- >> >> Jarek Potiuk >> Polidea <https://www.polidea.com/> | Principal Software Engineer >> >> M: +48 660 796 129 <+48660796129> >> [image: Polidea] <https://www.polidea.com/> >> >> > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> > >