Sorry, I meant Naveen Swamy and not Suneel. -Marco
On Sat, Jan 6, 2018 at 3:29 PM, Marco de Abreu <[email protected] > wrote: > @Kellen: From my experience it was a bit unclear which exact steps have to > be executed in order to create a valid environment to run ci_build.sh with > tests. For example, the script expects the source-code in a specific > directory which is then getting softlinked into the docker container. The > build artefacts are getting copied into another specific softlinked > directory as a result of the build process. In order to get to the test > stage, these specific directories have to be in place. In general, I've got > the feeling that too many undocumented requirements exists before the > ci_build.sh can be executed properly - which makes sense at some point as > it is supposed to only be used on Jenkins-slave. > I'd like to see the scripted revamped in such a way that it can be run out > of the box on a developer computer as well as on CI, telling the user if > anything is missing or expected. > > @Pedro: Thank you! I've already had the possibility to let Eric, Sheng and > Suneel test the authentication mechanism and so far everything worked > flawlessly. At the moment, the roles have to be assigned manually, but I'd > like to invite everybody to try it out themselves on our test-system, > available at http://jenkins.mxnet-ci-dev.amazon-ml.com/. Feel free to > break it and let me know if anything is missing. If this system passed > review and test, I'd like to migrate it to prod. > > -Marco > > On Sat, Jan 6, 2018 at 3:12 PM, Pedro Larroy <[email protected] > > wrote: > >> Agree that comitters should have access to Jenkins. >> >> I would like to as ask for some patience due to the ongoing progress >> on the CI work and thank Amazon for providing the resources for >> running the new CI and the great job done by Marco and the infra team. >> >> Are there some volunteers in helping with the authentication mechanism >> for committers? >> >> Pedro. >> >> On Sat, Jan 6, 2018 at 1:15 PM, Marco de Abreu >> <[email protected]> wrote: >> > While compile errors may be reproduced that way, it's hard to run any >> tests >> > due to the required softlink to the compiled binaries. It is possible, >> but >> > very inconvenient to use and thus it renders reproducing test results >> very >> > hard. >> > >> > I have been thinking about giving the possibility to download the >> generated >> > artefacts during build stage - for debugging reasons only! This way, >> they >> > can be used in conjunction with unit tests to reproduce a test failure, >> but >> > this still needs some discussions and a security review. >> > >> > -Marco >> > >> > Am 06.01.2018 12:59 nachm. schrieb "kellen sunderland" < >> > [email protected]>: >> > >> >> Regarding the comments around reproducibility, what parts of the CI are >> >> people having trouble reproducing now? I'm in favour of making our >> AMIs >> >> public for transparency reasons (and so that people can provide >> suggestions >> >> on how to improve them), but I'm not sure it would help in terms of >> >> reproducibility for any system other than Windows. When I have an >> error in >> >> CI I generally just do a `make clean` in my mxnet root source dire, >> then >> >> copy the failing command from CI, i.e. `tests/ci_build/ci_build.sh cpu >> >> --dockerbinary docker make DEV=1 USE_PROFILER=1 USE_CPP_PACKAGE=1 >> >> USE_BLAS=openblas -j$(nproc)`. Are there CI tasks (other than Windows) >> >> that don't work for people? If so maybe we can help fix those? >> >> >> >> >> >> On Sat, Jan 6, 2018 at 11:50 AM, kellen sunderland < >> >> [email protected]> wrote: >> >> >> >> > +1, thanks for the work Marco. >> >> > >> >> > On Sat, Jan 6, 2018 at 12:24 AM, Naveen Swamy <[email protected]> >> >> wrote: >> >> > >> >> >> this sounds fine to me as long as there is at least one MXNet >> committer >> >> >> who >> >> >> is also an admin. >> >> >> >> >> >> thanks Marco for making this happen :) >> >> >> >> >> >> - Naveen >> >> >> >> >> >> On Fri, Jan 5, 2018 at 2:54 PM, Marco de Abreu < >> >> >> [email protected] >> >> >> > wrote: >> >> >> >> >> >> > I'm proposing following permissions: >> https://i.imgur.com/uiFBtuW.png. >> >> >> The >> >> >> > meaning of every permission is explained at >> https://wiki.jenkins.io/ >> >> >> > display/JENKINS/Matrix-based+security. >> >> >> > >> >> >> > Any objections? >> >> >> > >> >> >> > On Fri, Jan 5, 2018 at 11:03 PM, Marco de Abreu < >> >> >> > [email protected]> wrote: >> >> >> > >> >> >> > > I'm currently working on a prototype of SSO based on GitHub and >> a >> >> few >> >> >> > > issues arose: >> >> >> > > >> >> >> > > We are not able to use the permission strategy which determines >> the >> >> >> > access >> >> >> > > rights based on the read/write permission to a project as the >> >> >> > > Jenkins-plugin is not able to resolve the link between >> Jenkins-jobs >> >> >> and >> >> >> > > GitHub-repositories. Instead I would propose to use a role-based >> >> >> approach >> >> >> > > using https://wiki.jenkins.io/displa >> y/JENKINS/Role+Strategy+Plugin. >> >> >> In >> >> >> > > this case we would have three roles: Anonymous, Administrator >> and >> >> >> > > Committer. While everybody would authenticate using their >> regular >> >> >> GitHub >> >> >> > > account, the role assignment would have to happen manually. >> >> >> Considering >> >> >> > > that the amount of administrators and committers doesn't change >> that >> >> >> > > frequently, this shouldn't be too much of an issue - auto >> populating >> >> >> the >> >> >> > > status is not possible unfortunately. >> >> >> > > >> >> >> > > Reason for splitting Administrators and Committers into two >> separate >> >> >> > roles >> >> >> > > has a security reason. At the moment, we're using Chris Oliviers >> >> >> GitHub >> >> >> > > credentials to populate the commit status. If all committers >> would >> >> >> gain >> >> >> > > full admin rights, they would have access to these credentials. >> >> Chris >> >> >> is >> >> >> > > not fine with this approach and would like to limit the amount >> of >> >> >> people >> >> >> > > with access to his credentials as much as possible. >> >> >> > > >> >> >> > > In order to address his concerns, I propose to add Chris to the >> >> >> committer >> >> >> > > as well as to the admin role, while all other committers will >> only >> >> >> > receive >> >> >> > > the committer role without read access to the credentials. In a >> >> later >> >> >> > > email, I will make a proposal for the detailed committer role >> >> rights. >> >> >> You >> >> >> > > can check all available options at https://wiki.jenkins.io/ >> >> >> > > display/JENKINS/Matrix-based+security. >> >> >> > > >> >> >> > > All people who have access to the underlying AWS account would >> be >> >> >> granted >> >> >> > > the Administrator role with full access. At the moment, this >> would >> >> be >> >> >> > > Meghna Baijal, Gautam Kumar and myself. >> >> >> > > >> >> >> > > An alternative solution would be to create a bot account >> >> specifically >> >> >> for >> >> >> > > MXNet CI and use its credentials instead of Chris'. This account >> >> >> requires >> >> >> > > write permission to the repository, but would give us the >> advantage >> >> >> that >> >> >> > > these credentials would be shared within the committers and thus >> >> >> making >> >> >> > the >> >> >> > > restrictions regarding credentials obsolete (and Chris would be >> >> happy >> >> >> not >> >> >> > > the see his face within every single PR :P ). I've asked around >> and >> >> >> > > received the feedback from multiple people that Apache Infra >> does >> >> not >> >> >> > want >> >> >> > > to grant bot accounts write permission to a repository, but I >> would >> >> >> like >> >> >> > to >> >> >> > > confirm back considering that AppVeyor, for example, has a bot >> >> account >> >> >> > with >> >> >> > > write permission. I would like to check back with a mentor and >> >> create >> >> >> an >> >> >> > > Apache Infra ticket to request details and permission. >> >> >> > > >> >> >> > > I would propose to take both approaches at the same time, >> meaning we >> >> >> can >> >> >> > > start with Chris in the committer AND admin role while trying >> to get >> >> >> > > permission for a bot account in the meantime. >> >> >> > > >> >> >> > > wdyt? >> >> >> > > >> >> >> > > On Fri, Jan 5, 2018 at 8:21 PM, Chris Olivier < >> >> [email protected]> >> >> >> > > wrote: >> >> >> > > >> >> >> > >> I am fine without a vote unless a vote is required? Any >> >> objections, >> >> >> > >> anyone? You're sort of adding functionality here, not >> changing or >> >> >> > >> restricting... We can always change to Apache later. >> >> >> > >> >> >> >> > >> On Fri, Jan 5, 2018 at 11:18 AM, Marco de Abreu < >> >> >> > >> [email protected]> wrote: >> >> >> > >> >> >> >> > >> > I'd be in favour of GitHub. Shall we open a vote or would you >> >> like >> >> >> me >> >> >> > to >> >> >> > >> > create a POC with GitHub first and afterwards we can check if >> >> >> that's >> >> >> > >> > enough? >> >> >> > >> > >> >> >> > >> > -Marco >> >> >> > >> > >> >> >> > >> > On Fri, Jan 5, 2018 at 8:13 PM, Chris Olivier < >> >> >> [email protected]> >> >> >> > >> > wrote: >> >> >> > >> > >> >> >> > >> > > Apparently Apache supports OATH, so I am open to either. >> >> >> > >> > > Good idea for the docker thing. >> >> >> > >> > > >> >> >> > >> > > On Fri, Jan 5, 2018 at 11:02 AM, Marco de Abreu < >> >> >> > >> > > [email protected]> wrote: >> >> >> > >> > > >> >> >> > >> > > > GitHub SSO allows the neat feature that login and >> permission >> >> >> can >> >> >> > be >> >> >> > >> > > > selected depending on the access rights a user has to a >> >> >> project. >> >> >> > >> > Somebody >> >> >> > >> > > > with write access (committers) would be get different >> >> >> permissions >> >> >> > >> than >> >> >> > >> > > > somebody with only read access. >> >> >> > >> > > > >> >> >> > >> > > > We could check back with Apache for SSO, but this would >> >> involve >> >> >> > >> Apache >> >> >> > >> > > > infra. We could put it up to a vote whether to use >> GitHub or >> >> >> > Apache >> >> >> > >> > SSO. >> >> >> > >> > > > >> >> >> > >> > > > In order to reproduce a build failure we have been >> thinking >> >> >> about >> >> >> > >> > > changing >> >> >> > >> > > > the ci_build.sh in such a way that it can be run manually >> >> >> without >> >> >> > >> > > Jenkins. >> >> >> > >> > > > The setup I took over binds the Jenkins work directory >> into >> >> the >> >> >> > >> docker >> >> >> > >> > > > containers and uses a few hacks which are hard to >> reproduce >> >> >> > >> locally. We >> >> >> > >> > > > plan to reengineer this script to make it easier to run >> >> >> manually. >> >> >> > >> > > > But making the AMI public is a good idea! We plan to >> make the >> >> >> > whole >> >> >> > >> > > > infrastructure code (based on Terraform) completely >> public - >> >> at >> >> >> > the >> >> >> > >> > > moment >> >> >> > >> > > > it's in a private repository as it contains credentials, >> but >> >> >> they >> >> >> > >> will >> >> >> > >> > be >> >> >> > >> > > > moved to KMS soon. It would definitely be a good >> approach to >> >> >> just >> >> >> > >> > supply >> >> >> > >> > > > the AMI so everybody could recreate the environment in >> their >> >> >> own >> >> >> > >> > account. >> >> >> > >> > > > >> >> >> > >> > > > -Marco >> >> >> > >> > > > >> >> >> > >> > > > Am 05.01.2018 7:51 nachm. schrieb "Chris Olivier" < >> >> >> > >> > [email protected] >> >> >> > >> > > >: >> >> >> > >> > > > >> >> >> > >> > > > Well, login to the Jenkins server, I would imagine. >> >> >> > >> > > > >> >> >> > >> > > > github or Apache SSO (does Apache support OAUTH?) seems >> like >> >> a >> >> >> > good >> >> >> > >> > idea >> >> >> > >> > > as >> >> >> > >> > > > long as there's a way to not let everyone with a github >> >> account >> >> >> > log >> >> >> > >> in. >> >> >> > >> > > > >> >> >> > >> > > > Access to actual slave machines could be more >> restricted, I >> >> >> > imagine. >> >> >> > >> > > > >> >> >> > >> > > > Eventually, a public current AMI for a build slave would >> be >> >> >> good >> >> >> > in >> >> >> > >> > order >> >> >> > >> > > > to reproduce build or test problems that can't be >> reproduced >> >> >> > >> locally. >> >> >> > >> > > > >> >> >> > >> > > > wdyt? >> >> >> > >> > > > >> >> >> > >> > > > >> >> >> > >> > > > >> >> >> > >> > > > On Fri, Jan 5, 2018 at 10:41 AM, Marco de Abreu < >> >> >> > >> > > > [email protected]> wrote: >> >> >> > >> > > > >> >> >> > >> > > > > Would it be an acceptable solution if we add SSO or do >> you >> >> >> also >> >> >> > >> want >> >> >> > >> > > > access >> >> >> > >> > > > > to the actual AWS account and all machines? >> >> >> > >> > > > > >> >> >> > >> > > > > Yes, the build jobs are automatically getting created >> for >> >> new >> >> >> > >> > branches. >> >> >> > >> > > > > >> >> >> > >> > > > > -Marco >> >> >> > >> > > > > >> >> >> > >> > > > > Am 05.01.2018 7:35 nachm. schrieb "Marco de Abreu" < >> >> >> > >> > > > > [email protected]>: >> >> >> > >> > > > > >> >> >> > >> > > > > I totally agree, this is not the way it should work in >> an >> >> >> Apache >> >> >> > >> > > Project. >> >> >> > >> > > > > It's running on an isengard account, meaning it is only >> >> >> > accessible >> >> >> > >> > for >> >> >> > >> > > > > Amazon employees. The problem is that a compromised >> account >> >> >> > could >> >> >> > >> > cause >> >> >> > >> > > > > damage up to 170,000$ per day. There are alarms in >> place to >> >> >> > notice >> >> >> > >> > > those >> >> >> > >> > > > > cases, but we still have to be very careful. These high >> >> >> limits >> >> >> > >> have >> >> >> > >> > > been >> >> >> > >> > > > > chosen due to auto scaling being added within the next >> >> >> week's. >> >> >> > >> > > > > >> >> >> > >> > > > > I'd be happy to introduce a committer into the CI >> process >> >> and >> >> >> > all >> >> >> > >> the >> >> >> > >> > > > > necessary steps as well as granting them permission. >> The >> >> only >> >> >> > >> > > restriction >> >> >> > >> > > > > being that it has to be and Amazon employee and access >> to >> >> >> > console, >> >> >> > >> > > master >> >> >> > >> > > > > and slave only being possible from the Corp network. >> >> >> > >> > > > > >> >> >> > >> > > > > There is no open ticket. What would you like to >> request? >> >> >> > >> > > > > >> >> >> > >> > > > > -Marco >> >> >> > >> > > > > >> >> >> > >> > > > > >> >> >> > >> > > > > Am 05.01.2018 7:22 nachm. schrieb "Chris Olivier" < >> >> >> > >> > > [email protected] >> >> >> > >> > > > >: >> >> >> > >> > > > > >> >> >> > >> > > > > Like John and other mentors were saying, it's not >> proper >> >> for >> >> >> CI >> >> >> > to >> >> >> > >> > be a >> >> >> > >> > > > > closed/inaccessible environment. Is it running on an >> >> >> Isengard >> >> >> > >> > account >> >> >> > >> > > or >> >> >> > >> > > > > in PROD or CORP or just generic EC2? I think that we >> >> should >> >> >> > >> remedy >> >> >> > >> > > this. >> >> >> > >> > > > > It's very strange that no committers have access at >> all. >> >> Is >> >> >> > >> there a >> >> >> > >> > > > ticket >> >> >> > >> > > > > open to IPSEC? >> >> >> > >> > > > > >> >> >> > >> > > > > On Fri, Jan 5, 2018 at 10:17 AM, Marco de Abreu < >> >> >> > >> > > > > [email protected]> wrote: >> >> >> > >> > > > > >> >> >> > >> > > > > > Hello Chris, >> >> >> > >> > > > > > >> >> >> > >> > > > > > At the moment this is not possible due Amazon AppSec >> >> >> > >> (Application >> >> >> > >> > > > > security) >> >> >> > >> > > > > > restrictions which does not permit user data and >> >> >> credentials >> >> >> > on >> >> >> > >> > these >> >> >> > >> > > > > > machines. >> >> >> > >> > > > > > >> >> >> > >> > > > > > I have been thinking about adding single sign on >> bound to >> >> >> > >> GitHub, >> >> >> > >> > but >> >> >> > >> > > > we >> >> >> > >> > > > > > would have to check back with AppSec. >> >> >> > >> > > > > > >> >> >> > >> > > > > > Is the reason for your request still the ability to >> start >> >> >> and >> >> >> > >> stop >> >> >> > >> > > > > running >> >> >> > >> > > > > > builds? >> >> >> > >> > > > > > >> >> >> > >> > > > > > Best regards, >> >> >> > >> > > > > > Marco >> >> >> > >> > > > > > >> >> >> > >> > > > > > Am 05.01.2018 7:11 nachm. schrieb "Chris Olivier" < >> >> >> > >> > > > [email protected] >> >> >> > >> > > > > >: >> >> >> > >> > > > > > >> >> >> > >> > > > > > Marco, >> >> >> > >> > > > > > >> >> >> > >> > > > > > Are all committers able to get login access to the >> >> Jenkins >> >> >> > >> Server? >> >> >> > >> > > If >> >> >> > >> > > > > not, >> >> >> > >> > > > > > why? >> >> >> > >> > > > > > >> >> >> > >> > > > > > -Chris >> >> >> > >> > > > > > >> >> >> > >> > > > > >> >> >> > >> > > > >> >> >> > >> > > >> >> >> > >> > >> >> >> > >> >> >> >> > > >> >> >> > > >> >> >> > >> >> >> >> >> > >> >> > >> >> >> > >
