Some updates here, we have migrated most of the jobs to ci-hadoop.a.o. There is a known issue that our flaky dashboard is broken, due to this new feature of jenkins
https://wiki.jenkins.io/display/JENKINS/Configuring+Content+Security+Policy Josh is contacting the infra team to see if they can relax the policy but I do not think it is easy as the policy is per site, not per job... Anyway, there is a chrome plugin to temporarily disable CSP, so you can view the correct flaky dashboard. https://chrome.google.com/webstore/detail/disable-content-security/ieelmcmcagommplceebfedjlakkhpden Thanks. Andor Molnar <an...@apache.org> 于2020年7月30日周四 下午3:12写道: > https://issues.apache.org/jira/browse/INFRA-20613 > > > > > On 2020. Jul 30., at 1:47, 张铎(Duo Zhang) <palomino...@gmail.com> wrote: > > > > This never worked in the past... > > > > But it would be great if you can kick the infra team to get this done :) > > > > File an infra issue? > > > > Andor Molnar <an...@apache.org>于2020年7月29日 周三18:36写道: > > > >> You’re having the same issue with HBase Robot btw. At the end of console > >> outputs: > >> > >> "Could not update commit status, please check if your scan credentials > >> belong to a member of the organization or a collaborator of the > repository > >> and repo:status scope is selected” > >> > >> ...and shortly after that: > >> > >> "GitHub has been notified of this commit’s build result” > >> > >> Whatever does it mean. > >> > >> Andor > >> > >> > >> > >>> On 2020. Jul 29., at 11:57, Andor Molnar <an...@apache.org> wrote: > >>> > >>> Yep, we’ve finally received it. It’s done. > >>> > >>> Current issue is that Jenkins is unable to set Github build status. > I’ve > >> added repo:status permission, but it’s also asking to be member of the > >> project/organization and not sure how to do that. > >>> > >>> Andor > >>> > >>> > >>> > >>>> On 2020. Jul 29., at 4:10, 张铎(Duo Zhang) <palomino...@gmail.com> > wrote: > >>>> > >>>> Seems you have already made it? > >>>> > >>>> Usually there are several moderators for the private list, you need to > >> ask > >>>> them to let the GitHub registration go through. > >>>> > >>>> Andor Molnar <an...@apache.org> 于2020年7月29日周三 上午1:03写道: > >>>> > >>>>> Thanks Duo, that’s very helpful. > >>>>> I cannot set private@zookeeper as a verified e-mail address, because > >> the > >>>>> verification e-mail cannot be sent to the list. Isn’t that restricted > >> for > >>>>> members only (by default)? > >>>>> > >>>>> Andor > >>>>> > >>>>> > >>>>> > >>>>>> On 2020. Jul 28., at 3:15, 张铎(Duo Zhang) <palomino...@gmail.com> > >> wrote: > >>>>>> > >>>>>> Hi Andor, > >>>>>> > >>>>>> The Apache-HBase account is registered by me, using the > private@hbase > >>>>>> mailing list, so all the PMC members can maintain the password. > >>>>>> > >>>>>> I generated an access token and added it to our jenkins, so we can > >> use it > >>>>>> to post comments back to GitHub. > >>>>>> > >>>>>> I think you could do the same to register an Apache-ZooKeeper > >> account? Or > >>>>>> if you want to use the hadoop-yetus account, you'd better ask the > >> hadoop > >>>>>> PMC members or Gavin to add the token to jenkins so you can use it. > >>>>>> > >>>>>> Thanks. > >>>>>> > >>>>>> Andor Molnar <an...@apache.org> 于2020年7月28日周二 上午3:56写道: > >>>>>> > >>>>>>> Hi Duo, > >>>>>>> > >>>>>>> I’m trying to create a similar job for Apache ZooKeeper, but > >>>>> unfortunately > >>>>>>> haven’t got too much help on the Apache builds@ list so far, so > I’m > >>>>>>> rather asking you if you don’t mind. > >>>>>>> > >>>>>>> First, how have you set up the Hbase Github account that you use in > >> this > >>>>>>> job to access the repo? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Andor > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> On 2020. Jul 27., at 2:22, 张铎(Duo Zhang) <palomino...@gmail.com> > >>>>> wrote: > >>>>>>>> > >>>>>>>> The pre commit job has been migrated to c-hadoop.a.o. > >>>>>>>> > >>>>>>>> I have disabled periodical scan for the old job on builds.a.o, as > we > >>>>>>> still > >>>>>>>> need to view the pre commit result on it do not delete for now. > Will > >>>>>>> delete > >>>>>>>> it later, maybe after several weeks. > >>>>>>>> > >>>>>>>> The new job is here > >>>>>>>> > >>>>>>>> > >> https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/ > >>>>>>>> > >>>>>>>> Thanks. > >>>>>>>> > >>>>>>>> 张铎(Duo Zhang) <palomino...@gmail.com> 于2020年7月25日周六 下午9:44写道: > >>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>> > >> > https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/5/console > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> We successfully finished a nightly build. > >>>>>>>>> > >>>>>>>>> But seems the jiraComment did not work. I haven't seen the > comment > >>>>>>>>> on HBASE-24757... > >>>>>>>>> > >>>>>>>>> 张铎(Duo Zhang) <palomino...@gmail.com> 于2020年7月25日周六 下午4:51写道: > >>>>>>>>> > >>>>>>>>>> After installing two new jenkins plugins, the pre commit job > seems > >>>>> fine > >>>>>>>>>> now. > >>>>>>>>>> > >>>>>>>>>> The last failure is because of a timeout, I assume the problem > is > >>>>> that > >>>>>>> we > >>>>>>>>>> do not have enough executors so all the jobs are executed > >>>>> sequentially. > >>>>>>>>>> > >>>>>>>>>> Maybe we could move the pre commit job to the new env first? The > >>>>>>> nightly > >>>>>>>>>> job and flaky job require more resources, and we need the output > >> of > >>>>>>> these > >>>>>>>>>> jenkins jobs(the flaky test list). > >>>>>>>>>> > >>>>>>>>>> Thanks. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> 张铎(Duo Zhang) <palomino...@gmail.com> 于2020年7月24日周五 下午4:36写道: > >>>>>>>>>> > >>>>>>>>>>> The problem seems because of this: > >>>>>>>>>>> > >>>>>>>>>>> https://issues.jenkins-ci.org/browse/JENKINS-48556 > >>>>>>>>>>> > >>>>>>>>>>> I triggered the job again, it passed the timestamps call, and > >> will > >>>>>>> keep > >>>>>>>>>>> an eye on it. > >>>>>>>>>>> > >>>>>>>>>>> 张铎(Duo Zhang) <palomino...@gmail.com> 于2020年7月21日周二 上午11:18写道: > >>>>>>>>>>> > >>>>>>>>>>>> On the sponsors, we could have a try. > >>>>>>>>>>>> > >>>>>>>>>>>> The problem here is the process of the donation? IIRC there > is a > >>>>>>> thread > >>>>>>>>>>>> on the infra mailing list about how to donate machines to a > >>>>> specific > >>>>>>>>>>>> project and the discussion did not go well... > >>>>>>>>>>>> > >>>>>>>>>>>> Sean Busbey <bus...@apache.org> 于2020年7月21日周二 上午11:13写道: > >>>>>>>>>>>> > >>>>>>>>>>>>> We could check with ASF infra for the current state of things > >> wrt > >>>>>>>>>>>>> GitHub > >>>>>>>>>>>>> actions. I believe there is a queue set up across ASF > projects. > >>>>>>>>>>>>> > >>>>>>>>>>>>> It has the same resource issue Travis had; things are fine > >> until > >>>>>>> some > >>>>>>>>>>>>> critical mass of projects seeking better perf realize some > new > >>>>>>> option > >>>>>>>>>>>>> is > >>>>>>>>>>>>> available and then quickly all available resources are > >> consumed. > >>>>>>>>>>>>> > >>>>>>>>>>>>> AFAICT the only option that gets us the same or better as the > >> H* > >>>>>>> nodes > >>>>>>>>>>>>> will > >>>>>>>>>>>>> be finding sponsors and running our own. > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Mon, Jul 20, 2020, 21:55 张铎(Duo Zhang) < > >> palomino...@gmail.com> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> I think our nightly, flakey, and pre commit jobs should be > >>>>>>>>>>>>> transferred as a > >>>>>>>>>>>>>> whole? They depend on each other. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I offer my help on the transition. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> And on github CI, does ASF have a special deal with github? > If > >>>>> not, > >>>>>>>>>>>>> I do > >>>>>>>>>>>>>> not think the default resource can fit our requirements... > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Sean Busbey <bus...@apache.org> 于2020年7月21日周二 上午1:49写道: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi folks! > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Back in April there was a brief discussion[1] about ASF > >> Infra's > >>>>>>>>>>>>>>> notification that builds.a.o is going away and we are > >> currently > >>>>>>>>>>>>> slated > >>>>>>>>>>>>>>> to migrate to a set of CI servers for "Hadoop and related > >>>>>>>>>>>>> projects". > >>>>>>>>>>>>>>> This is the ci farm that will contain the bulk of the H* > >> worker > >>>>>>>>>>>>> nodes > >>>>>>>>>>>>>>> that are donated by Yahoo!, which are the nodes we've been > >>>>> running > >>>>>>>>>>>>> on > >>>>>>>>>>>>>>> for ages[2]. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Migration discussion still happens on the > >>>>> hadoop-migrations@i.a.o > >>>>>>>>>>>>>>> list[3] and recently ASF Infra set a target date of August > >> 15th > >>>>>>> for > >>>>>>>>>>>>>>> turning off the existing builds.a.o server[4]. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> That gives us a little under 4 weeks to have things up and > >>>>> working > >>>>>>>>>>>>> on > >>>>>>>>>>>>>>> the new ci-hadoop.a.o jenkins coordinator[5]. it’s not > clear > >> to > >>>>> me > >>>>>>>>>>>>>>> that the level of effort we’ll need to spend is worth what > we > >>>>> get > >>>>>>>>>>>>> out > >>>>>>>>>>>>>>> of a continuation of the status quo on builds.a.o. I did a > >> quick > >>>>>>>>>>>>> test > >>>>>>>>>>>>>>> by updating the nightly job on ci-hadoop.a.o to run just > >>>>> branch-2, > >>>>>>>>>>>>>>> since that has been stable on builds.a.o. It failed with a > >>>>> Jenkins > >>>>>>>>>>>>>>> pipeline DSL syntax error[6] so I'm assuming migrating will > >> be a > >>>>>>>>>>>>> slog. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> As far as I can see our options are: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> * Do nothing. Have no testing or automated website > >> publication > >>>>> in > >>>>>>>>>>>>> mid > >>>>>>>>>>>>>>> August. > >>>>>>>>>>>>>>> * Transition website publication and nothing else (probably > >> can > >>>>> be > >>>>>>>>>>>>>>> done in a day) > >>>>>>>>>>>>>>> * Transition just precommit testing for various repos > >> (probably > >>>>>>>>>>>>> can be > >>>>>>>>>>>>>>> done in a few days) > >>>>>>>>>>>>>>> * Transition everything (no idea how long it takes due to > >>>>> nightly, > >>>>>>>>>>>>>>> flaky stuff, etc) > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> The alternatives if we do not transition any given job to > >>>>>>>>>>>>> ci-hadoop: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> * Try to move to GitHub Actions > >>>>>>>>>>>>>>> * Try to move to Travis CI > >>>>>>>>>>>>>>> * Try to move to Jenkins infra we maintain ourselves > >> (presumably > >>>>>>> by > >>>>>>>>>>>>>>> soliciting project specific donations for worker nodes on > >> cloud > >>>>>>>>>>>>>>> vendors) > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> It's important to remember that as a project we have a > heavy > >>>>>>>>>>>>> footprint > >>>>>>>>>>>>>>> wherever our nightly tests run. For context, a given > branch's > >>>>>>>>>>>>> nightly > >>>>>>>>>>>>>>> can keep 3-4 executors busy for 6+ hours on the current > >>>>> builds.a.o > >>>>>>>>>>>>>>> setup. There's been a bunch of great work lately on > bringing > >>>>> down > >>>>>>>>>>>>> what > >>>>>>>>>>>>>>> it takes to run the full test suite, but applying that work > >> to > >>>>>>>>>>>>> nightly > >>>>>>>>>>>>>>> is itself a significant undertaking. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> What are folks thinking? Most importantly who is ready to > >> work > >>>>>>>>>>>>> towards > >>>>>>>>>>>>>>> any given approach? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> [1] [DISCUSS] Migrating HBase to new CI Master > >>>>>>>>>>>>>>> https://s.apache.org/fux1o > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> [2] https://builds.apache.org/view/H-L/view/HBase/ > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> [3] > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>> > >> https://lists.apache.org/list.html?hadoop-migrati...@infra.apache.org > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> [4] [IMPORTANT] - 2 more HADOOP nodes migrated over to > >> ci-hadoop > >>>>>>>>>>>>>>> https://s.apache.org/7e1nq > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> [5] https://ci-hadoop.apache.org/job/HBase/ > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> [6] > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>> > >>>>> > >> > https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/2/console > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>> > >>>>>>> > >>>>> > >>>>> > >>> > >> > >> > >