Thank you so much for taking on these migrations! I very much appreciate it!
-n On Tue, Aug 11, 2020 at 8:08 AM 张铎(Duo Zhang) <[email protected]> wrote: > Some updates here, we have migrated most of the jobs to ci-hadoop.a.o. > > There is a known issue that our flaky dashboard is broken, due to this new > feature of jenkins > > https://wiki.jenkins.io/display/JENKINS/Configuring+Content+Security+Policy > > Josh is contacting the infra team to see if they can relax the policy but I > do not think it is easy as the policy is per site, not per job... > > Anyway, there is a chrome plugin to temporarily disable CSP, so you can > view the correct flaky dashboard. > > > https://chrome.google.com/webstore/detail/disable-content-security/ieelmcmcagommplceebfedjlakkhpden > > Thanks. > > Andor Molnar <[email protected]> 于2020年7月30日周四 下午3:12写道: > > > https://issues.apache.org/jira/browse/INFRA-20613 > > > > > > > > > On 2020. Jul 30., at 1:47, 张铎(Duo Zhang) <[email protected]> > wrote: > > > > > > This never worked in the past... > > > > > > But it would be great if you can kick the infra team to get this done > :) > > > > > > File an infra issue? > > > > > > Andor Molnar <[email protected]>于2020年7月29日 周三18:36写道: > > > > > >> You’re having the same issue with HBase Robot btw. At the end of > console > > >> outputs: > > >> > > >> "Could not update commit status, please check if your scan credentials > > >> belong to a member of the organization or a collaborator of the > > repository > > >> and repo:status scope is selected” > > >> > > >> ...and shortly after that: > > >> > > >> "GitHub has been notified of this commit’s build result” > > >> > > >> Whatever does it mean. > > >> > > >> Andor > > >> > > >> > > >> > > >>> On 2020. Jul 29., at 11:57, Andor Molnar <[email protected]> wrote: > > >>> > > >>> Yep, we’ve finally received it. It’s done. > > >>> > > >>> Current issue is that Jenkins is unable to set Github build status. > > I’ve > > >> added repo:status permission, but it’s also asking to be member of the > > >> project/organization and not sure how to do that. > > >>> > > >>> Andor > > >>> > > >>> > > >>> > > >>>> On 2020. Jul 29., at 4:10, 张铎(Duo Zhang) <[email protected]> > > wrote: > > >>>> > > >>>> Seems you have already made it? > > >>>> > > >>>> Usually there are several moderators for the private list, you need > to > > >> ask > > >>>> them to let the GitHub registration go through. > > >>>> > > >>>> Andor Molnar <[email protected]> 于2020年7月29日周三 上午1:03写道: > > >>>> > > >>>>> Thanks Duo, that’s very helpful. > > >>>>> I cannot set private@zookeeper as a verified e-mail address, > because > > >> the > > >>>>> verification e-mail cannot be sent to the list. Isn’t that > restricted > > >> for > > >>>>> members only (by default)? > > >>>>> > > >>>>> Andor > > >>>>> > > >>>>> > > >>>>> > > >>>>>> On 2020. Jul 28., at 3:15, 张铎(Duo Zhang) <[email protected]> > > >> wrote: > > >>>>>> > > >>>>>> Hi Andor, > > >>>>>> > > >>>>>> The Apache-HBase account is registered by me, using the > > private@hbase > > >>>>>> mailing list, so all the PMC members can maintain the password. > > >>>>>> > > >>>>>> I generated an access token and added it to our jenkins, so we can > > >> use it > > >>>>>> to post comments back to GitHub. > > >>>>>> > > >>>>>> I think you could do the same to register an Apache-ZooKeeper > > >> account? Or > > >>>>>> if you want to use the hadoop-yetus account, you'd better ask the > > >> hadoop > > >>>>>> PMC members or Gavin to add the token to jenkins so you can use > it. > > >>>>>> > > >>>>>> Thanks. > > >>>>>> > > >>>>>> Andor Molnar <[email protected]> 于2020年7月28日周二 上午3:56写道: > > >>>>>> > > >>>>>>> Hi Duo, > > >>>>>>> > > >>>>>>> I’m trying to create a similar job for Apache ZooKeeper, but > > >>>>> unfortunately > > >>>>>>> haven’t got too much help on the Apache builds@ list so far, so > > I’m > > >>>>>>> rather asking you if you don’t mind. > > >>>>>>> > > >>>>>>> First, how have you set up the Hbase Github account that you use > in > > >> this > > >>>>>>> job to access the repo? > > >>>>>>> > > >>>>>>> Thanks, > > >>>>>>> Andor > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>>> On 2020. Jul 27., at 2:22, 张铎(Duo Zhang) <[email protected] > > > > >>>>> wrote: > > >>>>>>>> > > >>>>>>>> The pre commit job has been migrated to c-hadoop.a.o. > > >>>>>>>> > > >>>>>>>> I have disabled periodical scan for the old job on builds.a.o, > as > > we > > >>>>>>> still > > >>>>>>>> need to view the pre commit result on it do not delete for now. > > Will > > >>>>>>> delete > > >>>>>>>> it later, maybe after several weeks. > > >>>>>>>> > > >>>>>>>> The new job is here > > >>>>>>>> > > >>>>>>>> > > >> https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/ > > >>>>>>>> > > >>>>>>>> Thanks. > > >>>>>>>> > > >>>>>>>> 张铎(Duo Zhang) <[email protected]> 于2020年7月25日周六 下午9:44写道: > > >>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>> > > >>>>> > > >> > > > https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/5/console > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> We successfully finished a nightly build. > > >>>>>>>>> > > >>>>>>>>> But seems the jiraComment did not work. I haven't seen the > > comment > > >>>>>>>>> on HBASE-24757... > > >>>>>>>>> > > >>>>>>>>> 张铎(Duo Zhang) <[email protected]> 于2020年7月25日周六 下午4:51写道: > > >>>>>>>>> > > >>>>>>>>>> After installing two new jenkins plugins, the pre commit job > > seems > > >>>>> fine > > >>>>>>>>>> now. > > >>>>>>>>>> > > >>>>>>>>>> The last failure is because of a timeout, I assume the problem > > is > > >>>>> that > > >>>>>>> we > > >>>>>>>>>> do not have enough executors so all the jobs are executed > > >>>>> sequentially. > > >>>>>>>>>> > > >>>>>>>>>> Maybe we could move the pre commit job to the new env first? > The > > >>>>>>> nightly > > >>>>>>>>>> job and flaky job require more resources, and we need the > output > > >> of > > >>>>>>> these > > >>>>>>>>>> jenkins jobs(the flaky test list). > > >>>>>>>>>> > > >>>>>>>>>> Thanks. > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> 张铎(Duo Zhang) <[email protected]> 于2020年7月24日周五 下午4:36写道: > > >>>>>>>>>> > > >>>>>>>>>>> The problem seems because of this: > > >>>>>>>>>>> > > >>>>>>>>>>> https://issues.jenkins-ci.org/browse/JENKINS-48556 > > >>>>>>>>>>> > > >>>>>>>>>>> I triggered the job again, it passed the timestamps call, and > > >> will > > >>>>>>> keep > > >>>>>>>>>>> an eye on it. > > >>>>>>>>>>> > > >>>>>>>>>>> 张铎(Duo Zhang) <[email protected]> 于2020年7月21日周二 > 上午11:18写道: > > >>>>>>>>>>> > > >>>>>>>>>>>> On the sponsors, we could have a try. > > >>>>>>>>>>>> > > >>>>>>>>>>>> The problem here is the process of the donation? IIRC there > > is a > > >>>>>>> thread > > >>>>>>>>>>>> on the infra mailing list about how to donate machines to a > > >>>>> specific > > >>>>>>>>>>>> project and the discussion did not go well... > > >>>>>>>>>>>> > > >>>>>>>>>>>> Sean Busbey <[email protected]> 于2020年7月21日周二 上午11:13写道: > > >>>>>>>>>>>> > > >>>>>>>>>>>>> We could check with ASF infra for the current state of > things > > >> wrt > > >>>>>>>>>>>>> GitHub > > >>>>>>>>>>>>> actions. I believe there is a queue set up across ASF > > projects. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> It has the same resource issue Travis had; things are fine > > >> until > > >>>>>>> some > > >>>>>>>>>>>>> critical mass of projects seeking better perf realize some > > new > > >>>>>>> option > > >>>>>>>>>>>>> is > > >>>>>>>>>>>>> available and then quickly all available resources are > > >> consumed. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> AFAICT the only option that gets us the same or better as > the > > >> H* > > >>>>>>> nodes > > >>>>>>>>>>>>> will > > >>>>>>>>>>>>> be finding sponsors and running our own. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On Mon, Jul 20, 2020, 21:55 张铎(Duo Zhang) < > > >> [email protected]> > > >>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>> > > >>>>>>>>>>>>>> I think our nightly, flakey, and pre commit jobs should be > > >>>>>>>>>>>>> transferred as a > > >>>>>>>>>>>>>> whole? They depend on each other. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> I offer my help on the transition. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> And on github CI, does ASF have a special deal with > github? > > If > > >>>>> not, > > >>>>>>>>>>>>> I do > > >>>>>>>>>>>>>> not think the default resource can fit our requirements... > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Sean Busbey <[email protected]> 于2020年7月21日周二 上午1:49写道: > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Hi folks! > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Back in April there was a brief discussion[1] about ASF > > >> Infra's > > >>>>>>>>>>>>>>> notification that builds.a.o is going away and we are > > >> currently > > >>>>>>>>>>>>> slated > > >>>>>>>>>>>>>>> to migrate to a set of CI servers for "Hadoop and related > > >>>>>>>>>>>>> projects". > > >>>>>>>>>>>>>>> This is the ci farm that will contain the bulk of the H* > > >> worker > > >>>>>>>>>>>>> nodes > > >>>>>>>>>>>>>>> that are donated by Yahoo!, which are the nodes we've > been > > >>>>> running > > >>>>>>>>>>>>> on > > >>>>>>>>>>>>>>> for ages[2]. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> Migration discussion still happens on the > > >>>>> [email protected] > > >>>>>>>>>>>>>>> list[3] and recently ASF Infra set a target date of > August > > >> 15th > > >>>>>>> for > > >>>>>>>>>>>>>>> turning off the existing builds.a.o server[4]. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> That gives us a little under 4 weeks to have things up > and > > >>>>> working > > >>>>>>>>>>>>> on > > >>>>>>>>>>>>>>> the new ci-hadoop.a.o jenkins coordinator[5]. it’s not > > clear > > >> to > > >>>>> me > > >>>>>>>>>>>>>>> that the level of effort we’ll need to spend is worth > what > > we > > >>>>> get > > >>>>>>>>>>>>> out > > >>>>>>>>>>>>>>> of a continuation of the status quo on builds.a.o. I did > a > > >> quick > > >>>>>>>>>>>>> test > > >>>>>>>>>>>>>>> by updating the nightly job on ci-hadoop.a.o to run just > > >>>>> branch-2, > > >>>>>>>>>>>>>>> since that has been stable on builds.a.o. It failed with > a > > >>>>> Jenkins > > >>>>>>>>>>>>>>> pipeline DSL syntax error[6] so I'm assuming migrating > will > > >> be a > > >>>>>>>>>>>>> slog. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> As far as I can see our options are: > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> * Do nothing. Have no testing or automated website > > >> publication > > >>>>> in > > >>>>>>>>>>>>> mid > > >>>>>>>>>>>>>>> August. > > >>>>>>>>>>>>>>> * Transition website publication and nothing else > (probably > > >> can > > >>>>> be > > >>>>>>>>>>>>>>> done in a day) > > >>>>>>>>>>>>>>> * Transition just precommit testing for various repos > > >> (probably > > >>>>>>>>>>>>> can be > > >>>>>>>>>>>>>>> done in a few days) > > >>>>>>>>>>>>>>> * Transition everything (no idea how long it takes due to > > >>>>> nightly, > > >>>>>>>>>>>>>>> flaky stuff, etc) > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> The alternatives if we do not transition any given job to > > >>>>>>>>>>>>> ci-hadoop: > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> * Try to move to GitHub Actions > > >>>>>>>>>>>>>>> * Try to move to Travis CI > > >>>>>>>>>>>>>>> * Try to move to Jenkins infra we maintain ourselves > > >> (presumably > > >>>>>>> by > > >>>>>>>>>>>>>>> soliciting project specific donations for worker nodes on > > >> cloud > > >>>>>>>>>>>>>>> vendors) > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> It's important to remember that as a project we have a > > heavy > > >>>>>>>>>>>>> footprint > > >>>>>>>>>>>>>>> wherever our nightly tests run. For context, a given > > branch's > > >>>>>>>>>>>>> nightly > > >>>>>>>>>>>>>>> can keep 3-4 executors busy for 6+ hours on the current > > >>>>> builds.a.o > > >>>>>>>>>>>>>>> setup. There's been a bunch of great work lately on > > bringing > > >>>>> down > > >>>>>>>>>>>>> what > > >>>>>>>>>>>>>>> it takes to run the full test suite, but applying that > work > > >> to > > >>>>>>>>>>>>> nightly > > >>>>>>>>>>>>>>> is itself a significant undertaking. > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> What are folks thinking? Most importantly who is ready to > > >> work > > >>>>>>>>>>>>> towards > > >>>>>>>>>>>>>>> any given approach? > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> [1] [DISCUSS] Migrating HBase to new CI Master > > >>>>>>>>>>>>>>> https://s.apache.org/fux1o > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> [2] https://builds.apache.org/view/H-L/view/HBase/ > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> [3] > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>> > > >> https://lists.apache.org/[email protected] > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> [4] [IMPORTANT] - 2 more HADOOP nodes migrated over to > > >> ci-hadoop > > >>>>>>>>>>>>>>> https://s.apache.org/7e1nq > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> [5] https://ci-hadoop.apache.org/job/HBase/ > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>> [6] > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>> > > >>>>> > > >> > > > https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/2/console > > >>>>>>>>>>>>>>> > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>> > > >>>>>>>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>> > > >>>>> > > >>> > > >> > > >> > > > > >
