This dashboard on executors helps: https://builds.apache.org/label/Hadoop/load-statistics
Hard to figure who is occupying the executors other than by clicking on each machine in turn along the top: e.g H0, H1, H10, etc.. When I do, I see lots of hbase at the moment. For flakey runs, I changed master, branch-2, and branch-2.1 to twice a day instead of every four hours. I left branch-2.3, branch-2.2, and branch-1 at every 4 hours since these are probably the most closely watched. I changed branch-1.4 and branch-1.3 to @daily (1.3 was running every hour). https://issues.apache.org/jira/browse/HBASE-24017 I disabled this job though was running infrequently as probably not being looked at (though it was set to run only on change): https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.3-IT Will keep an eye on it. S On Wed, Mar 18, 2020 at 2:54 PM Stack <st...@duboce.net> wrote: > On Wed, Mar 18, 2020 at 11:34 AM Sean Busbey <bus...@apache.org> wrote: > >> I agree we should be triggering on SCM poll. Whomever updates that please >> monitor the results to make sure we weren't avoiding that before due to a >> bug in jenkins polling. >> >> > I changed all branches to poll once a day and only run nightly if change > (s/cron/pollSCM/). https://issues.apache.org/jira/browse/HBASE-24016. > Will keep an eye on how it plays out going forward > > > >> If we switch the various builds to rely on flaky lists from their minor >> release that could be fine, but we'll have to be more proactive in >> backporting things that stabilize tests and EOM branches where we don't do >> those backports. That sounds great to me personally. >> >> > Let me just down frequency on the less active branches for now. Will > report back. If someone is game for Sean's prescription, they can go after > I'm done. > > Can we switch our jobs to do sequential steps instead of parallel? that >> will make them take a lot longer but it will mean we don't overwhelm the >> executors all at once. >> >> > Suggest we wait on result of above work before we go all serial. By all > means if we are still hogs when above is done, lets go in series. I am > reluctant too because then a nightly will take a good part of a day instead > of 3-4 hours. > > S > > >> On Wed, Mar 18, 2020 at 12:45 AM Stack <st...@duboce.net> wrote: >> >> > On Tue, Mar 17, 2020 at 3:36 PM Nick Dimiduk <ndimi...@apache.org> >> wrote: >> > >> > > "Poll SCM: daily" sounds like a good default for everything. >> > > >> > > >> > This sounds good to me. Let me try it for: 1.3, 1.4, 2.1 (unless >> > objection). >> > >> > There are two other branches that run nightly: >> > >> > >> > >> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/HBASE-22114-branch-1/configure >> > >> > >> https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/HBASE-23162-branch-1/ >> > >> > The top-most I changed today seemingly successfully from @daily to >> @weekly >> > (at Sean's request). I should change it and HBASE-23162-branch-1 to >> 'Poll >> > SCM: daily' too (If I can figure out how to do it). >> > >> > On flakeys, if they are not being looked at/addressed, they just burning >> > CPU. Up on slack, he asked that we not turn off flakeys. I could turn >> down >> > frequency on the old builds (I think I know how to do this...). >> > >> > Will mess w/ this starting tomorrow (out loud on slack) time permitting. >> > S >> > >> > >> > >> > >> > >> > > On Tue, Mar 17, 2020 at 11:44 AM Bharath Vissapragada < >> > bhara...@apache.org >> > > > >> > > wrote: >> > > >> > > > Switch the build trigger for branches from "Build periodically: >> daily" >> > to >> > > > "Poll SCM: daily" ? This may not help master much because commits >> land >> > > > almost every day but for branches like branch-2.2 where the commits >> are >> > > not >> > > > that often the nightlies are triggered only when there are new >> commits >> > > > < >> > > > >> > > >> > >> https://stackoverflow.com/questions/41114251/what-is-the-difference-between-poll-scm-build-periodically-in-jenkins/41114307 >> > > > > >> > > > . >> > > > >> > > > Also +1 to reducing the frequency of 1.3/1.4/2.1 and feature >> branches. >> > > > >> > > > On Tue, Mar 17, 2020 at 10:58 AM Stack <st...@duboce.net> wrote: >> > > > >> > > > > On Tue, Mar 17, 2020 at 10:35 AM Nick Dimiduk < >> ndimi...@apache.org> >> > > > wrote: >> > > > > >> > > > > > Heya, >> > > > > > >> > > > > > We've been dinged on HBASE-24001 for our unneighborly Jenkins >> > usage. >> > > I >> > > > > > haven't dug into the report to verify or diagnose, I hear it as >> a >> > > > > > reasonable accusation. I recently made things worse during the >> > JDK11 >> > > > work >> > > > > > by using parallel stages. I'm about to make it worse by adding >> > > > branch-2.3 >> > > > > > to the nightly matrix. >> > > > > > >> > > > > > How should we approach this? >> > > > > > >> > > > > > I would suggest we reduce our flakey build runner to only look >> at >> > > > master, >> > > > > > branch-2, and branch-1. We'd need to update the nightly and PR >> jobs >> > > to >> > > > > use >> > > > > > their respective branch's flakey list as well. >> > > > > > >> > > > > > >> > > > > Could turn down or off the frequency at which some of the >> nightlies >> > > run. >> > > > Do >> > > > > we need 1.3 and 1.4 run every night? 2.1? >> > > > > On flakies, could at least turn off 2.1, 1.3, and 1.4 runs unless >> > they >> > > > are >> > > > > being studied? >> > > > > >> > > > > S >> > > > > >> > > > > >> > > > > >> > > > > > Other idea? >> > > > > > >> > > > > > Thanks, >> > > > > > Nick >> > > > > > >> > > > > >> > > > >> > > >> > >> >