You select which nodes to run the component jobs on as part of configuring
the actual combinations that form the matrix (so you could for example pick
multiple node types, though we only specify Ubuntu), and in the past I dont
think there was an option to further restrict where the overall matrix
itself initiates, but digging through the options again currently there
certainly is now; I have updated that too to only use the Ubuntu nodes.

The mails following my message last night and looking at the actual jobs
since then do suggest it really was stuck though, not getting past the svn
update for the overall matrix (which I believe is really just used to list
the changes before queuing the first sub task to actually run on the Ubuntu
nodes).

Robbie

On 15 April 2014 22:23, Dennis Lundberg <[email protected]> wrote:

> Hi Robbie,
>
> Yes, sorry about that. Those matrix jobs are tricky. I was so into the
> idea that the windows slaves were the bottleneck, that I didn't think
> about the possibility that it might be the other way around. My bad.
>
> How come you can't select a node or label to use for a matrix job?
>
>
> On Tue, Apr 15, 2014 at 2:21 AM, Robbie Gemmell
> <[email protected]> wrote:
> > '1' was possibly not stuck. It is a matrix project, although while the
> > matrix itself can launch on any node including the Windows ones
> (something
> > we apparently cant control) it doesnt use a numbered executor on the
> slave
> > while doing so which is how you killed 3 things when the node only has 2
> > executors. The individual jobs within those matrix projects are
> restricted
> > to only run on the Ubuntu nodes, with each sub part getting scheduled
> > individually at the end of the job queue after the previous sub part
> > completes. Most of the time for the matrix running is simply spent
> waiting
> > for its parts to get to the front of the queue again.
> >
> > The project was defined that way to ensure we didnt effectively use a
> > larger single block of time (2 to 2.5hrs depending on the particular
> Ubuntu
> > nodes used and what else is running) the way many jobs do seem to, though
> > it means it can take a very long time for the matrix as a whole to
> complete
> > if the job queue is long due to the number of times it has to wait for
> each
> > part to get to the front of the queue. This seemed fairer than either
> > running the parts in a group of separate jobs or a single job and
> > effectively only queing once, but it does mean people see the matrix
> > sitting there doing not very much for quite some time.
> >
> > Though they weren't using any executors on the Windows nodes, I have
> > regardless disabled the periodic build on the job which triggers '1'.
> >
> > Robbie
> >
> > On 14 April 2014 20:37, Dennis Lundberg <[email protected]> wrote:
> >
> >> I have just killed the following jobs on windows1, they had been stuck
> >> for 23+ hours:
> >> 1. https://builds.apache.org/job/Qpid-Java-Java-BDB-TestMatrix/
> >> 2. https://builds.apache.org/job/river-qa-refactor-win6/
> >> 3. https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008_java/
> >>
> >> Together they were effectively blocking all other projects that needed
> >> a windows slave.
> >>
> >> The problem with 1 is that it is triggered by
> >> https://builds.apache.org/job/Qpid-Java-Java-MMS-TestMatrix
> >> which in turn is on a periodical schedule (once a day, 0 9 * * *) as
> >> well as an SCM poll schedule (once every 15 minutes, */15 * * * *)
> >>
> >> The same problem goes for 3 which is on a periodical schedule (once a
> >> day, 30 8 * * *)
> >>
> >> In my opinion we should not allow periodical schedules.
> >>
> >> On Sun, Apr 13, 2014 at 10:46 AM, Gavin McDonald <
> [email protected]>
> >> wrote:
> >> > Managed to kill 3 of them, looking into why.
> >> >
> >> > Gav…
> >> >
> >> > On 13/04/2014, at 7:01 AM, Erik de Bruin <[email protected]> wrote:
> >> >
> >> >> Currently there are 4 builds stuck on the windows1 slave. They seem
> to
> >> have
> >> >> stopped on the SCM step right at the beginning of their builds.
> >> >>
> >> >> Can you please take a look?
> >> >>
> >> >> EdB
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Fri, Apr 11, 2014 at 4:43 PM, Alex Harui <[email protected]>
> wrote:
> >> >>
> >> >>> Hi Jake,
> >> >>>
> >> >>> Thanks for restarting.  I can't help but wonder if there is still
> some
> >> >>> configuration issue with Jenkins and Git that is causing Windows1 to
> >> run
> >> >>> out of memory.  Is there an investigation going on in that regard?
> >> >>>
> >> >>> Thanks,
> >> >>> -Alex
> >> >>>
> >> >>> On 4/11/14 7:38 AM, "Jake Farrell" <[email protected]> wrote:
> >> >>>
> >> >>>> Hey Erik
> >> >>>> Windows1 ran out of memory, restarted and builds in the queue have
> >> been
> >> >>>> picked up and are running
> >> >>>>
> >> >>>> -Jake
> >> >>>>
> >> >>>>
> >> >>>> On Fri, Apr 11, 2014 at 10:17 AM, Erik de Bruin <
> [email protected]>
> >> >>>> wrote:
> >> >>>>
> >> >>>>> Same week, second time... The 'windows1' slave is offline. There
> are
> >> >>>>> builds that have been in the queue for over 12 hours, so it's not
> >> >>>>> 'idling'.
> >> >>>>>
> >> >>>>> Can someone look at this, please?
> >> >>>>>
> >> >>>>> Thanks,
> >> >>>>>
> >> >>>>> EdB
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> On Tue, Apr 8, 2014 at 1:08 AM, David Nalley <[email protected]>
> wrote:
> >> >>>>>
> >> >>>>>> Jan and I discussed this briefly at ApacheCon and are tossing
> around
> >> >>>>>> the idea of having Circonus monitor the status of the slave
> >> (according
> >> >>>>>> to Jenkins) and perhaps to take corrective action automagically.
> >> We're
> >> >>>>>> going to continue to think and work on this. Neither of us have
> >> admin
> >> >>>>>> privs on the Window's slaves, so we'd want folks that do (and are
> >> thus
> >> >>>>>> responsible for maintaining them) to bless this approach.
> >> >>>>>>
> >> >>>>>> --David
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On Mon, Apr 7, 2014 at 11:17 AM, Alex Harui <[email protected]>
> >> wrote:
> >> >>>>>>> Hi Jake,
> >> >>>>>>>
> >> >>>>>>> Is there some way you could create a "button" that we could hit
> to
> >> >>>>>> restart
> >> >>>>>>> the Windows slave so we don't have to keep bothering you?  Or
> does
> >> it
> >> >>>>>>> require human intervention to get it to come back up?
> >> >>>>>>>
> >> >>>>>>> Maybe some script we can get at from people.a.o, or a custom
> >> Jenkins
> >> >>>>>> task
> >> >>>>>>> that we kick, or a button on the wiki that runs some script
> code?
> >> >>>>>>>
> >> >>>>>>> Thanks,
> >> >>>>>>> -Alex
> >> >>>>>>>
> >> >>>>>>> On 4/7/14 8:13 AM, "Erik de Bruin" <[email protected]> wrote:
> >> >>>>>>>
> >> >>>>>>>> Good news.
> >> >>>>>>>>
> >> >>>>>>>> Excellent service, thank you!
> >> >>>>>>>>
> >> >>>>>>>> EdB
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On Mon, Apr 7, 2014 at 4:22 PM, Jake Farrell <
> [email protected]
> >> >
> >> >>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>>> Hey Erik
> >> >>>>>>>>> I just restarted windows 1 and it has picked up the Apache
> Flex
> >> >>>>>> build
> >> >>>>>>>>> and
> >> >>>>>>>>> is running it right now.
> >> >>>>>>>>>
> >> >>>>>>>>> -Jake
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> On Mon, Apr 7, 2014 at 10:08 AM, Erik de Bruin <
> >> [email protected]
> >> >>>>
> >> >>>>>>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>> Hi,
> >> >>>>>>>>>>
> >> >>>>>>>>>> This is becoming a weekly event... both 'windows' slaves are
> >> >>>>>> offline,
> >> >>>>>>>>>> again.
> >> >>>>>>>>>>
> >> >>>>>>>>>> You might want to seriously consider accepting the offers to
> >> help
> >> >>>>>> from
> >> >>>>>>>>>> the friendly people in the "volunteering for ASF Jenkins farm
> >> >>>>>> service
> >> >>>>>>>>>> maintenance" thread.
> >> >>>>>>>>>>
> >> >>>>>>>>>> EdB
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> On Thu, Apr 3, 2014 at 7:22 PM, Jake Farrell <
> >> [email protected]
> >> >>>>
> >> >>>>>>>>>> wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>>> restarted, builds should start getting picked up shortly
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> -Jake
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> On Thu, Apr 3, 2014 at 1:05 PM, Erik de Bruin
> >> >>>>>> <[email protected]>
> >> >>>>>>>>>>> wrote:
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>> Hi,
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Both Windows slaves seem to be offline. There are several
> >> >>>>>> 'windows'
> >> >>>>>>>>>>> builds
> >> >>>>>>>>>>>> in the queue, so it seems they are not simply idling. Can
> you
> >> >>>>>> please
> >> >>>>>>>>>>> take a
> >> >>>>>>>>>>>> look?
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> EdB
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> On Tue, Apr 1, 2014 at 9:20 AM, Jake Farrell
> >> >>>>>> <[email protected]
> >> >>>>>>>
> >> >>>>>>>>>>> wrote:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>> Hey Justin
> >> >>>>>>>>>>>>> The builds look like they are working, now sure why java
> is
> >> >>>>>> giving
> >> >>>>>>>>>>> you
> >> >>>>>>>>>>>>> that
> >> >>>>>>>>>>>>> error for the latest java path since
> >> >>>>>>>>>>>>> /f/hudson/tools/java/latest-1.6-64/jre/bin/java.exe
> -version
> >> >>>>>> gives
> >> >>>>>>>>>>> me
> >> >>>>>>>>>>> a
> >> >>>>>>>>>>>>> print out of 1.6.0_27. if you wouldnt mind creating a
> ticket
> >> >>>>>> for
> >> >>>>>>>>>>> this
> >> >>>>>>>>>>> so
> >> >>>>>>>>>>>>> someone can investigate it I would appreciate it, its 3am
> for
> >> >>>>>> me
> >> >>>>>>>>>>> and I
> >> >>>>>>>>>>>>> need
> >> >>>>>>>>>>>>> to call it a night
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> -Jake
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>> On Tue, Apr 1, 2014 at 3:09 AM, Justin Mclean <
> >> >>>>>>>>>>> [email protected]
> >> >>>>>>>>>>>>>> wrote:
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Hi,
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>> Flex-sdk_1 and flex-sdk_release fixed and started,
> looking
> >> >>>>>>>>>>> through the
> >> >>>>>>>>>>>>>>> other flex builds now
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>>
> >> >>>>>> https://builds.apache.org/view/E-G/view/Flex/job/flex-sdk_1/60/
> >> >>>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>
> >> https://builds.apache.org/view/E-G/view/Flex/job/flex-sdk_release/539/
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> While it looks like they are compiling I noticed this:
> >> >>>>>>>>>>>>>> java.io.IOException: Cannot run program
> >> >>>>>>>>>>>>>> "f:\hudson\tools\java\latest-1.6-64\jre\bin\java.exe
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> So look like the version of java it expects to use is
> >> >>>>>> missing??
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>> Justin
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>>
> >> >>>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> --
> >> >>>>>>>>>>>> Ix Multimedia Software
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Jan Luykenstraat 27
> >> >>>>>>>>>>>> 3521 VB Utrecht
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> T. 06-51952295
> >> >>>>>>>>>>>> I. www.ixsoftware.nl
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> --
> >> >>>>>>>>>> Ix Multimedia Software
> >> >>>>>>>>>>
> >> >>>>>>>>>> Jan Luykenstraat 27
> >> >>>>>>>>>> 3521 VB Utrecht
> >> >>>>>>>>>>
> >> >>>>>>>>>> T. 06-51952295
> >> >>>>>>>>>> I. www.ixsoftware.nl
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> --
> >> >>>>>>>> Ix Multimedia Software
> >> >>>>>>>>
> >> >>>>>>>> Jan Luykenstraat 27
> >> >>>>>>>> 3521 VB Utrecht
> >> >>>>>>>>
> >> >>>>>>>> T. 06-51952295
> >> >>>>>>>> I. www.ixsoftware.nl
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> --
> >> >>>>> Ix Multimedia Software
> >> >>>>>
> >> >>>>> Jan Luykenstraat 27
> >> >>>>> 3521 VB Utrecht
> >> >>>>>
> >> >>>>> T. 06-51952295
> >> >>>>> I. www.ixsoftware.nl
> >> >>>>>
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >> --
> >> >> Ix Multimedia Software
> >> >>
> >> >> Jan Luykenstraat 27
> >> >> 3521 VB Utrecht
> >> >>
> >> >> T. 06-51952295
> >> >> I. www.ixsoftware.nl
> >> >
> >>
> >>
> >>
> >> --
> >> Dennis Lundberg
> >>
>
>
>
> --
> Dennis Lundberg
>

Reply via email to