Short version
I'm experiencing somewhat slow slave start up when nodes are offline.
Launch method is "Launch slave via execution of command on the Master"
(bash script), availability is "Take this salve on-line when in demand and
off-line when idle" and "In demand delay" is 0. When I launch a new job
while slaves are offline they take about a minute to start launching. If I
start slaves manually they come online immediately, so it is not our bash
script that is slow. How do I make on-demand slaves start launching as soon
as they are needed? Can we do some groovy hack to listen for the need for a
specific jenkins node and launch it? Or do we need to write our own plugin?
Would that even be possible? See the end of this post for a plugin idea.
Longer version
I want (some of) my jenkins jobs to be executed on a build farm (Oracle
Grid Engine). The build farm does load balancing between servers in the
build farm and handles things as requests for a specific OS or
architecture. I would like it work like this:
- A job is triggered somehow. For this example, assume that the job is
restricted to run on the build farm and on RHEL6.4.
- A new jenkins node is launched immediately and connects to the build
farm. The build farm schedules a job to be run a RHEL6.4 server (it may
have to wait if no servers are available, or if the jenkins user has
already scheduled too many jobs).
- Preferrably the jenkins node receives information about the jenkins job
name and build number that caused it to be launched (so that the
information can be logged for any trouble shooting later).
- When the jenkins job has finished, the jenkins node is disconnected.
- No "# of executors" is needed, since the build farm has its own limit. (?)
The build farm is configured to automatically disconnect after 72 hours,
regardless of if there are any ongoing jobs, which means I cannot use "Keep
this slave on-line as much as possible" (I risk e.g.
DiagnosedStreamCorruptionException). This is not a build farm setting that
I can change.
Currently, we have nodes with
- Launch method "Launch slave via execution of command on the Master"
- Availability "Take this salve on-line when in demand and off-line when
idle"
- In demand delay 0
- Idle delay 1 (since it cannot be set to 0).
The launch command runs a bash script on the master. The script sets up a
connection to the build farm and listens to it using netcat. When the build
farm assigns a server to the connection, slave.jar is launched and the
jenkins job is executed. When the slave has been idle for at least 1 minute
(or after at most 72 hours) it is disconnected.
If we click the "Launch slave agent" button on the node's page, it is
immediately launched and ready for jenkins jobs, but if the node is
disconnected it takes about a minute before launch is started (or sometimes
even up to two minutes).
Problems with this approach:
- hudson.slaves.ComputerRetentionWork only checks if a slave is needed once
per minute, so there is normally a pretty long delay before a slave is
launched, that is, before the build farm can even start to find a host to
execute on. Users of our jenkins setup get very annoyed by this behavior.
- More than one job can use the node while it is connected, and we do not
know which jobs that do so, which makes it difficult to trouble shoot any
problems.
Plugin idea
One idea is to write a plugin that uses the QueueListener's
onEnterBuildable method. Once onEnterBuildable, it would basically mimic
ComputerRetentionWork's doRun method - something like this:
for (Computer c : Jenkins.getInstance().getComputers()) {
Node n = c.getNode();
if (n!=null && n.isHoldOffLaunchUntilSave())
continue;
c.getRetentionStrategy().check(c);
}
Would this interfere with the scheduled ComputerRetentionWork? Is it a good
idea at all? It doesn't fulfill all our requirements, but it would at least
improve launch times.
--
You received this message because you are subscribed to the Google Groups
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.