With appropriate tuning of various parameters, we have seen customers get
over 1000 executors on-line (with an average of 2-3 per slave)

With a large number of slaves, you are usually better using one of the
on-demand retention strategies.

Getting all 1000 executors running builds can be problematic though.

I am currently working on a scalability framework for testing how Jenkins
scales. It will be here
https://github.com/jenkinsci/scalability-test-framework once I finish some
refactoring that I identified as required - I want people to have a mostly
stable API when I publish this framework.

Some other tooling I have built:

* https://github.com/jenkinsci/mock-load-builder-plugin which will create
build jobs that should load up the remoting channel with load
representative of a chatty build

* https://github.com/jenkinsci/random-job-builder-plugin which will build
jobs selected at random at a specified rate.

With that tooling you can set up a Jenkins instance with a load of jobs and
have those jobs queue up in a semi-realistic - if stressed - way.

Using all the above I can report that:

A 1.553 Jenkins master on an m3.large can support:

* Connecting 60 JNLP slaves and having them idle (but the system will be
unresponsive for 2-3 minutes after startup); OR
* Connecting 60 SSH slaves and having them idle (but the system will be
unresponsive for 5-6 minutes after startup); OR
* Connecting 60 CloudBees NIO SSH slaves and having them idle

All with 2 executors per slave

For both of the SSH slave options above, you need to tell the JVM to use
/dev/./urandom as the entropy source. I suspect the newer NIO JNLP mode
will have removed / reduced the unresponsivity of the JNLP master after
startup... also that is a side-effect of me starting all the JNLP slaves at
the same time. Real systems will likely have JNLP slaves connect in a more
staggered way and not suffer the thread contention that locks up the Web UI.

On each of those test systems I created 3000 mock jobs organized in
folders. I then upped the rate of builds until the Web UI became
unresponsive.

* JNLP hits system load > 5 and web UI is unusable at somewhere between 50
and 55 concurrent builds on an m3.large
* Traditional SSH hits system load > 5 and web UI is unusable at between 10
and 12 concurrent builds on an m3.large
* CloudBees NIO SSH hits system load of 4 at 15 concurrent builds, but Web
UI remains usable all the way up to 120 concurrent build. The build
duration - however - is increased once you go past 15 concurrent builds. So
the cause and effect here is that back-pressure is being forced on the
build in order to allow the master to remain usable...

I picked a m3.large as being a reasonably cost-effective machine type that
would let me scale up to a size where you should start seeing problems of
scalability but not so large that I need a massive army of machines to
saturate it.

HTH


On 28 July 2014 23:51, Maureen Barger <[email protected]> wrote:

> Hi - I am wondering if there is a limit to how many slaves can connect to
> one master.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Jenkins Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to