With appropriate tuning of various parameters, we have seen customers get over 1000 executors on-line (with an average of 2-3 per slave)
With a large number of slaves, you are usually better using one of the on-demand retention strategies. Getting all 1000 executors running builds can be problematic though. I am currently working on a scalability framework for testing how Jenkins scales. It will be here https://github.com/jenkinsci/scalability-test-framework once I finish some refactoring that I identified as required - I want people to have a mostly stable API when I publish this framework. Some other tooling I have built: * https://github.com/jenkinsci/mock-load-builder-plugin which will create build jobs that should load up the remoting channel with load representative of a chatty build * https://github.com/jenkinsci/random-job-builder-plugin which will build jobs selected at random at a specified rate. With that tooling you can set up a Jenkins instance with a load of jobs and have those jobs queue up in a semi-realistic - if stressed - way. Using all the above I can report that: A 1.553 Jenkins master on an m3.large can support: * Connecting 60 JNLP slaves and having them idle (but the system will be unresponsive for 2-3 minutes after startup); OR * Connecting 60 SSH slaves and having them idle (but the system will be unresponsive for 5-6 minutes after startup); OR * Connecting 60 CloudBees NIO SSH slaves and having them idle All with 2 executors per slave For both of the SSH slave options above, you need to tell the JVM to use /dev/./urandom as the entropy source. I suspect the newer NIO JNLP mode will have removed / reduced the unresponsivity of the JNLP master after startup... also that is a side-effect of me starting all the JNLP slaves at the same time. Real systems will likely have JNLP slaves connect in a more staggered way and not suffer the thread contention that locks up the Web UI. On each of those test systems I created 3000 mock jobs organized in folders. I then upped the rate of builds until the Web UI became unresponsive. * JNLP hits system load > 5 and web UI is unusable at somewhere between 50 and 55 concurrent builds on an m3.large * Traditional SSH hits system load > 5 and web UI is unusable at between 10 and 12 concurrent builds on an m3.large * CloudBees NIO SSH hits system load of 4 at 15 concurrent builds, but Web UI remains usable all the way up to 120 concurrent build. The build duration - however - is increased once you go past 15 concurrent builds. So the cause and effect here is that back-pressure is being forced on the build in order to allow the master to remain usable... I picked a m3.large as being a reasonably cost-effective machine type that would let me scale up to a size where you should start seeing problems of scalability but not so large that I need a massive army of machines to saturate it. HTH On 28 July 2014 23:51, Maureen Barger <[email protected]> wrote: > Hi - I am wondering if there is a limit to how many slaves can connect to > one master. > > -- > You received this message because you are subscribed to the Google Groups > "Jenkins Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
