Re: [Question] Distributed Load Testing with Mesos and Gatling

Joao Ribeiro Thu, 02 Jul 2015 09:35:17 -0700

This sounds like a really cool project.

I am still a very green user of mesos and never used gatling at all but a quick 
search took me to http://gatling.io/docs/2.1.6/cookbook/scaling_out.html 
<http://gatling.io/docs/2.1.6/cookbook/scaling_out.html>


With this it sound’t be took difficult to create a master/slave or 
scheduler/executors approach where you would have the master launch several 
slaves to do the work, wait for it to finish, collect logs and generate the 
report.
For better synchronisation you could make the slaves register to zookeeper 
while master waits for all slaves to be up and trigger a “start test” command 
on all slaves simultaneously.
You then could easily time out if it takes too long to get all slaves up or use 
other more fault tolerant strategies. i.e.: run slaves that you got; bump each 
slave that is up with more load to try to make up for missing slaves;

It might be a naive approach but would be a starting point in my opinion.

> On 02 Jul 2015, at 18:00, CCAAT <cc...@tampabay.rr.com> wrote:
> 
> On 07/01/2015 01:17 PM, Carlos Torres wrote:
>> Hi all,
>> 
>> In the past weeks, I've been thinking in leveraging Mesos to schedule 
>> distributed load tests.
> 
> An excellent idea.
>> 
>> One problem, at least for me, with this approach is that the load testing 
>> tool needs to coordinate
>> the distributed scenario, and combine the data, if it doesn't, then the load 
>> clients will trigger at
>> different times, and then later an aggregation step of the data would be 
>> handled by the user, or
>> some external batch job, or script. This is not a problem for load 
>> generators like Tsung, or Locust,
>> but could be a little more complicated for Gatling, since they already 
>> provide a distributed model,
>> and coordinate the distributed tasks, and Gatling does not. To me, the 
>> approach the Kubernetes team
>> suggests is really a hack using the 'Replication Controller' to spawn 
>> multiple replicas, which could
>> be easily achieved using the same approach with Marathon (or Kubernetes on 
>> Mesos).
> 
>> I was thinking of building a Mesos framework, that would take the input, or 
>> load simulation file,
>> and would schedule jobs across the cluster (perhaps with dedicated resources 
>> too minimize variance)
>> using Gatling.  A Mesos framework will be able to provide a UI/API to take 
>> the input jobs, and
>> report status of multiple jobs. It can also provide a way to 
>> sync/orchestrate the simulation, and
>> finally provide a way to aggregate the simulation data in one place, and 
>> serve the generated HTML
>> report.
> 
>> Boiled down to its primitive parts, it would spin multiple Gatling (java) 
>> processes across the
>> cluster, use something like a barrier (not sure what to use here) to wait 
>> for all processes to
>> be ready to execute, and finally copy, and rename the generated simulations 
>> logs from each
>> Gatling process to one node/place, that is finally aggregated and compiled 
>> to HTML report by a
>> single Gatling process.
> 
>> First of all, is there anything in the Mesos community that does this 
>> already? If not, do you
>> think this is feasible to accomplish with a Mesos framework, and would you 
>> recommend to go with this
>> approach? Does Mesos offers a barrier-like features to coordinate jobs, and 
>> can I somehow move
>> files to a single node to be processed?
> 
> This all sounds workable, but, I do not have all the experiences necessary to 
> qualify your ideas. What I would suggest is a solution that lends itself to 
> testing similarly configured cloud/cluster offerings, so we the cloud/cluster 
> community has a way to test and evaluate   new releases, substitute component 
> codes, forks and even competitive offerings. A ubiquitous  and robust testing 
> semantic based on your ideas does seem to be an overwhelmingly positive idea, 
> imho. As such some organizational structures to allow results to be 
> maintained and quickly compared to other 'test-runs' would greatly encourage 
> usage.
> Hopefully 'Gatling' and such have many, if not most of the features needed to 
> automate the evaluation of results.
> 
> 
>> Finally, I've never written a non-trivial Mesos framework, how should I go 
>> about, or find more
>> documentation, to get started? I'm looking for best practices, pitfalls, etc.
>> 
>> 
>> Thank you for your time,
>> Carlos
> 
> hth,
> James
>

Re: [Question] Distributed Load Testing with Mesos and Gatling

Reply via email to