Re: [Question] Distributed Load Testing with Mesos and Gatling

CCAAT Thu, 02 Jul 2015 08:56:11 -0700

On 07/01/2015 01:17 PM, Carlos Torres wrote:

Hi all,


In the past weeks, I've been thinking in leveraging Mesos to schedule 
distributed load tests.


An excellent idea.


One problem, at least for me, with this approach is that the load testing tool 
needs to coordinate
the distributed scenario, and combine the data, if it doesn't, then the load 
clients will trigger at
different times, and then later an aggregation step of the data would be 
handled by the user, or
some external batch job, or script. This is not a problem for load generators 
like Tsung, or Locust,
but could be a little more complicated for Gatling, since they already provide 
a distributed model,
and coordinate the distributed tasks, and Gatling does not. To me, the approach 
the Kubernetes team
suggests is really a hack using the 'Replication Controller' to spawn multiple 
replicas, which could
be easily achieved using the same approach with Marathon (or Kubernetes on 
Mesos).

I was thinking of building a Mesos framework, that would take the input, or 
load simulation file,
and would schedule jobs across the cluster (perhaps with dedicated resources 
too minimize variance)
using Gatling.  A Mesos framework will be able to provide a UI/API to take the 
input jobs, and
report status of multiple jobs. It can also provide a way to sync/orchestrate 
the simulation, and
finally provide a way to aggregate the simulation data in one place, and serve 
the generated HTML
report.

Boiled down to its primitive parts, it would spin multiple Gatling (java) 
processes across the
cluster, use something like a barrier (not sure what to use here) to wait for 
all processes to
be ready to execute, and finally copy, and rename the generated simulations 
logs from each
Gatling process to one node/place, that is finally aggregated and compiled to 
HTML report by a
single Gatling process.

First of all, is there anything in the Mesos community that does this already? 
If not, do you
think this is feasible to accomplish with a Mesos framework, and would you 
recommend to go with this
approach? Does Mesos offers a barrier-like features to coordinate jobs, and can 
I somehow move
files to a single node to be processed?

This all sounds workable, but, I do not have all the experiencesnecessary to qualify your ideas. What I would suggest is a solution thatlends itself to testing similarly configured cloud/cluster offerings, sowe the cloud/cluster community has a way to test and evaluate newreleases, substitute component codes, forks and even competitiveofferings. A ubiquitous and robust testing semantic based on your ideasdoes seem to be an overwhelmingly positive idea, imho. As such someorganizational structures to allow results to be maintained and quicklycompared to other 'test-runs' would greatly encourage usage.Hopefully 'Gatling' and such have many, if not most of the featuresneeded to automate the evaluation of results.

Finally, I've never written a non-trivial Mesos framework, how should I go 
about, or find more
documentation, to get started? I'm looking for best practices, pitfalls, etc.


Thank you for your time,
Carlos


hth,
James

Re: [Question] Distributed Load Testing with Mesos and Gatling

Reply via email to