I think a basic tool that is relatively independent of cluster specifics could be pretty useful. I'm imagining a tool that allows you to do load testing against any cluster you point it at to: Test indexing by selecting the complexity of data objects you're interested in - ie, create X test indices with X shards and Y replicas each, and send either a custom object with fields that could be defined for length and type of random variables, or basic objects at various sizes (an example tweet, log record, simple data point, etc). As a better example, if I wanted to see how my system did with objects that looked like {"test1":18, "test2":[{"test3":"XnksdjfknSeorjOJosdimkn skjnfds sdfsidfun ifsdfosdmfo"}, {"test3":"dslkmlkdsfnUFIDNSiufndsfn DFISnu"}]}, we could just pass something like a mapping, {"test1":{"type":"int", "maxSize":10000000, "minSize":-1029420},"test2":{"type":"nested", "minSize":0, "maxSize":20, "properties":{"test3":{"type":"string", "maxSize": 160, "minSize":20}}}} Random objects would be generated according to the specifications of the template. Maybe you could also pick a type "english", "french", "russian", etc, to generate strings that are actually language based on a dictionary of terms. (or be able to define custom "types" pointing at text files). You could then indicate how many objects you want created in any given index, also a min/max range, the number of workers writing to any given index, the total number of workers you want writing data, the total number of indexes, etc. Data could all be sent over the native API, I'm a Python guy so I'll say Python, or over HTTP using something like the requests module. This could allow for interesting comparisons of the APIs.
Test queries by doing pretty much the same thing. Track the query response times per worker, and other relevant stats. Have definable "max requests per second per worker", maybe, so you can replicate your worst case user behavior in each process. This would be step 1 of the process, step 2 would be developing something so that a central test system could allocate testing jobs and collect stats across a number of client test systems. So I'd set up a test service on host A, and test clients on hosts B and C. B and C would be sent job properties by A, B and C would then launch, track their own stats, and send them to A to aggregate. Scale out to as many systems as you like. This is just a first pass at the idea, there may be some dumb mistakes in logic or oversights about test cases, but I think an app like this could be pretty useful. Heck, you could have a GUI on it, or just make it run off a yaml file or something. If it gets into my head enough maybe I'll try to write this up, though like I said, it'd be in python since that's my language of choice. So it wouldn't be as optimal a testing platform as a native Java app, I guess, but still useful as a proof of concept. On Thursday, January 30, 2014 4:41:06 PM UTC-8, Josh Harrison wrote: > > In our case, we're just interested in query stress testing. We've got a > web app that queries our indexes that are organized based on weeks of the > year, with a bunch of aliases making it so specific portions of the data > can be reached easily. Questions about scaling the app have come up. In our > case, that means testing through the app itself, which so far only makes > queries. I figure we should load test our cluster directly too, so we can > see if there is a bottleneck somewhere in the app, if any eventual > bottlenecks are on the cluster itself. > > So far I haven't been able to really max out the indexing rate on a system > that is adequately equipped with resources, that I can tell. I've had 32 > sub-process Python workers happily sending, I think, ~5+ million records an > hour to our cluster with no problem in indexing speed or other response > time when backloading some data. > My current strategy is to get the ugliest heavy queries the application > runs and simply use ABS or something similar to run queries over http with > variables that are in a reasonable range. If I can make my cluster crash by > doing that, I know that'll be my upper limit! > > > > > On Thursday, January 30, 2014 3:59:19 PM UTC-8, Jörg Prante wrote: >> >> Just a few questions, because I'm also interested in load testing. >> >> What kind of stress do you think of? Random data? Wikipedia? Logfiles? >> Just query? What about indexing? And what client? Java? Other script >> languages? How should the cluster be configured, one node? two or more >> nodes? Index shards? Replica? etc. etc. >> >> There are so many variants and options out there, I believe this is one >> of the reason why a compelling load testing tool is still missing. >> >> It would be nice to have a tool to upload ES performance profiles to a >> public web site, for checking how well an ES cluster is tuned in comparison >> to others. A measure unit for comparing performance is needed to be >> defined, e.g. "this cluster performs with a power factor of 1.0, this >> cluster has power factor 1.5, 2.0, ..." >> >> That's only possible when all software and hardware characteristics are >> properly taken into account, plus "application profiles" for a typical >> workloads, so it can be decided which configuration is best for what >> purpose. >> >> Jörg >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ddd4083-5507-4dca-b0fe-43eb710e7d76%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.