Hi, A few pointers are in this Kafka user ML thread: http://search-hadoop.com/m/uyzND177HP92xnm4e
Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Sun, Sep 20, 2015 at 1:14 AM, David Luu <manga...@gmail.com> wrote: > I'd like to generate load against a system we have that uses kafka as the > message bus. We have a custom JSON message format, and to properly load > test the system, each set of messages for a particular scenario (i.e. user) > needs to have a unique identifier, which it normally does. > > I think of using record & playback technique to capture messages that > correspond to a few users. Then play back those messages to generate load > but to be realistic simulation and to scale up the load, I would: > > * re-use the captured user set to simulate additional users to scale up # > of users against the system > > * for original set of users and when scaling beyond that for more users, I > would dynamically replace the identifier in the captured messages with a > unique one generated at runtime for each user. also replacing anything else > that needs to like timestamps. > > As such, this would have to be a scripted solution. I don't think there is > existing kafka-centric tool to assist with such testing is there? > > If not, I'd likely have to build my own. In which case, my question is what > technology stack to use would be most suitable so that I can generate the > highest amount of load with the least amount of load generating > machines/hardware. Using threads, processes, or whatever. node.js, python, > scala, ruby, java, .net, etc. > > thoughts, suggestions appreciated, > David >