One thing I think would be really helpful would be to make it so you could easily spawn up a cluster. Hadoop does this and it is nice. For companies that don't have standardized deployment infrastructure this is really nice, and also for testing of all kinds. Basically it would be nice if you could make a list containing one host:port per line and there were scripts like bin/kafka-cluster-deploy.sh machine-list.txt # rsync the kafka directory and config around to the given list of machines bin/kafka-cluster-start.sh machine-list.txt # ssh to each machine in the list and start it bin/kafka-cluster-stop.sh machine-list.txt # ssh to each machine and kill the kafka process bin/kafka-cluster-delete.sh machine-list.txt # ssh around to each machine in the list and delete the code and log directory
To do this I think you would need a way to override properties on the command line. This would make it so you could rsync out the kafka code to a bunch of machines but give a different node id and (if needed) port. I think something like "--property-name value" to give properties that override what is in the properties file would work. One other minor and unrelated issue we have is that our stop script actually kills all the kafka processes on the machine, which means if you are trying to run multiple nodes on the same machine it is a little dangerous. -Jay On Thu, Aug 4, 2011 at 4:59 PM, Chris Burroughs <chris.burrou...@gmail.com>wrote: > I've been looking at how kafka is packaged and the provided scripts for > running it. There are a few things I want to improve: it should be > easier to run in the foreground or background, convincing java to keep > the executable bits set, and reasonable log4j defaults. There are also > a few plain old bugs detailed in KAFKA-81 that need to be fixed. > > If there are other things that would make the results of ./sbt package > or release-zip more useful, or the bash scripts easier to work with, > please let me know. > > > Thanks, > Chris Burroughs >