Hi Alec, Single node kafka cluster not recommended apart from using it for development. I highly recommend using multinode cluster and create a partitioned topic with replication. This not only makes it optimal to take in more data at faster rates also allows your cluster running if there is a node failure as the topic is replicated there wouldn't be huge data loss.
" If I am using multiple-nodes, the tradeoff is the connection time among different nodes?" kafka producer api sends a message to broker either round-robin or based on partition function. please go through the kafka docs here [1]http://kafka.apache.org/documentation.html for simple consumer and also how the replication works among multiple nodes. -Harsha On Tue, Sep 16, 2014, at 02:06 PM, Sa Li wrote: Hi, All I have been using kafka cluster in single server with three brokers, but I am thinking to build a larger kafka cluster, say 4 nodes (server), and 3 brokers in each node, so totally 12 brokers, would that be better than single node cluster? Or single node will be fair enough, since web api may push million rows into kafka cluster every day, I am kinda worry if the cluster is capable to take such much data without losing data. If I am using multiple-nodes, the tradeoff is the connection time among different nodes? thanks Alec References 1. http://kafka.apache.org/documentation.html