@Jay, My bad. I mistook the batch.size to be number of messages instead of bytes. Below are revised measurements based on computing the batch.size in bytes .
@Jun, With explicit flush()... linger should not impact. Isn't it ? @Wang, Larger batches are not necessarily giving better numbers are you can see below. The 2 problems I noted earlier still exist in the batched sync mode (using flush() ). * batch.size still seems to play a factor even when set to a larger value than the bytes generated by client * 4 & 8 partition see a big slowdown Revised measurements for new Producer API: - All cases...Single threaded, 1k event size Batched SYNC using flus() , acks=1 1 partition Batch=4k Batch=8k Batch=16k batch.size == clientBatch 140 124 batch.size = 10MB 140 123 124 batch.Size = 20MB 31 30 42 4 partitions Batch=4k Batch=8k Batch=16k batch.size == clientBatch 60 8 6 batch.size = 10M 7 7 7 batch.Size = 20M 6 6 5 8 partitions Batch=4k Batch=8k Batch=16k batch.size == clientBatch 7 8 8 batch.size = 10M 7 8 7 batch.Size = 20M 6 6 6 Just for reference I also took the number for default ASYNC mode with acks=1 : batch.size=deafult batch.size=4MB batch.size=8MB batch.size=16MB 1 partition 53 130 113 76 4 partitions 84 126 9 7 8 partitions 9 12 10 5