@Jay,
My bad. I mistook the batch.size to be number of messages instead of bytes. 
Below are revised measurements based on computing the batch.size in bytes .

@Jun,

   With explicit flush()...  linger should not impact. Isn't it ?

@Wang,
   Larger batches are not necessarily giving better numbers are you can see 
below.


The 2 problems I noted earlier still exist in the batched sync mode (using 
flush() ).

  *   batch.size still seems to play a factor even when set to a larger value 
than the bytes generated by client
  *   4 & 8 partition see a big slowdown



Revised measurements for new Producer API:

- All cases...Single threaded, 1k event size


Batched SYNC using flus() , acks=1










        1 partition






        Batch=4k        Batch=8k        Batch=16k


        batch.size == clientBatch       140
        124


        batch.size = 10MB       140     123     124


        batch.Size = 20MB       31      30      42




















        4 partitions






        Batch=4k        Batch=8k        Batch=16k


        batch.size == clientBatch       60      8       6


        batch.size = 10M        7       7       7


        batch.Size = 20M        6       6       5




















        8 partitions






        Batch=4k        Batch=8k        Batch=16k


        batch.size == clientBatch       7       8       8


        batch.size = 10M        7       8       7


        batch.Size = 20M        6       6       6



Just for reference I also took the number for  default ASYNC mode with acks=1 :






        batch.size=deafult      batch.size=4MB  batch.size=8MB  batch.size=16MB
1 partition     53      130     113     76
4 partitions    84      126     9       7
8 partitions    9       12      10      5








Reply via email to