Hi Ewen,
Thank you for your response. It’s much faster after changing to async.
Cheers,Huy, Le Van
On Thursday, Dec 11, 2014 at 7:08 p.m., Ewen Cheslack-Postava
e...@confluent.io, wrote:
Did you set producer.type to async when creating your producer? The console
producer uses async by default, but the default producer config is sync.
-Ewen
On Thu, Dec 11, 2014 at 6:08 AM, Huy Le Van
wrote:
Hi,
I’m writing my own producer to read from text files, and send line by line
to Kafka cluster. I notice that the producer is extremely slow. It's
currently sending at ~57KB/node/s. This is like 50-100 times slower than
using bin/kafka-console-producer.sh
Here’s my producer:
final File dir = new File(dataDir);
List files = new ArrayList(Arrays.asList(dir.listFiles()));
int key = 0;
for (final File file : files) {
try {
BufferedReader br = new BufferedReader(new FileReader(file));
for (String line = br.readLine(); line != null; line =
br.readLine()) {
KeyedMessage data = new KeyedMessage(topic,
Integer.toString(key++), line);
producer.send(data);
}
} catch (IOException e) {
e.printStackTrace();
}
}
And partitioner:
public int partition(Object key, int numPartitions) {
String stringKey = (String)key;
return Integer.parseInt(stringKey) % numPartitions;
}
The only difference between kafka-console-producer.sh code and my code is
that I use a custom partitioner. I have no idea why it’s so slow.
Best regards,Huy, Le Van
--
Thanks,
Ewen
,@insight-centre.org