Did you set producer.type to async when creating your producer? The console
producer uses async by default, but the default producer config is sync.

-Ewen

On Thu, Dec 11, 2014 at 6:08 AM, Huy Le Van <huy.le...@insight-centre.org>
wrote:

> Hi,
>
>
> I’m writing my own producer to read from text files, and send line by line
> to Kafka cluster. I notice that the producer is extremely slow. It's
> currently sending at ~57KB/node/s. This is like 50-100 times slower than
> using bin/kafka-console-producer.sh
>
>
> Here’s my producer:
> final File dir = new File(dataDir);
> List<File> files = new ArrayList<>(Arrays.asList(dir.listFiles()));
> int key = 0;
> for (final File file : files) {
>     try {
>         BufferedReader br = new BufferedReader(new FileReader(file));
>         for (String line = br.readLine(); line != null; line =
> br.readLine()) {
>             KeyedMessage<String, String> data = new KeyedMessage<>(topic,
> Integer.toString(key++), line);
>             producer.send(data);
>         }
>     } catch (IOException e) {
>         e.printStackTrace();
>     }
> }
>
>
>
> And partitioner:
> public int partition(Object key, int numPartitions) {
>     String stringKey = (String)key;
>     return Integer.parseInt(stringKey) % numPartitions;
> }
>
>
> The only difference between kafka-console-producer.sh code and my code is
> that I use a custom partitioner. I have no idea why it’s so slow.
>
> Best regards,Huy, Le Van




-- 
Thanks,
Ewen

Reply via email to