Hi,

[2017-04-17 18:14:05,868] ERROR Error when sending message to topic
big_ptns1_repl1_nozip with key: null, value: 55 bytes with error:
(org.apache.kafka.clients.
producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Batch containing 8
record(s) expired due to timeout while requesting metadata from
brokers for big_ptns1_repl1_nozip-0

data didn't reach producer. So why should data appear in consumer?
loss rate is more or less similar : 0.02 (130k / 5400mb) ~ 0.03% (150mb /
5000gb) Not so bad.


2017-04-18 21:46 GMT+02:00 jan <rtm4...@googlemail.com>:

> Hi all, I'm something of a kafka n00b.
> I posted the following in the  google newsgroup, haven't had a reply
> or even a single read so I'll try here. My original msg, slightly
> edited, was:
>
> ----
>
> (windows 2K8R2 fully patched, 16GB ram, fairly modern dual core xeon
> server, latest version of java)
>
> I've spent several days trying to sort out unexpected behaviour
> involving kafka and the kafka console producer and consumer.
>
>  If I set  the console produced and console consumer to look at the
> same topic then I can type lines into the producer window and see them
> appear in the consumer window, so it works.
>
> If I try to pipe in large amounts of data to the producer, some gets
> lost and the producer reports errors eg.
>
> [2017-04-17 18:14:05,868] ERROR Error when sending message to topic
> big_ptns1_repl1_nozip with key: null, value: 55 bytes with error:
> (org.apache.kafka.clients.
> producer.internals.ErrorLoggingCallback)
> org.apache.kafka.common.errors.TimeoutException: Batch containing 8
> record(s) expired due to timeout while requesting metadata from
> brokers for big_ptns1_repl1_nozip-0
>
> I'm using as input a file either shakespeare's full works (about 5.4
> meg ascii), or a much larger file of shakespear's full works
> replicated 900 times to make it about 5GB. Lines are ascii and short,
> and each line should be a single record when read in by the console
> producer. I need to do some benchmarking on time and space and this
> was my first try.
>
> As mentioned, data gets lost. I presume it is expected that any data
> we pipe into the producer should arrive in the consumer, so if I do
> this in one windows console:
>
> kafka-console-consumer.bat --bootstrap-server localhost:9092  --topic
> big_ptns1_repl1_nozip --zookeeper localhost:2181 >
> F:\Users\me\Desktop\shakespear\single_all_shakespear_OUT.txt
>
> and this in another:
>
> kafka-console-producer.bat --broker-list localhost:9092  --topic
> big_ptns1_repl1_nozip <
> F:\Users\me\Desktop\shakespear\complete_works_no_bare_lines.txt
>
> then the output file "single_all_shakespear_OUT.txt" should be
> identical to the input file "complete_works_no_bare_lines.txt" except
> it's not. For the complete works (sabout 5.4 meg uncompressed) I lost
> about 130K in the output.
> For the replicated shakespeare, which is about 5GB, I lost about 150 meg.
>
> This can't be right surely and it's repeatable but happens at
> different places in the file when errors start to be produced, it
> seems.
>
> I've done this using all 3 versions of kafak in the 0.10.x.y branch
> and I get the same problem (the above commands were using the 0.10.0.0
> branch so they look a little obsolete but they are right for that
> branch I think). It's cost me some days.
> So, am I making a mistake, if so what?
>
> thanks
>
> jan
>

Reply via email to