Kafka can be tuned for greater delivery guarantees, but such guarantees
come at the cost of latency and throughput (as they do in many other such
systems). If you are doing a simple end-to-end test you may want to look at
tuning the "acks" configuration setting to ensure you aren't dropping any
messages during the test.

On Tue, Apr 18, 2017 at 5:02 PM, Serega Sheypak <serega.shey...@gmail.com>
wrote:

> > err, isn't it supposed to? Isn't the loss of data a very serious error?
> Kafka can't fix networking issues like latencies, blinking, unavailability
> or any other weird stuff. Kafka promises you to persist data if data
> reaches Kafka. Data delivery responsibility to kafka is on your side. You
> fail to do it according to logs.
>
> 0.02% not 2%
> You should check broker logs to figure out what went wrong. All things
> happen on one machine as far as I understand. Maybe your brokers don't have
> enough mem and they stuck because of GC and don't respond to producer.
> Async producer fails to send data. That is why you observe data loss on
> consumer side.
>
>
> 2017-04-18 23:32 GMT+02:00 jan <rtm4...@googlemail.com>:
>
> > Hi Serega,
> >
> > > data didn't reach producer. So why should data appear in consumer?
> >
> > err, isn't it supposed to? Isn't the loss of data a very serious error?
> >
> > > loss rate is more or less similar [...] Not so bad.
> >
> > That made me laugh at least.  Is kafka intended to be a reliable
> > message delivery system, or is a 2% data loss officially acceptable?
> >
> > I've been reading the other threads and one says windows is really not
> > supported, and certainly not for production. Perhaps that's the root
> > of it. Well I'm hoping to try it on linux shortly so I'll see if I can
> > replicate the issue but I would like to know whether it *should* work
> > in windows.
> >
> > cheers
> >
> > jan
> >
> > On 18/04/2017, Serega Sheypak <serega.shey...@gmail.com> wrote:
> > > Hi,
> > >
> > > [2017-04-17 18:14:05,868] ERROR Error when sending message to topic
> > > big_ptns1_repl1_nozip with key: null, value: 55 bytes with error:
> > > (org.apache.kafka.clients.
> > > producer.internals.ErrorLoggingCallback)
> > > org.apache.kafka.common.errors.TimeoutException: Batch containing 8
> > > record(s) expired due to timeout while requesting metadata from
> > > brokers for big_ptns1_repl1_nozip-0
> > >
> > > data didn't reach producer. So why should data appear in consumer?
> > > loss rate is more or less similar : 0.02 (130k / 5400mb) ~ 0.03%
> (150mb /
> > > 5000gb) Not so bad.
> > >
> > >
> > > 2017-04-18 21:46 GMT+02:00 jan <rtm4...@googlemail.com>:
> > >
> > >> Hi all, I'm something of a kafka n00b.
> > >> I posted the following in the  google newsgroup, haven't had a reply
> > >> or even a single read so I'll try here. My original msg, slightly
> > >> edited, was:
> > >>
> > >> ----
> > >>
> > >> (windows 2K8R2 fully patched, 16GB ram, fairly modern dual core xeon
> > >> server, latest version of java)
> > >>
> > >> I've spent several days trying to sort out unexpected behaviour
> > >> involving kafka and the kafka console producer and consumer.
> > >>
> > >>  If I set  the console produced and console consumer to look at the
> > >> same topic then I can type lines into the producer window and see them
> > >> appear in the consumer window, so it works.
> > >>
> > >> If I try to pipe in large amounts of data to the producer, some gets
> > >> lost and the producer reports errors eg.
> > >>
> > >> [2017-04-17 18:14:05,868] ERROR Error when sending message to topic
> > >> big_ptns1_repl1_nozip with key: null, value: 55 bytes with error:
> > >> (org.apache.kafka.clients.
> > >> producer.internals.ErrorLoggingCallback)
> > >> org.apache.kafka.common.errors.TimeoutException: Batch containing 8
> > >> record(s) expired due to timeout while requesting metadata from
> > >> brokers for big_ptns1_repl1_nozip-0
> > >>
> > >> I'm using as input a file either shakespeare's full works (about 5.4
> > >> meg ascii), or a much larger file of shakespear's full works
> > >> replicated 900 times to make it about 5GB. Lines are ascii and short,
> > >> and each line should be a single record when read in by the console
> > >> producer. I need to do some benchmarking on time and space and this
> > >> was my first try.
> > >>
> > >> As mentioned, data gets lost. I presume it is expected that any data
> > >> we pipe into the producer should arrive in the consumer, so if I do
> > >> this in one windows console:
> > >>
> > >> kafka-console-consumer.bat --bootstrap-server localhost:9092  --topic
> > >> big_ptns1_repl1_nozip --zookeeper localhost:2181 >
> > >> F:\Users\me\Desktop\shakespear\single_all_shakespear_OUT.txt
> > >>
> > >> and this in another:
> > >>
> > >> kafka-console-producer.bat --broker-list localhost:9092  --topic
> > >> big_ptns1_repl1_nozip <
> > >> F:\Users\me\Desktop\shakespear\complete_works_no_bare_lines.txt
> > >>
> > >> then the output file "single_all_shakespear_OUT.txt" should be
> > >> identical to the input file "complete_works_no_bare_lines.txt" except
> > >> it's not. For the complete works (sabout 5.4 meg uncompressed) I lost
> > >> about 130K in the output.
> > >> For the replicated shakespeare, which is about 5GB, I lost about 150
> > meg.
> > >>
> > >> This can't be right surely and it's repeatable but happens at
> > >> different places in the file when errors start to be produced, it
> > >> seems.
> > >>
> > >> I've done this using all 3 versions of kafak in the 0.10.x.y branch
> > >> and I get the same problem (the above commands were using the 0.10.0.0
> > >> branch so they look a little obsolete but they are right for that
> > >> branch I think). It's cost me some days.
> > >> So, am I making a mistake, if so what?
> > >>
> > >> thanks
> > >>
> > >> jan
> > >>
> > >
> >
>



-- 
Robert Quinlivan
Software Engineer, Signal

Reply via email to