Hi
Do we store message crc also on disk, and server verifies same when we are
reading messages back from disk?
And how to handle errors when we use async publish ?

On Fri, Mar 25, 2016 at 4:17 AM, Becket Qin <becket....@gmail.com> wrote:

> You mentioned that you saw few corrupted messages, (< 0.1%). If so are you
> able to see some corrupted messages if you produce, say, 10M messages?
>
> On Wed, Mar 23, 2016 at 9:40 PM, sunil kalva <kalva.ka...@gmail.com>
> wrote:
>
> >  I am using java client and kafka 0.8.2, since events are corrupted in
> > kafka broker i cant read and replay them again.
> >
> > On Thu, Mar 24, 2016 at 9:42 AM, Becket Qin <becket....@gmail.com>
> wrote:
> >
> > > Hi Sunil,
> > >
> > > The messages in Kafka has a CRC stored with each of them. When consumer
> > > receives a message, it will compute the CRC from the message bytes and
> > > compare it to the stored CRC. If the computed CRC and stored CRC does
> not
> > > match, that indicates the message has corrupted. I am not sure in your
> > case
> > > why the message is corrupted. Corrupted message seems to  be pretty
> rare
> > > because the broker actually validate the CRC before it stores the
> > messages
> > > on to the disk.
> > >
> > > Is this problem reproduceable? If so, can you find out the messages
> that
> > > are corrupted? Also, are you using the Java clients or some other
> > clients?
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Wed, Mar 23, 2016 at 8:28 PM, sunil kalva <kalva.ka...@gmail.com>
> > > wrote:
> > >
> > > > can some one help me out here.
> > > >
> > > > On Wed, Mar 23, 2016 at 7:36 PM, sunil kalva <kalva.ka...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi
> > > > > I am seeing few messages getting corrupted in kafka, It is not
> > > happening
> > > > > frequently and percentage is also very very less (less than 0.1%).
> > > > >
> > > > > Basically i am publishing thrift events in byte array format to
> kafka
> > > > > topics(with out encoding like base64), and i also see more events
> > than
> > > i
> > > > > publish (i confirm this by looking at the offset for that topic).
> > > > > For example if i publish 100 events and i see 110 as offset for
> that
> > > > topic
> > > > > (since it is in production i could not get exact messages which
> > causing
> > > > > this problem, and we will only realize this problem when we consume
> > > > because
> > > > > our thrift deserialization fails).
> > > > >
> > > > > So my question is, is there any magic byte which actually
> determines
> > > the
> > > > > boundary of the message which is same as the byte i am sending or
> or
> > > for
> > > > > any n/w issues messages get chopped and stores as one message to
> > > multiple
> > > > > messages on server side ?
> > > > >
> > > > > tx
> > > > > SunilKalva
> > > > >
> > > >
> > >
> >
>

Reply via email to