Metric for follower's replication latency

2018-05-23 Thread Uddhav Arote
Hello all,

I have a set of brokers and I want to know 'how much time do my brokers
spend in replication of a message?'.
I have two questions:
1. I did not find any available metric for replication latency (avg, max).
2. Why is there no metric for understanding broker's replication behavior?

Uddhav


Re: Keeping track of ingest time of messages in pipeline (cluster 1-> mm -> cluster 2 -> ..)

2018-04-23 Thread Uddhav Arote
Any thoughts here?

On 2018/04/23 05:47:24, Uddhav Arote <a...@gmail.com> wrote: 
> Hi,> 
> 
> The V1 message format is> 
> 
> 
>1. v1 (supported since 0.10.0)> 
>2. Message => Crc MagicByte Attributes Key Value> 
>3.   Crc => int32> 
>4.   MagicByte => int8> 
>5.   Attributes => int8> 
>6.   Timestamp => int64> 
>7.   Key => bytes> 
>8.   Value => bytes> 
> 
> 
> Would it be a good suggestion to have message format like> 
> 
> 
>1. v1 (supported since 0.10.0)> 
>2. Message => Crc MagicByte Attributes Key Value> 
>3.   Crc => int32> 
>4.   MagicByte => int8> 
>5.   Attributes => int8> 
>6.   Timestamp => int64[] <-- array to hold ingest time from all brokers> 
>7.   Key => bytes> 
>8.   Value => bytes> 
> 
> 
> This will be a good feature to keep track (ingest time) of the> 
> message (set) downstream. For detailed latency calculation, exactly knowing> 
> where a message was at a given time?> 
> Is there any plan on including this or there is any reason for not adding> 
> this to the message format?> 
> 
> Yes, the header will become variable length. But can't there be a control> 
> variable to keep a limit size of this array?> 
> 
> Uddhav> 
> 

Keeping track of ingest time of messages in pipeline (cluster 1-> mm -> cluster 2 -> ..)

2018-04-22 Thread Uddhav Arote
Hi,

The V1 message format is


   1. v1 (supported since 0.10.0)
   2. Message => Crc MagicByte Attributes Key Value
   3.   Crc => int32
   4.   MagicByte => int8
   5.   Attributes => int8
   6.   Timestamp => int64
   7.   Key => bytes
   8.   Value => bytes


Would it be a good suggestion to have message format like


   1. v1 (supported since 0.10.0)
   2. Message => Crc MagicByte Attributes Key Value
   3.   Crc => int32
   4.   MagicByte => int8
   5.   Attributes => int8
   6.   Timestamp => int64[] <-- array to hold ingest time from all brokers
   7.   Key => bytes
   8.   Value => bytes


This will be a good feature to keep track (ingest time) of the
message (set) downstream. For detailed latency calculation, exactly knowing
where a message was at a given time?
Is there any plan on including this or there is any reason for not adding
this to the message format?

Yes, the header will become variable length. But can't there be a control
variable to keep a limit size of this array?

Uddhav


Re: Compression in Kafka

2018-02-14 Thread Uddhav Arote
Oh, that makes sense.
So, to summarize
1. producer and broker compression codecs different: the broker
decompresses and re-compresses the message batches
2. producer and broker compression codecs same: (lz4 & lz4) -- retain the
producer compression **
3. producer and broker compression codec (lz4 and 'producer') -- retain the
original compression codec

Case 2 and 3 same process? **
What is the need to recompress in case of different codecs?

Thanks

On Wed, Feb 14, 2018 at 8:53 PM, Manikumar <manikumar.re...@gmail.com>
wrote:

> It is not double compression. When I say re-compression,  brokers
> decompress the messages and compress again with
> new codec.
>
> On Wed, Feb 14, 2018 at 5:18 PM, Uddhav Arote <aroteudd...@gmail.com>
> wrote:
>
> > Thanks.
> >
> > I am using console-producer with following settings with lz4 broker
> > compression codec
> > 1. None producer compression codec
> > 2. Snappy producer compression codec
> > 3. lz4 producer compression codec
> >
> > I send a 354 Byte message with each of the above settings. However, I do
> > not see any kind of double compression happening, when the producer and
> > broker compression codecs are different
> >
> > *Output 1:*
> >  offset: 0 position: 0 CreateTime: 1518607686194 isvalid: true
> payloadsize:
> > 61 magic: 1 compresscodec: LZ4CompressionCodec crc:  3693540371 payload:
> >   compressed 354Byte message
> >
> >  using --deep-iteration
> >  offset: 0 position: 0 CreateTime: 1518607686194 isvalid: true
> payloadsize:
> > 354 magic: 1 compresscodec: NoCompressionCodec crc: 4190573446 payload:
> 354
> > byte message
> >
> >
> > *Output 2:*offset: 7 position: 517 CreateTime: 1518608039723 isvalid:
> true
> > payloadsize: 61 magic: 1 compresscodec: LZ4CompressionCodec crc:
> 4075439033
> > payload: compressed 354B message
> >
> > using --deep-iteration
> > offset: 7 position: 517 CreateTime: 1518608039723 isvalid: true
> > payloadsize: 354 magic: 1 compresscodec: NoCompressionCodec crc:
> 4061704088
> > payload: Same 354B message
> >
> > *Output 3:*
> > offset: 11 position: 883 CreateTime: 1518608269618 isvalid: true
> > payloadsize: 61 magic: 1 compresscodec: LZ4CompressionCodec crc:
> 981370250
> > payload: compressed 354B message
> >
> > using --deep-iteration
> > offset: 11 position: 883 CreateTime: 1518608269618 isvalid: true
> > payloadsize: 354 magic: 1 compresscodec: NoCompressionCodec crc:
> 468622988
> > payload: same 354B message
> >
> > Please note the compression codecs in the --deep-iteration case,
> >  Case 1 is OK, but in case 2 shouldn't it be SnappyCompression and 3 may
> be
> > LZ4Compression
> >
> > Or is it visible when the message batch into large batches?
> >
> > Thanks
> > Uddhav
> >
> > On Wed, Feb 14, 2018 at 6:05 PM, Manikumar <manikumar.re...@gmail.com>
> > wrote:
> >
> > >   If the broker "compression.type" is "producer", then the broker
> retains
> > > the original compression codec set by the producer.
> > >   If the producer and broker codecs are different, then broker
> recompress
> > > the data using broker "compression.type".
> > >
> > > On Wed, Feb 14, 2018 at 10:58 AM, Uddhav Arote <aroteudd...@gmail.com>
> > > wrote:
> > >
> > > > Hi Kafka users,
> > > >
> > > > I am trying to understand the behavior of compression in Kafka.
> > Consider
> > > a
> > > > scenario, where producer sets compression.codec "snappy" and broker's
> > > > compression.code  "lz4"?
> > > > In this scenario, what is the behavior of the compression?
> > > >
> > > > As far as I have understood is the following,
> > > > The messages compressed by the producer are wrapped in the wrapper
> > > message
> > > > and send to the broker. If the broker compression.codec is
> "producer",
> > > the
> > > > message is written as is to the log.
> > > > In the code,
> > > >
> > > > https://github.com/apache/kafka/blob/962bc638f9c2ab249e5008a587ee78
> > > > e3ba35fcb9/core/src/main/scala/kafka/log/LogValidator.scala#L218
> > > >
> > > >
> > > > what I understand is that if the producer and broker codecs are not
> > same,
> > > > then the compression should happen again.
> > > >
> > > > But I am not sure about this. Can somebody tell me how this works?
> > > >
> > > > Thanks,
> > > > Uddhav
> > > >
> > >
> >
>


Compression in Kafka

2018-02-13 Thread Uddhav Arote
Hi Kafka users,

I am trying to understand the behavior of compression in Kafka. Consider a
scenario, where producer sets compression.codec "snappy" and broker's
compression.code  "lz4"?
In this scenario, what is the behavior of the compression?

As far as I have understood is the following,
The messages compressed by the producer are wrapped in the wrapper message
and send to the broker. If the broker compression.codec is "producer", the
message is written as is to the log.
In the code,

https://github.com/apache/kafka/blob/962bc638f9c2ab249e5008a587ee78e3ba35fcb9/core/src/main/scala/kafka/log/LogValidator.scala#L218


what I understand is that if the producer and broker codecs are not same,
then the compression should happen again.

But I am not sure about this. Can somebody tell me how this works?

Thanks,
Uddhav