Re: data loss - replicas

2015-06-24 Thread Nirmal ram
i ran the following in both servers * kafka-run-class.sh kafka.tools.DumpLogSegments --files /tmp/kafka-logs/jun8-6/21764229.log* It seems you might have run that on the last log segment. Can you run it on 21764229.log on both brokers and compare? I'm

Re: data loss - replicas

2015-06-23 Thread Joel Koshy
It seems you might have run that on the last log segment. Can you run it on 21764229.log on both brokers and compare? I'm guessing there may be a message-set with a different compression codec that may be causing this. Thanks, Joel On Tue, Jun 23, 2015 at 01:06:16PM +0530, nirmal wro

Re: data loss - replicas

2015-06-23 Thread Todd Palino
Thanks, Joel. I know I remember a case where we had a difference like this between two brokers, and it was not due to retention settings or some other problem, but I can't remember exactly what we determined it was. -Todd On Mon, Jun 22, 2015 at 4:22 PM, Joel Koshy wrote: > The replicas do not

Re: data loss - replicas

2015-06-23 Thread nirmal
Hi, i ran DumpLogSegments. *Broker 1* offset: 23077447 position: 1073722324 isvalid: true payloadsize: 431 magic: 0 compresscodec: NoCompressionCodec crc: 895349554 *Broker 2* offset: 23077447 position: 1073740131 isvalid: true payloadsize: 431 magic: 0 compresscodec: NoCompressionCodec crc:

Re: data loss - replicas

2015-06-22 Thread Joel Koshy
The replicas do not have to decompress/recompress so I don't think that would contribute to this. There may be some corner cases such as: - Multiple unclean leadership elections in sequence - Changing the compression codec for a topic on the fly - different brokers may see this config change at

Re: data loss - replicas

2015-06-22 Thread Todd Palino
I assume that you are considering the data loss to be the difference in size between the two directories? This is generally not a good guideline, as the batching and compression will be different between the two replicas. -Todd On Mon, Jun 22, 2015 at 7:26 AM, Nirmal ram wrote: > Hi, > > I not

data loss - replicas

2015-06-22 Thread Nirmal ram
Hi, I noticed a data loss while storing in kafka logs. Generally, leader hands the request to followers, is there a data loss in that process? topic 'jun8' with 2 replicas and 8 partitions *Broker 1*[user@ jun8-6]$ ls -ltr total 7337500 -rw-rw-r-- 1 user user 1073741311 Jun 22 12:45 000