Re: Segments being deleted too early after upgrading 0.9.0.1 to 0.10.1.0

2016-11-01 Thread Jun Rao
Hi, James, That's a good point. KAFKA-3802 should cause the log segments to be kept longer, instead of shorter. So, there is probably something else that's causing this behavior. Could you try if you can reproduce this? When you do that, one thing you could try is to set log.segment.delete.delay.m

Re: Segments being deleted too early after upgrading 0.9.0.1 to 0.10.1.0

2016-10-31 Thread James Brown
KAFKA-3802 does seem plausible; I had to restart the brokers again after the 0.10.1.0 upgrade to change some JVM settings; maybe that touched the mtime on the files? Not sure why that would make them *more* likely to be deleted, though, since their mtime should've gone into the future, not into the

Re: Segments being deleted too early after upgrading 0.9.0.1 to 0.10.1.0

2016-10-31 Thread Jun Rao
Hi, James, Thanks for testing and reporting this. What you observed is actually not the expected behavior in 0.10.1 based on the design. The way that retention works in 0.10.1 is that if a log segment has at least one message with a timestamp, we will use the largest timestamp in that segment to d

Re: Segments being deleted too early after upgrading 0.9.0.1 to 0.10.1.0

2016-10-31 Thread James Brown
Incidentally, I'd like to note that this did *not* occur in my testing environment (which didn't expire any unexpected segments after upgrading), so if it is a feature, it's certainly a hit-or-miss one. On Mon, Oct 31, 2016 at 4:14 PM, James Brown wrote: > I just finished upgrading our main prod

Segments being deleted too early after upgrading 0.9.0.1 to 0.10.1.0

2016-10-31 Thread James Brown
I just finished upgrading our main production cluster to 0.10.1.0 (from 0.9.0.1) with an on-line rolling upgrade, and I noticed something strange — the leader for one of our big partitions just decided to expire all of the logs from before the upgrade. I have log.retention.hours set to 336 in my co