Jan,

Currently, there is no switch to disable the time based index.

There are quite a few use cases of time based index.

1. From KIP-33's wiki, it allows us to do time-based retention accurately.
Before KIP-33, the time-based retention is based on the last modified time
of each log segment. The main issue is that last modified time can change
over time. For example, if a broker loses storage and has to re-replicate
all data, those re-replicated segments will be retained much longer since
their last modified time is more recent. Having a time-based index allows
us to retain segments based on the message time, not the last modified
time. This can also benefit KIP-71, where we want to combine time-based
retention and compaction.

2. In KIP-58, we want to delay log compaction based on a configurable
amount of time. Time-based index allows us to do this more accurately.

3. We plan to add an api in the consumer to allow seeking to an offset
based on a timestamp. The time based index allows us to do this more
accurately and fast.

Now for the impact.

a. There is a slight change on how time-based rolling works. Before KIP-33,
rolling was based on the time when a segment was loaded in the broker.
After KIP-33, rolling is based on the time of the first message of a
segment. Not sure if this is your concern. In the common case, the two
behave more or less the same. The latter is actually more deterministic
since it's not sensitive to broker restarts.

b. Time-based index potentially adds overhead to producing messages and
loading segments. Our experiments show that the impact to producing is
insignificant. The time to load segments when restarting a broker can be
doubled. However, the absolute time is still reasonable. For example,
loading 10K log segments with time-based index takes about 5 seconds.

Because time-based index is useful in several cases and the impact seems
small, we didn't consider making time based index optional. Finally,
although it's possible to make the time based index optional, it will add
more complexity to the code base. So, we probably should only consider it
if it's truly needed.

Thanks,

Jun

On Mon, Aug 22, 2016 at 12:24 AM, Jan Filipiak <jan.filip...@trivago.com>
wrote:

> Hello everyone,
>
> I stumbled across KIP-33 and the time based index, while briefly checking
> the wiki and commits, I fail to find a way to opt out.
> I saw it having quite some impact on when logs are rolled and was hoping
> not to have to deal with all of that. Is there a disable switch I
> overlooked?
>
> Does anybody have a good use case where the timebase index comes in handy?
> I made a custom console consumer for me,
> that can bisect a log based on time. Its just a quick probabilistic shot
> into the log but is sometimes quite useful for some debugging.
>
> Best Jan
>

Reply via email to