Re: [VOTE] KIP-33 - Add a time based log index

Jun Rao Sun, 10 Apr 2016 21:27:02 -0700

Hi, Jiangjie,

Thanks for the update. Looks good to me overall. Just a few minor comments
below.

10. On broker startup, it's not clear to me why we need to scan the log
segment to retrieve the largest timestamp since the time index always has
an entry for the largest timestamp. Is that only for restarting after a
hard failure?

11. On broker startup, if a log segment misses the time index, do we always
rebuild it? This can happen when the broker is upgraded.

12. Related to Guozhang's question #1. It seems it's simpler to add time
index entries independent of the offset index since at index entry may not
be added to the offset and the time index at the same time. Also, this
allows time index to be rebuilt independently if needed.

Thanks,

Jun

On Wed, Apr 6, 2016 at 5:44 PM, Becket Qin <becket....@gmail.com> wrote:

> Hi all,
>
> I updated KIP-33 based on the initial implementation. Per discussion on
> yesterday's KIP hangout, I would like to initiate the new vote thread for
> KIP-33.
>
> The KIP wiki:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index
>
> Here is a brief summary of the KIP:
> 1. We propose to add a time index for each log segment.
> 2. The time indices are going to be used of log retention, log rolling and
> message search by timestamp.
>
> There was an old voting thread which has some discussions on this KIP. The
> mail thread link is following:
>
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3ccabtagwgoebukyapfpchmycjk2tepq3ngtuwnhtr2tjvsnc8...@mail.gmail.com%3E
>
> I have the following WIP patch for reference. It needs a few more unit
> tests and documentation. Other than that it should run fine.
>
> https://github.com/becketqin/kafka/commit/712357a3fbf1423e05f9eed7d2fed5b6fe6c37b7
>
> Thanks,
>
> Jiangjie (Becket) Qin
>

Re: [VOTE] KIP-33 - Add a time based log index

Reply via email to