Re: [VOTE] KIP-33 - Add a time based log index

Becket Qin Mon, 11 Apr 2016 00:22:39 -0700

Hi Guozhang and Jun,

Thanks for the comments. Please see the responses below.

Regarding to Guozhang's question #1 and Jun's question #12. I was inserting
the time index and offset index entry together mostly for simplicity as
Guozhang mentioned. The purpose of using index interval bytes for time
index was to control the density of the time index, which is the same
purpose as offset index. It seems reasonable to make them aligned. We can
track separately the physical position when we insert the last time index
entry(my original code did that), but when I wrote the code I feel it seems
unnecessary. Another minor benefit is that searching by timestamp could be
potentially faster if we align the time index and offset index.
It is possible that we only have either a corrupted time index or an offset
index, but not both. Although we can choose to only rebuild the one which
is corrupted, given that we have to scan the entire log segment anyway,
rebuilding both of them seems not much overhead. So the current patch I
have is rebuilding both of them together.

10. Yes, it should only happen after a hard failure. The last time index
entry of a normally closed segment has already points to the LEO, so there
is no scan during start up.

11. On broker startup, if a time index does not exist, an empty one will be
created first. If message format version is 0.9.0, we will append a time
index entry of (last modification time -> base offset of next segment) to
the time index of each inactive segment. So no actual rebuild will happen
during upgrade. However, if message format version is 0.10.0, we will
rebuild the time index if it does not exist. (I actually had a question
about the how we are loading the log segments, we can discuss it in the PR)

I will update the wiki to clarify the question raised in the comments and
submit a PR by tomorrow. I am currently cleaning up the documentation.

Thanks,

Jiangjie (Becket) Qin

On Sun, Apr 10, 2016 at 9:25 PM, Jun Rao <[email protected]> wrote:

> Hi, Jiangjie,
>
> Thanks for the update. Looks good to me overall. Just a few minor comments
> below.
>
> 10. On broker startup, it's not clear to me why we need to scan the log
> segment to retrieve the largest timestamp since the time index always has
> an entry for the largest timestamp. Is that only for restarting after a
> hard failure?
>
> 11. On broker startup, if a log segment misses the time index, do we always
> rebuild it? This can happen when the broker is upgraded.
>
> 12. Related to Guozhang's question #1. It seems it's simpler to add time
> index entries independent of the offset index since at index entry may not
> be added to the offset and the time index at the same time. Also, this
> allows time index to be rebuilt independently if needed.
>
> Thanks,
>
> Jun
>
>
> On Wed, Apr 6, 2016 at 5:44 PM, Becket Qin <[email protected]> wrote:
>
> > Hi all,
> >
> > I updated KIP-33 based on the initial implementation. Per discussion on
> > yesterday's KIP hangout, I would like to initiate the new vote thread for
> > KIP-33.
> >
> > The KIP wiki:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index
> >
> > Here is a brief summary of the KIP:
> > 1. We propose to add a time index for each log segment.
> > 2. The time indices are going to be used of log retention, log rolling
> and
> > message search by timestamp.
> >
> > There was an old voting thread which has some discussions on this KIP.
> The
> > mail thread link is following:
> >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3ccabtagwgoebukyapfpchmycjk2tepq3ngtuwnhtr2tjvsnc8...@mail.gmail.com%3E
> >
> > I have the following WIP patch for reference. It needs a few more unit
> > tests and documentation. Other than that it should run fine.
> >
> >
> https://github.com/becketqin/kafka/commit/712357a3fbf1423e05f9eed7d2fed5b6fe6c37b7
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
>

Re: [VOTE] KIP-33 - Add a time based log index

Reply via email to