Re: KIP-33 Opt out from Time Based indexing

2016-09-08 Thread Jan Filipiak
Hi Jun, thanks a lot for the hint, Ill check it out when I get a free minute! Best Jan On 07.09.2016 00:35, Jun Rao wrote: Jan, For the time rolling issue, Jiangjie has committed a fix ( https://issues.apache.org/jira/browse/KAFKA-4099) to trunk. Perhaps you can help test out trunk and see

Re: KIP-33 Opt out from Time Based indexing

2016-09-06 Thread Jun Rao
Jan, For the time rolling issue, Jiangjie has committed a fix ( https://issues.apache.org/jira/browse/KAFKA-4099) to trunk. Perhaps you can help test out trunk and see if there are any other issues related to time-based index? Thanks, Jun On Mon, Sep 5, 2016 at 11:52 PM, Jan Filipiak

Re: KIP-33 Opt out from Time Based indexing

2016-09-06 Thread Jan Filipiak
Hi Jun, sorry for the late reply. Regarding B, my main concern was just complexity of understanding what's going on. As you can see it took me probably some 2 days or so, to fully grab all the details in the implementation and what the impacts are. Usually I prefer to turn things I don't use

Re: KIP-33 Opt out from Time Based indexing

2016-08-29 Thread Becket Qin
Hi Jun, I just created KAFKA-4099 and will submit patch soon. Thanks, Jiangjie (Becket) Qin On Mon, Aug 29, 2016 at 11:55 AM, Jun Rao wrote: > Jiangjie, > > Good point on the time index format related to uncompressed messages. It > does seem that indexing based on file

Re: KIP-33 Opt out from Time Based indexing

2016-08-29 Thread Jun Rao
Jiangjie, Good point on the time index format related to uncompressed messages. It does seem that indexing based on file position requires a bit more complexity. Since the time index is going to be used infrequently, having a level of indirection doesn't seem a big concern. So, we can leave the

Re: KIP-33 Opt out from Time Based indexing

2016-08-29 Thread Jun Rao
Jan, For the usefulness of time index, it's ok if you don't plan to use it. However, I do think there are other people who will want to use it. Fixing an application bug always requires some additional work. Intuitively, being able to seek back to a particular point of time for replay is going to

Re: KIP-33 Opt out from Time Based indexing

2016-08-28 Thread Becket Qin
Jan, Thanks for the example of reprocessing the messages. I think in any case, reconsuming all the messages will definitely work. What we want to do here is to see if we can avoid doing that by only reconsuming necessary messages. In the scenario you mentioned, can you store an

Re: KIP-33 Opt out from Time Based indexing

2016-08-26 Thread Jan Filipiak
Hi Jun, thanks for taking the time to answer on such a detailed level. You are right Log.fetchOffsetByTimestamp works, the comment is just confusing "// Get all the segments whose largest timestamp is smaller than target timestamp" wich is apparently is not what takeWhile does (I am more on

Re: KIP-33 Opt out from Time Based indexing

2016-08-26 Thread Becket Qin
Jun, Good point about new log rolling behavior issue when move replicas. Keeping the old behavior sounds reasonable to me. Currently the time index entry points to the exact shallow message with the indexed timestamp, are you suggesting we change it to point to the starting offset of the

Re: KIP-33 Opt out from Time Based indexing

2016-08-26 Thread Jun Rao
Jiangjie, I am not sure about changing the default to LogAppendTime since CreateTime is probably what most people want. It also doesn't solve the problem completely. For example, if you do partition reassignment and need to copy a bunch of old log segments to a new broker, this may cause log

Re: KIP-33 Opt out from Time Based indexing

2016-08-25 Thread Becket Qin
Hi Jan, It seems your main concern is for the changed behavior of time based log rolling and time based retention. That is actually why we have two timestamp types. If user set the log.message.timestamp.type to LogAppendTime, the broker will behave exactly the same as they were, except the

Re: KIP-33 Opt out from Time Based indexing

2016-08-25 Thread Jun Rao
Jan, Thanks a lot for the feedback. Now I understood your concern better. The following are my comments. The first odd thing that you pointed out could be a real concern. Basically, if a producer publishes messages with really old timestamp, our default log.roll.hours (7 days) will indeed cause

Re: KIP-33 Opt out from Time Based indexing

2016-08-24 Thread Jan Filipiak
Hey Jun, I go and try again :), wrote the first one in quite a stressful environment. The bottom line is that I, for our use cases, see a to small use/effort ratio in this time index. We do not bootstrap new consumers for key-less logs so frequently and when we do it, they usually want

Re: KIP-33 Opt out from Time Based indexing

2016-08-22 Thread Jay Kreps
Can you describe the behavior you saw that you didn't like? -Jay On Mon, Aug 22, 2016 at 12:24 AM, Jan Filipiak wrote: > Hello everyone, > > I stumbled across KIP-33 and the time based index, while briefly checking > the wiki and commits, I fail to find a way to opt