Joel,

Thanks for the comments. I updated the KIP page and added the canary
procedure.

Thanks,

Jiangjie (Becket) Qin

On Wed, Sep 30, 2015 at 6:26 PM, Joel Koshy <jjkosh...@gmail.com> wrote:

> The Phase 2 2.* sub-steps don't seem to be right. Can you look over
> that carefully? Also, "definitive" - you mean "absolute" i.e., not
> relative offsets right?
>
> One more thing that may be worth mentioning is that it is technically
> possible to canary the new version format on at most one broker (or
> multiple if it hosts mutually disjoint partitions). Basically turn on
> the new message format on one broker, leave it on for an extended
> period - if we hit some unanticipated bug and something goes terribly
> wrong with the feature then just kill that broker, switch it to the v0
> on-disk format and reseed it from the leaders. Most people may not
> want to have such a long deployment plan but at least it is an option
> for those who want to tread very carefully given that it is backwards
> incompatible.
>
> Joel
>
> On Tue, Sep 29, 2015 at 4:50 PM, Jiangjie Qin <j...@linkedin.com.invalid>
> wrote:
> > Hi Joel and other folks.
> >
> > I updated the KIP page with the two phase roll out, which avoids the
> > conversion for majority of users.
> >
> > To do that we need to add a message.format.version configuration to
> broker.
> > Other than that there is no interface change from the previous proposal.
> > Please let me know if you have concern about the updated proposal.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Fri, Sep 25, 2015 at 11:26 AM, Joel Koshy <jjkosh...@gmail.com>
> wrote:
> >
> >> Hey Becket,
> >>
> >> I do think we need the interim deployment phase, set
> >> message.format.version and down-convert for producer request v2.
> >> Down-conversion for v2 is no worse than what the broker is doing now.
> >> I don't think we want a prolonged phase where we down-convert for
> >> every v1 fetch - in fact I'm less concerned about losing zero-copy for
> >> those fetch requests than the overhead of decompress/recompress for
> >> those fetches as that would increase your CPU usage by 4x, 5x or
> >> whatever the average consumer fan-out is. The
> >> decompression/recompression will put further memory pressure as well.
> >>
> >> It is true that clients send the latest request version that it is
> >> compiled with and that does not need to change. The broker can
> >> continue to send back with zero-copy for fetch request version 2 as
> >> well (even if during the interim phase during which it down-converts
> >> producer request v2). The consumer iterator (for old consumer) or the
> >> Fetcher (for new consumer) needs to be able to handle messages that
> >> are in original as well as new (relative offset) format.
> >>
> >> Thanks,
> >>
> >> Joel
> >>
> >>
> >> On Thu, Sep 24, 2015 at 7:56 PM, Jiangjie Qin <j...@linkedin.com.invalid
> >
> >> wrote:
> >> > Hi Joel,
> >> >
> >> > That is a valid concern. And that is actually why we had the
> >> > message.format.version before.
> >> >
> >> > My original thinking was:
> >> > 1. upgrade the broker to support both V1 and V2 for consumer/producer
> >> > request.
> >> > 2. configure broker to store V1 on the disk. (message.format.version
> = 1)
> >> > 3. upgrade the consumer to support both V1 and V2 for consumer
> request.
> >> > 4. Meanwhile some producer might also be upgraded to use producer
> request
> >> > V2.
> >> > 5. At this point, for producer request V2, broker will do down
> >> conversion.
> >> > Regardless consumers are upgraded or not, broker will always use
> >> zero-copy
> >> > transfer. Because supposedly both old and upgraded consumer should be
> >> able
> >> > to understand that.
> >> > 6. After most of the consumers are upgraded, We set
> >> message.format.version
> >> > = 1 and only do down conversion for old consumers.
> >> >
> >> > This way we don't need to reject producer request V2. And we always to
> >> > version conversion for the minority of the consumers. However I have a
> >> few
> >> > concerns over this approach, not sure if they actually matters.
> >> >
> >> > A. (5) is not true for now. Today the clients only uses the highest
> >> > version, i.e. a producer/consumer wouldn't parse a lower version of
> >> > response even the code exist there. I think supposedly, consumer
> should
> >> > stick to one version and broker should do the conversion.
> >> > B. Let's say (A) is not a concern, we make all the clients support all
> >> the
> >> > versions it knows. At step(6), there will be a transitional period
> that
> >> > user will see both messages with new and old version. For KIP-31 only
> it
> >> > might be OK because we are not adding anything into the message. But
> if
> >> the
> >> > message has different fields (e.g. KIP-32), that means people will get
> >> > those fields from some messages but not from some other messages.
> Would
> >> > that be a problem?
> >> >
> >> > If (A) and (B) are not a problem. Is the above procedure able to
> address
> >> > your concern?
> >> >
> >> > Thanks,
> >> >
> >> > Jiangjie (Becket) Qin
> >> >
> >> > On Thu, Sep 24, 2015 at 6:32 PM, Joel Koshy <jjkosh...@gmail.com>
> wrote:
> >> >
> >> >> The upgrade plan works, but the potentially long interim phase of
> >> >> skipping zero-copy for down-conversion could be problematic
> especially
> >> >> for large deployments with large consumer fan-out. It is not only
> >> >> going to be memory overhead but CPU as well - since you need to
> >> >> decompress, write absolute offsets, then recompress for every v1
> >> >> fetch. i.e., it may be safer (but obviously more tedious) to have a
> >> >> multi-step upgrade process. For e.g.,:
> >> >>
> >> >> 1 - Upgrade brokers, but disable the feature. i.e., either reject
> >> >> producer requests v2 or down-convert to old message format (with
> >> >> absolute offsets)
> >> >> 2 - Upgrade clients, but they should only use v1 requests
> >> >> 3 - Switch (all or most) consumers to use v2 fetch format (which will
> >> >> use zero-copy).
> >> >> 4 - Turn on the feature on the brokers to allow producer requests v2
> >> >> 5 - Switch producers to use v2 produce format
> >> >>
> >> >> (You may want a v1 fetch rate metric and decide to proceed to step 4
> >> >> only when that comes down to a trickle)
> >> >>
> >> >> I'm not sure if the prolonged upgrade process is viable in every
> >> >> scenario. I think it should work at LinkedIn for e.g., but may not
> for
> >> >> other environments.
> >> >>
> >> >> Joel
> >> >>
> >> >>
> >> >> On Tue, Sep 22, 2015 at 12:55 AM, Jiangjie Qin
> >> >> <j...@linkedin.com.invalid> wrote:
> >> >> > Thanks for the explanation, Jay.
> >> >> > Agreed. We have to keep the offset to be the offset of last inner
> >> >> message.
> >> >> >
> >> >> > Jiangjie (Becket) Qin
> >> >> >
> >> >> > On Mon, Sep 21, 2015 at 6:21 PM, Jay Kreps <j...@confluent.io>
> wrote:
> >> >> >
> >> >> >> For (3) I don't think we can change the offset in the outer
> message
> >> from
> >> >> >> what it is today as it is relied upon in the search done in the
> log
> >> >> layer.
> >> >> >> The reason it is the offset of the last message rather than the
> first
> >> >> is to
> >> >> >> make the offset a least upper bound (i.e. the smallest offset >=
> >> >> >> fetch_offset). This needs to work the same for both gaps due to
> >> >> compacted
> >> >> >> topics and gaps due to compressed messages.
> >> >> >>
> >> >> >> So imagine you had a compressed set with offsets {45, 46, 47, 48}
> if
> >> you
> >> >> >> assigned this compressed set the offset 45 a fetch for 46 would
> >> actually
> >> >> >> skip ahead to 49 (the least upper bound).
> >> >> >>
> >> >> >> -Jay
> >> >> >>
> >> >> >> On Mon, Sep 21, 2015 at 5:17 PM, Jun Rao <j...@confluent.io>
> wrote:
> >> >> >>
> >> >> >> > Jiangjie,
> >> >> >> >
> >> >> >> > Thanks for the writeup. A few comments below.
> >> >> >> >
> >> >> >> > 1. We will need to be a bit careful with fetch requests from the
> >> >> >> followers.
> >> >> >> > Basically, as we are doing a rolling upgrade of the brokers, the
> >> >> follower
> >> >> >> > can't start issuing V2 of the fetch request until the rest of
> the
> >> >> brokers
> >> >> >> > are ready to process it. So, we probably need to make use of
> >> >> >> > inter.broker.protocol.version to do the rolling upgrade. In step
> >> 1, we
> >> >> >> set
> >> >> >> > inter.broker.protocol.version to 0.9 and do a round of rolling
> >> >> upgrade of
> >> >> >> > the brokers. At this point, all brokers are capable of
> processing
> >> V2
> >> >> of
> >> >> >> > fetch requests, but no broker is using it yet. In step 2, we
> >> >> >> > set inter.broker.protocol.version to 0.10 and do another round
> of
> >> >> rolling
> >> >> >> > restart of the brokers. In this step, the upgraded brokers will
> >> start
> >> >> >> > issuing V2 of the fetch request.
> >> >> >> >
> >> >> >> > 2. If we do #1, I am not sure if there is still a need for
> >> >> >> > message.format.version since the broker can start writing
> messages
> >> in
> >> >> the
> >> >> >> > new format after inter.broker.protocol.version is set to 0.10.
> >> >> >> >
> >> >> >> > 3. It wasn't clear from the wiki whether the base offset in the
> >> >> shallow
> >> >> >> > message is the offset of the first or the last inner message.
> It's
> >> >> better
> >> >> >> > to use the offset of the last inner message. This way, the
> >> followers
> >> >> >> don't
> >> >> >> > have to decompress messages to figure out the next fetch offset.
> >> >> >> >
> >> >> >> > 4. I am not sure that I understand the following sentence in the
> >> >> wiki. It
> >> >> >> > seems that the relative offsets in a compressed message don't
> have
> >> to
> >> >> be
> >> >> >> > consecutive. If so, why do we need to update the relative
> offsets
> >> in
> >> >> the
> >> >> >> > inner messages?
> >> >> >> > "When the log cleaner compacts log segments, it needs to update
> the
> >> >> inner
> >> >> >> > message's relative offset values."
> >> >> >> >
> >> >> >> > Thanks,
> >> >> >> >
> >> >> >> > Jun
> >> >> >> >
> >> >> >> > On Thu, Sep 17, 2015 at 12:54 PM, Jiangjie Qin
> >> >> <j...@linkedin.com.invalid
> >> >> >> >
> >> >> >> > wrote:
> >> >> >> >
> >> >> >> > > Hi folks,
> >> >> >> > >
> >> >> >> > > Thanks a lot for the feedback on KIP-31 - move to use relative
> >> >> offset.
> >> >> >> > (Not
> >> >> >> > > including timestamp and index discussion).
> >> >> >> > >
> >> >> >> > > I updated the migration plan section as we discussed on KIP
> >> >> hangout. I
> >> >> >> > > think it is the only concern raised so far. Please let me
> know if
> >> >> there
> >> >> >> > are
> >> >> >> > > further comments about the KIP.
> >> >> >> > >
> >> >> >> > > Thanks,
> >> >> >> > >
> >> >> >> > > Jiangjie (Becket) Qin
> >> >> >> > >
> >> >> >> > > On Mon, Sep 14, 2015 at 5:13 PM, Jiangjie Qin <
> j...@linkedin.com
> >> >
> >> >> >> wrote:
> >> >> >> > >
> >> >> >> > > > I just updated the KIP-33 to explain the indexing on
> CreateTime
> >> >> and
> >> >> >> > > > LogAppendTime respectively. I also used some use case to
> >> compare
> >> >> the
> >> >> >> > two
> >> >> >> > > > solutions.
> >> >> >> > > > Although this is for KIP-33, but it does give a some
> insights
> >> on
> >> >> >> > whether
> >> >> >> > > > it makes sense to have a per message LogAppendTime.
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index
> >> >> >> > > >
> >> >> >> > > > As a short summary of the conclusions we have already
> reached
> >> on
> >> >> >> > > timestamp:
> >> >> >> > > > 1. It is good to add a timestamp to the message.
> >> >> >> > > > 2. LogAppendTime should be used for broker policy
> enforcement
> >> (Log
> >> >> >> > > > retention / rolling)
> >> >> >> > > > 3. It is useful to have a CreateTime in message format,
> which
> >> is
> >> >> >> > > immutable
> >> >> >> > > > after producer sends the message.
> >> >> >> > > >
> >> >> >> > > > There are following questions still in discussion:
> >> >> >> > > > 1. Should we also add LogAppendTime to message format?
> >> >> >> > > > 2. which timestamp should we use to build the index.
> >> >> >> > > >
> >> >> >> > > > Let's talk about question 1 first because question 2 is
> >> actually a
> >> >> >> > follow
> >> >> >> > > > up question for question 1.
> >> >> >> > > > Here are what I think:
> >> >> >> > > > 1a. To enforce broker log policy, theoretically we don't
> need
> >> >> >> > per-message
> >> >> >> > > > LogAppendTime. If we don't include LogAppendTime in
> message, we
> >> >> still
> >> >> >> > > need
> >> >> >> > > > to implement a separate solution to pass log segment
> timestamps
> >> >> among
> >> >> >> > > > brokers. That means if we don't include the LogAppendTime in
> >> >> message,
> >> >> >> > > there
> >> >> >> > > > will be further complication in replication.
> >> >> >> > > > 1b. LogAppendTime has some advantage over CreateTime (KIP-33
> >> has
> >> >> >> detail
> >> >> >> > > > comparison)
> >> >> >> > > > 1c. We have already exposed offset, which is essentially an
> >> >> internal
> >> >> >> > > > concept of message in terms of position. Exposing
> LogAppendTime
> >> >> means
> >> >> >> > we
> >> >> >> > > > expose another internal concept of message in terms of time.
> >> >> >> > > >
> >> >> >> > > > Considering the above reasons, personally I think it worth
> >> adding
> >> >> the
> >> >> >> > > > LogAppendTime to each message.
> >> >> >> > > >
> >> >> >> > > > Any thoughts?
> >> >> >> > > >
> >> >> >> > > > Thanks,
> >> >> >> > > >
> >> >> >> > > > Jiangjie (Becket) Qin
> >> >> >> > > >
> >> >> >> > > > On Mon, Sep 14, 2015 at 11:44 AM, Jiangjie Qin <
> >> j...@linkedin.com
> >> >> >
> >> >> >> > > wrote:
> >> >> >> > > >
> >> >> >> > > >> I was trying to send last email before KIP hangout so maybe
> >> did
> >> >> not
> >> >> >> > > think
> >> >> >> > > >> it through completely. By the way, the discussion is
> actually
> >> >> more
> >> >> >> > > related
> >> >> >> > > >> to KIP-33, i.e. whether we should index on CreateTime or
> >> >> >> > LogAppendTime.
> >> >> >> > > >> (Although it seems all the discussion are still in this
> >> mailing
> >> >> >> > > thread...)
> >> >> >> > > >> This solution in last email is for indexing on CreateTime.
> It
> >> is
> >> >> >> > > >> essentially what Jay suggested except we use a timestamp
> map
> >> >> instead
> >> >> >> > of
> >> >> >> > > a
> >> >> >> > > >> memory mapped index file. Please ignore the proposal of
> using
> >> a
> >> >> log
> >> >> >> > > >> compacted topic. The solution can be simplified to:
> >> >> >> > > >>
> >> >> >> > > >> Each broker keeps
> >> >> >> > > >> 1. a timestamp index map - Map[TopicPartitionSegment,
> >> >> Map[Timestamp,
> >> >> >> > > >> Offset]]. The timestamp is on minute boundary.
> >> >> >> > > >> 2. A timestamp index file for each segment.
> >> >> >> > > >> When a broker receives a message (both leader or
> follower), it
> >> >> >> checks
> >> >> >> > if
> >> >> >> > > >> the timestamp index map contains the timestamp for current
> >> >> segment.
> >> >> >> > The
> >> >> >> > > >> broker add the offset to the map and append an entry to the
> >> >> >> timestamp
> >> >> >> > > index
> >> >> >> > > >> if the timestamp does not exist. i.e. we only use the index
> >> file
> >> >> as
> >> >> >> a
> >> >> >> > > >> persistent copy of the index timestamp map.
> >> >> >> > > >>
> >> >> >> > > >> When a log segment is deleted, we need to:
> >> >> >> > > >> 1. delete the TopicPartitionKeySegment key in the timestamp
> >> index
> >> >> >> map.
> >> >> >> > > >> 2. delete the timestamp index file
> >> >> >> > > >>
> >> >> >> > > >> This solution assumes we only keep CreateTime in the
> message.
> >> >> There
> >> >> >> > are
> >> >> >> > > a
> >> >> >> > > >> few trade-offs in this solution:
> >> >> >> > > >> 1. The granularity of search will be per minute.
> >> >> >> > > >> 2. All the timestamp index map has to be in the memory all
> the
> >> >> time.
> >> >> >> > > >> 3. We need to think about another way to honor log
> retention
> >> time
> >> >> >> and
> >> >> >> > > >> time-based log rolling.
> >> >> >> > > >> 4. We lose the benefit brought by including LogAppendTime
> in
> >> the
> >> >> >> > message
> >> >> >> > > >> mentioned earlier.
> >> >> >> > > >>
> >> >> >> > > >> I am not sure whether this solution is necessarily better
> than
> >> >> >> > indexing
> >> >> >> > > >> on LogAppendTime.
> >> >> >> > > >>
> >> >> >> > > >> I will update KIP-33 to explain the solution to index on
> >> >> CreateTime
> >> >> >> > and
> >> >> >> > > >> LogAppendTime respectively and put some more concrete use
> >> cases
> >> >> as
> >> >> >> > well.
> >> >> >> > > >>
> >> >> >> > > >> Thanks,
> >> >> >> > > >>
> >> >> >> > > >> Jiangjie (Becket) Qin
> >> >> >> > > >>
> >> >> >> > > >>
> >> >> >> > > >> On Mon, Sep 14, 2015 at 9:40 AM, Jiangjie Qin <
> >> j...@linkedin.com
> >> >> >
> >> >> >> > > wrote:
> >> >> >> > > >>
> >> >> >> > > >>> Hi Joel,
> >> >> >> > > >>>
> >> >> >> > > >>> Good point about rebuilding index. I agree that having a
> per
> >> >> >> message
> >> >> >> > > >>> LogAppendTime might be necessary. About time adjustment,
> the
> >> >> >> solution
> >> >> >> > > >>> sounds promising, but it might be better to make it as a
> >> follow
> >> >> up
> >> >> >> of
> >> >> >> > > the
> >> >> >> > > >>> KIP because it seems a really rare use case.
> >> >> >> > > >>>
> >> >> >> > > >>> I have another thought on how to manage the out of order
> >> >> >> timestamps.
> >> >> >> > > >>> Maybe we can do the following:
> >> >> >> > > >>> Create a special log compacted topic __timestamp_index
> >> similar
> >> >> to
> >> >> >> > > topic,
> >> >> >> > > >>> the key would be (TopicPartition,
> >> TimeStamp_Rounded_To_Minute),
> >> >> the
> >> >> >> > > value
> >> >> >> > > >>> is offset. In memory, we keep a map for each
> TopicPartition,
> >> the
> >> >> >> > value
> >> >> >> > > is
> >> >> >> > > >>> (timestamp_rounded_to_minute ->
> >> smallest_offset_in_the_minute).
> >> >> >> This
> >> >> >> > > way we
> >> >> >> > > >>> can search out of order message and make sure no message
> is
> >> >> >> missing.
> >> >> >> > > >>>
> >> >> >> > > >>> Thoughts?
> >> >> >> > > >>>
> >> >> >> > > >>> Thanks,
> >> >> >> > > >>>
> >> >> >> > > >>> Jiangjie (Becket) Qin
> >> >> >> > > >>>
> >> >> >> > > >>> On Fri, Sep 11, 2015 at 12:46 PM, Joel Koshy <
> >> >> jjkosh...@gmail.com>
> >> >> >> > > >>> wrote:
> >> >> >> > > >>>
> >> >> >> > > >>>> Jay had mentioned the scenario of mirror-maker bootstrap
> >> which
> >> >> >> would
> >> >> >> > > >>>> effectively reset the logAppendTimestamps for the
> >> bootstrapped
> >> >> >> data.
> >> >> >> > > >>>> If we don't include logAppendTimestamps in each message
> >> there
> >> >> is a
> >> >> >> > > >>>> similar scenario when rebuilding indexes during recovery.
> >> So it
> >> >> >> > seems
> >> >> >> > > >>>> it may be worth adding that timestamp to messages. The
> >> >> drawback to
> >> >> >> > > >>>> that is exposing a server-side concept in the protocol
> >> >> (although
> >> >> >> we
> >> >> >> > > >>>> already do that with offsets). logAppendTimestamp really
> >> >> should be
> >> >> >> > > >>>> decided by the broker so I think the first scenario may
> have
> >> >> to be
> >> >> >> > > >>>> written off as a gotcha, but the second may be worth
> >> addressing
> >> >> >> (by
> >> >> >> > > >>>> adding it to the message format).
> >> >> >> > > >>>>
> >> >> >> > > >>>> The other point that Jay raised which needs to be
> addressed
> >> >> (since
> >> >> >> > we
> >> >> >> > > >>>> require monotically increasing timestamps in the index)
> in
> >> the
> >> >> >> > > >>>> proposal is changing time on the server (I'm a little
> less
> >> >> >> concerned
> >> >> >> > > >>>> about NTP clock skews than a user explicitly changing the
> >> >> server's
> >> >> >> > > >>>> time - i.e., big clock skews). We would at least want to
> >> "set
> >> >> >> back"
> >> >> >> > > >>>> all the existing timestamps to guarantee non-decreasing
> >> >> timestamps
> >> >> >> > > >>>> with future messages. I'm not sure at this point how
> best to
> >> >> >> handle
> >> >> >> > > >>>> that, but we could perhaps have a epoch/base-time (or
> >> >> >> > time-correction)
> >> >> >> > > >>>> stored in the log directories and base all log index
> >> timestamps
> >> >> >> off
> >> >> >> > > >>>> that base-time (or corrected). So if at any time you
> >> determine
> >> >> >> that
> >> >> >> > > >>>> time has changed backwards you can adjust that base-time
> >> >> without
> >> >> >> > > >>>> having to fix up all the entries. Without knowing the
> exact
> >> >> diff
> >> >> >> > > >>>> between the previous clock and new clock we cannot adjust
> >> the
> >> >> >> times
> >> >> >> > > >>>> exactly, but we can at least ensure increasing
> timestamps.
> >> >> >> > > >>>>
> >> >> >> > > >>>> On Fri, Sep 11, 2015 at 10:52 AM, Jiangjie Qin
> >> >> >> > > >>>> <j...@linkedin.com.invalid> wrote:
> >> >> >> > > >>>> > Ewen and Jay,
> >> >> >> > > >>>> >
> >> >> >> > > >>>> > They way I see the LogAppendTime is another format of
> >> >> "offset".
> >> >> >> It
> >> >> >> > > >>>> serves
> >> >> >> > > >>>> > the following purpose:
> >> >> >> > > >>>> > 1. Locate messages not only by position, but also by
> time.
> >> >> The
> >> >> >> > > >>>> difference
> >> >> >> > > >>>> > from offset is timestamp is not unique for all messags.
> >> >> >> > > >>>> > 2. Allow broker to manage messages based on time, e.g.
> >> >> >> retention,
> >> >> >> > > >>>> rolling
> >> >> >> > > >>>> > 3. Provide convenience for user to search message not
> >> only by
> >> >> >> > > offset,
> >> >> >> > > >>>> but
> >> >> >> > > >>>> > also by timestamp.
> >> >> >> > > >>>> >
> >> >> >> > > >>>> > For purpose (2) we don't need per message server
> >> timestamp.
> >> >> We
> >> >> >> > only
> >> >> >> > > >>>> need
> >> >> >> > > >>>> > per log segment server timestamp and propagate it among
> >> >> brokers.
> >> >> >> > > >>>> >
> >> >> >> > > >>>> > For (1) and (3), we need per message timestamp. Then
> the
> >> >> >> question
> >> >> >> > is
> >> >> >> > > >>>> > whether we should use CreateTime or LogAppendTime?
> >> >> >> > > >>>> >
> >> >> >> > > >>>> > I completely agree that an application timestamp is
> very
> >> >> useful
> >> >> >> > for
> >> >> >> > > >>>> many
> >> >> >> > > >>>> > use cases. But it seems to me that having Kafka to
> >> understand
> >> >> >> and
> >> >> >> > > >>>> maintain
> >> >> >> > > >>>> > application timestamp is a bit over demanding. So I
> think
> >> >> there
> >> >> >> is
> >> >> >> > > >>>> value to
> >> >> >> > > >>>> > pass on CreateTime for application convenience, but I
> am
> >> not
> >> >> >> sure
> >> >> >> > it
> >> >> >> > > >>>> can
> >> >> >> > > >>>> > replace LogAppendTime. Managing out-of-order
> CreateTime is
> >> >> >> > > equivalent
> >> >> >> > > >>>> to
> >> >> >> > > >>>> > allowing producer to send their own offset and ask
> broker
> >> to
> >> >> >> > manage
> >> >> >> > > >>>> the
> >> >> >> > > >>>> > offset for them, It is going to be very hard to
> maintain
> >> and
> >> >> >> could
> >> >> >> > > >>>> create
> >> >> >> > > >>>> > huge performance/functional issue because of
> complicated
> >> >> logic.
> >> >> >> > > >>>> >
> >> >> >> > > >>>> > About whether we should expose LogAppendTime to
> broker, I
> >> >> agree
> >> >> >> > that
> >> >> >> > > >>>> server
> >> >> >> > > >>>> > timestamp is internal to broker, but isn't offset also
> an
> >> >> >> internal
> >> >> >> > > >>>> concept?
> >> >> >> > > >>>> > Arguably it's not provided by producer so consumer
> >> >> application
> >> >> >> > logic
> >> >> >> > > >>>> does
> >> >> >> > > >>>> > not have to know offset. But user needs to know offset
> >> >> because
> >> >> >> > they
> >> >> >> > > >>>> need to
> >> >> >> > > >>>> > know "where is the message" in the log. LogAppendTime
> >> >> provides
> >> >> >> the
> >> >> >> > > >>>> answer
> >> >> >> > > >>>> > of "When was the message appended" to the log. So
> >> personally
> >> >> I
> >> >> >> > think
> >> >> >> > > >>>> it is
> >> >> >> > > >>>> > reasonable to expose the LogAppendTime to consumers.
> >> >> >> > > >>>> >
> >> >> >> > > >>>> > I can see some use cases of exposing the
> LogAppendTime, to
> >> >> name
> >> >> >> > > some:
> >> >> >> > > >>>> > 1. Let's say broker has 7 days of log retention, some
> >> >> >> application
> >> >> >> > > >>>> wants to
> >> >> >> > > >>>> > reprocess the data in past 3 days. User can simply
> provide
> >> >> the
> >> >> >> > > >>>> timestamp
> >> >> >> > > >>>> > and start consume.
> >> >> >> > > >>>> > 2. User can easily know lag by time.
> >> >> >> > > >>>> > 3. Cross cluster fail over. This is a more complicated
> use
> >> >> case,
> >> >> >> > > >>>> there are
> >> >> >> > > >>>> > two goals: 1) Not lose message; and 2) do not reconsume
> >> tons
> >> >> of
> >> >> >> > > >>>> messages.
> >> >> >> > > >>>> > Only knowing offset of cluster A won't help with
> finding
> >> fail
> >> >> >> over
> >> >> >> > > >>>> point in
> >> >> >> > > >>>> > cluster B  because an offset of a cluster means
> nothing to
> >> >> >> another
> >> >> >> > > >>>> cluster.
> >> >> >> > > >>>> > Timestamp however is a good cross cluster reference in
> >> this
> >> >> >> case.
> >> >> >> > > >>>> >
> >> >> >> > > >>>> > Thanks,
> >> >> >> > > >>>> >
> >> >> >> > > >>>> > Jiangjie (Becket) Qin
> >> >> >> > > >>>> >
> >> >> >> > > >>>> > On Thu, Sep 10, 2015 at 9:28 PM, Ewen Cheslack-Postava
> <
> >> >> >> > > >>>> e...@confluent.io>
> >> >> >> > > >>>> > wrote:
> >> >> >> > > >>>> >
> >> >> >> > > >>>> >> Re: MM preserving timestamps: Yes, this was how I
> >> >> interpreted
> >> >> >> the
> >> >> >> > > >>>> point in
> >> >> >> > > >>>> >> the KIP and I only raised the issue because it
> restricts
> >> the
> >> >> >> > > >>>> usefulness of
> >> >> >> > > >>>> >> timestamps anytime MM is involved. I agree it's not a
> >> deal
> >> >> >> > breaker,
> >> >> >> > > >>>> but I
> >> >> >> > > >>>> >> wanted to understand exact impact of the change. Some
> >> users
> >> >> >> seem
> >> >> >> > to
> >> >> >> > > >>>> want to
> >> >> >> > > >>>> >> be able to seek by application-defined timestamps
> >> (despite
> >> >> the
> >> >> >> > many
> >> >> >> > > >>>> obvious
> >> >> >> > > >>>> >> issues involved), and the proposal clearly would not
> >> support
> >> >> >> that
> >> >> >> > > >>>> unless
> >> >> >> > > >>>> >> the timestamps submitted with the produce requests
> were
> >> >> >> > respected.
> >> >> >> > > >>>> If we
> >> >> >> > > >>>> >> ignore client submitted timestamps, then we probably
> >> want to
> >> >> >> try
> >> >> >> > to
> >> >> >> > > >>>> hide
> >> >> >> > > >>>> >> the timestamps as much as possible in any public
> >> interface
> >> >> >> (e.g.
> >> >> >> > > >>>> never
> >> >> >> > > >>>> >> shows up in any public consumer APIs), but expose it
> just
> >> >> >> enough
> >> >> >> > to
> >> >> >> > > >>>> be
> >> >> >> > > >>>> >> useful for operational purposes.
> >> >> >> > > >>>> >>
> >> >> >> > > >>>> >> Sorry if my devil's advocate position / attempt to map
> >> the
> >> >> >> design
> >> >> >> > > >>>> space led
> >> >> >> > > >>>> >> to some confusion!
> >> >> >> > > >>>> >>
> >> >> >> > > >>>> >> -Ewen
> >> >> >> > > >>>> >>
> >> >> >> > > >>>> >>
> >> >> >> > > >>>> >> On Thu, Sep 10, 2015 at 5:48 PM, Jay Kreps <
> >> >> j...@confluent.io>
> >> >> >> > > wrote:
> >> >> >> > > >>>> >>
> >> >> >> > > >>>> >> > Ah, I see, I think I misunderstood about MM, it was
> >> called
> >> >> >> out
> >> >> >> > in
> >> >> >> > > >>>> the
> >> >> >> > > >>>> >> > proposal and I thought you were saying you'd retain
> the
> >> >> >> > timestamp
> >> >> >> > > >>>> but I
> >> >> >> > > >>>> >> > think you're calling out that you're not. In that
> case
> >> >> you do
> >> >> >> > > have
> >> >> >> > > >>>> the
> >> >> >> > > >>>> >> > opposite problem, right? When you add mirroring for
> a
> >> >> topic
> >> >> >> all
> >> >> >> > > >>>> that data
> >> >> >> > > >>>> >> > will have a timestamp of now and retention won't be
> >> right.
> >> >> >> Not
> >> >> >> > a
> >> >> >> > > >>>> blocker
> >> >> >> > > >>>> >> > but a bit of a gotcha.
> >> >> >> > > >>>> >> >
> >> >> >> > > >>>> >> > -Jay
> >> >> >> > > >>>> >> >
> >> >> >> > > >>>> >> >
> >> >> >> > > >>>> >> >
> >> >> >> > > >>>> >> > On Thu, Sep 10, 2015 at 5:40 PM, Joel Koshy <
> >> >> >> > jjkosh...@gmail.com
> >> >> >> > > >
> >> >> >> > > >>>> wrote:
> >> >> >> > > >>>> >> >
> >> >> >> > > >>>> >> > > > Don't you see all the same issues you see with
> >> >> >> > client-defined
> >> >> >> > > >>>> >> > timestamp's
> >> >> >> > > >>>> >> > > > if you let mm control the timestamp as you were
> >> >> >> proposing?
> >> >> >> > > >>>> That means
> >> >> >> > > >>>> >> > > time
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> > > Actually I don't think that was in the proposal
> (or
> >> was
> >> >> >> it?).
> >> >> >> > > >>>> i.e., I
> >> >> >> > > >>>> >> > > think it was always supposed to be controlled by
> the
> >> >> broker
> >> >> >> > > (and
> >> >> >> > > >>>> not
> >> >> >> > > >>>> >> > > MM).
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> > > > Also, Joel, can you just confirm that you guys
> have
> >> >> >> talked
> >> >> >> > > >>>> through
> >> >> >> > > >>>> >> the
> >> >> >> > > >>>> >> > > > whole timestamp thing with the Samza folks at
> LI?
> >> The
> >> >> >> > reason
> >> >> >> > > I
> >> >> >> > > >>>> ask
> >> >> >> > > >>>> >> > about
> >> >> >> > > >>>> >> > > > this is that Samza and Kafka Streams (KIP-28)
> are
> >> both
> >> >> >> > trying
> >> >> >> > > >>>> to rely
> >> >> >> > > >>>> >> > on
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> > > We have not. This is a good point - we will
> >> follow-up.
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> > > > WRT your idea of a FollowerFetchRequestI had
> >> thought
> >> >> of a
> >> >> >> > > >>>> similar
> >> >> >> > > >>>> >> idea
> >> >> >> > > >>>> >> > > > where we use the leader's timestamps to
> >> approximately
> >> >> set
> >> >> >> > the
> >> >> >> > > >>>> >> > follower's
> >> >> >> > > >>>> >> > > > timestamps. I had thought of just adding a
> >> partition
> >> >> >> > metadata
> >> >> >> > > >>>> request
> >> >> >> > > >>>> >> > > that
> >> >> >> > > >>>> >> > > > would subsume the current offset/time lookup and
> >> >> could be
> >> >> >> > > used
> >> >> >> > > >>>> by the
> >> >> >> > > >>>> >> > > > follower to try to approximately keep their
> >> timestamps
> >> >> >> > > kosher.
> >> >> >> > > >>>> It's a
> >> >> >> > > >>>> >> > > > little hacky and doesn't help with MM but it is
> >> also
> >> >> >> maybe
> >> >> >> > > less
> >> >> >> > > >>>> >> > invasive
> >> >> >> > > >>>> >> > > so
> >> >> >> > > >>>> >> > > > that approach could be viable.
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> > > That would also work, but perhaps responding with
> the
> >> >> >> actual
> >> >> >> > > >>>> leader
> >> >> >> > > >>>> >> > > offset-timestamp entries (corresponding to the
> >> fetched
> >> >> >> > portion)
> >> >> >> > > >>>> would
> >> >> >> > > >>>> >> > > be exact and it should be small as well. Anyway,
> the
> >> >> main
> >> >> >> > > >>>> motivation
> >> >> >> > > >>>> >> > > in this was to avoid leaking server-side
> timestamps
> >> to
> >> >> the
> >> >> >> > > >>>> >> > > message-format if people think it is worth it so
> the
> >> >> >> > > >>>> alternatives are
> >> >> >> > > >>>> >> > > implementation details. My original instinct was
> >> that it
> >> >> >> also
> >> >> >> > > >>>> avoids a
> >> >> >> > > >>>> >> > > backwards incompatible change (but it does not
> >> because
> >> >> we
> >> >> >> > also
> >> >> >> > > >>>> have
> >> >> >> > > >>>> >> > > the relative offset change).
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> > > Thanks,
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> > > Joel
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> > > >
> >> >> >> > > >>>> >> > > >
> >> >> >> > > >>>> >> > > >
> >> >> >> > > >>>> >> > > > On Thu, Sep 10, 2015 at 3:36 PM, Joel Koshy <
> >> >> >> > > >>>> jjkosh...@gmail.com>
> >> >> >> > > >>>> >> > wrote:
> >> >> >> > > >>>> >> > > >
> >> >> >> > > >>>> >> > > >> I just wanted to comment on a few points made
> >> >> earlier in
> >> >> >> > > this
> >> >> >> > > >>>> >> thread:
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > > >> Concerns on clock skew: at least for the
> original
> >> >> >> > proposal's
> >> >> >> > > >>>> scope
> >> >> >> > > >>>> >> > > >> (which was more for honoring retention
> >> broker-side)
> >> >> this
> >> >> >> > > >>>> would only
> >> >> >> > > >>>> >> be
> >> >> >> > > >>>> >> > > >> an issue when spanning leader movements right?
> >> i.e.,
> >> >> >> > leader
> >> >> >> > > >>>> >> migration
> >> >> >> > > >>>> >> > > >> latency has to be much less than clock skew for
> >> this
> >> >> to
> >> >> >> > be a
> >> >> >> > > >>>> real
> >> >> >> > > >>>> >> > > >> issue wouldn’t it?
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > > >> Client timestamp vs broker timestamp: I’m not
> sure
> >> >> Kafka
> >> >> >> > > >>>> (brokers)
> >> >> >> > > >>>> >> are
> >> >> >> > > >>>> >> > > >> the right place to reason about client-side
> >> >> timestamps
> >> >> >> > > >>>> precisely due
> >> >> >> > > >>>> >> > > >> to the nuances that have been discussed at
> length
> >> in
> >> >> >> this
> >> >> >> > > >>>> thread. My
> >> >> >> > > >>>> >> > > >> preference would have been to the timestamp
> (now
> >> >> called
> >> >> >> > > >>>> >> > > >> LogAppendTimestamp) have nothing to do with the
> >> >> >> > > applications.
> >> >> >> > > >>>> Ewen
> >> >> >> > > >>>> >> > > >> raised a valid concern about leaking such
> >> >> >> > > >>>> “private/server-side”
> >> >> >> > > >>>> >> > > >> timestamps into the protocol spec. i.e., it is
> >> fine
> >> >> to
> >> >> >> > have
> >> >> >> > > >>>> the
> >> >> >> > > >>>> >> > > >> CreateTime which is expressly client-provided
> and
> >> >> >> > immutable
> >> >> >> > > >>>> >> > > >> thereafter, but the LogAppendTime is also going
> >> part
> >> >> of
> >> >> >> > the
> >> >> >> > > >>>> protocol
> >> >> >> > > >>>> >> > > >> and it would be good to avoid exposure (to
> client
> >> >> >> > > developers)
> >> >> >> > > >>>> if
> >> >> >> > > >>>> >> > > >> possible. Ok, so here is a slightly different
> >> >> approach
> >> >> >> > that
> >> >> >> > > I
> >> >> >> > > >>>> was
> >> >> >> > > >>>> >> just
> >> >> >> > > >>>> >> > > >> thinking about (and did not think too far so it
> >> may
> >> >> not
> >> >> >> > > >>>> work): do
> >> >> >> > > >>>> >> not
> >> >> >> > > >>>> >> > > >> add the LogAppendTime to messages. Instead,
> build
> >> the
> >> >> >> > > >>>> time-based
> >> >> >> > > >>>> >> index
> >> >> >> > > >>>> >> > > >> on the server side on message arrival time
> alone.
> >> >> >> > Introduce
> >> >> >> > > a
> >> >> >> > > >>>> new
> >> >> >> > > >>>> >> > > >> ReplicaFetchRequest/Response pair.
> >> >> ReplicaFetchResponses
> >> >> >> > > will
> >> >> >> > > >>>> also
> >> >> >> > > >>>> >> > > >> include the slice of the time-based index for
> the
> >> >> >> follower
> >> >> >> > > >>>> broker.
> >> >> >> > > >>>> >> > > >> This way we can at least keep timestamps
> aligned
> >> >> across
> >> >> >> > > >>>> brokers for
> >> >> >> > > >>>> >> > > >> retention purposes. We do lose the append
> >> timestamp
> >> >> for
> >> >> >> > > >>>> mirroring
> >> >> >> > > >>>> >> > > >> pipelines (which appears to be the case in
> KIP-32
> >> as
> >> >> >> > well).
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > > >> Configurable index granularity: We can do this
> but
> >> >> I’m
> >> >> >> not
> >> >> >> > > >>>> sure it
> >> >> >> > > >>>> >> is
> >> >> >> > > >>>> >> > > >> very useful and as Jay noted, a major change
> from
> >> the
> >> >> >> old
> >> >> >> > > >>>> proposal
> >> >> >> > > >>>> >> > > >> linked from the KIP is the sparse time-based
> index
> >> >> which
> >> >> >> > we
> >> >> >> > > >>>> felt was
> >> >> >> > > >>>> >> > > >> essential to bound memory usage (and having
> >> >> timestamps
> >> >> >> on
> >> >> >> > > >>>> each log
> >> >> >> > > >>>> >> > > >> index entry was probably a big waste since in
> the
> >> >> common
> >> >> >> > > case
> >> >> >> > > >>>> >> several
> >> >> >> > > >>>> >> > > >> messages span the same timestamp). BTW another
> >> >> benefit
> >> >> >> of
> >> >> >> > > the
> >> >> >> > > >>>> second
> >> >> >> > > >>>> >> > > >> index is that it makes it easier to roll-back
> or
> >> >> throw
> >> >> >> > away
> >> >> >> > > if
> >> >> >> > > >>>> >> > > >> necessary (vs. modifying the existing index
> >> format) -
> >> >> >> > > >>>> although that
> >> >> >> > > >>>> >> > > >> obviously does not help with rolling back the
> >> >> timestamp
> >> >> >> > > >>>> change in
> >> >> >> > > >>>> >> the
> >> >> >> > > >>>> >> > > >> message format, but it is one less thing to
> worry
> >> >> about.
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > > >> Versioning: I’m not sure everyone is saying the
> >> same
> >> >> >> thing
> >> >> >> > > >>>> wrt the
> >> >> >> > > >>>> >> > > >> scope of this. There is the record format
> change,
> >> >> but I
> >> >> >> > also
> >> >> >> > > >>>> think
> >> >> >> > > >>>> >> > > >> this ties into all of the API versioning that
> we
> >> >> already
> >> >> >> > > have
> >> >> >> > > >>>> in
> >> >> >> > > >>>> >> > > >> Kafka. The current API versioning approach
> works
> >> fine
> >> >> >> for
> >> >> >> > > >>>> >> > > >> upgrades/downgrades across official Kafka
> >> releases,
> >> >> but
> >> >> >> > not
> >> >> >> > > >>>> so well
> >> >> >> > > >>>> >> > > >> between releases. (We almost got bitten by
> this at
> >> >> >> > LinkedIn
> >> >> >> > > >>>> with the
> >> >> >> > > >>>> >> > > >> recent changes to various requests but were
> able
> >> to
> >> >> work
> >> >> >> > > >>>> around
> >> >> >> > > >>>> >> > > >> these.) We can clarify this in the follow-up
> KIP.
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > > >> Thanks,
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > > >> Joel
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > > >> On Thu, Sep 10, 2015 at 3:00 PM, Jiangjie Qin
> >> >> >> > > >>>> >> > <j...@linkedin.com.invalid
> >> >> >> > > >>>> >> > > >
> >> >> >> > > >>>> >> > > >> wrote:
> >> >> >> > > >>>> >> > > >> > Hi Jay,
> >> >> >> > > >>>> >> > > >> >
> >> >> >> > > >>>> >> > > >> > I just changed the KIP title and updated the
> KIP
> >> >> page.
> >> >> >> > > >>>> >> > > >> >
> >> >> >> > > >>>> >> > > >> > And yes, we are working on a general version
> >> >> control
> >> >> >> > > >>>> proposal to
> >> >> >> > > >>>> >> > make
> >> >> >> > > >>>> >> > > the
> >> >> >> > > >>>> >> > > >> > protocol migration like this more smooth. I
> will
> >> >> also
> >> >> >> > > >>>> create a KIP
> >> >> >> > > >>>> >> > for
> >> >> >> > > >>>> >> > > >> that
> >> >> >> > > >>>> >> > > >> > soon.
> >> >> >> > > >>>> >> > > >> >
> >> >> >> > > >>>> >> > > >> > Thanks,
> >> >> >> > > >>>> >> > > >> >
> >> >> >> > > >>>> >> > > >> > Jiangjie (Becket) Qin
> >> >> >> > > >>>> >> > > >> >
> >> >> >> > > >>>> >> > > >> >
> >> >> >> > > >>>> >> > > >> > On Thu, Sep 10, 2015 at 2:21 PM, Jay Kreps <
> >> >> >> > > >>>> j...@confluent.io>
> >> >> >> > > >>>> >> > wrote:
> >> >> >> > > >>>> >> > > >> >
> >> >> >> > > >>>> >> > > >> >> Great, can we change the name to something
> >> >> related to
> >> >> >> > the
> >> >> >> > > >>>> >> > > >> change--"KIP-31:
> >> >> >> > > >>>> >> > > >> >> Move to relative offsets in compressed
> message
> >> >> sets".
> >> >> >> > > >>>> >> > > >> >>
> >> >> >> > > >>>> >> > > >> >> Also you had mentioned before you were
> going to
> >> >> >> expand
> >> >> >> > on
> >> >> >> > > >>>> the
> >> >> >> > > >>>> >> > > mechanics
> >> >> >> > > >>>> >> > > >> of
> >> >> >> > > >>>> >> > > >> >> handling these log format changes, right?
> >> >> >> > > >>>> >> > > >> >>
> >> >> >> > > >>>> >> > > >> >> -Jay
> >> >> >> > > >>>> >> > > >> >>
> >> >> >> > > >>>> >> > > >> >> On Thu, Sep 10, 2015 at 12:42 PM, Jiangjie
> Qin
> >> >> >> > > >>>> >> > > >> <j...@linkedin.com.invalid>
> >> >> >> > > >>>> >> > > >> >> wrote:
> >> >> >> > > >>>> >> > > >> >>
> >> >> >> > > >>>> >> > > >> >> > Neha and Jay,
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > Thanks a lot for the feedback. Good point
> >> about
> >> >> >> > > >>>> splitting the
> >> >> >> > > >>>> >> > > >> >> discussion. I
> >> >> >> > > >>>> >> > > >> >> > have split the proposal to three KIPs and
> it
> >> >> does
> >> >> >> > make
> >> >> >> > > >>>> each
> >> >> >> > > >>>> >> > > discussion
> >> >> >> > > >>>> >> > > >> >> more
> >> >> >> > > >>>> >> > > >> >> > clear:
> >> >> >> > > >>>> >> > > >> >> > KIP-31 - Message format change (Use
> relative
> >> >> >> offset)
> >> >> >> > > >>>> >> > > >> >> > KIP-32 - Add CreateTime and LogAppendTime
> to
> >> >> Kafka
> >> >> >> > > >>>> message
> >> >> >> > > >>>> >> > > >> >> > KIP-33 - Build a time-based log index
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > KIP-33 can be a follow up KIP for KIP-32,
> so
> >> we
> >> >> can
> >> >> >> > > >>>> discuss
> >> >> >> > > >>>> >> about
> >> >> >> > > >>>> >> > > >> KIP-31
> >> >> >> > > >>>> >> > > >> >> > and KIP-32 first for now. I will create a
> >> >> separate
> >> >> >> > > >>>> discussion
> >> >> >> > > >>>> >> > > thread
> >> >> >> > > >>>> >> > > >> for
> >> >> >> > > >>>> >> > > >> >> > KIP-32 and reply the concerns you raised
> >> >> regarding
> >> >> >> > the
> >> >> >> > > >>>> >> timestamp.
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > So far it looks there is no objection to
> >> KIP-31.
> >> >> >> > Since
> >> >> >> > > I
> >> >> >> > > >>>> >> removed
> >> >> >> > > >>>> >> > a
> >> >> >> > > >>>> >> > > few
> >> >> >> > > >>>> >> > > >> >> part
> >> >> >> > > >>>> >> > > >> >> > from previous KIP and only left the
> relative
> >> >> offset
> >> >> >> > > >>>> proposal,
> >> >> >> > > >>>> >> it
> >> >> >> > > >>>> >> > > >> would be
> >> >> >> > > >>>> >> > > >> >> > great if people can take another look to
> see
> >> if
> >> >> >> there
> >> >> >> > > is
> >> >> >> > > >>>> any
> >> >> >> > > >>>> >> > > concerns.
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > Thanks,
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > Jiangjie (Becket) Qin
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > On Tue, Sep 8, 2015 at 1:28 PM, Neha
> >> Narkhede <
> >> >> >> > > >>>> >> n...@confluent.io
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> > > >> wrote:
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > Becket,
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > > Nice write-up. Few thoughts -
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > > I'd split up the discussion for
> simplicity.
> >> >> Note
> >> >> >> > that
> >> >> >> > > >>>> you can
> >> >> >> > > >>>> >> > > always
> >> >> >> > > >>>> >> > > >> >> > group
> >> >> >> > > >>>> >> > > >> >> > > several of these in one patch to reduce
> the
> >> >> >> > protocol
> >> >> >> > > >>>> changes
> >> >> >> > > >>>> >> > > people
> >> >> >> > > >>>> >> > > >> >> have
> >> >> >> > > >>>> >> > > >> >> > to
> >> >> >> > > >>>> >> > > >> >> > > deal with.This is just a suggestion,
> but I
> >> >> think
> >> >> >> > the
> >> >> >> > > >>>> >> following
> >> >> >> > > >>>> >> > > split
> >> >> >> > > >>>> >> > > >> >> > might
> >> >> >> > > >>>> >> > > >> >> > > make it easier to tackle the changes
> being
> >> >> >> > proposed -
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > >    - Relative offsets
> >> >> >> > > >>>> >> > > >> >> > >    - Introducing the concept of time
> >> >> >> > > >>>> >> > > >> >> > >    - Time-based indexing (separate the
> >> usage
> >> >> of
> >> >> >> the
> >> >> >> > > >>>> timestamp
> >> >> >> > > >>>> >> > > field
> >> >> >> > > >>>> >> > > >> >> from
> >> >> >> > > >>>> >> > > >> >> > >    how/whether we want to include a
> >> timestamp
> >> >> in
> >> >> >> > the
> >> >> >> > > >>>> message)
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > > I'm a +1 on relative offsets, we
> should've
> >> >> done
> >> >> >> it
> >> >> >> > > >>>> back when
> >> >> >> > > >>>> >> we
> >> >> >> > > >>>> >> > > >> >> > introduced
> >> >> >> > > >>>> >> > > >> >> > > it. Other than reducing the CPU
> overhead,
> >> this
> >> >> >> will
> >> >> >> > > >>>> also
> >> >> >> > > >>>> >> reduce
> >> >> >> > > >>>> >> > > the
> >> >> >> > > >>>> >> > > >> >> > garbage
> >> >> >> > > >>>> >> > > >> >> > > collection overhead on the brokers.
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > > On the timestamp field, I generally
> agree
> >> >> that we
> >> >> >> > > >>>> should add
> >> >> >> > > >>>> >> a
> >> >> >> > > >>>> >> > > >> >> timestamp
> >> >> >> > > >>>> >> > > >> >> > to
> >> >> >> > > >>>> >> > > >> >> > > a Kafka message but I'm not quite sold
> on
> >> how
> >> >> >> this
> >> >> >> > > KIP
> >> >> >> > > >>>> >> suggests
> >> >> >> > > >>>> >> > > the
> >> >> >> > > >>>> >> > > >> >> > > timestamp be set. Will avoid repeating
> the
> >> >> >> > downsides
> >> >> >> > > >>>> of a
> >> >> >> > > >>>> >> > broker
> >> >> >> > > >>>> >> > > >> side
> >> >> >> > > >>>> >> > > >> >> > > timestamp mentioned previously in this
> >> >> thread. I
> >> >> >> > > think
> >> >> >> > > >>>> the
> >> >> >> > > >>>> >> > topic
> >> >> >> > > >>>> >> > > of
> >> >> >> > > >>>> >> > > >> >> > > including a timestamp in a Kafka message
> >> >> >> requires a
> >> >> >> > > >>>> lot more
> >> >> >> > > >>>> >> > > thought
> >> >> >> > > >>>> >> > > >> >> and
> >> >> >> > > >>>> >> > > >> >> > > details than what's in this KIP. I'd
> >> suggest
> >> >> we
> >> >> >> > make
> >> >> >> > > >>>> it a
> >> >> >> > > >>>> >> > > separate
> >> >> >> > > >>>> >> > > >> KIP
> >> >> >> > > >>>> >> > > >> >> > that
> >> >> >> > > >>>> >> > > >> >> > > includes a list of all the different use
> >> cases
> >> >> >> for
> >> >> >> > > the
> >> >> >> > > >>>> >> > timestamp
> >> >> >> > > >>>> >> > > >> >> (beyond
> >> >> >> > > >>>> >> > > >> >> > > log retention) including stream
> processing
> >> and
> >> >> >> > > discuss
> >> >> >> > > >>>> >> > tradeoffs
> >> >> >> > > >>>> >> > > of
> >> >> >> > > >>>> >> > > >> >> > > including client and broker side
> >> timestamps.
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > > Agree with the benefit of time-based
> >> indexing,
> >> >> >> but
> >> >> >> > > >>>> haven't
> >> >> >> > > >>>> >> had
> >> >> >> > > >>>> >> > a
> >> >> >> > > >>>> >> > > >> chance
> >> >> >> > > >>>> >> > > >> >> > to
> >> >> >> > > >>>> >> > > >> >> > > dive into the design details yet.
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > > Thanks,
> >> >> >> > > >>>> >> > > >> >> > > Neha
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > > On Tue, Sep 8, 2015 at 10:57 AM, Jay
> Kreps
> >> <
> >> >> >> > > >>>> j...@confluent.io
> >> >> >> > > >>>> >> >
> >> >> >> > > >>>> >> > > >> wrote:
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > > > Hey Beckett,
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > I was proposing splitting up the KIP
> just
> >> >> for
> >> >> >> > > >>>> simplicity of
> >> >> >> > > >>>> >> > > >> >> discussion.
> >> >> >> > > >>>> >> > > >> >> > > You
> >> >> >> > > >>>> >> > > >> >> > > > can still implement them in one
> patch. I
> >> >> think
> >> >> >> > > >>>> otherwise it
> >> >> >> > > >>>> >> > > will
> >> >> >> > > >>>> >> > > >> be
> >> >> >> > > >>>> >> > > >> >> > hard
> >> >> >> > > >>>> >> > > >> >> > > to
> >> >> >> > > >>>> >> > > >> >> > > > discuss/vote on them since if you like
> >> the
> >> >> >> offset
> >> >> >> > > >>>> proposal
> >> >> >> > > >>>> >> > but
> >> >> >> > > >>>> >> > > not
> >> >> >> > > >>>> >> > > >> >> the
> >> >> >> > > >>>> >> > > >> >> > > time
> >> >> >> > > >>>> >> > > >> >> > > > proposal what do you do?
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > Introducing a second notion of time
> into
> >> >> Kafka
> >> >> >> > is a
> >> >> >> > > >>>> pretty
> >> >> >> > > >>>> >> > > massive
> >> >> >> > > >>>> >> > > >> >> > > > philosophical change so it kind of
> >> warrants
> >> >> >> it's
> >> >> >> > > own
> >> >> >> > > >>>> KIP I
> >> >> >> > > >>>> >> > > think
> >> >> >> > > >>>> >> > > >> it
> >> >> >> > > >>>> >> > > >> >> > isn't
> >> >> >> > > >>>> >> > > >> >> > > > just "Change message format".
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > WRT time I think one thing to clarify
> in
> >> the
> >> >> >> > > >>>> proposal is
> >> >> >> > > >>>> >> how
> >> >> >> > > >>>> >> > MM
> >> >> >> > > >>>> >> > > >> will
> >> >> >> > > >>>> >> > > >> >> > have
> >> >> >> > > >>>> >> > > >> >> > > > access to set the timestamp?
> Presumably
> >> this
> >> >> >> will
> >> >> >> > > be
> >> >> >> > > >>>> a new
> >> >> >> > > >>>> >> > > field
> >> >> >> > > >>>> >> > > >> in
> >> >> >> > > >>>> >> > > >> >> > > > ProducerRecord, right? If so then any
> >> user
> >> >> can
> >> >> >> > set
> >> >> >> > > >>>> the
> >> >> >> > > >>>> >> > > timestamp,
> >> >> >> > > >>>> >> > > >> >> > right?
> >> >> >> > > >>>> >> > > >> >> > > > I'm not sure you answered the
> questions
> >> >> around
> >> >> >> > how
> >> >> >> > > >>>> this
> >> >> >> > > >>>> >> will
> >> >> >> > > >>>> >> > > work
> >> >> >> > > >>>> >> > > >> for
> >> >> >> > > >>>> >> > > >> >> > MM
> >> >> >> > > >>>> >> > > >> >> > > > since when MM retains timestamps from
> >> >> multiple
> >> >> >> > > >>>> partitions
> >> >> >> > > >>>> >> > they
> >> >> >> > > >>>> >> > > >> will
> >> >> >> > > >>>> >> > > >> >> > then
> >> >> >> > > >>>> >> > > >> >> > > be
> >> >> >> > > >>>> >> > > >> >> > > > out of order and in the past (so the
> >> >> >> > > >>>> >> > max(lastAppendedTimestamp,
> >> >> >> > > >>>> >> > > >> >> > > > currentTimeMillis) override you
> proposed
> >> >> will
> >> >> >> not
> >> >> >> > > >>>> work,
> >> >> >> > > >>>> >> > > right?).
> >> >> >> > > >>>> >> > > >> If
> >> >> >> > > >>>> >> > > >> >> we
> >> >> >> > > >>>> >> > > >> >> > > > don't do this then when you set up
> >> mirroring
> >> >> >> the
> >> >> >> > > >>>> data will
> >> >> >> > > >>>> >> > all
> >> >> >> > > >>>> >> > > be
> >> >> >> > > >>>> >> > > >> new
> >> >> >> > > >>>> >> > > >> >> > and
> >> >> >> > > >>>> >> > > >> >> > > > you have the same retention problem
> you
> >> >> >> > described.
> >> >> >> > > >>>> Maybe I
> >> >> >> > > >>>> >> > > missed
> >> >> >> > > >>>> >> > > >> >> > > > something...?
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > My main motivation is that given that
> >> both
> >> >> >> Samza
> >> >> >> > > and
> >> >> >> > > >>>> Kafka
> >> >> >> > > >>>> >> > > streams
> >> >> >> > > >>>> >> > > >> >> are
> >> >> >> > > >>>> >> > > >> >> > > > doing work that implies a mandatory
> >> >> >> > client-defined
> >> >> >> > > >>>> notion
> >> >> >> > > >>>> >> of
> >> >> >> > > >>>> >> > > >> time, I
> >> >> >> > > >>>> >> > > >> >> > > really
> >> >> >> > > >>>> >> > > >> >> > > > think introducing a different
> mandatory
> >> >> notion
> >> >> >> of
> >> >> >> > > >>>> time in
> >> >> >> > > >>>> >> > > Kafka is
> >> >> >> > > >>>> >> > > >> >> > going
> >> >> >> > > >>>> >> > > >> >> > > to
> >> >> >> > > >>>> >> > > >> >> > > > be quite odd. We should think hard
> about
> >> how
> >> >> >> > > >>>> client-defined
> >> >> >> > > >>>> >> > > time
> >> >> >> > > >>>> >> > > >> >> could
> >> >> >> > > >>>> >> > > >> >> > > > work. I'm not sure if it can, but I'm
> >> also
> >> >> not
> >> >> >> > sure
> >> >> >> > > >>>> that it
> >> >> >> > > >>>> >> > > can't.
> >> >> >> > > >>>> >> > > >> >> > Having
> >> >> >> > > >>>> >> > > >> >> > > > both will be odd. Did you chat about
> this
> >> >> with
> >> >> >> > > >>>> Yi/Kartik on
> >> >> >> > > >>>> >> > the
> >> >> >> > > >>>> >> > > >> Samza
> >> >> >> > > >>>> >> > > >> >> > > side?
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > When you are saying it won't work you
> are
> >> >> >> > assuming
> >> >> >> > > >>>> some
> >> >> >> > > >>>> >> > > particular
> >> >> >> > > >>>> >> > > >> >> > > > implementation? Maybe that the index
> is a
> >> >> >> > > >>>> monotonically
> >> >> >> > > >>>> >> > > increasing
> >> >> >> > > >>>> >> > > >> >> set
> >> >> >> > > >>>> >> > > >> >> > of
> >> >> >> > > >>>> >> > > >> >> > > > pointers to the least record with a
> >> >> timestamp
> >> >> >> > > larger
> >> >> >> > > >>>> than
> >> >> >> > > >>>> >> the
> >> >> >> > > >>>> >> > > >> index
> >> >> >> > > >>>> >> > > >> >> > time?
> >> >> >> > > >>>> >> > > >> >> > > > In other words a search for time X
> gives
> >> the
> >> >> >> > > largest
> >> >> >> > > >>>> offset
> >> >> >> > > >>>> >> > at
> >> >> >> > > >>>> >> > > >> which
> >> >> >> > > >>>> >> > > >> >> > all
> >> >> >> > > >>>> >> > > >> >> > > > records are <= X?
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > For retention, I agree with the
> problem
> >> you
> >> >> >> point
> >> >> >> > > >>>> out, but
> >> >> >> > > >>>> >> I
> >> >> >> > > >>>> >> > > think
> >> >> >> > > >>>> >> > > >> >> what
> >> >> >> > > >>>> >> > > >> >> > > you
> >> >> >> > > >>>> >> > > >> >> > > > are saying in that case is that you
> want
> >> a
> >> >> size
> >> >> >> > > >>>> limit too.
> >> >> >> > > >>>> >> If
> >> >> >> > > >>>> >> > > you
> >> >> >> > > >>>> >> > > >> use
> >> >> >> > > >>>> >> > > >> >> > > > system time you actually hit the same
> >> >> problem:
> >> >> >> > say
> >> >> >> > > >>>> you do a
> >> >> >> > > >>>> >> > > full
> >> >> >> > > >>>> >> > > >> dump
> >> >> >> > > >>>> >> > > >> >> > of
> >> >> >> > > >>>> >> > > >> >> > > a
> >> >> >> > > >>>> >> > > >> >> > > > DB table with a setting of 7 days
> >> retention,
> >> >> >> your
> >> >> >> > > >>>> retention
> >> >> >> > > >>>> >> > > will
> >> >> >> > > >>>> >> > > >> >> > actually
> >> >> >> > > >>>> >> > > >> >> > > > not get enforced for the first 7 days
> >> >> because
> >> >> >> the
> >> >> >> > > >>>> data is
> >> >> >> > > >>>> >> > "new
> >> >> >> > > >>>> >> > > to
> >> >> >> > > >>>> >> > > >> >> > Kafka".
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > -Jay
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > On Mon, Sep 7, 2015 at 10:44 AM,
> Jiangjie
> >> >> Qin
> >> >> >> > > >>>> >> > > >> >> > <j...@linkedin.com.invalid
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > wrote:
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > > > > Jay,
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > Thanks for the comments. Yes, there
> are
> >> >> >> > actually
> >> >> >> > > >>>> three
> >> >> >> > > >>>> >> > > >> proposals as
> >> >> >> > > >>>> >> > > >> >> > you
> >> >> >> > > >>>> >> > > >> >> > > > > pointed out.
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > We will have a separate proposal for
> >> (1) -
> >> >> >> > > version
> >> >> >> > > >>>> >> control
> >> >> >> > > >>>> >> > > >> >> mechanism.
> >> >> >> > > >>>> >> > > >> >> > > We
> >> >> >> > > >>>> >> > > >> >> > > > > actually thought about whether we
> want
> >> to
> >> >> >> > > separate
> >> >> >> > > >>>> 2 and
> >> >> >> > > >>>> >> 3
> >> >> >> > > >>>> >> > > >> >> internally
> >> >> >> > > >>>> >> > > >> >> > > > > before creating the KIP. The reason
> we
> >> >> put 2
> >> >> >> > and
> >> >> >> > > 3
> >> >> >> > > >>>> >> together
> >> >> >> > > >>>> >> > > is
> >> >> >> > > >>>> >> > > >> it
> >> >> >> > > >>>> >> > > >> >> > will
> >> >> >> > > >>>> >> > > >> >> > > > > saves us another cross board wire
> >> protocol
> >> >> >> > > change.
> >> >> >> > > >>>> Like
> >> >> >> > > >>>> >> you
> >> >> >> > > >>>> >> > > >> said,
> >> >> >> > > >>>> >> > > >> >> we
> >> >> >> > > >>>> >> > > >> >> > > have
> >> >> >> > > >>>> >> > > >> >> > > > > to migrate all the clients in all
> >> >> languages.
> >> >> >> To
> >> >> >> > > >>>> some
> >> >> >> > > >>>> >> > extent,
> >> >> >> > > >>>> >> > > the
> >> >> >> > > >>>> >> > > >> >> > effort
> >> >> >> > > >>>> >> > > >> >> > > > to
> >> >> >> > > >>>> >> > > >> >> > > > > spend on upgrading the clients can
> be
> >> even
> >> >> >> > bigger
> >> >> >> > > >>>> than
> >> >> >> > > >>>> >> > > >> implementing
> >> >> >> > > >>>> >> > > >> >> > the
> >> >> >> > > >>>> >> > > >> >> > > > new
> >> >> >> > > >>>> >> > > >> >> > > > > feature itself. So there are some
> >> >> attractions
> >> >> >> > if
> >> >> >> > > >>>> we can
> >> >> >> > > >>>> >> do
> >> >> >> > > >>>> >> > 2
> >> >> >> > > >>>> >> > > >> and 3
> >> >> >> > > >>>> >> > > >> >> > > > together
> >> >> >> > > >>>> >> > > >> >> > > > > instead of separately. Maybe after
> (1)
> >> is
> >> >> >> done
> >> >> >> > it
> >> >> >> > > >>>> will be
> >> >> >> > > >>>> >> > > >> easier to
> >> >> >> > > >>>> >> > > >> >> > do
> >> >> >> > > >>>> >> > > >> >> > > > > protocol migration. But if we are
> able
> >> to
> >> >> >> come
> >> >> >> > to
> >> >> >> > > >>>> an
> >> >> >> > > >>>> >> > > agreement
> >> >> >> > > >>>> >> > > >> on
> >> >> >> > > >>>> >> > > >> >> the
> >> >> >> > > >>>> >> > > >> >> > > > > timestamp solution, I would prefer
> to
> >> >> have it
> >> >> >> > > >>>> together
> >> >> >> > > >>>> >> with
> >> >> >> > > >>>> >> > > >> >> relative
> >> >> >> > > >>>> >> > > >> >> > > > offset
> >> >> >> > > >>>> >> > > >> >> > > > > in the interest of avoiding another
> >> wire
> >> >> >> > protocol
> >> >> >> > > >>>> change
> >> >> >> > > >>>> >> > (the
> >> >> >> > > >>>> >> > > >> >> process
> >> >> >> > > >>>> >> > > >> >> > > to
> >> >> >> > > >>>> >> > > >> >> > > > > migrate to relative offset is
> exactly
> >> the
> >> >> >> same
> >> >> >> > as
> >> >> >> > > >>>> migrate
> >> >> >> > > >>>> >> > to
> >> >> >> > > >>>> >> > > >> >> message
> >> >> >> > > >>>> >> > > >> >> > > with
> >> >> >> > > >>>> >> > > >> >> > > > > timestamp).
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > In terms of timestamp. I completely
> >> agree
> >> >> >> that
> >> >> >> > > >>>> having
> >> >> >> > > >>>> >> > client
> >> >> >> > > >>>> >> > > >> >> > timestamp
> >> >> >> > > >>>> >> > > >> >> > > is
> >> >> >> > > >>>> >> > > >> >> > > > > more useful if we can make sure the
> >> >> timestamp
> >> >> >> > is
> >> >> >> > > >>>> good.
> >> >> >> > > >>>> >> But
> >> >> >> > > >>>> >> > in
> >> >> >> > > >>>> >> > > >> >> reality
> >> >> >> > > >>>> >> > > >> >> > > > that
> >> >> >> > > >>>> >> > > >> >> > > > > can be a really big *IF*. I think
> the
> >> >> problem
> >> >> >> > is
> >> >> >> > > >>>> exactly
> >> >> >> > > >>>> >> as
> >> >> >> > > >>>> >> > > Ewen
> >> >> >> > > >>>> >> > > >> >> > > > mentioned,
> >> >> >> > > >>>> >> > > >> >> > > > > if we let the client to set the
> >> >> timestamp, it
> >> >> >> > > >>>> would be
> >> >> >> > > >>>> >> very
> >> >> >> > > >>>> >> > > hard
> >> >> >> > > >>>> >> > > >> >> for
> >> >> >> > > >>>> >> > > >> >> > > the
> >> >> >> > > >>>> >> > > >> >> > > > > broker to utilize it. If broker
> apply
> >> >> >> retention
> >> >> >> > > >>>> policy
> >> >> >> > > >>>> >> > based
> >> >> >> > > >>>> >> > > on
> >> >> >> > > >>>> >> > > >> the
> >> >> >> > > >>>> >> > > >> >> > > > client
> >> >> >> > > >>>> >> > > >> >> > > > > timestamp. One misbehave producer
> can
> >> >> >> > potentially
> >> >> >> > > >>>> >> > completely
> >> >> >> > > >>>> >> > > >> mess
> >> >> >> > > >>>> >> > > >> >> up
> >> >> >> > > >>>> >> > > >> >> > > the
> >> >> >> > > >>>> >> > > >> >> > > > > retention policy on the broker.
> >> Although
> >> >> >> people
> >> >> >> > > >>>> don't
> >> >> >> > > >>>> >> care
> >> >> >> > > >>>> >> > > about
> >> >> >> > > >>>> >> > > >> >> > server
> >> >> >> > > >>>> >> > > >> >> > > > > side timestamp. People do care a lot
> >> when
> >> >> >> > > timestamp
> >> >> >> > > >>>> >> breaks.
> >> >> >> > > >>>> >> > > >> >> Searching
> >> >> >> > > >>>> >> > > >> >> > > by
> >> >> >> > > >>>> >> > > >> >> > > > > timestamp is a really important use
> >> case
> >> >> even
> >> >> >> > > >>>> though it
> >> >> >> > > >>>> >> is
> >> >> >> > > >>>> >> > > not
> >> >> >> > > >>>> >> > > >> used
> >> >> >> > > >>>> >> > > >> >> > as
> >> >> >> > > >>>> >> > > >> >> > > > > often as searching by offset. It has
> >> >> >> > significant
> >> >> >> > > >>>> direct
> >> >> >> > > >>>> >> > > impact
> >> >> >> > > >>>> >> > > >> on
> >> >> >> > > >>>> >> > > >> >> RTO
> >> >> >> > > >>>> >> > > >> >> > > > when
> >> >> >> > > >>>> >> > > >> >> > > > > there is a cross cluster failover as
> >> Todd
> >> >> >> > > >>>> mentioned.
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > The trick using
> >> max(lastAppendedTimestamp,
> >> >> >> > > >>>> >> > currentTimeMillis)
> >> >> >> > > >>>> >> > > >> is to
> >> >> >> > > >>>> >> > > >> >> > > > > guarantee monotonic increase of the
> >> >> >> timestamp.
> >> >> >> > > Many
> >> >> >> > > >>>> >> > > commercial
> >> >> >> > > >>>> >> > > >> >> system
> >> >> >> > > >>>> >> > > >> >> > > > > actually do something similar to
> this
> >> to
> >> >> >> solve
> >> >> >> > > the
> >> >> >> > > >>>> time
> >> >> >> > > >>>> >> > skew.
> >> >> >> > > >>>> >> > > >> About
> >> >> >> > > >>>> >> > > >> >> > > > > changing the time, I am not sure if
> >> people
> >> >> >> use
> >> >> >> > > NTP
> >> >> >> > > >>>> like
> >> >> >> > > >>>> >> > > using a
> >> >> >> > > >>>> >> > > >> >> watch
> >> >> >> > > >>>> >> > > >> >> > > to
> >> >> >> > > >>>> >> > > >> >> > > > > just set it forward/backward by an
> >> hour or
> >> >> >> so.
> >> >> >> > > The
> >> >> >> > > >>>> time
> >> >> >> > > >>>> >> > > >> adjustment
> >> >> >> > > >>>> >> > > >> >> I
> >> >> >> > > >>>> >> > > >> >> > > used
> >> >> >> > > >>>> >> > > >> >> > > > > to do is typically to adjust
> something
> >> >> like a
> >> >> >> > > >>>> minute  /
> >> >> >> > > >>>> >> > > week. So
> >> >> >> > > >>>> >> > > >> >> for
> >> >> >> > > >>>> >> > > >> >> > > each
> >> >> >> > > >>>> >> > > >> >> > > > > second, there might be a few
> >> mircoseconds
> >> >> >> > > >>>> slower/faster
> >> >> >> > > >>>> >> but
> >> >> >> > > >>>> >> > > >> should
> >> >> >> > > >>>> >> > > >> >> > not
> >> >> >> > > >>>> >> > > >> >> > > > > break the clock completely to make
> sure
> >> >> all
> >> >> >> the
> >> >> >> > > >>>> >> time-based
> >> >> >> > > >>>> >> > > >> >> > transactions
> >> >> >> > > >>>> >> > > >> >> > > > are
> >> >> >> > > >>>> >> > > >> >> > > > > not affected. The one minute change
> >> will
> >> >> be
> >> >> >> > done
> >> >> >> > > >>>> within a
> >> >> >> > > >>>> >> > > week
> >> >> >> > > >>>> >> > > >> but
> >> >> >> > > >>>> >> > > >> >> > not
> >> >> >> > > >>>> >> > > >> >> > > > > instantly.
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > Personally, I think having client
> side
> >> >> >> > timestamp
> >> >> >> > > >>>> will be
> >> >> >> > > >>>> >> > > useful
> >> >> >> > > >>>> >> > > >> if
> >> >> >> > > >>>> >> > > >> >> we
> >> >> >> > > >>>> >> > > >> >> > > > don't
> >> >> >> > > >>>> >> > > >> >> > > > > need to put the broker and data
> >> integrity
> >> >> >> under
> >> >> >> > > >>>> risk. If
> >> >> >> > > >>>> >> we
> >> >> >> > > >>>> >> > > >> have to
> >> >> >> > > >>>> >> > > >> >> > > > choose
> >> >> >> > > >>>> >> > > >> >> > > > > from one of them but not both. I
> would
> >> >> prefer
> >> >> >> > > >>>> server side
> >> >> >> > > >>>> >> > > >> timestamp
> >> >> >> > > >>>> >> > > >> >> > > > because
> >> >> >> > > >>>> >> > > >> >> > > > > for client side timestamp there is
> >> always
> >> >> a
> >> >> >> > plan
> >> >> >> > > B
> >> >> >> > > >>>> which
> >> >> >> > > >>>> >> is
> >> >> >> > > >>>> >> > > >> putting
> >> >> >> > > >>>> >> > > >> >> > the
> >> >> >> > > >>>> >> > > >> >> > > > > timestamp into payload.
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > Another reason I am reluctant to use
> >> the
> >> >> >> client
> >> >> >> > > >>>> side
> >> >> >> > > >>>> >> > > timestamp
> >> >> >> > > >>>> >> > > >> is
> >> >> >> > > >>>> >> > > >> >> > that
> >> >> >> > > >>>> >> > > >> >> > > it
> >> >> >> > > >>>> >> > > >> >> > > > > is always dangerous to mix the
> control
> >> >> plane
> >> >> >> > with
> >> >> >> > > >>>> data
> >> >> >> > > >>>> >> > > plane. IP
> >> >> >> > > >>>> >> > > >> >> did
> >> >> >> > > >>>> >> > > >> >> > > this
> >> >> >> > > >>>> >> > > >> >> > > > > and it has caused so many different
> >> >> breaches
> >> >> >> so
> >> >> >> > > >>>> people
> >> >> >> > > >>>> >> are
> >> >> >> > > >>>> >> > > >> >> migrating
> >> >> >> > > >>>> >> > > >> >> > to
> >> >> >> > > >>>> >> > > >> >> > > > > something like MPLS. An example in
> >> Kafka
> >> >> is
> >> >> >> > that
> >> >> >> > > >>>> any
> >> >> >> > > >>>> >> client
> >> >> >> > > >>>> >> > > can
> >> >> >> > > >>>> >> > > >> >> > > > construct a
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>>
> >> >> >>
> LeaderAndIsrRequest/UpdateMetadataRequest/ContorlledShutdownRequest
> >> >> >> > > >>>> >> > > >> >> > > (you
> >> >> >> > > >>>> >> > > >> >> > > > > name it) and send it to the broker
> to
> >> >> mess up
> >> >> >> > the
> >> >> >> > > >>>> entire
> >> >> >> > > >>>> >> > > >> cluster,
> >> >> >> > > >>>> >> > > >> >> > also
> >> >> >> > > >>>> >> > > >> >> > > as
> >> >> >> > > >>>> >> > > >> >> > > > > we already noticed a busy cluster
> can
> >> >> respond
> >> >> >> > > >>>> quite slow
> >> >> >> > > >>>> >> to
> >> >> >> > > >>>> >> > > >> >> > controller
> >> >> >> > > >>>> >> > > >> >> > > > > messages. So it would really be nice
> >> if we
> >> >> >> can
> >> >> >> > > >>>> avoid
> >> >> >> > > >>>> >> giving
> >> >> >> > > >>>> >> > > the
> >> >> >> > > >>>> >> > > >> >> power
> >> >> >> > > >>>> >> > > >> >> > > to
> >> >> >> > > >>>> >> > > >> >> > > > > clients to control the log
> retention.
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > Thanks,
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > Jiangjie (Becket) Qin
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > On Sun, Sep 6, 2015 at 9:54 PM, Todd
> >> >> Palino <
> >> >> >> > > >>>> >> > > tpal...@gmail.com>
> >> >> >> > > >>>> >> > > >> >> > wrote:
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > So, with regards to why you want
> to
> >> >> search
> >> >> >> by
> >> >> >> > > >>>> >> timestamp,
> >> >> >> > > >>>> >> > > the
> >> >> >> > > >>>> >> > > >> >> > biggest
> >> >> >> > > >>>> >> > > >> >> > > > > > problem I've seen is with
> consumers
> >> who
> >> >> >> want
> >> >> >> > to
> >> >> >> > > >>>> reset
> >> >> >> > > >>>> >> > their
> >> >> >> > > >>>> >> > > >> >> > > timestamps
> >> >> >> > > >>>> >> > > >> >> > > > > to a
> >> >> >> > > >>>> >> > > >> >> > > > > > specific point, whether it is to
> >> replay
> >> >> a
> >> >> >> > > certain
> >> >> >> > > >>>> >> amount
> >> >> >> > > >>>> >> > of
> >> >> >> > > >>>> >> > > >> >> > messages,
> >> >> >> > > >>>> >> > > >> >> > > > or
> >> >> >> > > >>>> >> > > >> >> > > > > to
> >> >> >> > > >>>> >> > > >> >> > > > > > rewind to before some problem
> state
> >> >> >> existed.
> >> >> >> > > This
> >> >> >> > > >>>> >> happens
> >> >> >> > > >>>> >> > > more
> >> >> >> > > >>>> >> > > >> >> > often
> >> >> >> > > >>>> >> > > >> >> > > > than
> >> >> >> > > >>>> >> > > >> >> > > > > > anyone would like.
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > To handle this now we need to
> >> constantly
> >> >> >> > export
> >> >> >> > > >>>> the
> >> >> >> > > >>>> >> > > broker's
> >> >> >> > > >>>> >> > > >> >> offset
> >> >> >> > > >>>> >> > > >> >> > > for
> >> >> >> > > >>>> >> > > >> >> > > > > > every partition to a time-series
> >> >> database
> >> >> >> and
> >> >> >> > > >>>> then use
> >> >> >> > > >>>> >> > > >> external
> >> >> >> > > >>>> >> > > >> >> > > > processes
> >> >> >> > > >>>> >> > > >> >> > > > > > to query this. I know we're not
> the
> >> only
> >> >> >> ones
> >> >> >> > > >>>> doing
> >> >> >> > > >>>> >> this.
> >> >> >> > > >>>> >> > > The
> >> >> >> > > >>>> >> > > >> way
> >> >> >> > > >>>> >> > > >> >> > the
> >> >> >> > > >>>> >> > > >> >> > > > > > broker handles requests for
> offsets
> >> by
> >> >> >> > > timestamp
> >> >> >> > > >>>> is a
> >> >> >> > > >>>> >> > > little
> >> >> >> > > >>>> >> > > >> >> obtuse
> >> >> >> > > >>>> >> > > >> >> > > > > > (explain it to anyone without
> >> intimate
> >> >> >> > > knowledge
> >> >> >> > > >>>> of the
> >> >> >> > > >>>> >> > > >> internal
> >> >> >> > > >>>> >> > > >> >> > > > workings
> >> >> >> > > >>>> >> > > >> >> > > > > > of the broker - every time I do I
> see
> >> >> >> this).
> >> >> >> > In
> >> >> >> > > >>>> >> addition,
> >> >> >> > > >>>> >> > > as
> >> >> >> > > >>>> >> > > >> >> Becket
> >> >> >> > > >>>> >> > > >> >> > > > > pointed
> >> >> >> > > >>>> >> > > >> >> > > > > > out, it causes problems
> specifically
> >> >> with
> >> >> >> > > >>>> retention of
> >> >> >> > > >>>> >> > > >> messages
> >> >> >> > > >>>> >> > > >> >> by
> >> >> >> > > >>>> >> > > >> >> > > time
> >> >> >> > > >>>> >> > > >> >> > > > > > when you move partitions around.
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > I'm deliberately avoiding the
> >> >> discussion of
> >> >> >> > > what
> >> >> >> > > >>>> >> > timestamp
> >> >> >> > > >>>> >> > > to
> >> >> >> > > >>>> >> > > >> >> use.
> >> >> >> > > >>>> >> > > >> >> > I
> >> >> >> > > >>>> >> > > >> >> > > > can
> >> >> >> > > >>>> >> > > >> >> > > > > > see the argument either way,
> though I
> >> >> tend
> >> >> >> to
> >> >> >> > > >>>> lean
> >> >> >> > > >>>> >> > towards
> >> >> >> > > >>>> >> > > the
> >> >> >> > > >>>> >> > > >> >> idea
> >> >> >> > > >>>> >> > > >> >> > > > that
> >> >> >> > > >>>> >> > > >> >> > > > > > the broker timestamp is the only
> >> viable
> >> >> >> > source
> >> >> >> > > >>>> of truth
> >> >> >> > > >>>> >> > in
> >> >> >> > > >>>> >> > > >> this
> >> >> >> > > >>>> >> > > >> >> > > > > situation.
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > -Todd
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > On Sun, Sep 6, 2015 at 7:08 PM,
> Ewen
> >> >> >> > > >>>> Cheslack-Postava <
> >> >> >> > > >>>> >> > > >> >> > > > e...@confluent.io
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > wrote:
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > On Sun, Sep 6, 2015 at 4:57 PM,
> Jay
> >> >> >> Kreps <
> >> >> >> > > >>>> >> > > j...@confluent.io
> >> >> >> > > >>>> >> > > >> >
> >> >> >> > > >>>> >> > > >> >> > > wrote:
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > 2. Nobody cares what time it
> is
> >> on
> >> >> the
> >> >> >> > > >>>> server.
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > This is a good way of
> summarizing
> >> the
> >> >> >> > issue I
> >> >> >> > > >>>> was
> >> >> >> > > >>>> >> > trying
> >> >> >> > > >>>> >> > > to
> >> >> >> > > >>>> >> > > >> get
> >> >> >> > > >>>> >> > > >> >> > at,
> >> >> >> > > >>>> >> > > >> >> > > > > from
> >> >> >> > > >>>> >> > > >> >> > > > > > an
> >> >> >> > > >>>> >> > > >> >> > > > > > > app's perspective. Of the 3
> stated
> >> >> goals
> >> >> >> of
> >> >> >> > > >>>> the KIP,
> >> >> >> > > >>>> >> #2
> >> >> >> > > >>>> >> > > (lot
> >> >> >> > > >>>> >> > > >> >> > > > retention)
> >> >> >> > > >>>> >> > > >> >> > > > > > is
> >> >> >> > > >>>> >> > > >> >> > > > > > > reasonably handled by a
> server-side
> >> >> >> > > timestamp.
> >> >> >> > > >>>> I
> >> >> >> > > >>>> >> really
> >> >> >> > > >>>> >> > > just
> >> >> >> > > >>>> >> > > >> >> care
> >> >> >> > > >>>> >> > > >> >> > > > that
> >> >> >> > > >>>> >> > > >> >> > > > > a
> >> >> >> > > >>>> >> > > >> >> > > > > > > message is there long enough
> that I
> >> >> have
> >> >> >> a
> >> >> >> > > >>>> chance to
> >> >> >> > > >>>> >> > > process
> >> >> >> > > >>>> >> > > >> >> it.
> >> >> >> > > >>>> >> > > >> >> > #3
> >> >> >> > > >>>> >> > > >> >> > > > > > > (searching by timestamp) only
> seems
> >> >> >> useful
> >> >> >> > if
> >> >> >> > > >>>> we can
> >> >> >> > > >>>> >> > > >> guarantee
> >> >> >> > > >>>> >> > > >> >> > the
> >> >> >> > > >>>> >> > > >> >> > > > > > > server-side timestamp is close
> >> enough
> >> >> to
> >> >> >> > the
> >> >> >> > > >>>> original
> >> >> >> > > >>>> >> > > >> >> client-side
> >> >> >> > > >>>> >> > > >> >> > > > > > > timestamp, and any mirror maker
> >> step
> >> >> >> seems
> >> >> >> > to
> >> >> >> > > >>>> break
> >> >> >> > > >>>> >> > that
> >> >> >> > > >>>> >> > > >> (even
> >> >> >> > > >>>> >> > > >> >> > > > ignoring
> >> >> >> > > >>>> >> > > >> >> > > > > > any
> >> >> >> > > >>>> >> > > >> >> > > > > > > issues with broker
> availability).
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > I'm also wondering whether
> >> optimizing
> >> >> for
> >> >> >> > > >>>> >> > > >> search-by-timestamp
> >> >> >> > > >>>> >> > > >> >> on
> >> >> >> > > >>>> >> > > >> >> > > the
> >> >> >> > > >>>> >> > > >> >> > > > > > broker
> >> >> >> > > >>>> >> > > >> >> > > > > > > is really something we want to
> do
> >> >> given
> >> >> >> > that
> >> >> >> > > >>>> messages
> >> >> >> > > >>>> >> > > aren't
> >> >> >> > > >>>> >> > > >> >> > really
> >> >> >> > > >>>> >> > > >> >> > > > > > > guaranteed to be ordered by
> >> >> >> > application-level
> >> >> >> > > >>>> >> > timestamps
> >> >> >> > > >>>> >> > > on
> >> >> >> > > >>>> >> > > >> the
> >> >> >> > > >>>> >> > > >> >> > > > broker.
> >> >> >> > > >>>> >> > > >> >> > > > > > Is
> >> >> >> > > >>>> >> > > >> >> > > > > > > part of the need for this just
> due
> >> to
> >> >> the
> >> >> >> > > >>>> current
> >> >> >> > > >>>> >> > > consumer
> >> >> >> > > >>>> >> > > >> APIs
> >> >> >> > > >>>> >> > > >> >> > > being
> >> >> >> > > >>>> >> > > >> >> > > > > > > difficult to work with? For
> >> example,
> >> >> >> could
> >> >> >> > > you
> >> >> >> > > >>>> >> > implement
> >> >> >> > > >>>> >> > > >> this
> >> >> >> > > >>>> >> > > >> >> > > pretty
> >> >> >> > > >>>> >> > > >> >> > > > > > easily
> >> >> >> > > >>>> >> > > >> >> > > > > > > client side just the way you
> would
> >> >> >> > > >>>> broker-side? I'd
> >> >> >> > > >>>> >> > > imagine
> >> >> >> > > >>>> >> > > >> a
> >> >> >> > > >>>> >> > > >> >> > > couple
> >> >> >> > > >>>> >> > > >> >> > > > of
> >> >> >> > > >>>> >> > > >> >> > > > > > > random seeks + reads during very
> >> rare
> >> >> >> > > >>>> occasions (i.e.
> >> >> >> > > >>>> >> > > when
> >> >> >> > > >>>> >> > > >> the
> >> >> >> > > >>>> >> > > >> >> > app
> >> >> >> > > >>>> >> > > >> >> > > > > starts
> >> >> >> > > >>>> >> > > >> >> > > > > > > up) wouldn't be a problem
> >> >> >> performance-wise.
> >> >> >> > > Or
> >> >> >> > > >>>> is it
> >> >> >> > > >>>> >> > also
> >> >> >> > > >>>> >> > > >> that
> >> >> >> > > >>>> >> > > >> >> > you
> >> >> >> > > >>>> >> > > >> >> > > > need
> >> >> >> > > >>>> >> > > >> >> > > > > > the
> >> >> >> > > >>>> >> > > >> >> > > > > > > broker to enforce things like
> >> >> >> monotonically
> >> >> >> > > >>>> >> increasing
> >> >> >> > > >>>> >> > > >> >> timestamps
> >> >> >> > > >>>> >> > > >> >> > > > since
> >> >> >> > > >>>> >> > > >> >> > > > > > you
> >> >> >> > > >>>> >> > > >> >> > > > > > > can't do the query properly and
> >> >> >> efficiently
> >> >> >> > > >>>> without
> >> >> >> > > >>>> >> > that
> >> >> >> > > >>>> >> > > >> >> > guarantee,
> >> >> >> > > >>>> >> > > >> >> > > > and
> >> >> >> > > >>>> >> > > >> >> > > > > > > therefore what applications are
> >> >> actually
> >> >> >> > > >>>> looking for
> >> >> >> > > >>>> >> > *is*
> >> >> >> > > >>>> >> > > >> >> > > broker-side
> >> >> >> > > >>>> >> > > >> >> > > > > > > timestamps?
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > -Ewen
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > Consider cases where data is
> >> being
> >> >> >> copied
> >> >> >> > > >>>> from a
> >> >> >> > > >>>> >> > > database
> >> >> >> > > >>>> >> > > >> or
> >> >> >> > > >>>> >> > > >> >> > from
> >> >> >> > > >>>> >> > > >> >> > > > log
> >> >> >> > > >>>> >> > > >> >> > > > > > > > files. In steady-state the
> server
> >> >> time
> >> >> >> is
> >> >> >> > > >>>> very
> >> >> >> > > >>>> >> close
> >> >> >> > > >>>> >> > to
> >> >> >> > > >>>> >> > > >> the
> >> >> >> > > >>>> >> > > >> >> > > client
> >> >> >> > > >>>> >> > > >> >> > > > > time
> >> >> >> > > >>>> >> > > >> >> > > > > > > if
> >> >> >> > > >>>> >> > > >> >> > > > > > > > their clocks are sync'd (see
> 1)
> >> but
> >> >> >> there
> >> >> >> > > >>>> will be
> >> >> >> > > >>>> >> > > times of
> >> >> >> > > >>>> >> > > >> >> > large
> >> >> >> > > >>>> >> > > >> >> > > > > > > divergence
> >> >> >> > > >>>> >> > > >> >> > > > > > > > when the copying process is
> >> stopped
> >> >> or
> >> >> >> > > falls
> >> >> >> > > >>>> >> behind.
> >> >> >> > > >>>> >> > > When
> >> >> >> > > >>>> >> > > >> >> this
> >> >> >> > > >>>> >> > > >> >> > > > occurs
> >> >> >> > > >>>> >> > > >> >> > > > > > it
> >> >> >> > > >>>> >> > > >> >> > > > > > > is
> >> >> >> > > >>>> >> > > >> >> > > > > > > > clear that the time the data
> >> >> arrived on
> >> >> >> > the
> >> >> >> > > >>>> server
> >> >> >> > > >>>> >> is
> >> >> >> > > >>>> >> > > >> >> > irrelevant,
> >> >> >> > > >>>> >> > > >> >> > > > it
> >> >> >> > > >>>> >> > > >> >> > > > > is
> >> >> >> > > >>>> >> > > >> >> > > > > > > the
> >> >> >> > > >>>> >> > > >> >> > > > > > > > source timestamp that matters.
> >> This
> >> >> is
> >> >> >> > the
> >> >> >> > > >>>> problem
> >> >> >> > > >>>> >> > you
> >> >> >> > > >>>> >> > > are
> >> >> >> > > >>>> >> > > >> >> > trying
> >> >> >> > > >>>> >> > > >> >> > > > to
> >> >> >> > > >>>> >> > > >> >> > > > > > fix
> >> >> >> > > >>>> >> > > >> >> > > > > > > by
> >> >> >> > > >>>> >> > > >> >> > > > > > > > retaining the mm timestamp but
> >> >> really
> >> >> >> the
> >> >> >> > > >>>> client
> >> >> >> > > >>>> >> > should
> >> >> >> > > >>>> >> > > >> >> always
> >> >> >> > > >>>> >> > > >> >> > > set
> >> >> >> > > >>>> >> > > >> >> > > > > the
> >> >> >> > > >>>> >> > > >> >> > > > > > > time
> >> >> >> > > >>>> >> > > >> >> > > > > > > > with the use of server-side
> time
> >> as
> >> >> a
> >> >> >> > > >>>> fallback. It
> >> >> >> > > >>>> >> > > would
> >> >> >> > > >>>> >> > > >> be
> >> >> >> > > >>>> >> > > >> >> > worth
> >> >> >> > > >>>> >> > > >> >> > > > > > talking
> >> >> >> > > >>>> >> > > >> >> > > > > > > > to the Samza folks and reading
> >> >> through
> >> >> >> > this
> >> >> >> > > >>>> blog
> >> >> >> > > >>>> >> > post (
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >>
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> >
> >> >> >> > > >>>> >>
> >> >> >> > > >>>>
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >>
> >>
> http://radar.oreilly.com/2015/08/the-world-beyond-batch-streaming-101.html
> >> >> >> > > >>>> >> > > >> >> > > > > > > > )
> >> >> >> > > >>>> >> > > >> >> > > > > > > > on this subject since we went
> >> >> through
> >> >> >> > > similar
> >> >> >> > > >>>> >> > > learnings on
> >> >> >> > > >>>> >> > > >> >> the
> >> >> >> > > >>>> >> > > >> >> > > > stream
> >> >> >> > > >>>> >> > > >> >> > > > > > > > processing side.
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > I think the implication of
> these
> >> >> two is
> >> >> >> > > that
> >> >> >> > > >>>> we
> >> >> >> > > >>>> >> need
> >> >> >> > > >>>> >> > a
> >> >> >> > > >>>> >> > > >> >> proposal
> >> >> >> > > >>>> >> > > >> >> > > > that
> >> >> >> > > >>>> >> > > >> >> > > > > > > > handles potentially very
> >> >> out-of-order
> >> >> >> > > >>>> timestamps in
> >> >> >> > > >>>> >> > > some
> >> >> >> > > >>>> >> > > >> kind
> >> >> >> > > >>>> >> > > >> >> > of
> >> >> >> > > >>>> >> > > >> >> > > > > sanish
> >> >> >> > > >>>> >> > > >> >> > > > > > > way
> >> >> >> > > >>>> >> > > >> >> > > > > > > > (buggy clients will set
> something
> >> >> >> totally
> >> >> >> > > >>>> wrong as
> >> >> >> > > >>>> >> > the
> >> >> >> > > >>>> >> > > >> time).
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > -Jay
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > On Sun, Sep 6, 2015 at 4:22
> PM,
> >> Jay
> >> >> >> > Kreps <
> >> >> >> > > >>>> >> > > >> j...@confluent.io>
> >> >> >> > > >>>> >> > > >> >> > > > wrote:
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > The magic byte is used to
> >> version
> >> >> >> > message
> >> >> >> > > >>>> format
> >> >> >> > > >>>> >> so
> >> >> >> > > >>>> >> > > >> we'll
> >> >> >> > > >>>> >> > > >> >> > need
> >> >> >> > > >>>> >> > > >> >> > > to
> >> >> >> > > >>>> >> > > >> >> > > > > > make
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > sure that check is in
> place--I
> >> >> >> actually
> >> >> >> > > >>>> don't see
> >> >> >> > > >>>> >> > it
> >> >> >> > > >>>> >> > > in
> >> >> >> > > >>>> >> > > >> the
> >> >> >> > > >>>> >> > > >> >> > > > current
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > consumer code which I think
> is
> >> a
> >> >> bug
> >> >> >> we
> >> >> >> > > >>>> should
> >> >> >> > > >>>> >> fix
> >> >> >> > > >>>> >> > > for
> >> >> >> > > >>>> >> > > >> the
> >> >> >> > > >>>> >> > > >> >> > next
> >> >> >> > > >>>> >> > > >> >> > > > > > release
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > (filed KAFKA-2523). The
> >> purpose of
> >> >> >> that
> >> >> >> > > >>>> field is
> >> >> >> > > >>>> >> so
> >> >> >> > > >>>> >> > > >> there
> >> >> >> > > >>>> >> > > >> >> is
> >> >> >> > > >>>> >> > > >> >> > a
> >> >> >> > > >>>> >> > > >> >> > > > > clear
> >> >> >> > > >>>> >> > > >> >> > > > > > > > check
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > on the format rather than
> the
> >> >> >> scrambled
> >> >> >> > > >>>> scenarios
> >> >> >> > > >>>> >> > > Becket
> >> >> >> > > >>>> >> > > >> >> > > > describes.
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > Also, Becket, I don't think
> >> just
> >> >> >> fixing
> >> >> >> > > >>>> the java
> >> >> >> > > >>>> >> > > client
> >> >> >> > > >>>> >> > > >> is
> >> >> >> > > >>>> >> > > >> >> > > > > sufficient
> >> >> >> > > >>>> >> > > >> >> > > > > > > as
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > that would break other
> >> >> clients--i.e.
> >> >> >> if
> >> >> >> > > >>>> anyone
> >> >> >> > > >>>> >> > > writes a
> >> >> >> > > >>>> >> > > >> v1
> >> >> >> > > >>>> >> > > >> >> > > > > messages,
> >> >> >> > > >>>> >> > > >> >> > > > > > > even
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > by accident, any
> non-v1-capable
> >> >> >> > consumer
> >> >> >> > > >>>> will
> >> >> >> > > >>>> >> > break.
> >> >> >> > > >>>> >> > > I
> >> >> >> > > >>>> >> > > >> >> think
> >> >> >> > > >>>> >> > > >> >> > we
> >> >> >> > > >>>> >> > > >> >> > > > > > > probably
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > need a way to have the
> server
> >> >> ensure
> >> >> >> a
> >> >> >> > > >>>> particular
> >> >> >> > > >>>> >> > > >> message
> >> >> >> > > >>>> >> > > >> >> > > format
> >> >> >> > > >>>> >> > > >> >> > > > > > either
> >> >> >> > > >>>> >> > > >> >> > > > > > > > at
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > read or write time.
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > -Jay
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > On Thu, Sep 3, 2015 at 3:47
> PM,
> >> >> >> > Jiangjie
> >> >> >> > > >>>> Qin
> >> >> >> > > >>>> >> > > >> >> > > > > > <j...@linkedin.com.invalid
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > > wrote:
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> Hi Guozhang,
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> I checked the code again.
> >> >> Actually
> >> >> >> CRC
> >> >> >> > > >>>> check
> >> >> >> > > >>>> >> > > probably
> >> >> >> > > >>>> >> > > >> >> won't
> >> >> >> > > >>>> >> > > >> >> > > > fail.
> >> >> >> > > >>>> >> > > >> >> > > > > > The
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> newly
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> added timestamp field
> might be
> >> >> >> treated
> >> >> >> > > as
> >> >> >> > > >>>> >> > keyLength
> >> >> >> > > >>>> >> > > >> >> instead,
> >> >> >> > > >>>> >> > > >> >> > > so
> >> >> >> > > >>>> >> > > >> >> > > > we
> >> >> >> > > >>>> >> > > >> >> > > > > > are
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> likely to receive an
> >> >> >> > > >>>> IllegalArgumentException
> >> >> >> > > >>>> >> when
> >> >> >> > > >>>> >> > > try
> >> >> >> > > >>>> >> > > >> to
> >> >> >> > > >>>> >> > > >> >> > read
> >> >> >> > > >>>> >> > > >> >> > > > the
> >> >> >> > > >>>> >> > > >> >> > > > > > > key.
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> I'll update the KIP.
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> Thanks,
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> Jiangjie (Becket) Qin
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> On Thu, Sep 3, 2015 at
> 12:48
> >> PM,
> >> >> >> > > Jiangjie
> >> >> >> > > >>>> Qin <
> >> >> >> > > >>>> >> > > >> >> > > > j...@linkedin.com>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > wrote:
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > Hi, Guozhang,
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > Thanks for reading the
> KIP.
> >> By
> >> >> >> "old
> >> >> >> > > >>>> >> consumer", I
> >> >> >> > > >>>> >> > > >> meant
> >> >> >> > > >>>> >> > > >> >> the
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >
> ZookeeperConsumerConnector
> >> in
> >> >> >> trunk
> >> >> >> > > >>>> now, i.e.
> >> >> >> > > >>>> >> > > without
> >> >> >> > > >>>> >> > > >> >> this
> >> >> >> > > >>>> >> > > >> >> > > bug
> >> >> >> > > >>>> >> > > >> >> > > > > > > fixed.
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> If we
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > fix the
> >> >> ZookeeperConsumerConnector
> >> >> >> > > then
> >> >> >> > > >>>> it
> >> >> >> > > >>>> >> will
> >> >> >> > > >>>> >> > > throw
> >> >> >> > > >>>> >> > > >> >> > > > exception
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> complaining
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > about the unsupported
> >> version
> >> >> when
> >> >> >> > it
> >> >> >> > > >>>> sees
> >> >> >> > > >>>> >> > message
> >> >> >> > > >>>> >> > > >> >> format
> >> >> >> > > >>>> >> > > >> >> > > V1.
> >> >> >> > > >>>> >> > > >> >> > > > > > What I
> >> >> >> > > >>>> >> > > >> >> > > > > > > > was
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > trying to say is that if
> we
> >> >> have
> >> >> >> > some
> >> >> >> > > >>>> >> > > >> >> > > > ZookeeperConsumerConnector
> >> >> >> > > >>>> >> > > >> >> > > > > > > > running
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > without the fix, the
> >> consumer
> >> >> will
> >> >> >> > > >>>> complain
> >> >> >> > > >>>> >> > about
> >> >> >> > > >>>> >> > > CRC
> >> >> >> > > >>>> >> > > >> >> > > mismatch
> >> >> >> > > >>>> >> > > >> >> > > > > > > instead
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> of
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > unsupported version.
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > Thanks,
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > Jiangjie (Becket) Qin
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> > On Thu, Sep 3, 2015 at
> 12:15
> >> >> PM,
> >> >> >> > > >>>> Guozhang
> >> >> >> > > >>>> >> Wang <
> >> >> >> > > >>>> >> > > >> >> > > > > > wangg...@gmail.com>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> wrote:
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> Thanks for the write-up
> >> >> Jiangjie.
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> One comment about
> migration
> >> >> plan:
> >> >> >> > > "For
> >> >> >> > > >>>> old
> >> >> >> > > >>>> >> > > >> consumers,
> >> >> >> > > >>>> >> > > >> >> if
> >> >> >> > > >>>> >> > > >> >> > > they
> >> >> >> > > >>>> >> > > >> >> > > > > see
> >> >> >> > > >>>> >> > > >> >> > > > > > > the
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> new
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> protocol the CRC check
> will
> >> >> >> fail"..
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> Do you mean this bug in
> the
> >> >> old
> >> >> >> > > >>>> consumer
> >> >> >> > > >>>> >> cannot
> >> >> >> > > >>>> >> > > be
> >> >> >> > > >>>> >> > > >> >> fixed
> >> >> >> > > >>>> >> > > >> >> > > in a
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> backward-compatible way?
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> Guozhang
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> On Thu, Sep 3, 2015 at
> 8:35
> >> >> AM,
> >> >> >> > > >>>> Jiangjie Qin
> >> >> >> > > >>>> >> > > >> >> > > > > > > > <j...@linkedin.com.invalid
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> wrote:
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> > Hi,
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> > We just created
> KIP-31 to
> >> >> >> > propose a
> >> >> >> > > >>>> message
> >> >> >> > > >>>> >> > > format
> >> >> >> > > >>>> >> > > >> >> > change
> >> >> >> > > >>>> >> > > >> >> > > > in
> >> >> >> > > >>>> >> > > >> >> > > > > > > Kafka.
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >>
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> >
> >> >> >> > > >>>> >>
> >> >> >> > > >>>>
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-31+-+Message+format+change+proposal
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> > As a summary, the
> >> >> motivations
> >> >> >> > are:
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> > 1. Avoid server side
> >> message
> >> >> >> > > >>>> re-compression
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> > 2. Honor time-based
> log
> >> roll
> >> >> >> and
> >> >> >> > > >>>> retention
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> > 3. Enable offset
> search
> >> by
> >> >> >> > > timestamp
> >> >> >> > > >>>> at a
> >> >> >> > > >>>> >> > finer
> >> >> >> > > >>>> >> > > >> >> > > > granularity.
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> > Feedback and comments
> are
> >> >> >> > welcome!
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> > Thanks,
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> > Jiangjie (Becket) Qin
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> --
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >> -- Guozhang
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >> >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >>
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > > > --
> >> >> >> > > >>>> >> > > >> >> > > > > > > Thanks,
> >> >> >> > > >>>> >> > > >> >> > > > > > > Ewen
> >> >> >> > > >>>> >> > > >> >> > > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > > >
> >> >> >> > > >>>> >> > > >> >> > > > >
> >> >> >> > > >>>> >> > > >> >> > > >
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> > > --
> >> >> >> > > >>>> >> > > >> >> > > Thanks,
> >> >> >> > > >>>> >> > > >> >> > > Neha
> >> >> >> > > >>>> >> > > >> >> > >
> >> >> >> > > >>>> >> > > >> >> >
> >> >> >> > > >>>> >> > > >> >>
> >> >> >> > > >>>> >> > > >>
> >> >> >> > > >>>> >> > >
> >> >> >> > > >>>> >> >
> >> >> >> > > >>>> >>
> >> >> >> > > >>>> >>
> >> >> >> > > >>>> >>
> >> >> >> > > >>>> >> --
> >> >> >> > > >>>> >> Thanks,
> >> >> >> > > >>>> >> Ewen
> >> >> >> > > >>>> >>
> >> >> >> > > >>>>
> >> >> >> > > >>>>
> >> >> >> > > >>>
> >> >> >> > > >>
> >> >> >> > > >
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >>
> >>
>
>

Reply via email to