Not really.. there are dozens (over 50) producer instances using
librdkafka, it is not feasible to update them all.
I set log.message.timestamp.type to LogAppendTime in server.properties and
it seems to be working (at least --reset-offsets returns the same offset
for the same datetime for partitions where the leader is a broker with the
new setting applied), but I won't be able to truly validate before I
upgrade the entire cluster to 0.11.0.1 and apply this setting to all
brokers.

Is there a reason LogAppendTime is not the default setting for this?

On 26 October 2017 at 11:42, Ted Yu <yuzhih...@gmail.com> wrote:

> Can you update librdkafka and try again ?
> Thanks
> -------- Original message --------From: Dan Markhasin <minimi...@gmail.com>
> Date: 10/25/17  11:26 PM  (GMT-08:00) To: users@kafka.apache.org Subject:
> Re: Failure to reset consumer offsets for specific topics
> For beats (the topic where timestamps are OK) the producers are Metricbeat
> / Logstash which I assume use a relatively recent producer version.
> For the topics with missing timestamps the producers are based on
> librdkafka though I'm not sure which version exactly - I wouldn't be
> surprised if it's pretty old and doesn't set timestamp on the messages it
> produces.
>
> On 26 October 2017 at 08:17, Elyahou Ittah <elyaho...@fiverr.com> wrote:
>
> > Which driver is used yo produce these messages ?
> >
> > On Thu, Oct 26, 2017 at 8:14 AM, Dan Markhasin <minimi...@gmail.com>
> > wrote:
> >
> > > Furthermore, when looking at messages produced to the data1_log topic
> > with
> > > print.timestamp=true they all have CreateTime:-1 whereas messages
> > produced
> > > to the beats topic have valid timestamps. The producers that are
> sending
> > to
> > > data1_log are older than the producers that are sending to beats - if
> > this
> > > is where the broker takes the timestamps from, it explains why they are
> > all
> > > empty for data1_log.
> > >
> > > On 26 October 2017 at 08:07, Dan Markhasin <minimi...@gmail.com>
> wrote:
> > >
> > > > After a bit more checking it seems that Kafka isn't writing
> timestamps
> > at
> > > > all in the .timeindex file for the topics where offset rewind is not
> > > > working.
> > > >
> > > > The following output is from * a different 0.11.0.0 * which also has
> a
> > > > topic called data1_log (this cluster has not experienced any issues
> > > lately):
> > > >
> > > > bash-4.2$ ls -ltr | tail
> > > > -rw-rw-r-- 1 ibiuser users 1073739867 Oct 24 20:42
> > > 00000000000723010092.log
> > > > -rw-rw-r-- 1 ibiuser users     926216 Oct 24 20:42
> > > > 00000000000723010092.index
> > > > -rw-rw-r-- 1 ibiuser users         16 Oct 25 05:09
> > > leader-epoch-checkpoint
> > > > -rw-rw-r-- 1 ibiuser users 1073741239 Oct 25 09:52
> > > 00000000000725186047.log
> > > > -rw-rw-r-- 1 ibiuser users   10485756 Oct 25 09:52
> > > > 00000000000727395414.timeindex
> > > > -rw-rw-r-- 1 ibiuser users         10 Oct 25 09:52
> > > > 00000000000727395414.snapshot
> > > > -rw-rw-r-- 1 ibiuser users          0 Oct 25 09:52
> > > > 00000000000725186047.timeindex
> > > > -rw-rw-r-- 1 ibiuser users     961144 Oct 25 09:52
> > > > 00000000000725186047.index
> > > > -rw-rw-r-- 1 ibiuser users   10485760 Oct 25 21:59
> > > > 00000000000727395414.index
> > > > -rw-rw-r-- 1 ibiuser users 1028809456 Oct 25 21:59
> > > 00000000000727395414.log
> > > >
> > > >
> > > > bash-4.2$ /kafka/latest/bin/kafka-run-class.sh
> > > > kafka.tools.DumpLogSegments --files 00000000000727395414.timeindex >
> > > > /tmp/timestamps 2>&1
> > > >
> > > > The output looks like this:
> > > >
> > > > timestamp: 0 offset: 727395414
> > > > timestamp: 0 offset: 727395414
> > > > timestamp: 0 offset: 727395414
> > > > timestamp: 0 offset: 727395414
> > > > timestamp: 0 offset: 727395414
> > > > timestamp: 0 offset: 727395414
> > > > timestamp: 0 offset: 727395414
> > > >
> > > >
> > > > The offset is the same on all of the records and not a single line
> has
> > a
> > > > timestamp greater than zero.
> > > >
> > > > bash-4.2$ cat /tmp/timestamps | grep timestamp | awk '{print $2}' |
> > grep
> > > > -v 0
> > > > bash-4.2$
> > > >
> > > > For comparison, the .timeindex for the beats topic on the same
> cluster
> > > > looks like this:
> > > >
> > > > Dumping 00000000000254410745.timeindex
> > > > timestamp: 1508978010544 offset: 254410759
> > > > timestamp: 1508978011084 offset: 254410763
> > > > timestamp: 1508978012080 offset: 254410770
> > > > timestamp: 1508978012789 offset: 254410796
> > > > timestamp: 1508978013981 offset: 254410808
> > > > timestamp: 1508978014526 offset: 254410812
> > > > timestamp: 1508978014698 offset: 254410828
> > > > timestamp: 1508978014959 offset: 254410834
> > > > timestamp: 1508978015677 offset: 254410854
> > > > timestamp: 1508978016980 offset: 254410870
> > > > timestamp: 1508978017212 offset: 254410883
> > > >
> > > >
> > > > On 26 October 2017 at 07:26, Elyahou Ittah <elyaho...@fiverr.com>
> > wrote:
> > > >
> > > >> Which driver is used yo produce these messages ?
> > > >>
> > > >> On Oct 26, 2017 07:11, "Dan Markhasin" <minimi...@gmail.com> wrote:
> > > >>
> > > >> > No, that flag doesn't affect which offsets are returned, only
> > executes
> > > >> the
> > > >> > action (and resets the consumer to latest offset when used,
> > regardless
> > > >> of
> > > >> > datetime value I provide).
> > > >> >
> > > >> > On 25 October 2017 at 23:44, Hans Jespersen <h...@confluent.io>
> > > wrote:
> > > >> >
> > > >> > > I think you are just missing the —execute flag.
> > > >> > >
> > > >> > > -hans
> > > >> > >
> > > >> > > > On Oct 25, 2017, at 1:24 PM, Ted Yu <yuzhih...@gmail.com>
> > wrote:
> > > >> > > >
> > > >> > > > I wonder if you have hit KAFKA-5600.
> > > >> > > >
> > > >> > > > Is it possible that you try out 0.11.0.1 ?
> > > >> > > >
> > > >> > > > Thanks
> > > >> > > >
> > > >> > > >> On Wed, Oct 25, 2017 at 1:15 PM, Dan Markhasin <
> > > >> minimi...@gmail.com>
> > > >> > > wrote:
> > > >> > > >>
> > > >> > > >> I am using 0.11.0.0.
> > > >> > > >>
> > > >> > > >> There is no difference configuration-wise - both have 10
> > > partitions
> > > >> > and
> > > >> > > 2
> > > >> > > >> replicas. There are no errors in the logs, but looking in the
> > > data
> > > >> > > folder
> > > >> > > >> it seems like Kafka is not updating the timeindex file for
> > > >> data1_log -
> > > >> > > >> notice how the timeindex file for the current log segment is
> > not
> > > >> being
> > > >> > > >> updated.
> > > >> > > >>
> > > >> > > >> bash-4.2$ pwd
> > > >> > > >> /kafka/data/data1_log-1
> > > >> > > >> bash-4.2$ ls -ltr | tail
> > > >> > > >> -rw-rw-r-- 1 ibiuser it 1073731573 Oct 25 01:21
> > > >> > 00000000000337554984.log
> > > >> > > >> -rw-rw-r-- 1 ibiuser it     943616 Oct 25 01:21
> > > >> > > 00000000000337554984.index
> > > >> > > >> -rw-rw-r-- 1 ibiuser it 1073734199 Oct 25 13:38
> > > >> > 00000000000339816017.log
> > > >> > > >> -rw-rw-r-- 1 ibiuser it   10485756 Oct 25 13:38
> > > >> > > >> 00000000000341934289.timeindex
> > > >> > > >> -rw-rw-r-- 1 ibiuser it         10 Oct 25 13:38
> > > >> > > >> 00000000000341934289.snapshot
> > > >> > > >> -rw-rw-r-- 1 ibiuser it          0 Oct 25 13:38
> > > >> > > >> 00000000000339816017.timeindex
> > > >> > > >> -rw-rw-r-- 1 ibiuser it     566712 Oct 25 13:38
> > > >> > > 00000000000339816017.index
> > > >> > > >> -rw-rw-r-- 1 ibiuser it         17 Oct 25 20:23
> > > >> > leader-epoch-checkpoint
> > > >> > > >> -rw-rw-r-- 1 ibiuser it   10485760 Oct 25 23:03
> > > >> > > 00000000000341934289.index
> > > >> > > >> -rw-rw-r-- 1 ibiuser it  461590419 Oct 25 23:04
> > > >> > 00000000000341934289.log
> > > >> > > >>
> > > >> > > >> For comparison, the beats topic:
> > > >> > > >>
> > > >> > > >> bash-4.2$ cd ../beats-1
> > > >> > > >> bash-4.2$ ls -ltr
> > > >> > > >> total 3212088
> > > >> > > >> -rw-rw-r-- 1 ibiuser it         17 Oct 25 00:23
> > > >> > leader-epoch-checkpoint
> > > >> > > >> -rw-rw-r-- 1 ibiuser it         10 Oct 25 20:04
> > > >> > > >> 00000000000188672034.snapshot
> > > >> > > >> -rw-rw-r-- 1 ibiuser it    2773008 Oct 25 20:04
> > > >> > > >> 00000000000185224087.timeindex
> > > >> > > >> -rw-rw-r-- 1 ibiuser it 1073741779 Oct 25 20:04
> > > >> > 00000000000185224087.log
> > > >> > > >> -rw-rw-r-- 1 ibiuser it    1967440 Oct 25 20:04
> > > >> > > 00000000000185224087.index
> > > >> > > >> -rw-rw-r-- 1 ibiuser it   10485760 Oct 25 23:03
> > > >> > > 00000000000188672034.index
> > > >> > > >> -rw-rw-r-- 1 ibiuser it   10485756 Oct 25 23:04
> > > >> > > >> 00000000000188672034.timeindex
> > > >> > > >> -rw-rw-r-- 1 ibiuser it   50166645 <5016%206645> Oct 25 23:04
> > > >> > 00000000000188672034.log
> > > >> > > >>
> > > >> > > >>
> > > >> > > >> To give some context to why I'm even trying to reset the
> > offsets,
> > > >> we
> > > >> > had
> > > >> > > >> encountered a strange situation earlier today:
> > > >> > > >>
> > > >> > > >> 1) One of the brokers had a hardware failure, and had to be
> > > rebuilt
> > > >> > from
> > > >> > > >> scratch (data partition was gone)
> > > >> > > >> 2) When it went down, we noticed a spike in lag in one
> > particular
> > > >> > > consumer
> > > >> > > >> group - it seems to have reset its offset to an earlier point
> > in
> > > >> time
> > > >> > > (but
> > > >> > > >> not the earliest offset of the topic); I have read other
> > messages
> > > >> on
> > > >> > > this
> > > >> > > >> mailing list of users who experienced the same behavior with
> > > >> 0.11.0.0
> > > >> > > >> 3) The broker was reinstalled and rejoined the cluster with
> the
> > > >> same
> > > >> > > >> broker.id (but with no data on it) - it rebalanced and
> > > eventually
> > > >> all
> > > >> > > >> replicas became synced and the cluster was functioning
> > normally.
> > > >> > > >> 4) I then decided to bounce the same broker again to see if I
> > can
> > > >> > > reproduce
> > > >> > > >> the issue I saw in #2 - and as soon as the broker was
> > restarted,
> > > >> the
> > > >> > > exact
> > > >> > > >> same consumer group had its offset reset again and was
> lagging
> > > with
> > > >> > > >> millions of records behind the current offset.
> > > >> > > >> 5) I then tried to manually reset the consumer group's offset
> > to
> > > a
> > > >> few
> > > >> > > >> minutes before I restarted the broker, only to discover this
> > > >> strange
> > > >> > > >> behavior where no matter which datetime value I provided, it
> > kept
> > > >> > > resetting
> > > >> > > >> to the latest offset.
> > > >> > > >>
> > > >> > > >>
> > > >> > > >>> On 25 October 2017 at 22:48, Ted Yu <yuzhih...@gmail.com>
> > > wrote:
> > > >> > > >>>
> > > >> > > >>> Do you mind providing a bit more information ?
> > > >> > > >>>
> > > >> > > >>> Release of Kafka you use
> > > >> > > >>>
> > > >> > > >>> Any difference between data1_log and the other, normal
> topic ?
> > > >> > > >>>
> > > >> > > >>> Probably check the broker log where data1_log is hosted -
> see
> > if
> > > >> > there
> > > >> > > is
> > > >> > > >>> some clue.
> > > >> > > >>>
> > > >> > > >>> Thanks
> > > >> > > >>>
> > > >> > > >>> On Wed, Oct 25, 2017 at 12:11 PM, Dan Markhasin <
> > > >> minimi...@gmail.com
> > > >> > >
> > > >> > > >>> wrote:
> > > >> > > >>>
> > > >> > > >>>> I'm trying to use the kafka-consumer-groups.sh tool in
> order
> > to
> > > >> > rewind
> > > >> > > >> a
> > > >> > > >>>> consumer group's offset, however it seems to be returning
> the
> > > >> latest
> > > >> > > >>> offset
> > > >> > > >>>> regarding of the requested offset.
> > > >> > > >>>>
> > > >> > > >>>> You can see in the below example that two consecutive
> > commands
> > > to
> > > >> > > reset
> > > >> > > >>> the
> > > >> > > >>>> offset to a specific point in time return different
> > > (increasing)
> > > >> > > >> offsets,
> > > >> > > >>>> which are actually the latest offsets for the topic.
> > > >> > > >>>>
> > > >> > > >>>> - The consumer group ("test_consumer") is a console
> consumer
> > > that
> > > >> > was
> > > >> > > >>>> started with --from-beginning and terminated after a few
> > > seconds,
> > > >> > just
> > > >> > > >>>> enough for it to commit its offsets.
> > > >> > > >>>> - The topic data1_log is very busy with thousands of
> incoming
> > > >> > messages
> > > >> > > >>> per
> > > >> > > >>>> second
> > > >> > > >>>> - The datetime value provided is approx. 5 hours earlier
> than
> > > the
> > > >> > > >> current
> > > >> > > >>>> UTC time
> > > >> > > >>>>
> > > >> > > >>>> [admin@broker01] ~> /kafka/latest/bin/kafka-
> > consumer-groups.sh
> > > >> > > >>>> --bootstrap-server localhost:9092 --reset-offsets --group
> > > >> > > test_consumer
> > > >> > > >>>> --topic data1_log --to-datetime '2017-10-25T13:40:00.000'
> > > >> > > >>>> Note: This will only show information about consumers that
> > use
> > > >> the
> > > >> > > Java
> > > >> > > >>>> consumer API (non-ZooKeeper-based consumers).
> > > >> > > >>>>
> > > >> > > >>>>
> > > >> > > >>>> TOPIC                          PARTITION  NEW-OFFSET
> > > >> > > >>>> data1_log                      8          301485420
> > > >> > > >>>> data1_log                      1          342788637
> > > >> > > >>>> data1_log                      7          287621428
> > > >> > > >>>> data1_log                      3          268612266
> > > >> > > >>>> data1_log                      0          201860717
> > > >> > > >>>> data1_log                      9          202749553
> > > >> > > >>>> data1_log                      4          188974032
> > > >> > > >>>> data1_log                      6          234308481
> > > >> > > >>>> data1_log                      2          263507741
> > > >> > > >>>> data1_log                      5          232707238
> > > >> > > >>>>
> > > >> > > >>>> [admin@broker01] ~> /kafka/latest/bin/kafka-
> > consumer-groups.sh
> > > >> > > >>>> --bootstrap-server localhost:9092 --reset-offsets --group
> > > >> > > test_consumer
> > > >> > > >>>> --topic data1_log --to-datetime '2017-10-25T13:40:00.000'
> > > >> > > >>>> Note: This will only show information about consumers that
> > use
> > > >> the
> > > >> > > Java
> > > >> > > >>>> consumer API (non-ZooKeeper-based consumers).
> > > >> > > >>>>
> > > >> > > >>>>
> > > >> > > >>>> TOPIC                          PARTITION  NEW-OFFSET
> > > >> > > >>>> data1_log                      8          301485491
> > > >> > > >>>> data1_log                      1          342788779
> > > >> > > >>>> data1_log                      7          287621534
> > > >> > > >>>> data1_log                      3          268612364
> > > >> > > >>>> data1_log                      0          201860796
> > > >> > > >>>> data1_log                      9          202749620
> > > >> > > >>>> data1_log                      4          188974068
> > > >> > > >>>> data1_log                      6          234308564
> > > >> > > >>>> data1_log                      2          263507823
> > > >> > > >>>> data1_log                      5          232707293
> > > >> > > >>>>
> > > >> > > >>>> This issue seems to be topic-specific - there is a
> different
> > > >> topic
> > > >> > > >> (also
> > > >> > > >>>> very active) where the same command consistently returns
> the
> > > >> correct
> > > >> > > >>>> offsets fixed in the time for the requested datetime.
> > > >> > > >>>>
> > > >> > > >>>> What could be the issue here?
> > > >> > > >>>>
> > > >> > > >>>> Thanks,
> > > >> > > >>>> Dan
> > > >> > > >>>>
> > > >> > > >>>
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to