Hi, I've created a ticket for the situation we have now:
https://issues.apache.org/jira/browse/KAFKA-6003. I will file a ticket for
the original Exception that took down replication fetcher thread after some
initial investigation - it might be same issue after all.

Still would appreciate any hints on how to get those topics into fully
replicated state without loosing all data.
Will turning of idempotence on producers and waiting until all old data is
cleaned up help?

Best regards,
Stas.

2017-10-02 20:08 GMT+02:00 Apurva Mehta <apu...@confluent.io>:

> Hi Stas,
>
> Thanks for reporting this. It would be helpful to have JIRA with more of
> the server logs on the leaders and followers in the time leading up to this
> OutOfOrderSequenceException.
>
> The answers to the following questions would help, when you file the JIRA:
>
> What are the retention settings for this topic? Is it configured for
> compaction? Compaction and deletion? What is the retention.time.ms
> setting?
> What is the retention.bytes setting? What messages are being written to the
> topic? Particularly, do they have a create time explicitly set by the
> application?
>
> Thanks,
> Apurva
>
> On Mon, Oct 2, 2017 at 4:40 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Hi Stas,
> >
> > Thank you for reporting this. Can you please file an issue? Even if
> > KAFKA-5793 has fixed it for 1.0.0 (which needs to be verified), we should
> > consider whether a fix is needed for the 0.11.0 branch as well.
> >
> > Ismael
> >
> > On Mon, Oct 2, 2017 at 11:28 AM, Stas Chizhov <schiz...@gmail.com>
> wrote:
> >
> > > Hi,
> > >
> > > We run 0.11.01 and there was a problem with 1 ReplicationFetcher on one
> > of
> > > the brokers - it experience out of order sequence problem for one
> > > topic/partition and was stopped. It stayed stopped over the weekend.
> > During
> > > this time log cleanup was working and by now it has cleaned up all the
> > data
> > > in the partitions that this fetcher was responsible for - including
> other
> > > partitions that didnt have out of order sequence problem at first
> place.
> > It
> > > is not completely clear to me why this initial problem occurred, but at
> > > this moment there is a borker with no data for few partitions and
> > > replication fetcher fails upon restart with
> > > "org.apache.kafka.common.errors.OutOfOrderSequenceException: Invalid
> > > sequence number for new epoch: 0 (request epoch), 154277489 (seq.
> > > number)".  I believe this is
> > > https://issues.apache.org/jira/browse/KAFKA-5793.
> > > However I wonder what is the easiest way of bringing this replicas back
> > > online?
> > >
> > > Best regards,
> > > Stanislav.
> > >
> >
>

Reply via email to