Gouzhang,
 Will do, if it gets stuck in this loop again ill inspect the broker log
dirs.  I'm running 0.11 release right now.

On Mon, Aug 14, 2017 at 4:22 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> Garrett,
>
> What I get confused is that you mentioned it start spamming the logs, means
> that it falls into this endless loop of:
>
> 1) getting out-of-range exception
> 2) resetting offset by querying the broker of the offset
> 3) getting offset 0 from the broker,
> 4) send fetching request with starting 0, getting out-of-range exception
> again.
>
> So I'd like to ask you do me a favor if you see this again in your test
> environment: goes into the broker log directory, and see if there are still
> any log segment files for partition foo-0, and if yes, does that segment
> file contain any data (using the DumpLogSegment tool) with what offset
> ranges.
> From the code path the broker should at least maintain one empty segment
> even if all data gets truncated, in trunk, but I'm not sure if you are
> running on an older version that may have some bug on the broker logs.
>
> Guozhang
>
>
>
>
> On Mon, Aug 14, 2017 at 6:28 AM, Garrett Barton <garrett.bar...@gmail.com>
> wrote:
>
> > ​Gouzhang,
> >  Thanks for the reply!​  Based on what you said I am going to increase
> the
> > log.retention.hours a bunch and see what happens, things typically break
> > long before 48 hours, but your right the data could have expired by then
> > too.  I'll pay attention to that as well.
> >
> >  As far as messing with the offsets I do nothing external to reset
> offsets,
> > streams is managing things itself.  This really seems to only happen when
> > streams does not have data to process, in the real system that is fed
> data
> > all the time I don't have this issue.
> >
> >
> > On Sun, Aug 13, 2017 at 8:46 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> > > Hi Garrett,
> > >
> > > Since your error message says "offset X" is out of range, it means that
> > the
> > > offset was reset to because there was no data any more on topic
> partition
> > > "foo-0". I suspect that is because all the log segments got truncated
> and
> > > the topic partition contains empty list. It is less likely caused by
> > > KAFKA-5510 and hence offsets.retention.minutes may not help here.
> > >
> > > Since you mentioned setting log.retention.hours=48 does not help, and
> > that
> > > the input sample data may be a day or two before the new build goes
> out,
> > I
> > > suspect there may be some messages with timestamps older than 48 hours
> > > published to the log, causing it to roll new segments and get deleted
> > > immediately: note that the Kafka brokers use the current system time to
> > > determine the diffs with the message timestamps. If that is the case it
> > is
> > > not a Streams issue, not even a general Consumer issue, but a Kafka
> > broker
> > > side log retention operation.
> > >
> > > What I'm not clear is that in your error message "X" is actually 0:
> this
> > is
> > > quite weird that a consumer may auto-reset its position to 0, did you
> run
> > > some tools periodically to reset the offset to 0?
> > >
> > >
> > > Guozhang
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Aug 9, 2017 at 7:16 AM, Garrett Barton <
> garrett.bar...@gmail.com
> > >
> > > wrote:
> > >
> > > > I have a small test setup with a local zk/kafka server and a streams
> > app
> > > > that loads sample data.  The test setup is usually up for a day or
> two
> > > > before a new build goes out and its blown away and loaded from
> scratch.
> > > >
> > > > Lately I've seen that after a few hours the stream app will stop
> > > processing
> > > > and start spamming the logs with:
> > > >
> > > > org.apache.kafka.clients.consumer.internals.Fetcher: Fetch Offset 0
> is
> > > out
> > > > of range for partition foo-0, resetting offset
> > > > org.apache.kafka.clients.consumer.internals.Fetcher: Fetch Offset 0
> is
> > > out
> > > > of range for partition foo-0, resetting offset
> > > > org.apache.kafka.clients.consumer.internals.Fetcher: Fetch Offset 0
> is
> > > out
> > > > of range for partition foo-0, resetting offset
> > > >
> > > > Pretty much sinks a core into spamming the logs.
> > > >
> > > > Restarting the application puts it right back in that broke state.
> > > >
> > > > I thought it was because of this:
> > > > https://issues.apache.org/jira/browse/KAFKA-5510
> > > > So I set my log.retention.hours=48, and
> offsets.retention.minutes=1008
> > 1,
> > > > which is huge compared to the total data retention time.  Yet same
> > error
> > > > occurred.
> > > >
> > > > Any ideas?
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Reply via email to