Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-05 Thread Matthias J. Sax

Great!

On 9/5/23 1:23 AM, Pushkar Deole wrote:

I think I could figure out a way. There are certain commands that can be
executed from kafka-cli to disassociate a consumer group from the topic
that are not more being consumed.
With this sort of command, I could delete the consumer offsets for a
consumer group for a specific topic and that resolved the lag problem:

kafka-consumer-groups --bootstrap-server $KAFKA_BOOTSTRAP_SERVERS
--command-config ~/kafka.properties --delete-offsets --group
"" --topic " wrote:


As long as the consumer group is active, nothing will be deleted. That
is the reason why you get those incorrect alerts -- Kafka cannot know
that you stopped consuming from those topics. (That is what I tried to
explain -- seems I did a bad job...)

Changing the group.id is tricky because Kafka Streams uses it to
identify internal topic names (for repartiton and chagnelog topics), and
thus your app would start with newly created (and thus empty topics). --
You might want to restart the app with `auto.offset.reset = "earliest"`
and reprocess all available input to re-create state.


-Matthias

On 8/19/23 8:07 AM, Pushkar Deole wrote:

@matthias

what are the alternatives to get rid of this issue? When the lag starts
increasing, we have alerts configured on our monitoring system in Datadog
which starts sending alerts and alarms to reliability teams. I know in
kafka the inactive consumer group is cleared up after 7 days however not
sure if that is the case with topics that were consumed previously and

not

consumed now.

Does creation of new consumer group (setting a different application.id)

on

streams application an option here?


On Thu, Aug 17, 2023 at 7:03 AM Matthias J. Sax 

wrote:



Well, it's kinda expected behavior. It's a split brain problem.

In the end, you use the same `application.id / group.id` and thus the
committed offsets for the removed topics are still in
`__consumer_offsets` topics and associated with the consumer group.

If a tool inspects lags and compares the latest committed offsets to
end-offsets it looks for everything it finds in the `__consumer_offsets`
topics for the group in question -- the tool cannot know that you
changed the application and that is does not read from those topics any
longer (and thus does not commit any longer).

I am not sure from top of my head if you could do a manual cleanup for
the `application.id` and topics in question and delete the committed
offsets from the `__consumer_offsets` topic -- try to checkout `Admin`
client and/or the command line tools...

In know that it's possible to delete committed offsets for a consumer
group (if a group becomes inactive, the broker would also cleanup all
group metadata after a configurable timeout), but I am not sure if
that's for the entire consumer group (ie, all topic) or if you can do it
on a per-topic basis, too.


HTH,
 -Matthias


On 8/16/23 2:11 AM, Pushkar Deole wrote:

Hi streams Dev community  @matthias, @bruno

Any inputs on above issue? Is this a bug in the streams library wherein

the

input topic removed from streams processor topology, the underlying
consumer group still reporting lag against those?

On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole 

wrote:



Hi All,

I have a streams application with 3 instances with application-id set

to

applicationV1. The application uses processor API with reading from

source

topics, processing the data and writing to destination topic.
Currently it consumes from 6 source topics however we don't need to
process data any more from 2 of those topics so we removed 2 topics

from

the source topics list. We have configured Datadog dashboard to report

and

alert on consumer lag so after removing the 2 source topics and

deploying

application, we started getting several alerts about consumer lag on
applicationV1 consumer group which is underlying consumer group of the
streams application. When we looked at the consumer group from

kafka-cli,

we could see that the consumer group is reporting lag against the

topics

removed from source topic list which is reflecting as increasing lag

on

Datadog monitoring.

Can someone advise if this is expected behavior? In my opinion, this

is

not expected since streams application no more has those topics as

part

of

source, it should not report lag on those.













Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-05 Thread Pushkar Deole
I think I could figure out a way. There are certain commands that can be
executed from kafka-cli to disassociate a consumer group from the topic
that are not more being consumed.
With this sort of command, I could delete the consumer offsets for a
consumer group for a specific topic and that resolved the lag problem:

kafka-consumer-groups --bootstrap-server $KAFKA_BOOTSTRAP_SERVERS
--command-config ~/kafka.properties --delete-offsets --group
"" --topic " wrote:

> As long as the consumer group is active, nothing will be deleted. That
> is the reason why you get those incorrect alerts -- Kafka cannot know
> that you stopped consuming from those topics. (That is what I tried to
> explain -- seems I did a bad job...)
>
> Changing the group.id is tricky because Kafka Streams uses it to
> identify internal topic names (for repartiton and chagnelog topics), and
> thus your app would start with newly created (and thus empty topics). --
> You might want to restart the app with `auto.offset.reset = "earliest"`
> and reprocess all available input to re-create state.
>
>
> -Matthias
>
> On 8/19/23 8:07 AM, Pushkar Deole wrote:
> > @matthias
> >
> > what are the alternatives to get rid of this issue? When the lag starts
> > increasing, we have alerts configured on our monitoring system in Datadog
> > which starts sending alerts and alarms to reliability teams. I know in
> > kafka the inactive consumer group is cleared up after 7 days however not
> > sure if that is the case with topics that were consumed previously and
> not
> > consumed now.
> >
> > Does creation of new consumer group (setting a different application.id)
> on
> > streams application an option here?
> >
> >
> > On Thu, Aug 17, 2023 at 7:03 AM Matthias J. Sax 
> wrote:
> >
> >> Well, it's kinda expected behavior. It's a split brain problem.
> >>
> >> In the end, you use the same `application.id / group.id` and thus the
> >> committed offsets for the removed topics are still in
> >> `__consumer_offsets` topics and associated with the consumer group.
> >>
> >> If a tool inspects lags and compares the latest committed offsets to
> >> end-offsets it looks for everything it finds in the `__consumer_offsets`
> >> topics for the group in question -- the tool cannot know that you
> >> changed the application and that is does not read from those topics any
> >> longer (and thus does not commit any longer).
> >>
> >> I am not sure from top of my head if you could do a manual cleanup for
> >> the `application.id` and topics in question and delete the committed
> >> offsets from the `__consumer_offsets` topic -- try to checkout `Admin`
> >> client and/or the command line tools...
> >>
> >> In know that it's possible to delete committed offsets for a consumer
> >> group (if a group becomes inactive, the broker would also cleanup all
> >> group metadata after a configurable timeout), but I am not sure if
> >> that's for the entire consumer group (ie, all topic) or if you can do it
> >> on a per-topic basis, too.
> >>
> >>
> >> HTH,
> >> -Matthias
> >>
> >>
> >> On 8/16/23 2:11 AM, Pushkar Deole wrote:
> >>> Hi streams Dev community  @matthias, @bruno
> >>>
> >>> Any inputs on above issue? Is this a bug in the streams library wherein
> >> the
> >>> input topic removed from streams processor topology, the underlying
> >>> consumer group still reporting lag against those?
> >>>
> >>> On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole 
> >> wrote:
> >>>
>  Hi All,
> 
>  I have a streams application with 3 instances with application-id set
> to
>  applicationV1. The application uses processor API with reading from
> >> source
>  topics, processing the data and writing to destination topic.
>  Currently it consumes from 6 source topics however we don't need to
>  process data any more from 2 of those topics so we removed 2 topics
> from
>  the source topics list. We have configured Datadog dashboard to report
> >> and
>  alert on consumer lag so after removing the 2 source topics and
> >> deploying
>  application, we started getting several alerts about consumer lag on
>  applicationV1 consumer group which is underlying consumer group of the
>  streams application. When we looked at the consumer group from
> >> kafka-cli,
>  we could see that the consumer group is reporting lag against the
> topics
>  removed from source topic list which is reflecting as increasing lag
> on
>  Datadog monitoring.
> 
>  Can someone advise if this is expected behavior? In my opinion, this
> is
>  not expected since streams application no more has those topics as
> part
> >> of
>  source, it should not report lag on those.
> 
> >>>
> >>
> >
>


Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-04 Thread Matthias J. Sax
As long as the consumer group is active, nothing will be deleted. That 
is the reason why you get those incorrect alerts -- Kafka cannot know 
that you stopped consuming from those topics. (That is what I tried to 
explain -- seems I did a bad job...)


Changing the group.id is tricky because Kafka Streams uses it to 
identify internal topic names (for repartiton and chagnelog topics), and 
thus your app would start with newly created (and thus empty topics). -- 
You might want to restart the app with `auto.offset.reset = "earliest"` 
and reprocess all available input to re-create state.



-Matthias

On 8/19/23 8:07 AM, Pushkar Deole wrote:

@matthias

what are the alternatives to get rid of this issue? When the lag starts
increasing, we have alerts configured on our monitoring system in Datadog
which starts sending alerts and alarms to reliability teams. I know in
kafka the inactive consumer group is cleared up after 7 days however not
sure if that is the case with topics that were consumed previously and not
consumed now.

Does creation of new consumer group (setting a different application.id) on
streams application an option here?


On Thu, Aug 17, 2023 at 7:03 AM Matthias J. Sax  wrote:


Well, it's kinda expected behavior. It's a split brain problem.

In the end, you use the same `application.id / group.id` and thus the
committed offsets for the removed topics are still in
`__consumer_offsets` topics and associated with the consumer group.

If a tool inspects lags and compares the latest committed offsets to
end-offsets it looks for everything it finds in the `__consumer_offsets`
topics for the group in question -- the tool cannot know that you
changed the application and that is does not read from those topics any
longer (and thus does not commit any longer).

I am not sure from top of my head if you could do a manual cleanup for
the `application.id` and topics in question and delete the committed
offsets from the `__consumer_offsets` topic -- try to checkout `Admin`
client and/or the command line tools...

In know that it's possible to delete committed offsets for a consumer
group (if a group becomes inactive, the broker would also cleanup all
group metadata after a configurable timeout), but I am not sure if
that's for the entire consumer group (ie, all topic) or if you can do it
on a per-topic basis, too.


HTH,
-Matthias


On 8/16/23 2:11 AM, Pushkar Deole wrote:

Hi streams Dev community  @matthias, @bruno

Any inputs on above issue? Is this a bug in the streams library wherein

the

input topic removed from streams processor topology, the underlying
consumer group still reporting lag against those?

On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole 

wrote:



Hi All,

I have a streams application with 3 instances with application-id set to
applicationV1. The application uses processor API with reading from

source

topics, processing the data and writing to destination topic.
Currently it consumes from 6 source topics however we don't need to
process data any more from 2 of those topics so we removed 2 topics from
the source topics list. We have configured Datadog dashboard to report

and

alert on consumer lag so after removing the 2 source topics and

deploying

application, we started getting several alerts about consumer lag on
applicationV1 consumer group which is underlying consumer group of the
streams application. When we looked at the consumer group from

kafka-cli,

we could see that the consumer group is reporting lag against the topics
removed from source topic list which is reflecting as increasing lag on
Datadog monitoring.

Can someone advise if this is expected behavior? In my opinion, this is
not expected since streams application no more has those topics as part

of

source, it should not report lag on those.









Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-19 Thread Pushkar Deole
@matthias

what are the alternatives to get rid of this issue? When the lag starts
increasing, we have alerts configured on our monitoring system in Datadog
which starts sending alerts and alarms to reliability teams. I know in
kafka the inactive consumer group is cleared up after 7 days however not
sure if that is the case with topics that were consumed previously and not
consumed now.

Does creation of new consumer group (setting a different application.id) on
streams application an option here?


On Thu, Aug 17, 2023 at 7:03 AM Matthias J. Sax  wrote:

> Well, it's kinda expected behavior. It's a split brain problem.
>
> In the end, you use the same `application.id / group.id` and thus the
> committed offsets for the removed topics are still in
> `__consumer_offsets` topics and associated with the consumer group.
>
> If a tool inspects lags and compares the latest committed offsets to
> end-offsets it looks for everything it finds in the `__consumer_offsets`
> topics for the group in question -- the tool cannot know that you
> changed the application and that is does not read from those topics any
> longer (and thus does not commit any longer).
>
> I am not sure from top of my head if you could do a manual cleanup for
> the `application.id` and topics in question and delete the committed
> offsets from the `__consumer_offsets` topic -- try to checkout `Admin`
> client and/or the command line tools...
>
> In know that it's possible to delete committed offsets for a consumer
> group (if a group becomes inactive, the broker would also cleanup all
> group metadata after a configurable timeout), but I am not sure if
> that's for the entire consumer group (ie, all topic) or if you can do it
> on a per-topic basis, too.
>
>
> HTH,
>-Matthias
>
>
> On 8/16/23 2:11 AM, Pushkar Deole wrote:
> > Hi streams Dev community  @matthias, @bruno
> >
> > Any inputs on above issue? Is this a bug in the streams library wherein
> the
> > input topic removed from streams processor topology, the underlying
> > consumer group still reporting lag against those?
> >
> > On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole 
> wrote:
> >
> >> Hi All,
> >>
> >> I have a streams application with 3 instances with application-id set to
> >> applicationV1. The application uses processor API with reading from
> source
> >> topics, processing the data and writing to destination topic.
> >> Currently it consumes from 6 source topics however we don't need to
> >> process data any more from 2 of those topics so we removed 2 topics from
> >> the source topics list. We have configured Datadog dashboard to report
> and
> >> alert on consumer lag so after removing the 2 source topics and
> deploying
> >> application, we started getting several alerts about consumer lag on
> >> applicationV1 consumer group which is underlying consumer group of the
> >> streams application. When we looked at the consumer group from
> kafka-cli,
> >> we could see that the consumer group is reporting lag against the topics
> >> removed from source topic list which is reflecting as increasing lag on
> >> Datadog monitoring.
> >>
> >> Can someone advise if this is expected behavior? In my opinion, this is
> >> not expected since streams application no more has those topics as part
> of
> >> source, it should not report lag on those.
> >>
> >
>


Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-16 Thread Matthias J. Sax

Well, it's kinda expected behavior. It's a split brain problem.

In the end, you use the same `application.id / group.id` and thus the 
committed offsets for the removed topics are still in 
`__consumer_offsets` topics and associated with the consumer group.


If a tool inspects lags and compares the latest committed offsets to 
end-offsets it looks for everything it finds in the `__consumer_offsets` 
topics for the group in question -- the tool cannot know that you 
changed the application and that is does not read from those topics any 
longer (and thus does not commit any longer).


I am not sure from top of my head if you could do a manual cleanup for 
the `application.id` and topics in question and delete the committed 
offsets from the `__consumer_offsets` topic -- try to checkout `Admin` 
client and/or the command line tools...


In know that it's possible to delete committed offsets for a consumer 
group (if a group becomes inactive, the broker would also cleanup all 
group metadata after a configurable timeout), but I am not sure if 
that's for the entire consumer group (ie, all topic) or if you can do it 
on a per-topic basis, too.



HTH,
  -Matthias


On 8/16/23 2:11 AM, Pushkar Deole wrote:

Hi streams Dev community  @matthias, @bruno

Any inputs on above issue? Is this a bug in the streams library wherein the
input topic removed from streams processor topology, the underlying
consumer group still reporting lag against those?

On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole  wrote:


Hi All,

I have a streams application with 3 instances with application-id set to
applicationV1. The application uses processor API with reading from source
topics, processing the data and writing to destination topic.
Currently it consumes from 6 source topics however we don't need to
process data any more from 2 of those topics so we removed 2 topics from
the source topics list. We have configured Datadog dashboard to report and
alert on consumer lag so after removing the 2 source topics and deploying
application, we started getting several alerts about consumer lag on
applicationV1 consumer group which is underlying consumer group of the
streams application. When we looked at the consumer group from kafka-cli,
we could see that the consumer group is reporting lag against the topics
removed from source topic list which is reflecting as increasing lag on
Datadog monitoring.

Can someone advise if this is expected behavior? In my opinion, this is
not expected since streams application no more has those topics as part of
source, it should not report lag on those.





Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-16 Thread Pushkar Deole
Hi streams Dev community  @matthias, @bruno

Any inputs on above issue? Is this a bug in the streams library wherein the
input topic removed from streams processor topology, the underlying
consumer group still reporting lag against those?

On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole  wrote:

> Hi All,
>
> I have a streams application with 3 instances with application-id set to
> applicationV1. The application uses processor API with reading from source
> topics, processing the data and writing to destination topic.
> Currently it consumes from 6 source topics however we don't need to
> process data any more from 2 of those topics so we removed 2 topics from
> the source topics list. We have configured Datadog dashboard to report and
> alert on consumer lag so after removing the 2 source topics and deploying
> application, we started getting several alerts about consumer lag on
> applicationV1 consumer group which is underlying consumer group of the
> streams application. When we looked at the consumer group from kafka-cli,
> we could see that the consumer group is reporting lag against the topics
> removed from source topic list which is reflecting as increasing lag on
> Datadog monitoring.
>
> Can someone advise if this is expected behavior? In my opinion, this is
> not expected since streams application no more has those topics as part of
> source, it should not report lag on those.
>