Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-05 Thread Matthias J. Sax

Great!

On 9/5/23 1:23 AM, Pushkar Deole wrote:

I think I could figure out a way. There are certain commands that can be
executed from kafka-cli to disassociate a consumer group from the topic
that are not more being consumed.
With this sort of command, I could delete the consumer offsets for a
consumer group for a specific topic and that resolved the lag problem:

kafka-consumer-groups --bootstrap-server $KAFKA_BOOTSTRAP_SERVERS
--command-config ~/kafka.properties --delete-offsets --group
"" --topic " wrote:


As long as the consumer group is active, nothing will be deleted. That
is the reason why you get those incorrect alerts -- Kafka cannot know
that you stopped consuming from those topics. (That is what I tried to
explain -- seems I did a bad job...)

Changing the group.id is tricky because Kafka Streams uses it to
identify internal topic names (for repartiton and chagnelog topics), and
thus your app would start with newly created (and thus empty topics). --
You might want to restart the app with `auto.offset.reset = "earliest"`
and reprocess all available input to re-create state.


-Matthias

On 8/19/23 8:07 AM, Pushkar Deole wrote:

@matthias

what are the alternatives to get rid of this issue? When the lag starts
increasing, we have alerts configured on our monitoring system in Datadog
which starts sending alerts and alarms to reliability teams. I know in
kafka the inactive consumer group is cleared up after 7 days however not
sure if that is the case with topics that were consumed previously and

not

consumed now.

Does creation of new consumer group (setting a different application.id)

on

streams application an option here?


On Thu, Aug 17, 2023 at 7:03 AM Matthias J. Sax 

wrote:



Well, it's kinda expected behavior. It's a split brain problem.

In the end, you use the same `application.id / group.id` and thus the
committed offsets for the removed topics are still in
`__consumer_offsets` topics and associated with the consumer group.

If a tool inspects lags and compares the latest committed offsets to
end-offsets it looks for everything it finds in the `__consumer_offsets`
topics for the group in question -- the tool cannot know that you
changed the application and that is does not read from those topics any
longer (and thus does not commit any longer).

I am not sure from top of my head if you could do a manual cleanup for
the `application.id` and topics in question and delete the committed
offsets from the `__consumer_offsets` topic -- try to checkout `Admin`
client and/or the command line tools...

In know that it's possible to delete committed offsets for a consumer
group (if a group becomes inactive, the broker would also cleanup all
group metadata after a configurable timeout), but I am not sure if
that's for the entire consumer group (ie, all topic) or if you can do it
on a per-topic basis, too.


HTH,
 -Matthias


On 8/16/23 2:11 AM, Pushkar Deole wrote:

Hi streams Dev community  @matthias, @bruno

Any inputs on above issue? Is this a bug in the streams library wherein

the

input topic removed from streams processor topology, the underlying
consumer group still reporting lag against those?

On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole 

wrote:



Hi All,

I have a streams application with 3 instances with application-id set

to

applicationV1. The application uses processor API with reading from

source

topics, processing the data and writing to destination topic.
Currently it consumes from 6 source topics however we don't need to
process data any more from 2 of those topics so we removed 2 topics

from

the source topics list. We have configured Datadog dashboard to report

and

alert on consumer lag so after removing the 2 source topics and

deploying

application, we started getting several alerts about consumer lag on
applicationV1 consumer group which is underlying consumer group of the
streams application. When we looked at the consumer group from

kafka-cli,

we could see that the consumer group is reporting lag against the

topics

removed from source topic list which is reflecting as increasing lag

on

Datadog monitoring.

Can someone advise if this is expected behavior? In my opinion, this

is

not expected since streams application no more has those topics as

part

of

source, it should not report lag on those.













RE: Need more clarity in documentation for upgrade/downgrade procedures and limitations across releases

2023-09-05 Thread Kaushik Srinivas (Nokia)
Hi Team,

Any help on this query ?

From: Kaushik Srinivas (Nokia)
Sent: Tuesday, August 22, 2023 10:26 AM
To: users@kafka.apache.org
Subject: Need more clarity in documentation for upgrade/downgrade procedures 
and limitations across releases


Hi Team,

Referring to the upgrade documentation for apache kafka.

https://kafka.apache.org/34/documentation.html#upgrade_3_4_0

There is confusion with respect to below statements from the above sectioned 
link of apache docs.

"If you are upgrading from a version prior to 2.1.x, please see the note below 
about the change to the schema used to store consumer offsets. Once you have 
changed the inter.broker.protocol.version to the latest version, it will not be 
possible to downgrade to a version prior to 2.1."

The above statement mentions that the downgrade would not be possible to 
version prior to "2.1" in case of "upgrading the inter.broker.protocol.version 
to the latest version".

But, there is another statement made in the documentation in point 4 as below

"Restart the brokers one by one for the new protocol version to take effect. 
Once the brokers begin using the latest protocol version, it will no longer be 
possible to downgrade the cluster to an older version."



These two statements are repeated across a lot of prior releases of kafka and 
is confusing.

Below are the questions:

  1.  Is downgrade not at all possible to "any" older version of kafka once the 
inter.broker.protocol.version is updated to latest version OR downgrades are 
not possible only to versions "<2.1" ?
  2.  Suppose one takes an approach similar to upgrade even for the downgrade 
path. i.e. downgrade the inter.broker.protocol.version first to the previous 
version, next downgrade the software/code of kafka to previous release 
revision. Does downgrade work with this approach ?

Can these two questions be documented if the results are already known ?

Regards,
Kaushik.


Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-05 Thread Pushkar Deole
I think I could figure out a way. There are certain commands that can be
executed from kafka-cli to disassociate a consumer group from the topic
that are not more being consumed.
With this sort of command, I could delete the consumer offsets for a
consumer group for a specific topic and that resolved the lag problem:

kafka-consumer-groups --bootstrap-server $KAFKA_BOOTSTRAP_SERVERS
--command-config ~/kafka.properties --delete-offsets --group
"" --topic " wrote:

> As long as the consumer group is active, nothing will be deleted. That
> is the reason why you get those incorrect alerts -- Kafka cannot know
> that you stopped consuming from those topics. (That is what I tried to
> explain -- seems I did a bad job...)
>
> Changing the group.id is tricky because Kafka Streams uses it to
> identify internal topic names (for repartiton and chagnelog topics), and
> thus your app would start with newly created (and thus empty topics). --
> You might want to restart the app with `auto.offset.reset = "earliest"`
> and reprocess all available input to re-create state.
>
>
> -Matthias
>
> On 8/19/23 8:07 AM, Pushkar Deole wrote:
> > @matthias
> >
> > what are the alternatives to get rid of this issue? When the lag starts
> > increasing, we have alerts configured on our monitoring system in Datadog
> > which starts sending alerts and alarms to reliability teams. I know in
> > kafka the inactive consumer group is cleared up after 7 days however not
> > sure if that is the case with topics that were consumed previously and
> not
> > consumed now.
> >
> > Does creation of new consumer group (setting a different application.id)
> on
> > streams application an option here?
> >
> >
> > On Thu, Aug 17, 2023 at 7:03 AM Matthias J. Sax 
> wrote:
> >
> >> Well, it's kinda expected behavior. It's a split brain problem.
> >>
> >> In the end, you use the same `application.id / group.id` and thus the
> >> committed offsets for the removed topics are still in
> >> `__consumer_offsets` topics and associated with the consumer group.
> >>
> >> If a tool inspects lags and compares the latest committed offsets to
> >> end-offsets it looks for everything it finds in the `__consumer_offsets`
> >> topics for the group in question -- the tool cannot know that you
> >> changed the application and that is does not read from those topics any
> >> longer (and thus does not commit any longer).
> >>
> >> I am not sure from top of my head if you could do a manual cleanup for
> >> the `application.id` and topics in question and delete the committed
> >> offsets from the `__consumer_offsets` topic -- try to checkout `Admin`
> >> client and/or the command line tools...
> >>
> >> In know that it's possible to delete committed offsets for a consumer
> >> group (if a group becomes inactive, the broker would also cleanup all
> >> group metadata after a configurable timeout), but I am not sure if
> >> that's for the entire consumer group (ie, all topic) or if you can do it
> >> on a per-topic basis, too.
> >>
> >>
> >> HTH,
> >> -Matthias
> >>
> >>
> >> On 8/16/23 2:11 AM, Pushkar Deole wrote:
> >>> Hi streams Dev community  @matthias, @bruno
> >>>
> >>> Any inputs on above issue? Is this a bug in the streams library wherein
> >> the
> >>> input topic removed from streams processor topology, the underlying
> >>> consumer group still reporting lag against those?
> >>>
> >>> On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole 
> >> wrote:
> >>>
>  Hi All,
> 
>  I have a streams application with 3 instances with application-id set
> to
>  applicationV1. The application uses processor API with reading from
> >> source
>  topics, processing the data and writing to destination topic.
>  Currently it consumes from 6 source topics however we don't need to
>  process data any more from 2 of those topics so we removed 2 topics
> from
>  the source topics list. We have configured Datadog dashboard to report
> >> and
>  alert on consumer lag so after removing the 2 source topics and
> >> deploying
>  application, we started getting several alerts about consumer lag on
>  applicationV1 consumer group which is underlying consumer group of the
>  streams application. When we looked at the consumer group from
> >> kafka-cli,
>  we could see that the consumer group is reporting lag against the
> topics
>  removed from source topic list which is reflecting as increasing lag
> on
>  Datadog monitoring.
> 
>  Can someone advise if this is expected behavior? In my opinion, this
> is
>  not expected since streams application no more has those topics as
> part
> >> of
>  source, it should not report lag on those.
> 
> >>>
> >>
> >
>