Re: Kafka streams stores key in multiple state store instances

2024-05-16 Thread Matthias J. Sax
Hello Kay, What you describe is "by design" -- unfortunately. The problem is, that when we build the `Topology` we don't know the partition count of the input topics, and thus, StreamsBuilder cannot insert a repartition topic for this case (we always assume that the partition count is the

Re: Kafka streams state store return hostname as unavailable when calling queryMetadataForKey method

2024-05-13 Thread Sophie Blee-Goldman
Ah. Well this isn't anything new then since it's been the case since 2.6, but the default task assignor in Kafka Streams will sometimes assign partitions unevenly for a time if it's trying to move around stateful tasks and there's no copy of that task's state on the local disk attached to the

Re: Kafka streams state store return hostname as unavailable when calling queryMetadataForKey method

2024-05-09 Thread Penumarthi Durga Prasad Chowdary
Kafka upgraded from 3.5.1 to 3.7.0 version On Fri, May 10, 2024 at 2:13 AM Sophie Blee-Goldman wrote: > What version did you upgrade from? > > On Wed, May 8, 2024 at 10:32 PM Penumarthi Durga Prasad Chowdary < > prasad.penumar...@gmail.com> wrote: > > > Hi Team, > > I'm utilizing Kafka

Re: Kafka streams state store return hostname as unavailable when calling queryMetadataForKey method

2024-05-09 Thread Sophie Blee-Goldman
What version did you upgrade from? On Wed, May 8, 2024 at 10:32 PM Penumarthi Durga Prasad Chowdary < prasad.penumar...@gmail.com> wrote: > Hi Team, > I'm utilizing Kafka Streams to handle data from Kafka topics, running > multiple instances with the same application ID. This enables

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-04-14 Thread Venkatesh Nagarajan
issue (which has most likely been resolved by correcting the METADATA_MAX_AGE_CONFIG setting) from surfacing. Thank you. Kind regards, Venkatesh From: Matthias J. Sax Date: Friday, 5 April 2024 at 3:59 AM To: users@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-04-04 Thread Matthias J. Sax
in metrics which can give insights. Thank you very much. Kind regards, Venkatesh From: Bruno Cadonna Date: Friday, 22 March 2024 at 9:53 PM To: users@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Hi Venkatesh, The 1 core 1 stream thread reco

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-04-03 Thread Venkatesh Nagarajan
@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Hi Venkatesh, The 1 core 1 stream thread recommendation is just s starting point. You need to set the number of stream thread as it fits you by monitoring the app. Maybe this blog post might be interesting

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-22 Thread Bruno Cadonna
. Kind regards, Venkatesh From: Bruno Cadonna Date: Friday, 15 March 2024 at 8:47 PM To: users@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Hi Venkatesh, As you discovered, in Kafka Streams 3.5.1 there is no stop-the-world rebalancing. Static group

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-18 Thread Venkatesh Nagarajan
om: Bruno Cadonna Date: Friday, 15 March 2024 at 8:47 PM To: users@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Hi Venkatesh, As you discovered, in Kafka Streams 3.5.1 there is no stop-the-world rebalancing. Static group member is helpful when Kafk

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-15 Thread Bruno Cadonna
users@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Apologies for the delay in responding to you, Bruno. Thank you very much for your important inputs. Just searched for log messages in the MSK broker logs pertaining to rebalancing and up

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-14 Thread Venkatesh Nagarajan
. Kind regards, Venkatesh From: Venkatesh Nagarajan Date: Friday, 15 March 2024 at 8:30 AM To: users@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Apologies for the delay in responding to you, Bruno. Thank you very much for your important inputs

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-14 Thread Venkatesh Nagarajan
? Thank you very much. Kind regards, Venkatesh From: Bruno Cadonna Date: Wednesday, 13 March 2024 at 8:29 PM To: users@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Hi Venkatesh, Extending on what Matthias replied, a metadata refresh might

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-13 Thread Bruno Cadonna
u have any suggestions on how to reduce such rebalancing, that will be very helpful. Thank you very much. Kind regards, Venkatesh From: Matthias J. Sax Date: Tuesday, 12 March 2024 at 1:31 pm To: users@kafka.apache.org Subject: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stall

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-12 Thread Venkatesh Nagarajan
any suggestions on how to reduce such rebalancing, that will be very helpful. Thank you very much. Kind regards, Venkatesh From: Matthias J. Sax Date: Tuesday, 12 March 2024 at 1:31 pm To: users@kafka.apache.org Subject: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Without

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-12 Thread Venkatesh Nagarajan
, that will be very helpful. Thank you very much. Kind regards, Venkatesh From: Matthias J. Sax Date: Tuesday, 12 March 2024 at 1:31 pm To: users@kafka.apache.org Subject: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Without detailed logs (maybe even DEBUG) hard to say

Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-11 Thread Matthias J. Sax
Without detailed logs (maybe even DEBUG) hard to say. But from what you describe, it could be a metadata issue? Why are you setting METADATA_MAX_AGE_CONFIG (consumer and producer): 5 hours in millis (to make rebalances rare) Refreshing metadata has nothing to do with rebalances, and a

Re: Kafka Streams: understanding re-key operations for joins

2024-02-23 Thread Karsten Stöckmann
Case closed, behaviour is actually as expected. - The source topic contains multiplied data that gets propagated into the join just as it should. I'm leveraging a stream processor for deduplication now. Best wishes Karsten Vikram Singh schrieb am Fr., 23. Feb. 2024, 12:13: > +Ajit Kharpude >

Re: Kafka Streams: understanding re-key operations for joins

2024-02-23 Thread Vikram Singh
+Ajit Kharpude On Fri, Feb 23, 2024 at 1:14 PM Karsten Stöckmann < karsten.stoeckm...@gmail.com> wrote: > Hi, > > I am observing somewhat unexpected (from my point of view) behaviour > while ke-key / re-partitioning operations in order to prepare a > KTable-KTable join. > > Assume two

Re: Kafka Streams LEFT JOIN multiple tables

2024-01-24 Thread Slathia p
Hi Team, Greetings, Apologies for the delay in reply as I was down with flu. We actually reached out to you for IT/ SAP/ Oracle/ Infor / Microsoft “VOTEC IT SERVICE PARTNERSHIP” “IT SERVICE OUTSOURCING” “ “PARTNER SERVICE SUBCONTRACTING” We have very attractive newly introduce

Re: Kafka Streams LEFT JOIN multiple tables

2024-01-24 Thread Slathia p
Hi Team, Greetings, Apologies for the delay in reply as I was down with flu. We actually reached out to you for IT/ SAP/ Oracle/ Infor / Microsoft “VOTEC IT SERVICE PARTNERSHIP” “IT SERVICE OUTSOURCING” “ “PARTNER SERVICE SUBCONTRACTING” We have very attractive newly introduce

Re: Kafka Streams LEFT JOIN multiple tables

2024-01-24 Thread Karsten Stöckmann
Hi Dharin, thanks so much for getting back and for your suggestions. At the moment I'm not quite sure if aggregation in our database is a viable option. Creating aggregate views seemed like an obvious solution at first, yet Debezium does not support subscribing to publications based on views.

Re: Kafka Streams LEFT JOIN multiple tables

2024-01-24 Thread Dharin Shah
Hi Karsten, Before delving deeper into Kafka Streams, it's worth considering if direct aggregation in the database might be a more straightforward solution, unless there's a compelling reason to avoid it. Aggregating data at the database level often leads to more efficient and maintainable

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-17 Thread Matthias J. Sax
You cannot add a `Processor`. You can only use `aggregate() / reduce() / count()` (which of course will add a pre-defined processor). `groupByKey()` is really just a "meta operation" that checks if the key was changes upstream, and to insert a repartition/shuffle step if necessary. Thus, if

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-13 Thread Igor Maznitsa
Thanks a lot for explanation but could you provide a bit more details about KGroupedStream? It is just interface and not extends KStream so how I can add processor in the case below? /   KStream someStream = / /  someStream / /     .groupByKey() / */how to add processor for resulted grouped

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-12 Thread Matthias J. Sax
`KGroupedStream` is just an "intermediate representation" to get a better flow in the DSL. It's not a "top level" abstraction like KStream/KTable. For `KTable` there is `transformValue()` -- there is no `transform()` because keying must be preserved -- if you want to change the keying you

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-11-02 Thread Debraj Manna
Matthias It happened again yesterday during another rolling update. The first error log I can see on the client side is below. It was there in PENDING_ERROR state for sometime and then went into ERROR state. Caused by: java.lang.IllegalStateException: KafkaStreams is not running. State is

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-10-02 Thread Matthias J. Sax
I did mean client side... If KS goes into ERROR state, it should log the reason. If the logs are indeed empty, try to register an uncaught-exception-handler via KafkaStreamssetUncaughtExceptionHandler(...) -Matthias On 10/2/23 12:11 PM, Debraj Manna wrote: Are you suggesting to check

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-10-02 Thread Debraj Manna
Are you suggesting to check the Kafka broker logs? I do not see any other errors logs on the client / application side. On Fri, 29 Sep, 2023, 22:01 Matthias J. Sax, wrote: > In general, Kafka Streams should keep running. > > Can you inspect the logs to figure out why it's going into ERROR state

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-09-29 Thread Matthias J. Sax
In general, Kafka Streams should keep running. Can you inspect the logs to figure out why it's going into ERROR state to begin with? Maybe you need to increase/change some timeouts/retries configs. The stack trace you shared, is a symptom, but not the root cause. -Matthias On 9/21/23 12:56

Re: Kafka Streams, read standby time window store

2023-09-07 Thread Bruno Cadonna
Thanks Igor for the insights! If you feel this should be changed, feel free to open a JIRA ticket. Best, Bruno On 9/6/23 9:07 PM, Igor Maznitsa wrote: Hi Bruno Looks like that I have found error in my code. The error was that I split create of  StoreQueryParameters and my code looked like

Re: Kafka Streams, read standby time window store

2023-09-06 Thread Igor Maznitsa
Hi Bruno Looks like that I have found error in my code. The error was that I split create of  StoreQueryParameters and my code looked like snippet below /*var query = StoreQueryParameters.fromNameAndType(TABLE_NAME, queryableStoreType);*/ /*if (useStale) {*/ /*    query.enableStaleStore();

Re: Kafka Streams, read standby time window store

2023-09-06 Thread Bruno Cadonna
Hi Igor, Sorry to hear you have issues with querying standbys! I have two questions to clarify the situation: 1. Did you enable querying stale stores with StoreQueryParameters.fromNameAndType(TABLE_NAME, queryableStoreType).enableStaleStores() as described in the blog post? 2. Since you

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-05 Thread Matthias J. Sax
Great! On 9/5/23 1:23 AM, Pushkar Deole wrote: I think I could figure out a way. There are certain commands that can be executed from kafka-cli to disassociate a consumer group from the topic that are not more being consumed. With this sort of command, I could delete the consumer offsets for a

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-05 Thread Pushkar Deole
I think I could figure out a way. There are certain commands that can be executed from kafka-cli to disassociate a consumer group from the topic that are not more being consumed. With this sort of command, I could delete the consumer offsets for a consumer group for a specific topic and that

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-04 Thread Matthias J. Sax
As long as the consumer group is active, nothing will be deleted. That is the reason why you get those incorrect alerts -- Kafka cannot know that you stopped consuming from those topics. (That is what I tried to explain -- seems I did a bad job...) Changing the group.id is tricky because

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-19 Thread Pushkar Deole
@matthias what are the alternatives to get rid of this issue? When the lag starts increasing, we have alerts configured on our monitoring system in Datadog which starts sending alerts and alarms to reliability teams. I know in kafka the inactive consumer group is cleared up after 7 days however

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-16 Thread Matthias J. Sax
Well, it's kinda expected behavior. It's a split brain problem. In the end, you use the same `application.id / group.id` and thus the committed offsets for the removed topics are still in `__consumer_offsets` topics and associated with the consumer group. If a tool inspects lags and compares

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-16 Thread Pushkar Deole
Hi streams Dev community @matthias, @bruno Any inputs on above issue? Is this a bug in the streams library wherein the input topic removed from streams processor topology, the underlying consumer group still reporting lag against those? On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole wrote: > Hi

Re: kafka streams re-partitioning on incoming events

2023-07-25 Thread Pushkar Deole
Thanks a lot Bruno! I am just trying the Processor API as you mentioned above, so the processor will write record to another kafka topic with new key. I am just having difficulty to read in another processor from that kafka topic and wondering if I need to create another stream with source as

Re: kafka streams re-partitioning on incoming events

2023-07-14 Thread Bruno Cadonna
Hi Pushkar, The events after repartitioning are processed by a different task than the task that read the events from the source topic. The task assignor assigns those tasks to stream threads. So events with the same key will be processed by the same task. As far as I understood from your

Re: kafka streams re-partitioning on incoming events

2023-07-14 Thread Pushkar Deole
Thanks Bruno.. What do you mean exactly with "...and then process them in that order"? By this, I mean to say if the order of events in partition will be processed after repartition. Probably I don't need to go through internal details but does the partitions of topic are again assigned to

Re: kafka streams re-partitioning on incoming events

2023-07-14 Thread Bruno Cadonna
Hi Pushkar, you can use repartition() for repartition your data. Method through() is actually deprecated in favor of repartition(). Before you repartition you need to specify the new key with selectKey(). What do you mean exactly with "...and then process them in that order"? The order of

Re: kafka streams re-partitioning on incoming events

2023-07-14 Thread Pushkar Deole
Hello, *Kafka dev community, @matthiasJsax* Can you comment on below question? It is very important for us since we are getting inconsistencies due to current design On Sun, Jul 9, 2023 at 6:15 PM Pushkar Deole wrote: > Hi, > > We have a kafka streams application that consumes from multiple

Re: kafka streams partition assignor strategy for version 2.5.1 - does it use sticky assignment

2023-04-16 Thread Pushkar Deole
Thanks John... however I have few more questions: How does this configuration work along with static group membership protocol? Or does this work only with dynamic group membership and not work well when static membership is configured? Secondly, I gather that streams doesn't immediately trigger

Re: kafka streams partition assignor strategy for version 2.5.1 - does it use sticky assignment

2023-04-15 Thread John Roesler
Hi Pushkar, In 2.5, Kafka Streams used an assignor that tried to strike a compromise between stickiness and workload balance, so you would observe some stickiness, but not all the time. In 2.6, we introduced the "high availability task assignor" (see KIP-441

Re: kafka streams partition assignor strategy for version 2.5.1 - does it use sticky assignment

2023-04-14 Thread Pushkar Deole
Any inputs on below query? On Wed, Apr 12, 2023 at 2:22 PM Pushkar Deole wrote: > Hi All, > > We are using version 2.5.1 of kafka-streams with 3 application instances > deployed as 3 kubernetes pods. > It consumes from multiple topics, each with 6 partitions. > I would like to know if streams

Re: Kafka Streams 2.7.1 to 3.3.1 rolling upgrade

2023-02-27 Thread Matthias J. Sax
Hmmm... that's interesting... It seems that Kafka Streams "version probing" does not play well static group membership... Sounds like a "bug" to me -- well, more like a missing integration. Not sure right now, if/how we could fix it. Can you file a ticket? For now, I don't think you can

Re: Kafka Streams possible partitioner bug

2022-11-20 Thread Upesh Desai
: Kafka Streams possible partitioner bug Hey Upesh, are you trying to plug in the custom partitioner via the `partitioner.class` ProducerConfig? That won't work in Streams for the exact reason you highlighted, which is why Streams has its own version of the interface called StreamPartitioner

Re: Kafka Streams possible partitioner bug

2022-11-18 Thread Sophie Blee-Goldman
Hey Upesh, are you trying to plug in the custom partitioner via the `partitioner.class` ProducerConfig? That won't work in Streams for the exact reason you highlighted, which is why Streams has its own version of the interface called StreamPartitioner -- this is what you need to implement instead.

RE: Kafka Streams - Producer attempted to produce with an old epoch.

2022-10-27 Thread Andrew Muraco
Subject: Re: Kafka Streams - Producer attempted to produce with an old epoch. CAUTION: This email originated from outside of ShopHQ. Do not click links or open attachments unless you recognize the sender and know the content is safe! I'm not one of the real experts on the Producer and even

Re: Kafka Streams - Producer attempted to produce with an old epoch.

2022-10-27 Thread Sophie Blee-Goldman
I'm not one of the real experts on the Producer and even further from one with broker performance, so someone else may need to chime in for that, but I did have a few questions: What specifically are you unsatisfied with w.r.t the performance? Are you hoping for a higher throughput of your

Re: Kafka Streams Topology State

2022-08-18 Thread Sophie Blee-Goldman
Hey Peter Try clearing the local state -- if you have stateful tasks then by default Streams will use rocksdb to store records locally in directories specific to/named after that task. This is presumably why you're seeing errors related to "the task for peek node missing in old nodes" You can

Re: Kafka streams and user authentication

2022-02-25 Thread Guozhang Wang
KStream plain(StreamsBuilder builder) { > KStream stream = builder.stream( "A" ); > stream.map( ... ).to( "B" ); > return stream; > } > > Thanks > Alessandro > > > -Original Message- > From: Guozhang Wang > S

RE: Kafka streams and user authentication

2022-02-24 Thread Alessandro Ernesto Mascherpa
hanks Alessandro -Original Message- From: Guozhang Wang Sent: mercoledì 23 febbraio 2022 19:20 To: Users Subject: Re: Kafka streams and user authentication Hello Alessandro, Could you elaborate a bit more on what authN methanisms you are using, and by `account` what do you mean expl

Re: Kafka streams uneven task allocation

2022-02-23 Thread Guozhang Wang
Hello Navneeth, Just to verify some behaviors, could you try 1) not using instance.id config, hence no static members, 2) upgrade to the latest version of Kafka, respectively (i.e. do not do them at the same time) and see if either one of them help with the imbalance issue? On Sun, Feb 20, 2022

Re: Kafka streams and user authentication

2022-02-23 Thread Guozhang Wang
Hello Alessandro, Could you elaborate a bit more on what authN methanisms you are using, and by `account` what do you mean explicitly? Guozhang On Wed, Feb 23, 2022 at 5:10 AM Alessandro Ernesto Mascherpa < alessandro.masche...@piksel.com> wrote: > Hi All, > I'm facing a problem with user

Re: Kafka streams uneven task allocation

2022-02-20 Thread Luke Chen
Hi Navneeth, To know the reason why there's more than one partition in the same stream task, we should know why the rebalance triggered. That might have to look into the logs. > I have configured standby to be 1 which means there will be one more copy of the state store and warm up by default is

Re: Kafka streams uneven task allocation

2022-02-18 Thread Navneeth Krishnan
Hi Guozhang, Thanks and sorry for the late reply. I'm overriding the GROUP_INSTANCE_ID_CONFIG & APPLICATION_SERVER_CONFIG. Rest all are defaults. Even then I see more than one partition being allocated to the same stream task. Also I have an additional question regarding the replicas. The

Re: Kafka streams usecase

2022-02-17 Thread Chad Preisler
It can be done with the consumer API. However, you're just going to end up re-implementing what is already there in the streams DSL. It will be far easier to use the Stream DSL join functionality to accomplish this. I've never tried to do it with a simple consumer. On Wed, Feb 16, 2022 at 6:45 PM

Re: Kafka streams usecase

2022-02-16 Thread pradeep s
Thanks Chad! if we want to consume from multiple topics and persist to a database , can i go with a consumer and lookup the record and update .Requirement is to consume from item topic and price topic and create a record in postgress . Both topic have item id in message which is the key in

Re: Kafka streams uneven task allocation

2022-02-03 Thread Guozhang Wang
Hello Navneeth, Could you describe how you ended up with more than one partition assigned to the same thread after certain rebalance(s)? Do you override any default config values such as instance.id (for static consumer members), etc? Also I'd suggest upgrading to a newer version --- we just

Re: Kafka streams usecase

2022-01-13 Thread Chad Preisler
Yes Kafka streams can be used to do this. There are probably several ways to implement this. We did something like this in Java using a groupByKey() and reduce() functions. The three topics we wanted to combine into one topic had different schemas and different java class types. So to combine them

Re: Kafka Streams - Stream threads processing two input topics

2022-01-10 Thread Guozhang Wang
Hi Miguel, I suspect it's due to the timestamps in your topic A, which are earlier than topic B. Note that Kafka Streams tries to synchronize joining topics by processing records with smaller timestamps, and hence if topic A's messages have smaller timestamps, they will be selected over the

Re: Kafka Streams - one topic moves faster the other one

2022-01-04 Thread Matthias J. Sax
If you observer timestamps based synchronization issues, you might also consider to switch to 3.0 release, that closes a few more gaps to this end. Cf https://cwiki.apache.org/confluence/display/KAFKA/KIP-695%3A+Further+Improve+Kafka+Streams+Timestamp+Synchronization -Matthias On 12/29/21

Re: Kafka Streams - one topic moves faster the other one

2021-12-29 Thread Luke Chen
Hi Miguel, Yes, the grace period is the solution to fix the problem. Alternatively, you can try to set a higher value for "max.task.idle.ms" configuration, because this is some kind of out-of-order data. Let's say, A topic has 1 record per second (fast), B topic has 1 record per minute (slow).

Re: Kafka Streams threads sometimes fail transaction and are fenced after broker restart

2021-12-23 Thread Pieter Hameete
! -- Pieter Van: Guozhang Wang Verzonden: dinsdag 21 december 2021 19:50 Aan: Users Onderwerp: Re: Kafka Streams threads sometimes fail transaction and are fenced after broker restart Hello Pieter, Thanks for bringing this to the community's attention. After

Re: Kafka Streams threads sometimes fail transaction and are fenced after broker restart

2021-12-21 Thread Guozhang Wang
Hello Pieter, Thanks for bringing this to the community's attention. After reading your description I suspect you're hitting this issue: https://cwiki.apache.org/confluence/display/KAFKA/KIP-588%3A+Allow+producers+to+recover+gracefully+from+transaction+timeouts Basically today we did not try to

Re: Kafka Streams - start consuming from specific offset

2021-12-17 Thread Luke Chen
Hi Miguel, > Is there a way to configure KafkaStreams or the consumer it uses to start from a specific offset? You can try to use `bin/kafka-streams-application-reset.sh` to reset the offset to a specific date time. REF:

Re: Kafka Streams - left join behavior

2021-12-13 Thread Luke Chen
Hi Miguel, Kafka v3.1.0 has already code freezed, and is dealing with some blocker issues. It should be released soon. For this feature status in v3.0.0, I think Matthias knows it the most. As far as I know, it was ready for v3.0.0 originally, but there's a regression bug found(KAFKA-13216

Re: Kafka Streams - left join behavior

2021-12-13 Thread Miguel González
Hi Matthias Do you know when the 3.1 version is going to be released? I noticed the JoinWindows class has a boolean property called enableSpuriousResultFix If I extend the class the set that flag to true will it eliminate spurious messages in kafka streams 3.0.0 ? thanks - Miguel On Mon,

Re: Kafka Streams app process records until certain date

2021-12-08 Thread Matthias J. Sax
Hard to achieve. I guess a naive approach would be to use a `flatMapTransform()` to implement a filter that drops all record that are not in the desired time range. pause() and resume() are not available in Kafka Streams, but only on the KafkaConsumer (The Spring docs you cite is also about

Re: Kafka Streams - left join behavior

2021-12-06 Thread Matthias J. Sax
It's fixed in upcoming 3.1 release. Cf https://issues.apache.org/jira/browse/KAFKA-10847 A stream-(global)table join has different semantics, so I am not sure if it would help. One workaround would be to apply a stateful` faltTransformValues()` after the join to "buffer" all NULL-results

Re: Kafka Streams - left join behavior

2021-11-30 Thread Luke Chen
Hi Miguel, > Is there a way to force the behavior I need, meaning... using left join and a JoinWindows output only one message (A,B) or (A, null) I think you can try to achieve it by using *KStream-GlobalKTable left join*, where the GlobalKTable should read all records at the right topic, and

Re: Kafka streams event deduplication keeping last event in window

2021-11-03 Thread Matthias J. Sax
You could do something similar to what the WindowStore does and store a key-timestamp pair as actual key. Given current wall-clock time, you can compute the time for closed windows and do corresponding lookups (either per key, or using range scans). -Matthias On 11/3/21 12:40 AM, Luigi

RE: Re: Kafka streams event deduplication keeping last event in window

2021-11-03 Thread Luigi Cerone
Hello Matthias, thanks for your reply. > Using a plain kv-store, whenever the punctuation runs you can find closed windows, forward the result and also delete the row explicitly, which give you more control. What is the best way to find closed windows? Have you got any examples? Thanks! :) On

Re: Kafka streams event deduplication keeping last event in window

2021-11-03 Thread Luigi Cerone
Hello Matthias, thanks for your reply. > Using a plain kv-store, whenever the punctuation runs you can find closed windows, forward the result and also delete the row explicitly, which give you more control. What is the best way to find closed windows? Have you got any examples? Thanks! :) Il

Re: Kafka streams event deduplication keeping last event in window

2021-11-02 Thread Matthias J. Sax
I did not study your code snippet, but yes, it sounds like a valid approach from your description. How can I be sure that the start of the window will coincide with the Punctuator's scheduled interval? For punctuations, there is always some jitter, because it's not possible to run a

Re: Kafka streams - message not materialized in intermediate topics

2021-10-28 Thread Bill Bejeck
Hi Tomer, To dump the topology you can do `System.out.println(topology.describe().toString())`. But if you can post just the code that would be fine as well. I understand about the logs, one thing to do is grep out any sensitive information, but I get it if you can't do that. Thanks, Bill On

Re: Kafka streams - message not materialized in intermediate topics

2021-10-28 Thread Tomer Cohen
Hi Bill, Is there an easy way to dump the topology to share? The logs contain sensitive information, is there something else that can be provided? Thanks, Tomer On Thu, Oct 28, 2021 at 12:23 PM Bill Bejeck wrote: > Hi Tomer, > > Can you share your topology and any log files? > > Thanks, >

Re: Kafka streams - message not materialized in intermediate topics

2021-10-28 Thread Bill Bejeck
Hi Tomer, Can you share your topology and any log files? Thanks, Bill On Thu, Oct 28, 2021 at 12:07 PM Tomer Cohen wrote: > Hi Bill/Matthias, > > Thanks for the replies. > > The issue is I never see a result, I have a log that shows the message > coming in, but the adder/subtractor is never

Re: Kafka streams - message not materialized in intermediate topics

2021-10-28 Thread Tomer Cohen
Hi Bill/Matthias, Thanks for the replies. The issue is I never see a result, I have a log that shows the message coming in, but the adder/subtractor is never invoked for it even though it should. So no result gets published to the intermediate topic I have. Thanks, Tomer On Thu, Oct 28, 2021

Re: Kafka streams - message not materialized in intermediate topics

2021-10-28 Thread Bill Bejeck
Tomer, As Matthias pointed out for a single, final result you need to use the `suppress()` operator. But back to your original question, they are processed by the adder/subtractor and are not > materialized in the intermediate topics which causes them not to be > outputted in the final topic >

Re: Kafka streams - message not materialized in intermediate topics

2021-10-27 Thread Matthias J. Sax
For this case, you can call `aggregate(...).suppress()`. -Matthias On 10/27/21 12:42 PM, Tomer Cohen wrote: Hi Bill, Thanks for the prompt reply. Setting to 0 forces a no collection window, so if I get 10 messages to aggregate for example, it will send 10 updates. But I only want to publish

Re: Kafka streams - message not materialized in intermediate topics

2021-10-27 Thread Tomer Cohen
Hi Bill, Thanks for the prompt reply. Setting to 0 forces a no collection window, so if I get 10 messages to aggregate for example, it will send 10 updates. But I only want to publish the final state only. Thanks, Tomer On Wed, Oct 27, 2021 at 2:10 PM Bill Bejeck wrote: > Hi Tomer, > > From

Re: Kafka streams - message not materialized in intermediate topics

2021-10-27 Thread Bill Bejeck
Hi Tomer, >From the description you've provided, it sounds to me like you have a stateful operation. The thing to keep in mind with stateful operations in Kafka Streams is that every result is not written to the changelog and forwarded downstream. Kafka Streams uses a cache for stateful

Re: Kafka Streams Testing Suite Question

2021-10-13 Thread Matthias J. Sax
I assume you are referring to the `TopologyTestDriver`? How do you pass/specify the serdes? For `TestInputTopic` and `TestOutputTopic` you would pass (de)serializer instances and thus you will need to call `configure()` in your test code explicitly. Similarly, if you pass `Serde` instances

Re: kafka streams commit.interval.ms for at-least-once too high

2021-10-06 Thread Matthias J. Sax
If you build it manually / from-scratch using plain consumer/producer, it is your responsibility to avoid duplicates and/or data loss for a clean shutdown case or a rebalance. That is, why we recommend to use Kafka Streams for a consumer-process-produce pattern, as it does a lot of heavy

Re: kafka streams commit.interval.ms for at-least-once too high

2021-10-06 Thread Pushkar Deole
Matthias, Good to hear on this part that kafka streams handle this internally : "If a rebalance/shutdown is triggered, Kafka Streams will stop processing new records and just finish processing all in-flight records. Afterwards, a commit happens right away for all fully processed records." Since

Re: kafka streams commit.interval.ms for at-least-once too high

2021-10-05 Thread Matthias J. Sax
- By producer config, i hope you mean batching and other settings that will hold off producing of events. Correct me if i'm wrong Correct. - Not sure what you mean by throughput here, which configuration would dictate that? I referred to input topic throughput. If you have higher/lower

Re: kafka streams commit.interval.ms for at-least-once too high

2021-10-05 Thread Pushkar Deole
Matthias, On your response "For at-least-once, you would still get output continuously, depending on throughput and producer configs" - Not sure what you mean by throughput here, which configuration would dictate that? - By producer config, i hope you mean batching and other settings that will

Re: kafka streams commit.interval.ms for at-least-once too high

2021-10-05 Thread Matthias J. Sax
The main motivation for a shorter commit interval for EOS is end-to-end-latency. A Topology could consist of multiple sub-topologies and the end-to-end-latency for the EOS case is roughly commit-interval times number-of-subtopologies. For regular rebalances/restarts, a longer commit interval

Re: Kafka Streams Handling uncaught exceptions REPLACE_THREAD

2021-08-24 Thread Yoda Jedi Master
Thank you for your help, I will check it and try it :-) On Mon, Aug 16, 2021 at 11:45 AM Bruno Cadonna wrote: > Hi Yoda, > > for certain cases, Kafka Streams allows you to specify handlers that > skip the problematic record. Those handlers are: > > 1. deserialization exception handler

Re: Kafka Streams leave group behaviour

2021-08-18 Thread Sophie Blee-Goldman
As Boyang mentioned, Kafka Streams intentionally does not send a LeaveGroup request when shutting down. This is because often the shutdown is not due to a scaling down event but instead some transient closure, such as during a rolling bounce. In cases where the instance is expected to start up

Re: Kafka Streams Handling uncaught exceptions REPLACE_THREAD

2021-08-16 Thread Bruno Cadonna
Hi Yoda, for certain cases, Kafka Streams allows you to specify handlers that skip the problematic record. Those handlers are: 1. deserialization exception handler configured in default.deserialization.exception.handler 2. time extractor set in default.timestamp.extractor and in the Consumed

Re: Kafka Streams leave group behaviour

2021-08-12 Thread Boyang Chen
You are right Uwe, Kafka Streams won't leave group no matter dynamic or static membership. If you want to have fast scale down, consider trying static membership and use the admin command `removeMemberFromGroup` when you need to rescale. Boyang On Thu, Aug 12, 2021 at 4:37 PM Lerh Chuan Low

Re: Kafka Streams leave group behaviour

2021-08-12 Thread Lerh Chuan Low
I think you may have stumbled upon this: https://issues.apache.org/jira/browse/KAFKA-4881. 1 thing that you could try is using static membership - we have yet to try that though so can't comment yet on how that might work out. On Thu, Aug 12, 2021 at 11:29 PM c...@uweeisele.eu wrote: > Hello

Re: Kafka Streams Handling uncaught exceptions REPLACE_THREAD

2021-08-10 Thread Yoda Jedi Master
Hi Bruno, thank you for your answer. I mean that the message that caused the exception was consumed and replaced thread will continue from the next message. How then does it handle uncaught exceptions, if it will fail again? On Tue, Aug 10, 2021 at 12:33 PM Bruno Cadonna wrote: > Hi Yoda, > >

Re: Kafka Streams Handling uncaught exceptions REPLACE_THREAD

2021-08-10 Thread Yoda Jedi Master
Hi Luke, thank you for your answer. I will try it, I think I will set an alert if there are too many messages. To ignore the message should I simply return "replace_thread" in the handler? On Tue, Aug 10, 2021 at 12:16 PM Luke Chen wrote: > Hi Yoda, > For your question: > > If an application

Re: Kafka Streams Handling uncaught exceptions REPLACE_THREAD

2021-08-10 Thread Bruno Cadonna
Hi Yoda, What do you mean exactly with "skipping that failed message"? Do you mean a record consumed from a topic that caused an exception that killed the stream thread? If the record killed the stream thread due to an exception, for example, a deserialization exception, it will probably

Re: Kafka Streams Handling uncaught exceptions REPLACE_THREAD

2021-08-10 Thread Luke Chen
Hi Yoda, For your question: > If an application gets an uncaught exception, then the failed thread will be replaced with another thread and it will continue processing messages, skipping that failed message? --> Yes, if everything goes well after `replace thread`, you can ignore this failed

  1   2   3   4   5   6   7   8   9   10   >