Re: kafka streams partition assignor strategy for version 2.5.1 - does it use sticky assignment

2023-04-15 Thread John Roesler
Hi Pushkar, In 2.5, Kafka Streams used an assignor that tried to strike a compromise between stickiness and workload balance, so you would observe some stickiness, but not all the time. In 2.6, we introduced the "high availability task assignor" (see KIP-441

Re: same keys appearing in state stores on different pods when using branches in kafka streams

2022-12-05 Thread John Roesler
ined on local state store on same pod >> >> 3rd event with CLOSED status, with key xyz came in and processed. The >> state is stored in 'record' state store, it is expected to be stored in >> state store on same pod. >> Why it would go to some other pod? >> >>

Re: Kafka service unavailable

2022-12-05 Thread John Roesler
Hi Rakesh, I'm sorry for your trouble. The mailing list doesn't transmit embedded images, so we can't see the information you provided. Maybe you can create a Github Gist or open a Jira ticket at https://issues.apache.org/jira/projects/KAFKA/issues ? Thanks, -John On Sun, Dec 4, 2022, at

Re: Stream sinks are not constructed when application starts up before Kafka broker

2022-11-23 Thread John Roesler
Hi Alexander, I’m sorry to hear that. It certainly sounds like a hard one to debug. To clarify, do you mean that when you observe this problem, the sink node is not in the topology at all, or that it is in the topology, but does not function properly? Also, are you using Spring to construct

Re: same keys appearing in state stores on different pods when using branches in kafka streams

2022-11-23 Thread John Roesler
Hi Pushkar, Thanks for the question. I think that what’s happening is that, even though both branches use the same grouping logic, Streams can’t detect that they are the same. It just sees two group-bys and therefore introduces two repartitions, with a separate downstream task for each. You

Re: [ANNOUNCE] New Kafka PMC Member: Bruno Cadonna

2022-11-01 Thread John Roesler
Congratulations, Bruno!!! On Tue, Nov 1, 2022, at 15:16, Lucas Brutschy wrote: > Wow, congratulations! > > On Tue, Nov 1, 2022 at 8:55 PM Chris Egerton wrote: >> >> Congrats! >> >> On Tue, Nov 1, 2022, 15:44 Bill Bejeck wrote: >> >> > Congrats Bruno! Well deserved. >> > >> > -Bill >> > >> > On

Re: Reprocessing messsages in kafka streams vs Transformer init

2022-10-07 Thread John Roesler
Hello Tomasz, Thanks for the question! Streams should always call init() before passing any records to transform(...). When we talk about "reprocessing", we just mean that some record was processed, but then there was a failure before its offset was committed, and therefore we have to process

Re: [VOTE] 3.3.1 RC0

2022-09-30 Thread John Roesler
Hi José, I verified the signatures and ran all the unit tests, as well as the Streams integration tests with: > ./gradlew -version > > > Gradle 7.4.2 > > > Build time:

Re: Out of order messages when kafka streams application catches up

2022-09-30 Thread John Roesler
and try adjusting the test setup until you're able to reproduce the behavior you're seeing? If you can do that, I think we will get to the bottom of it. Thanks, -John On Fri, Sep 30, 2022, at 09:51, John Roesler wrote: > Hi Tomasz, > > Thanks for trying that out. It’s not the way I

Re: Out of order messages when kafka streams application catches up

2022-09-30 Thread John Roesler
0.0 with positive task.max.idle.ms and it did > not help. > When lag is large, the application still consumes data batches without > interleaving. > > > > wt., 27 wrz 2022 o 05:51 John Roesler napisał(a): > >> Hi Tomasz, >> >> Thanks for asking. This soun

Re: Out of order messages when kafka streams application catches up

2022-09-26 Thread John Roesler
Hi Tomasz, Thanks for asking. This sounds like the situation that we fixed in Apache Kafka 3.0, with KIP-695 (https://cwiki.apache.org/confluence/display/KAFKA/KIP-695%3A+Further+Improve+Kafka+Streams+Timestamp+Synchronization). Can you try upgrading and let us know if this fixes the problem?

Re: [VOTE] 3.3.0 RC2

2022-09-26 Thread John Roesler
Thanks for running this, David! I've verified the signatures, looked at the docs, and run the quickstart (ZK and KRaft). I also ran the unit tests, as well as all the tests for Streams locally. The docs look a little malformed (the "collapse/expand" button floats over the text, the collapsed

Re: UnsupportedOperationException: this should not happen: timestamp() is not supported in standby tasks.

2022-08-23 Thread John Roesler
Hi Suresh, Sorry for the trouble! Are you able to provide the rest of the stack trace? It shouldn’t be possible to call put() on a store in a standby task, so we need to see the stack frames that show what is calling it. Thanks, John On Tue, Aug 23, 2022, at 05:08, Suresh Rukmangathan

Re: [ANNOUNCE] New Kafka PMC Member: A. Sophie Blee-Goldman

2022-08-02 Thread John Roesler
Congratulations, Sophie! -John On Tue, Aug 2, 2022, at 06:40, Chris Egerton wrote: > Congrats, Sophie! > > On Mon, Aug 1, 2022 at 9:21 PM Luke Chen wrote: > >> Congrats Sophie! :) >> >> Luke >> >> On Tue, Aug 2, 2022 at 7:56 AM Adam Bellemare >> wrote: >> >> > Congratulations Sophie! I’m glad

Re: question: kafka stream Tumbling Window can't close when no producer sending message

2022-07-28 Thread John Roesler
Hello, Yes, this is correct. There is a difference between what we call “stream time” and regular “wall-clock time”. All the windowing operations need to be deterministic, otherwise your results would depend on when you run your program. For that reason, we have “stream time”, which takes its

Re: [ANNOUNCE] New Committer: Chris Egerton

2022-07-25 Thread John Roesler
Congratulations, Chris!! -John On Mon, Jul 25, 2022, at 20:22, Luke Chen wrote: > Congratulations Chris! Well deserved! > > Luke > > On Tue, Jul 26, 2022 at 5:39 AM Anna McDonald > wrote: > >> Congratulations Chris! Time to Cellobrate! >> >> anna >> >> On Mon, Jul 25, 2022 at 4:23 PM Martin

Re: KStreams State Store - state.dir does not have .checkpoint file

2022-06-01 Thread John Roesler
Hi Neeraj, Thanks for all that detail! Your expectation is correct. You should see the checkpoint files after a _clean_ shutdown, and then you should not see it bootstrap from the beginning of the changelog on the next startup. How are you shutting down the application? You'll want to call

Re: Subscribing to users@kafka.apache.org

2022-04-15 Thread John Roesler
Hi Lorcan, Thanks for your interest! The instructions for subscribing are available here: https://kafka.apache.org/contact Thanks, John On Fri, Apr 15, 2022, at 11:28, Lorcan Cooke wrote: > Hi, > > > I would like to subscribe to users@kafka.apache.org please. > > > Regards, > > — > Lorcan

Re: [kafka-clients] [ANNOUNCE] Apache Kafka 3.0.1

2022-03-14 Thread John Roesler
Yes, thank you, Mickael! -John On Mon, 2022-03-14 at 12:19 +0100, Bruno Cadonna wrote: > Thanks Mickael for driving this release! > > Best, > Bruno > > On 14.03.22 11:42, Mickael Maison wrote: > > The Apache Kafka community is pleased to announce the release for > > Apache Kafka 3.0.1 > > > >

[ANNOUNCE] Apache Kafka 2.8.0

2021-04-19 Thread John Roesler
, Ismael Juma, Ivan Ponomarev, Ivan Yurchenko, jackyoh, James Cheng, James Yuzawa, Jason Gustafson, Jesse Gorzinski, Jim Galasyn, John Roesler, Jorge Esteban Quilcate Otoya, José Armando García Sancio, Julien Chanaud, Julien Jean Paul Sirocchi, Justine Olshan, Kengo Seki, Kowshik Prakasam, leah, Lee

Re: [kafka-clients] Re: Subject: [VOTE] 2.8.0 RC1

2021-04-14 Thread John Roesler
n, Apr 12, 2021 at 8:47 PM John Roesler > wrote: > > Good catch, Israel! > > > > I’ll make sure that gets fixed. > > > > Thanks, > > John > > > > On Mon, Apr 12, 2021, at 19:30, Israel Ekpo wrote: > > > I just noticed that with the la

[VOTE] 2.8.0 RC2

2021-04-14 Thread John Roesler
Hello Kafka users, developers and client-developers, This is the third candidate for release of Apache Kafka 2.8.0.This is a major release that includes many new features, including: * Early-access release of replacing Zookeeper with a self-managed quorum * Add Describe Cluster API *

Re: [kafka-clients] Re: Subject: [VOTE] 2.8.0 RC1

2021-04-12 Thread John Roesler
t; > On Fri, Apr 9, 2021 at 4:52 PM Bill Bejeck wrote: > > > Hi John, > > > > Thanks for running the 2.8.0 release! > > > > I've started to validate it and noticed the site-docs haven't been > > installed to https://kafka.apache.org/28/documentation.html yet. >

Subject: [VOTE] 2.8.0 RC1

2021-04-06 Thread John Roesler
Hello Kafka users, developers and client-developers, This is the second candidate for release of Apache Kafka 2.8.0. This is a major release that includes many new features, including: * Early-access release of replacing Zookeeper with a self- managed quorum * Add Describe Cluster API * Support

Re: enforceRebalance using kafka admin APIs anf not consumer client API.

2021-04-01 Thread John Roesler
Hi Mazen, This sounds like a good use case. If you’d like, you can start a KIP to add an enforceRebalance() method to the admin client interface. Feel free to ask here for any guidance on the KIP process itself. Regarding your usage, you will actually have to call poll(). That is the point at

Re: [VOTE] 2.8.0 RC0

2021-03-30 Thread John Roesler
release, but wanted to get RC0 out asap for testing. Thank you, John On Tue, 2021-03-30 at 16:37 -0500, John Roesler wrote: > Hello Kafka users, developers and client-developers, > > This is the first candidate for release of Apache Kafka > 2.8.0. This is a major release that inclu

[VOTE] 2.8.0 RC0

2021-03-30 Thread John Roesler
Hello Kafka users, developers and client-developers, This is the first candidate for release of Apache Kafka 2.8.0. This is a major release that includes many new features, including: * Early-access release of replacing Zookeeper with a self- managed quorum * Add Describe Cluster API * Support

Re: Processor API in 2.7

2021-03-22 Thread John Roesler
Hi Ross, Thanks for the feedback! For some "context," the change you mention was: https://cwiki.apache.org/confluence/display/KAFKA/KIP-478+-+Strongly+typed+Processor+API In addition to the spec on that page, there are links to the discussion and voting mailing list threads. The primary

Re: Kafka 2.7.0 processor API and streams-test-utils changes

2021-02-06 Thread John Roesler
Hello Upesh, I’m sorry for the trouble. This was my feature, and my oversight. I will update the docs on Monday. The quick answer is that there is also a new api.MockProcessorContext, which is compatible with the new interface. That class also provides getStateStoreContext to use with the

Re: Event Sourcing with Kafka Streams and processing order of a re-entrant pipeline

2021-01-31 Thread John Roesler
Hi David, Thank you for the question. If I can confirm, it looks like the "operations" topic is the only input to the topology, and the topology reads the "operations" topic joined with the "account" table and generates a "movements" stream. It reads (and aggregates) the "movements" stream to

Re: kafka-streams: interaction between max.poll.records and window expiration ?

2020-12-21 Thread John Roesler
Hi Mathieu, I don’t think there would be any problem. Note that window expiry is computed against an internal clock called “stream time”, which is the max timestamp yet observed. This time is advanced per each record when that record is processed. There is a separate clock for each partition,

Re: In Memory State Store

2020-12-21 Thread John Roesler
Hi Navneeth, Yes, you are correct. I think there are some opportunities for improvement there, but there are also reasons for it to be serialized in the in-memory store. Off the top of my head, we need to serialize stored data anyway to send it to the changelog. Also, even though the store

Re: Punctuate NPE

2020-12-15 Thread John Roesler
Hi Navneeth, I'm sorry for the trouble. Which version of Streams are you using? Also, this doesn't look like the full stacktrace, since we can't see the NPE itself. Can you share the whole thing? Thanks, -John On Tue, 2020-12-15 at 00:30 -0800, Navneeth Krishnan wrote: > Hi All, > > I have a

Re: Apache Kafka Powered By : Proposition La Redoute

2020-12-15 Thread John Roesler
Hi Antoine, Thanks for your message! I couldn't see the logo; I think the mailing list software doesn't transmit attachments. Perhaps you could just send a pull request to add La Redoute to https://github.com/apache/kafka-site/edit/asf-site/powered-by.html ? Feel free to reply here if you have

Re: [VOTE] 2.7.0 RC5

2020-12-14 Thread John Roesler
Thanks for this release, Bill, I ran though the quickstart (just the zk, broker, and console clients part), verified the signatures, and also built and ran the tests. I'm +1 (binding). Thanks, -John On Mon, 2020-12-14 at 14:58 -0800, Guozhang Wang wrote: > I checked the docs and ran unit

Re: Kafka Streams: Unexpected data loss

2020-11-20 Thread John Roesler
Hello Jeffrey, I’m sorry for the trouble. I appreciate your diligence in tracking this down. In reading your description, nothing jumps out to me as problematic. I’m a bit at a loss as to what may have been the problem. >- Is there a realistic scenario (e.g. crash, rebalance) which you

Re: kafka-streams / window expiration because of shuffling

2020-11-20 Thread John Roesler
Hi Mathieu, Ah, that is unfortunate. I believe your analysis is correct. In general, we have no good solution to the problem of upstream tasks moving ahead of each other and causing disorder in the repartition topics. Guozhang has done a substantial amount of thinking on this subject, though,

Re: Kafka Streams: SessionWindowedSerde Vs TimeWindowedSerde. Ambiguous implicit values

2020-11-19 Thread John Roesler
)) > .count() > .toStream > .to("streams-pipe-output") > > builder.build() > } > } > > > > > On Thu, Nov 19, 2020 at 7:24 AM John Roesler wrote: > > > Hi Eric, > > > > Sure thing. Assuming the

Re: Kafka Streams: SessionWindowedSerde Vs TimeWindowedSerde. Ambiguous implicit values

2020-11-19 Thread John Roesler
, can you please > show me how? > > def to(topic: String)(implicit produced: Produced[K, V]): Unit = > inner.to(topic, produced) > > > Also not sure how to use a self documenting format like JSON. Any > examples to share? > > > On Wed, Nov 18, 2020 at 5:14

Re: Kafka Streams: SessionWindowedSerde Vs TimeWindowedSerde. Ambiguous implicit values

2020-11-18 Thread John Roesler
Hi Eric, Ah, that’s a bummer. The correct serde is the session windowed serde, as I can see you know. I’m afraid I’m a bit rusty on implicit resolution rules, so I can’t be much help there. But my general recommendation for implicits is that when things get weird, just don’t use them at all.

Re: StreamBuilder#stream multiple topic and consumed api

2020-11-13 Thread John Roesler
Hi Uirco, The method that doesn’t take Consumed will fall back to the configured “default serdes”. If you don’t have that confit set, it will just keep them as byte arrays, which will probably give you an exception at runtime. You’ll probably want to use the Consumed argument to set your

Re: started getting TopologyException: Invalid topology after moving to streams-2.5.1

2020-10-05 Thread John Roesler
Hi Pushkar, Sorry for the trouble. Can you share your config and topology description? If I read your error message correctly, it says that your app is configured with no source topics and no threads. Is that accurate? Thanks, -John On Mon, 2020-10-05 at 15:04 +0530, Pushkar Deole wrote: > Hi

Re: Kafka stream error - Consumer is not subscribed to any topics or assigned any partitions

2020-09-14 Thread John Roesler
Hi Pushkar, I'd recommend always keeping Streams and the Clients at the same version, since we build, test, and release them together. FWIW, I think there were some bugfixes for the clients in 2.5.1 anyway. Thanks, -John On Mon, 2020-09-14 at 20:08 +0530, Pushkar Deole wrote: > Sophie, one more

Re: Handle exception in kafka stream

2020-09-01 Thread John Roesler
Hi Deepak, It sounds like you're saying that the exception handler is correctly indicating that Streams should "Continue", and that if you stop the app after handling an exceptional record but before the next commit, Streams re-processes the record? If that's what you're seeing, then it's how

Re: JNI linker issue on ARM (Raspberry PI)

2020-08-24 Thread John Roesler
ated the dependency > > to > > get this fix in 2.6. See KAFKA-9225 > > <https://issues.apache.org/jira/browse/KAFKA-9225> > > > > If you already were running 2.6, then, that's unfortunate. You might have > > some luck > > asking the rocksdb folks if all els

Re: JNI linker issue on ARM (Raspberry PI)

2020-08-24 Thread John Roesler
Hi Steve, Which version of Streams is this? I vaguely recall that we updated to a version of Rocks that’s compiled for ARM, and I think some people have used it on ARM, but I might be misremembering. I’m afraid I can’t be much help in debugging this, but maybe some others on the list have

Re: Request to add me to the contributors list

2020-08-17 Thread John Roesler
Hello Sanjay, I've just added you to the contributor list in Jira, so you should be able to assign tickets now. Thanks for your interest in the project! -John On Sun, 2020-08-16 at 21:27 -0500, Sanjay Y R wrote: > Hello, > > I am Sanjay. I am a Kafka user intending to contribute to Kafka Open

Re: Documentation suggestion / issue and question about ticket openning

2020-08-12 Thread John Roesler
Hello Ahmed, Thanks for this feedback. I can see what you mean. I know that there is a redesign currently in progress for the site, but I'm not sure if the API/Config documentation is planned as part of that effort. Here's the PR to re- design the home page:

[ANNOUNCE] Apache Kafka 2.5.1

2020-08-11 Thread John Roesler
“Andy” Fang, Dima Reznik, Ego, Evelyn Bayes, Ewen Cheslack-Postava, Greg Harris, Guozhang Wang, Ismael Juma, Jason Gustafson, Jeff Widman, Jeremy Custenborder, jiameixie, John Roesler, Jorge Esteban Quilcate Otoya, Konstantine Karantasis, Lucent-Wong, Mario Molina, Matthias J. Sax, Navinder Pal

Re: join not working in relation to how StreamsBuilder builds the topology

2020-08-11 Thread John Roesler
er. > (aaand I spent entire days on it) > > Problem solved > Thanks > > Mathieu > > Le mar. 11 août 2020 à 07:18, John Roesler a écrit : > > > Hi Mathieu, > > > > That sounds frustrating. I’m sorry for the trouble. > > > > From what you

Re: join not working in relation to how StreamsBuilder builds the topology

2020-08-10 Thread John Roesler
Hi Mathieu, That sounds frustrating. I’m sorry for the trouble. >From what you described, it does sound like something wacky is going on with >the partitioning. In particular, the fact that both joins work when you set >everything to 1 partition. You mentioned that you’re using the default

[RESULTS] [VOTE] Release Kafka version 2.5.1

2020-08-04 Thread John Roesler
Hello all, This vote passes with four +1 votes (3 binding) and no 0 or -1 votes. +1 votes PMC Members (in voting order): * Ismael Juma * Manikumar Reddy * Mickael Maison Committers (in voting order): * John Roesler Community: * No votes 0 votes * No votes -1 votes * No votes Vote thread

Re: [VOTE] 2.5.1 RC0

2020-07-30 Thread John Roesler
Hello again all, Just a reminder that the 2.5.1 RC0 is available for verification. Thanks, John On Thu, Jul 23, 2020, at 21:39, John Roesler wrote: > Hello Kafka users, developers and client-developers, > > This is the first candidate for release of Apache Kafka 2.5.1. > > Apa

[VOTE] 2.5.1 RC0

2020-07-23 Thread John Roesler
Hello Kafka users, developers and client-developers, This is the first candidate for release of Apache Kafka 2.5.1. Apache Kafka 2.5.1 is a bugfix release and fixes 72 issues since the 2.5.0 release. Please see the release notes for more information. Release notes for the 2.5.1 release:

Re: Confluent Platform- KTable clarification

2020-07-22 Thread John Roesler
Hello Nag, Yes, your conclusion sounds right. “Sum the values per key” is a statement that doesn’t really make sense in a KTable context, since there is always just one value per key (the latest update). I think the docs are just trying to drive the point home that in a KTable, there is just

Re: [ANNOUNCE] New committer: Boyang Chen

2020-06-22 Thread John Roesler
That’s great news! Congratulations, Boyang. It’s well deserved. -John On Mon, Jun 22, 2020, at 18:26, Guozhang Wang wrote: > The PMC for Apache Kafka has invited Boyang Chen as a committer and we are > pleased to announce that he has accepted! > > Boyang has been active in the Kafka community

Re: Clients may fetch incomplete set of topic partitions during cluster startup

2020-05-29 Thread John Roesler
Hello, Thanks for the question. It looks like the ticket is still open, so I think it's safe to say it's not fixed. If you're affected by the issue, it would be helpful to leave a comment on the ticket to that effect. Thanks, -John On Fri, May 29, 2020, at 00:05, Debraj Manna wrote: > Anyone

Re: NEED HELP : OutOfMemoryError: Java heap space error while starting KafkaStream with a simple topology

2020-05-28 Thread John Roesler
Woah, that's a nasty bug. I've just pinged the Jira ticket. Please feel free to do the same. Thanks, -John On Thu, May 28, 2020, at 02:55, Pushkar Deole wrote: > Thanks for the help Guozhang! > however i realized that the exception and actual problem is totally > different. The problem was the

Re: Kafka Streams, WindowBy, Grace Period, Late events, Suppres operation

2020-05-11 Thread John Roesler
Hello Baki, It looks like option 2 is really what you want. The purpose of the time window stores is to allow deleting old data when you need to group by a time dimension, which naturally results in an infinite key space. If you don’t want to wait for the final result, can you just not add

Re: records with key as string and value as java ArrayList in topic

2020-05-11 Thread John Roesler
Oh, my mistake. I thought this was a different thread :) You might want to check, but I don’t think there is a kip for a map serde. Of course, you’re welcome to start one. Thanks, John On Mon, May 11, 2020, at 09:14, John Roesler wrote: > Hi Pushkar, > > I don’t think there i

Re: records with key as string and value as java ArrayList in topic

2020-05-11 Thread John Roesler
:47, Pushkar Deole wrote: > John, > is there KIP in progress for supporting Java HashMap also? > > On Sun, May 10, 2020, 00:47 John Roesler wrote: > > > Yes, that’s correct. It’s only for serializing the java type ‘byte[]’. > > > > On Thu, May 7, 2020, at 10:37,

Re: can kafka state stores be used as a application level cache by application to modify it from outside the stream topology?

2020-05-11 Thread John Roesler
or it gets it from the backed topic everytime? > Secondly, what kind of internal data structure does it use? Is it good for > constant time performance? > > On Thu, May 7, 2020 at 7:27 PM John Roesler wrote: > > > Hi Pushkar, > > > > To answer your question about

Re: records with key as string and value as java ArrayList in topic

2020-05-09 Thread John Roesler
he way, what is the byteArrayserializer? As the name suggests, it is > for byte arrays so won't work for java ArrayList, right? > > On Thu, May 7, 2020 at 8:44 PM John Roesler wrote: > > > Hi Pushkar, > > > > If you’re not too concerned about compactness, I think Jackson

Re: records with key as string and value as java ArrayList in topic

2020-05-07 Thread John Roesler
Hi Pushkar, If you’re not too concerned about compactness, I think Jackson json serialization is the easiest way to serialize complex types. There’s also a kip in progress to add a list serde. You might take a look at that proposal for ideas to write your own. Thanks, John On Thu, May 7,

Re: can kafka state stores be used as a application level cache by application to modify it from outside the stream topology?

2020-05-07 Thread John Roesler
efully. > > Thanks again! > > On Mon, May 4, 2020 at 11:18 PM Pushkar Deole wrote: > > > Thanks John... what parameters would affect the latency in case > > GlobalKTable will be used and is there any configurations that could be > > tuned to minimize the latency

Re: can kafka state stores be used as a application level cache by application to modify it from outside the stream topology?

2020-05-04 Thread John Roesler
nce and it tried to read from store cache then it doesn't > > get the data, so the event passed on without enriched data. > > That's pretty much about the use case. > > > > > > On Sun, May 3, 2020 at 9:42 PM John Roesler wrote: > > > >> Hi Pushkar, >

Re: can kafka state stores be used as a application level cache by application to modify it from outside the stream topology?

2020-05-03 Thread John Roesler
Hi Pushkar, I’ve been wondering if we should add writable tables to the Streams api. Can you explain more about your use case and how it would integrate with your application? Incidentally, this would also help us provide more concrete advice. Thanks! John On Fri, May 1, 2020, at 15:28,

Re: Deserialization Exception when performing state store operations

2020-04-20 Thread John Roesler
Hi Carl, That sounds pretty frustrating; sorry about that. I think I got a hint, but I'm not totally clear on the situation. It shouldn't be possible for data to get into the store if it can't be handled by the serde. There is a specific issue with global stores, but it doesn't sound like that's

Re: Kafka Streams - issues with windowing and suppress

2020-04-20 Thread John Roesler
t; PR submitted :) https://github.com/apache/kafka/pull/8520 > > > > On Mon, Apr 20, 2020 at 2:34 PM John Roesler wrote: > > > >> Hi Liam, > >> > >> That sounds like a good idea to me. In fact, I’d go so far as to say we > >> should just chang

Re: Unexpected behaviour on windowing aggregations

2020-04-19 Thread John Roesler
< I'm deploying a fixed > version of it as we speak. Thanks for the reply though :) > > Kind regards, > > Liam Clarke > > > > On Mon, 20 Apr. 2020, 2:08 am John Roesler, wrote: > > > Hi Liam, > > > > I took a quick look. On the output side, it

Re: Kafka Streams - issues with windowing and suppress

2020-04-19 Thread John Roesler
ne. To prepare for this change, we could start to log a WARN > > message, if a user does not set the grace period explicitly for now. > > > > Just my 2 ct. Thoughts? > > > > -Matthias > > > > On 4/19/20 7:40 AM, John Roesler wrote: > > > Oh, ma

Re: Kafka Streams - issues with windowing and suppress

2020-04-19 Thread John Roesler
> Liam Clarke-Hutchinson > > On Thu, Apr 16, 2020 at 1:59 AM John Roesler wrote: > > > Boom, you got it, Liam! Nice debugging work. > > > > This is a pretty big bummer, but I had to do it that way for > > compatibility. I added a log message to try and help red

Re: Unexpected behaviour on windowing aggregations

2020-04-19 Thread John Roesler
Hi Liam, I took a quick look. On the output side, it looks like you’re adding the count to the prior count. Should that just set the outbound vale to the new count? Maybe I misunderstood the situation. What I mean is, suppose you get two events for the same window: Inbound map := 0+1 = 1

Re: How to add partitions to an existing kafka topic

2020-04-17 Thread John Roesler
ion the kafka completely re-distributes all > the older messages too among all the partitions. > And if it does that then does it ensure that in this re-distributions it > keeps messages of same key in same partition. > > Thanks > Sachin > > > > > On Wed, Apr 15, 2020 at

Re: How to add partitions to an existing kafka topic

2020-04-15 Thread John Roesler
Hi Sachin, Just to build on Boyang’s answer a little, when designing Kafka’s partition expansion operation, we did consider making it work also for dynamically repartitioning in a way that would work for Streams as well, but it added too much complexity, and the contributor had some other use

Re: Kafka Streams - issues with windowing and suppress

2020-04-15 Thread John Roesler
Boom, you got it, Liam! Nice debugging work. This is a pretty big bummer, but I had to do it that way for compatibility. I added a log message to try and help reduce the risk, but it’s still kind of a trap. I’d like to do a KIP at some point to consider changing the default grace period,

Re: Kafka Streams endless rebalancing

2020-04-10 Thread John Roesler
ould the > OVERALL length of time needed to fully restore the state stores (which > could be multiple topics with multiple partitions) be exceeding some > timeout or threshold? Thanks again for any ideas, > > > > Alex C > > > On Thu, Apr 9, 2020 at 9:36 AM John

Re: Kafka Streams endless rebalancing

2020-04-09 Thread John Roesler
Hi Alex, It sounds like your theory is plausible. After a rebalance, Streams needs to restore its stores from the changelog topics. Currently, Streams performs this restore operation in the same loop that does processing and polls the consumer for more records. If the restore batches (or the

Re: Passing states stores around

2020-03-10 Thread John Roesler
each. Since the code is fairly large per function, I have > > them split into classes by functionalities. and some method in the call > > stack would need to access this state. What do you recommend in such > > scenarios? > > > > Thanks > > > > On Mon, Mar 9, 2020

Re: Passing states stores around

2020-03-09 Thread John Roesler
Hi Navneeth, This sounds like an unusual use case. Can you provide more information on why this is required? Thanks, John On Mon, Mar 9, 2020, at 12:48, Navneeth Krishnan wrote: > Hi All, > > Any suggestions? > > Thanks > > On Sat, Mar 7, 2020 at 10:13 AM Navneeth Krishnan > wrote: > > >

Re: KafkaStreams GroupBy with new key. Can I skip repartition?

2020-03-02 Thread John Roesler
ap. But we > > would need to do a KIP to introduce some API to allow user to tell > > Kafka Streams that repartitioning is not necessary. > > > > In Apache Flink, there is an operator called > > `reinterpretAsKeyedStream`. We could introduce something similar. >

Re: KafkaStreams GroupBy with new key. Can I skip repartition?

2020-02-29 Thread John Roesler
Hi all, The KIP is accepted and implemented already, but is blocked on code review: https://github.com/apache/kafka/pull/7170 A quick note on the lack of recent progress... It's completely our fault, the reviews fell by the wayside during the 2.5.0 release cycle, and we haven't gotten back to

Re: subscribe kafka user mail

2020-02-27 Thread John Roesler
Hi there! To subscribe to the list, you have to email a different address: users-subscr...@kafka.apache.org. (see https://kafka.apache.org/contact.html). This also applies to the message you sent to dev (should have been dev-subscr...@kafka.apache.org). Thanks for joining the conversation!

Re: [ANNOUNCE] New committer: Konstantine Karantasis

2020-02-26 Thread John Roesler
Congrats, Konstantine! Awesome news. -John On Wed, Feb 26, 2020, at 16:39, Bill Bejeck wrote: > Congratulations Konstantine! Well deserved. > > -Bill > > On Wed, Feb 26, 2020 at 5:37 PM Jason Gustafson wrote: > > > The PMC for Apache Kafka has invited Konstantine Karantasis as a committer > >

Re: KTable in Compact Topic takes too long to be updated

2020-02-20 Thread John Roesler
in the streams, but the order is not updated yet. So we add the > item2 to order and instead of having: > Order(item1, item2) > > we have > > Order(item2) > I hope I made more clear our scenario. > Regards, > > Renato de Melo > >Em quarta-feir

Re: KTable in Compact Topic takes too long to be updated

2020-02-19 Thread John Roesler
Hi Renato, Can you describe a little more about the nature of the join+aggregation logic? It sounds a little like the KTable represents the result of aggregating messages from the KStream? If that's the case, the operation you probably wanted was like: > KStream.groupBy().aggregate() which

Re: Using Kafka AdminUtils

2020-02-16 Thread John Roesler
c functionality (like > createTopics) remains pretty stable. > Is it considered stable enough for production? > > Thanks, > Victoria > > > On 16/02/2020, 20:15, "John Roesler" wrote: > > Hi Victoria, > > I’ve used the AdminClient for this kind

Re: Using Kafka AdminUtils

2020-02-16 Thread John Roesler
Hi Victoria, I’ve used the AdminClient for this kind of thing before. It’s the official java client for administrative actions like creating topics. You can create topics with any partition count, replication, or any other config. I hope this helps, John On Sat, Feb 15, 2020, at 22:41,

Re: "Skipping record for expired segment" in InMemoryWindowStore

2020-02-11 Thread John Roesler
their timestamps > from given partition of the changelog topic via a command line tool (or in > some other way) - to confirm if they are really stored this way. If you > have a tip on how to do it, please let me know. > > That is all I have for now. I would like to resolve it. I will po

Re: "Skipping record for expired segment" in InMemoryWindowStore

2020-02-10 Thread John Roesler
ough, or > how likely it is that anyone really pays attention to it? > > On Mon, Feb 10, 2020 at 9:53 AM John Roesler wrote: > > > Hi, > > > > I’m sorry for the trouble. It looks like it was a mistake during > > > > https://github.com/apache/kafka

Re: "Skipping record for expired segment" in InMemoryWindowStore

2020-02-10 Thread John Roesler
Hi, I’m sorry for the trouble. It looks like it was a mistake during https://github.com/apache/kafka/pull/6521 Specifically, while addressing code review comments to change a bunch of other logs from debugs to warnings, that one seems to have been included by accident:

Re: Resource based kafka assignor

2020-01-31 Thread John Roesler
Hi Srinivas, Your approach sounds fine, as long as you don’t need the view of the assignment to be strictly consistent. As a roughy approximation, it could work. On the other hand, if you’re writing a custom assignor, you could consider using the SubscriptionInfo field of the joinGroup

Re: stop

2020-01-22 Thread John Roesler
Hey Sowjanya, That won't work. The "welcome" email you got when you signed up for the mailing list has instructions for unsubscribing: > To remove your address from the list, send a message to: > Cheers, -John On Wed, Jan 22, 2020, at 10:12, Sowjanya Karangula wrote: > stop >

Re: Does Merging two kafka-streams preserve co-partitioning

2020-01-20 Thread John Roesler
Hi Yair, You should be fine! Merging does preserve copartitioning. Also processing on that partition is single-threaded, so you don’t have to worry about races on the same key in your transformer. Actually, you might want to use transformValues to inform Streams that you haven’t changed the

Re: KTable Suppress not working

2020-01-19 Thread John Roesler
is flushed? > > Thanks, > Sushrut > > > > > > > On Sat, Jan 18, 2020 at 12:00 PM Sushrut Shivaswamy < > sushrut.shivasw...@gmail.com> wrote: > > > Thanks John, > > I'll try increasing the "CACHE_MAX_BYTES_BUFFERING_CONFIG" &

Re: KTable Suppress not working

2020-01-17 Thread John Roesler
to directly solve the given error instead of trying something different. Thanks, -John On Fri, Jan 17, 2020, at 23:33, John Roesler wrote: > Hi Sushrut, > > That's frustrating... I haven't seen that before, but looking at the error > in combination with what you say happens without suppres

Re: KTable Suppress not working

2020-01-17 Thread John Roesler
Hi Sushrut, That's frustrating... I haven't seen that before, but looking at the error in combination with what you say happens without suppress makes me think there's a large volume of data involved here. Probably, the problem isn't specific to suppression, but it's just that the interactions on

Re: [ANNOUNCE] New Kafka PMC Members: Colin, Vahid and Manikumar

2020-01-14 Thread John Roesler
Congrats, Colin, Vahid, and Manikumar! A great accomplishment, reflecting your great work. -John On Tue, Jan 14, 2020, at 11:33, Bill Bejeck wrote: > Congrats Colin, Vahid and Manikumar! Well deserved. > > -Bill > > On Tue, Jan 14, 2020 at 12:30 PM Gwen Shapira wrote: > > > Hi everyone, > >

Re: designing a streaming task for count and event time difference

2020-01-05 Thread John Roesler
Hey Chris, Yeah, I think what you’re really looking for is data-driven windowing, which we haven’t implemented yet. In lieu of that, you’ll want to build on top of session windows. What you can do is define an aggregate object similar to what Sachin proposed. After the aggregation, you can

Re: Kafka trunk vs master branch

2019-12-25 Thread John Roesler
Hi Sachin, Trunk is the basis for development. I’m not sure what master is for, if anything. I’ve never used it for anything or even checked it out. The numbered release branches are used to develop patch releases. Releases are created from trunk, PRs should be made against trunk, etc.

  1   2   >