Re: Request for Comments/Resolution - KAFKA-10465

2020-09-10 Thread Guozhang Wang
Hello, Thanks for pinging the community. I've taken a look at your code at KAFKA-10465 and replied in the ticket. Guozhang On Mon, Sep 7, 2020 at 8:46 AM M. Manna wrote: > Hello, > > We do appreciate that release 2.7 is keeping us occupied, but this bug (or > not) is holding us back from

Kafka Meetup hosted Online, Wednesday 5:00pm, Sep 2nd, 2020

2020-09-01 Thread Guozhang Wang
Hello folks, The Bay Area Kafka meetup will continue to be hosted online this month, tomorrow (Sep 2nd) at 5:00pm. This time we will have guest speakers from Twitter to talk about their journey to adopt Apache Kafka. *RSVP*: https://www.meetup.com/KafkaBayArea/events/272643868/ *Date* 5:00pm,

Re: Can't trace any sendfile system call from Kafka process

2020-08-31 Thread Guozhang Wang
Hi Ming, Maybe this ticket could be useful to you: https://issues.apache.org/jira/browse/KAFKA-7504 Guozhang On Fri, Aug 28, 2020 at 8:21 AM Ming Liu wrote: > Hi > One major reason that Kafka is fast is because it is using sendfile() > for zero copy, as what it described at >

Re: JNI linker issue on ARM (Raspberry PI)

2020-08-27 Thread Guozhang Wang
Thanks for the update Steve! This is very helpful and I find the blog is a good read too! Appreciate your contribution to the community. Guozhang On Thu, Aug 27, 2020 at 11:06 AM Steve Jones wrote: > Ok so an update here, it's just a learning thing alongside the day job so > it took a little

Re: [VOTE] KIP-657: Add Customized Kafka Streams Logo

2020-08-19 Thread Guozhang Wang
al with regard to the Kafka logo. > > > Would it be possible to change Design B accordingly? > > > > > > I am not a font expert, but the fonts in both design are different and > I > > > am wondering if there is an official Apache Kafka font that we should

Re: [ANNOUNCE] Apache Kafka 2.5.1

2020-08-11 Thread Guozhang Wang
Bejeck, Boyang > Chen, Bruno Cadonna, Chia-Ping Tsai, Chris Egerton, David > Arthur, David Jacot, Dezhi “Andy” Fang, Dima Reznik, Ego, > Evelyn Bayes, Ewen Cheslack-Postava, Greg Harris, Guozhang > Wang, Ismael Juma, Jason Gustafson, Jeff Widman, Jeremy > Custenborder, jiameixie, John Roe

Re: Preliminary blog post about the Apache 2.6.0 release

2020-08-05 Thread Guozhang Wang
Thanks Randall, I made a pass over the gdoc and it looks great. Guozhang On Wed, Aug 5, 2020 at 9:48 AM Randall Hauch wrote: > Thanks, Jason. We haven't done that for a few releases, but I think it's a > great idea. I've updated the blog post draft and the Google doc to mention > the 127

[ANNOUNCE] New committer: Xi Hu

2020-06-24 Thread Guozhang Wang
The PMC for Apache Kafka has invited Xi Hu as a committer and we are pleased to announce that he has accepted! Xi Hu has been actively contributing to Kafka since 2016, and is well recognized especially for his non-code contributions: he maintains a tech blog post evangelizing Kafka in the

[ANNOUNCE] New committer: Boyang Chen

2020-06-22 Thread Guozhang Wang
The PMC for Apache Kafka has invited Boyang Chen as a committer and we are pleased to announce that he has accepted! Boyang has been active in the Kafka community more than two years ago. Since then he has presented his experience operating with Kafka Streams at Pinterest as well as several

Re: NEED HELP : OutOfMemoryError: Java heap space error while starting KafkaStream with a simple topology

2020-05-28 Thread Guozhang Wang
ers, I think the problem is very misleading and should be > > fixed as soon as possible, or a proper exception should be thrown. > > > > On Thu, May 28, 2020 at 9:46 AM Guozhang Wang > wrote: > > > > > Hello Pushkar, > > > > > > I think the

Re: How to handle RebalanceInProgressException?

2020-04-28 Thread Guozhang Wang
Apr 28, 2020 at 1:34 AM Benoit Delbosc wrote: > Hi Guozhang, > thanks for your reply > > On 27.04.20 19:17, Guozhang Wang wrote: > > Hello Ben, > > > > First of all, just to clarify your versioning should be (4.5.0 -> 2.5.0) > > and (4.3.1 -> 2.3.1) etc, rig

Re: How to handle RebalanceInProgressException?

2020-04-27 Thread Guozhang Wang
Hello Ben, First of all, just to clarify your versioning should be (4.5.0 -> 2.5.0) and (4.3.1 -> 2.3.1) etc, right? Currently Apache Kafka's latest release is 2.5.0. The RebalanceInProgressException is only introduced in the most recent releases (2.5.0+), previously it is only used internally

Kafka Meetup hosted by Confluent Online, Tuesday 4:00pm, April 21st, 2020

2020-04-17 Thread Guozhang Wang
Hello folks, The Bay Area Kafka meetup will continue to be hosted online this month, next Tuesday (April 21st) 4:00pm. We will be presenting the current on going work for a Kafkaesque Raft Protocol:

Re: New CoGroup, how to do a left join

2020-04-17 Thread Guozhang Wang
I think it is an appropriate to support multi-way joins of KTables (including outer, left), but it is not necessarily an extension of the "co-group" syntax. We can start this discussion as a separate KIP. Guozhang

Re: Cannot terminate query on KSQL

2020-04-17 Thread Guozhang Wang
Hi Federico, I'd think you could re-post this question here: https://github.com/confluentinc/ksql/issues. Cheers, Guozhang On Fri, Apr 17, 2020 at 8:57 AM Federico D'Ambrosio wrote: > So, I tried with another query which hangs and I get this, after its > corresponding TERMINATE command: > >

Re: New CoGroup, how to do a left join

2020-04-16 Thread Guozhang Wang
Hello Murilo, Thanks for your interests in KIP-150. As we discussed in the KIP, the scope of this co-group is for stream co-aggregation. For your case, the first joining table is not from the aggregation but is a source table itself, in this case it cannot be included in the co-group of KIP-150.

Re: logical delete from a log compacted topic; tombstone message? with kafka-streams

2020-04-16 Thread Guozhang Wang
Hello Nicolae, If your output topic is configured as log compacted, then sending a record with null-bytes effectively serves as a tombstone. Note that you'd need to make sure in your sink node's serializer that the serialized bytes are null when the unserialized object typed value indicates to be

Re: CVEs for the dependency software guava and rocksdbjni of Kafka

2020-04-14 Thread Guozhang Wang
Thanks for the reported issue. For guava I think we should just upgrade version to 24.1.1 or newer to resolve 10237. For rocksdbjni, I saw that at the moment even current master is still using bzip version 1.0.6 so 3189 and 12900 would be existed in newest rocksDB version. I'd suggest you post

Re: EOL

2020-04-07 Thread Guozhang Wang
Hello Tim, The following link should be the source of truth for EOL policy: https://cwiki.apache.org/confluence/display/KAFKA/Time+Based+Release+Plan#TimeBasedReleasePlan-WhatIsOurEOLPolicy? Guozhang On Tue, Apr 7, 2020 at 5:54 AM Gaudet, Timothy wrote: > Hello, > > I am trying to find the

Re: [VOTE] 2.5.0 RC2

2020-03-21 Thread Guozhang Wang
I think it is a regression, since it was introduced in KAFKA-9437: KIP-559, which is in 2.5.0, so it means for the same scenario described in Boyang's PR, 2.5 would fail while 2.4 would not. Guozhang On Fri, Mar 20, 2020 at 6:59 PM Ismael Juma wrote: > Hi Boyang, > > Is this a regression? > >

Re: Kafka Streams - partition assignment for the input topic

2020-03-19 Thread Guozhang Wang
Hi Stephen, We've deprecated the partition-grouper API due to its drawbacks in upgrading compatibility (consider if you want to change the num.partitions while evolving your application), and instead we're working on KIP-221 for the same purpose of your use case:

Re: KafkaStreams GroupBy with new key. Can I skip repartition?

2020-02-29 Thread Guozhang Wang
wrote: > Guozhang > The ticket definitely describes what I’m trying to achieve. > And should I be hopeful with the fact it’s in progress? :) > Thanks for pointing that out. > Murilo > > On Fri, Feb 28, 2020 at 2:57 PM Guozhang Wang wrote: > > > Hi Murilo, > >

Re: KafkaStreams GroupBy with new key. Can I skip repartition?

2020-02-28 Thread Guozhang Wang
Hi Murilo, Would this be helping your case? https://issues.apache.org/jira/browse/KAFKA-4835 Guozhang On Fri, Feb 28, 2020 at 7:01 AM Murilo Tavares wrote: > Hi > I am currently doing a simple KTable groupby().aggregate() in KafkaStreams. > In the groupBy I do need to select a new key, but I

Re: [ANNOUNCE] New committer: Konstantine Karantasis

2020-02-26 Thread Guozhang Wang
Congrats Konstantine! Guozhang On Wed, Feb 26, 2020 at 3:09 PM John Roesler wrote: > Congrats, Konstantine! Awesome news. > -John > > On Wed, Feb 26, 2020, at 16:39, Bill Bejeck wrote: > > Congratulations Konstantine! Well deserved. > > > > -Bill > > > > On Wed, Feb 26, 2020 at 5:37 PM Jason

Re: KStream.groupByKey().aggregate() is always emitting record, even if nothing changed

2020-02-25 Thread Guozhang Wang
.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17044602 > > > . > > > > > > With the current behavior, the commands should be: > > > > > > `stream.transformValues(...).filter((k,v) -> return v != > > > null).groupByK

Re: KStream.groupByKey().aggregate() is always emitting record, even if nothing changed

2020-02-24 Thread Guozhang Wang
Hello Adam, It seems your intention is to not "avoid emitting if the new aggregation result is the same as the old aggregation" but to "avoid processing the aggregation at all if it state is already some certain value", right? In this case I think you can try sth. like this:

Re: Can we use transform to exchange data between two streams

2020-02-23 Thread Guozhang Wang
gt; KStream otherStream2, > ... > (va, vb, vc, ...) -> ... > ) > This way we can join multiple streams and whose resultant stream if of type > (K, VA+VB+VC...) > > Let me know if something like this can be useful. > > Thanks > Sachin > &g

Re: Can we use transform to exchange data between two streams

2020-02-22 Thread Guozhang Wang
ms and order may not hold, then > what may be a good way to ensure streamB records are enriched with the > joined data of streamAA and streamAB. > > Thanks > Sachin > > > > On Sat, Feb 22, 2020 at 2:33 AM Guozhang Wang wrote: > > > From the description it seems there

Re: Query on use of Glassfish application server

2020-02-22 Thread Guozhang Wang
Hello, As far as I know Kafka connect's Rest server depends on Glassfish for web-app configuration; the Kafka brokers do not. Guozhang On Sat, Feb 22, 2020 at 5:29 AM ashish sood wrote: > Hello, > > This is more of a query rather than an issue. I am working with java based > application &

Re: Can we use transform to exchange data between two streams

2020-02-21 Thread Guozhang Wang
that since streamAA and streamAB types of records are before > streamB type of record in the input topic the join will also happen before. > If this assumption is not safe then is there any other way of ensuring. > > For now lets assume there is single partition of the input topic. > > Thanks

Re: wanted to understand WindowStore#fetchAll(timeFrom, timeTo)

2020-02-20 Thread Guozhang Wang
Hi Sachin, The javadoc has a good explanation that you can refer to: https://kafka.apache.org/24/javadoc/index.html?org/apache/kafka/streams/state/ReadOnlyWindowStore.html As in our example, both these two would be returned. Guozhang On Tue, Feb 18, 2020 at 6:56 AM Sachin Mittal wrote: >

Re: Can we use transform to exchange data between two streams

2020-02-20 Thread Guozhang Wang
Hello Sachin, 1) It seems from your source code, that in the stream2.transform you are generating a new value and return a new key-value pair: mutate value = enrich(value, result) return new KeyValue(key, value); --- Anyways, if you do not want to generate a new value object, and

Kafka Meetup hosted by Confluent at Mountain View, Thursday 5:30pm, Feb 20th, 2020

2020-02-18 Thread Guozhang Wang
Hello folks, This is a kind reminder of the Bay Area Kafka® meetup this Thursday (Feb. 20th) 5:30pm, at Confluent's Mountain View HQ office. *RSVP and Register* (if you intend to attend in person): https://www.meetup.com/KafkaBayArea/events/268427599/ *Date* 5:30pm, Thursday, February 20th,

Re: [HELP]

2020-02-12 Thread Guozhang Wang
I've added francis lee (francislee) to the contribution list. On Wed, Feb 12, 2020 at 1:41 AM 萨尔卡 <1026203...@qq.com> wrote: > hiGuozhang Wang, > info in cwiki: > > > full name: francis lee > mail: francislee...@outlook.com > > > Have a nice dayFrancis Lee > > > QQ : 1026203200 > > > -- --

Re: [HELP]

2020-02-09 Thread Guozhang Wang
Hi Francis, What's your apache id? Guozhang On Sun, Feb 9, 2020 at 5:50 AM 萨尔卡 <1026203...@qq.com> wrote: > hi , > i want to make a KIP for kafka-exporter by java. how can i do it ? > > > > > Have a nice dayFrancis Lee > > > QQ : 1026203200 > > > -- -- Guozhang

Re: Mistake in official documentation ?

2020-02-06 Thread Guozhang Wang
Hello Fares, I remember there's a JIRA ticket for this part, is that what you saw as well? https://issues.apache.org/jira/browse/KAFKA-9304 Guozhang On Thu, Feb 6, 2020 at 8:25 AM Fares Oueslati wrote: > Thanks for your quick feedback. > > Here is a link to the hosted screenshot

Re: Missing internal topic (partitions) after kafka-streams client upgrade (2.3.1 -> 2.4.0)

2019-12-28 Thread Guozhang Wang
a:340) > >> at > >> > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:471) > >> at > >> > org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1267) >

Re: Please help me with link to documentation ... (generics I think)

2019-12-28 Thread Guozhang Wang
Hello Aurel, Maybe this helps: https://kafka.apache.org/24/documentation/streams/developer-guide/dsl-api.html Guozhang On Fri, Dec 27, 2019 at 8:50 AM Aurel Sandu wrote: > Hi all of you, > > I am reading the following code : > .. > KTable wordCounts = textLines >

Re: Missing internal topic (partitions) after kafka-streams client upgrade (2.3.1 -> 2.4.0)

2019-12-23 Thread Guozhang Wang
Guozhang Wang wrote: > Hello Nitay, > > Could you share the topology description on both 2.4 and 2.3.1, and also > could you elaborate on the feature flag you turned on / off? > > > Guozhang > > On Mon, Dec 23, 2019 at 9:32 AM Nitay Kufert wrote: > >> Hel

Re: Missing internal topic (partitions) after kafka-streams client upgrade (2.3.1 -> 2.4.0)

2019-12-23 Thread Guozhang Wang
Hello Nitay, Could you share the topology description on both 2.4 and 2.3.1, and also could you elaborate on the feature flag you turned on / off? Guozhang On Mon, Dec 23, 2019 at 9:32 AM Nitay Kufert wrote: > Hello, > > So as the title says, I am trying to upgrade our streams client to the

Re: KafkaProducer - Oversized batches

2019-12-18 Thread Guozhang Wang
hanks, > > Tomoyuki > > > > On Mon, Dec 9, 2019 at 1:52 PM Tomoyuki Saito > wrote: > > > > > > Hi Guozhang, > > > > > > Thank you for your suggestion! > > > I'll create a JIRA within a few days and consider submitting a PR. > > >

Re: Missing link in online document

2019-12-16 Thread Guozhang Wang
ou are launching a Kafka Streams instance > of your application. You can run multiple instances of your application. > >A common scenario is that there are multiple instances of your application > running in parallel. For more information, see Parallelism Model. > > On Mon, Dec 16, 20

Re: Missing link in online document

2019-12-15 Thread Guozhang Wang
Hello Yu, Could you point to me which page has the reference to this link? Guozhang On Sun, Dec 15, 2019 at 2:24 AM Yu Watanabe wrote: > Hello. > > I was walking through kafka streams document and below link seems to be > invalid. It returns below page. > > > Not Found > > The requested URL

Re: Static Membership AND Invalid IP addresses forConsumers

2019-12-08 Thread Guozhang Wang
Hello David, The host information of the group membership is inferred from the socket channel's ip address, I'm not certain how EC2 spot instance's application.server config is used, but if it is not dynamically reflected on the socket channel's inet address then it would be the case you

Re: KafkaProducer - Oversized batches

2019-12-08 Thread Guozhang Wang
Hello Tomoyuki, It seems that issue in 6494 is indeed valid, and I'd personally suggest we do option 3) to fix the flush() behavior. Please feel free to create a JIRA (and also submit your PR if you are interested in contributing :). Guozhang On Sat, Dec 7, 2019 at 7:59 AM Tomoyuki Saito

Re: How are custom keys compared during joins

2019-12-08 Thread Guozhang Wang
Hello Sachin, Thanks for the detailed description. Your find is right that for stream-table join the table-side updates would not trigger a join since stream records are not "materialized" or buffered during the processing. The community has requested similar semantics to improve as table-table

Re: Kafka consumer group keeps moving to PreparingRebalance and stops consuming

2019-12-08 Thread Guozhang Wang
Hello Avshalom, I think the first question to answer is where are the new consumers coming from. From your description they seem to be not expected (i.e. you did not intentionally start up new instances), so looking at those VMs that suddenly start new consumers would be my first shot. Guozhang

Re: Case of joining multiple streams/tables

2019-12-06 Thread Guozhang Wang
Hi Sachin, As Patrik mentioned, KIP-150 is being actively worked on and is likely to be included in the next release. Guozhang On Fri, Dec 6, 2019 at 12:09 AM Patrik Kleindl wrote: > Hi > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+Kafka-Streams+Cogroup > might > be worth a

Kafka Meetup hosted by Confluent at Mountain View, Thursday 5:30pm, December 5th, 2019

2019-12-03 Thread Guozhang Wang
the starting time is *earlier at 5:30*. *Agenda* 5:30pm - 6:00pm: Networking, Pizza and drinks! 6:00pm - 6:40pm: The Silver Bullet for Endless Rebalances (Guozhang Wang and Sophie Blee-Goldman, Confluent) 6:40pm - 7:20pm: Kafka as a service at Dropbox (Peng Kang and Richi Gupta, Dropbox) 7:20pm

[ANNOUNCE] New committer: John Roesler

2019-11-12 Thread Guozhang Wang
Hi Everyone, The PMC of Apache Kafka is pleased to announce a new Kafka committer, John Roesler. John has been contributing to Apache Kafka since early 2018. His main contributions are primarily around Kafka Streams, but have also included improving our test coverage beyond Streams as well.

Re: Kafka Streams - StateRestoreListener called when new partitions assigned

2019-11-11 Thread Guozhang Wang
Hello Javier, When a rebalance happened and the new tasks (hence input partitions) are assigned that need to be restored, the state of the instance would also transit to REBALANCING, and would only be transit back to RUNNING after all tasks have been completed restoring and all are being

Re: [ANNOUNCE] New committer: Mickael Maison

2019-11-07 Thread Guozhang Wang
Congrats Mickael! Guozhang On Thu, Nov 7, 2019 at 1:53 PM Bill Bejeck wrote: > Congratulations Mickael! Well deserved! > > -Bill > > On Thu, Nov 7, 2019 at 4:38 PM Jun Rao wrote: > > > Hi, Everyone, > > > > The PMC of Apache Kafka is pleased to announce a new Kafka committer > > Mickael > >

Re: [VOTE] 2.3.1 RC2

2019-10-22 Thread Guozhang Wang
+1. I've ran the quick start and unit tests. Guozhang On Tue, Oct 22, 2019 at 12:57 PM David Arthur wrote: > Thanks, Jonathon and Jason. I've updated the release notes along with the > signature and checksums. KAFKA-9053 was also missing. > > On Tue, Oct 22, 2019 at 3:47 PM Jason Gustafson >

Re: Questions about Producer per Task/Partition in Streams EoS Impl.

2019-10-17 Thread Guozhang Wang
Hello Sean, Yes atm we have one producer per task when EOS is turned on compared to one producer per thread without EOS. There's an ongoing KIP-447 aiming to bring this back to one producer per thread with EOS as well which involves broker-side changes. To answer your question: 1. Assuming each

Kafka Meetup hosted by Confluent at San Francisco, Tuesday 6:30pm, October 1st, 2019

2019-09-25 Thread Guozhang Wang
Hello folks, This is a kind reminder of the Bay Area Kafka® meetup next Tuesday 6:30pm, at Confluent's San Francisco office. *RSVP and Register* (if you intend to attend in person): https://www.meetup.com/KafkaBayArea/events/264562779/ *Date* 6:30pm, Tuesday, October 1st, 2019 *Location*

Re: please add me to Kafka user list

2019-09-10 Thread Guozhang Wang
Hello Ning, I've added you to the contributors list. Cheers, Guozhang On Tue, Sep 10, 2019 at 10:27 AM Ning Liu wrote: > Sorry, I forget to add my profile here > > Account signup > > You have signed up for a Jira account at: > https://issues.apache.org/jira > If you forget your password, you

Re: Kafka Streams and broker compatibility

2019-08-26 Thread Guozhang Wang
ogy works fine on 0.10.2.1 with kafka-streams 2.2.0, > > but fails with the error above if I use 2.2.1. > > > > I haven't changed any part of the code, simply updated my gradle file > > updating the dependency. > > > > Thanks again > > > > On Tue,

Re: Kafka Streams and broker compatibility

2019-08-26 Thread Guozhang Wang
Hello Alisson, The root cause you've seen is the message header support, which is added in brokers as in 0.11.0 (KIP-82) and in streams client as in 2.0 (KIP-244). If your code does not add any more headers then it would only inherit the headers from source topics when trying to write to

Re: Partition assignment in kafka streams

2019-08-08 Thread Guozhang Wang
t; > trying to spawn up multiple worker and wire up the instance with some > > static data which will be used in the per message business logic. > > > > Thanks > > > > On Thu, Aug 1, 2019 at 9:51 AM Guozhang Wang wrote: > > > >> Hello Navneeth, > >

Re: Partition assignment in kafka streams

2019-08-01 Thread Guozhang Wang
Hello Navneeth, I may be misunderstanding your intent from the previous emails here, so just a quick summary: 1) if you just want to "know" which partitions are assigned to which instance, this can be retrieved in multiple ways (e.g. the one mentioned by Matthias, and also one can get this info

Re: Rebalancing algorithm is extremely suboptimal for long processing

2019-07-25 Thread Guozhang Wang
browse/KAFKA-8715. I don't see any > workarounds for this, so hopefully this can get resolved sooner rather > than later. > > Regards, > Raman > > > > On Mon, Jul 22, 2019 at 9:25 PM Guozhang Wang wrote: > > > > Hello Raman, since you are using Cons

Re: Repeating UNKNOWN_PRODUCER_ID errors for Kafka streams applications

2019-07-23 Thread Guozhang Wang
ging rate to the rate at which records > are produced? > > Best, > > Pieter > > -Oorspronkelijk bericht- > Van: Guozhang Wang > Verzonden: Monday, 17 June 2019 22:14 > Aan: users@kafka.apache.org > Onderwerp: Re: Repeating UNKNOWN_PRODUCER_ID errors for Kafka

Re: Truncation

2019-07-22 Thread Guozhang Wang
Hi Jamie, The most relevant materials I can think of would be in KIP-101: https://cwiki.apache.org/confluence/display/KAFKA/KIP-101+-+Alter+Replication+Protocol+to+use+Leader+Epoch+rather+than+High+Watermark+for+Truncation Although it is a bit out-dated it still contains most significant design

Re: Rebalancing algorithm is extremely suboptimal for long processing

2019-07-22 Thread Guozhang Wang
Hello Raman, since you are using Consumer and you are concerning about the member-failure triggered rebalance, I think KIP-429 is most relevant to your scenario. As Matthias mentioned we are working on getting it in to the next release 2.4. Guozhang On Sat, Jul 20, 2019 at 6:36 PM Matthias J.

Re: Questions about KIP-415: Incremental Cooperative Rebalancing in Kafka Connect

2019-07-22 Thread Guozhang Wang
Hello Shurong, What you bumped into seems to be the same issue as is reported tracked by Luying Liu here: https://issues.apache.org/jira/browse/KAFKA-8676. Guozhang On Thu, Jun 27, 2019 at 10:48 AM srlin wrote: > Hi, team, > > We are using Kafka Connect at present and have encountered a

Re: Print RocksDb Stats

2019-07-18 Thread Guozhang Wang
Hello Muhammed, The community is working on KIP-444 that expose rocksDB metrics. There's an on-going PR that you may find helpful for your own implementation: https://github.com/apache/kafka/pull/6884 Guozhang On Wed, Jul 17, 2019 at 6:26 AM Muhammed Ashik wrote: > Hi I'm trying to log the

Bay area Kafka Meetup hosted by Confluent at Palo Alto, Tuesday 6:30pm, July 16, 2019

2019-07-15 Thread Guozhang Wang
Hello folks, This is just a kind reminder of the Bay Area Kafka® meetup tomorrow, Tuesday 6:30pm, at Confluent's Palo Alto HQ. *RSVP and Register* (if you intend to attend in person): https://www.meetup.com/KafkaBayArea/events/262715611/ please read instructions within the link to register for

Re: [ANNOUNCE] Apache Kafka 2.3.0

2019-06-25 Thread Guozhang Wang
en, Colin Hicks, Colin Patrick > McCabe, commandini, cwildman, Cyrus Vafadari, Dan Norwood, David Arthur, > Dejan Stojadinović, Dhruvil Shah, Doroszlai, Attila, Ewen Cheslack-Postava, > Fangbin Sun, Filipe Agapito, Florian Hussonnois, Gardner Vickers, Guozhang > Wang, Gwen Shapira, Hai

Re: Repeating UNKNOWN_PRODUCER_ID errors for Kafka streams applications

2019-06-17 Thread Guozhang Wang
ry retries that follow from it. If there will be a fix for > this eventually that would make us very happy :-) > > Thanks for your efforts so far, and let me know if I can do anything to > assist. We could continue this work via Slack if you prefer. > > Best, > > Pieter > > -Oors

Re: Repeating UNKNOWN_PRODUCER_ID errors for Kafka streams applications

2019-06-12 Thread Guozhang Wang
a volume large enough to > > produce a record on each partition between every offset commit. > > > > Our only options seem to be to move from exactly once to at least > > once, or to wait for > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-360%3A+Im

Re: Repeating UNKNOWN_PRODUCER_ID errors for Kafka streams applications

2019-06-10 Thread Guozhang Wang
min client based on committed input topic offsets. This would explain > producer id's not being known if the logs get cleaned up almost instantly > via this process. > > Best, > > Pieter > > -Oorspronkelijk bericht- > Van: Guozhang Wang > Verzonden: Thurs

Re: Streams reprocessing whole topic when deployed but not locally

2019-06-06 Thread Guozhang Wang
ymore. > > Regards > > -- > Alessandro Tagliapietra > > > On Thu, Jun 6, 2019 at 10:50 AM Guozhang Wang wrote: > > > That's right, but local state is used as a "materialized view" of your > > changelog topics: if you have nothing locally, then it has to boot

Re: Streams reprocessing whole topic when deployed but not locally

2019-06-06 Thread Guozhang Wang
> > Thank you > > -- > Alessandro Tagliapietra > > > On Thu, Jun 6, 2019 at 10:13 AM Guozhang Wang wrote: > > > If you deploy your streams app into a docker container, you'd need to > make > > sure local state directories are preserved, since otherwise when

Re: Streams reprocessing whole topic when deployed but not locally

2019-06-06 Thread Guozhang Wang
I'm trying to upgrade to kafka 2.2.1 to see if I get any > improvement. > > -- > Alessandro Tagliapietra > > > On Wed, Jun 5, 2019 at 4:45 PM Guozhang Wang wrote: > > > Hello Alessandro, > > > > What did you do for `restarting the app online`? I'm no

Re: Repeating UNKNOWN_PRODUCER_ID errors for Kafka streams applications

2019-06-06 Thread Guozhang Wang
build up the set of > known producers from the log. If a message was produced on a partition 5 > seconds before by the same producer (confirmed via kafkacat), how can it be > the broker throws an UNKNOWN_PRODUCER_ID exception? > > Best, > > Pieter > > -Oorspronk

Re: Streams reprocessing whole topic when deployed but not locally

2019-06-05 Thread Guozhang Wang
Hello Alessandro, What did you do for `restarting the app online`? I'm not sure I follow the difference between "restart the streams app" and "restart the app online" from your description. Guozhang On Wed, Jun 5, 2019 at 10:42 AM Alessandro Tagliapietra < tagliapietra.alessan...@gmail.com>

Re: Repeating UNKNOWN_PRODUCER_ID errors for Kafka streams applications

2019-06-05 Thread Guozhang Wang
Especially > because the cause seems different than the 'low traffic' cause in JIRA > issue KAFKA-7190 as the partitions for which errors are thrown are > receiving data. > > Best, > > Pieter > > -Oorspronkelijk bericht- > Van: Guozhang Wang > Verzonden: Wedn

Re: Repeating UNKNOWN_PRODUCER_ID errors for Kafka streams applications

2019-06-04 Thread Guozhang Wang
Hello Pieter, If you only have one record every few seconds that may be too small given you have at least 25 partitions (as I saw you have a xxx--repartition-24 partition), which means that for a single partition, it may not see any records for a long time, and in this case you may need to

Re: What happens in Rebalancing state ?

2019-06-04 Thread Guozhang Wang
Hello Mohan, That is right, note though that it is transiting to RUNNING after all assigned tasks to its threads has completed restoration and starting running. This means even under REBALANCING state some tasks that finished restoration early may still be executed as well. Guozhang On Tue,

Re: How to prevent data loss in "read-process-write" application?

2019-06-03 Thread Guozhang Wang
Hello, Transactional messaging is actually designed to solve this scenario exactly (pun intended :). Although in your app you may not have stateful logic, it is still necessary to enable transactional messaging if you are using consumer/producer, or just enable EOS if you are using streams.

Re: Transaction support multiple producer instance

2019-05-31 Thread Guozhang Wang
Hello Wenxuan, One KIP that we are considering so far is KIP-447: https://cwiki.apache.org/confluence/display/KAFKA/KIP-447%3A+Producer+scalability+for+exactly+once+semantics It does not directly address your scenarios, but I'm wondering if you can adjust your code to group the producers if they

Re: Stream application :: State transition from PARTITIONS_REVOKED to PARTITIONS_ASSIGNED continuously

2019-05-24 Thread Guozhang Wang
memory buffer in mind first and not > > spilling to disk. I have read the KIP document, > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables > > , > > but this doesn't mention any specifics as to why the in-memo

Re: Streams configuration for a stream with long varied processing times

2019-05-24 Thread Guozhang Wang
Hi Raman, Since you are using `transformation` already which is a lower-level API in DSL, you can basically do arbitrary logic like threading pool to process the records within your `process()` or `transform()` function. Note that, like consumer docs mentioned `Typically, you must disable

Re: Stream application :: State transition from PARTITIONS_REVOKED to PARTITIONS_ASSIGNED continuously

2019-05-20 Thread Guozhang Wang
Hello Nayanjyoti, Did you find anything else from the streams log entries (is it enabled on DEBUG or TRACE?), and what version of Kafka are you using? Guozhang On Sun, May 19, 2019 at 1:04 PM Nayanjyoti Deka wrote: > Forgot to add that there is no transition to RUNNING state. > > On Mon, May

Re: LogCleaner is not removing Transaction Records

2019-05-10 Thread Guozhang Wang
For who interested in this thread, there's a ticket created for it and we believe it is a lurking bug and are trying to fix it before the 2.3 release: https://issues.apache.org/jira/browse/KAFKA-8335 Guozhang On Fri, May 10, 2019 at 10:39 AM Michael Jaschob wrote: > Weichu, > > while I don't

Re: Kafka transaction between 2 kafka clusters

2019-05-09 Thread Guozhang Wang
Hello Emmanuel, Yes I think it is do-able technically. Note that it means the offsets of cluster A would be stored on cluster B and hence upon restarting one need to talk to cluster B in order to get the committed position in cluster A. Guozhang On Thu, May 9, 2019 at 11:58 AM Emmanuel

Re: InvalidStateStoreException on KStream join using GlobalKtables

2019-05-07 Thread Guozhang Wang
Hello Ishita, Is it consistently reproducing? And which Kafka version are you using? Guozhang On Thu, May 2, 2019 at 5:24 PM Ishita Rakshit wrote: > Hi, > I have a Kafka Streams application where I am joining a KStream that reads > from "topic1" with a GlobalKTable that reads from "topic2"

[ANNOUNCE] New Kafka PMC member: Matthias J. Sax

2019-04-18 Thread Guozhang Wang
Hello Everyone, I'm glad to announce that Matthias J. Sax is now a member of Kafka PMC. Matthias has been a committer since Jan. 2018, and since then he continued to be active in the community and made significant contributions the project. Congratulations to Matthias! -- Guozhang

Re: [ANNOUNCE] New Kafka PMC member: Sriharsh Chintalapan

2019-04-18 Thread Guozhang Wang
Congrats Harsh! Guozhang On Thu, Apr 18, 2019 at 11:46 AM Jun Rao wrote: > Hi, Everyone, > > Sriharsh Chintalapan has been active in the Kafka community since he became > a Kafka committer in 2015. I am glad to announce that Harsh is now a member > of Kafka PMC. > > Congratulations, Harsh! >

Bay area Kafka Meetup hosted by Confluent at Palo Alto, Monday 6pm, April 29, 2019

2019-04-17 Thread Guozhang Wang
Hello folks, The Confluent engineering team would like to invite you all to attend the first-ever Confluent hosted Bay Area Kafka® meetup on Monday 6pm, at Confluent's Palo Alto HQ. *RSVP and Register* (if you intend to attend in person): https://www.meetup.com/KafkaBayArea/events/260657485/

Re: Race condition with stream use of Global KTable

2019-04-14 Thread Guozhang Wang
f a lot of hoops to jump through, and in any case, > even if I did this, I'm still unclear on how this actually solves the > race condition (question #3 in my previous message)? > > Regards, > Raman > > On Fri, Apr 12, 2019 at 1:57 PM Guozhang Wang wrote: > > > >

Re: Race condition with stream use of Global KTable

2019-04-12 Thread Guozhang Wang
lve the race condition? > > Regards, > Raman > > > > On Fri, Apr 5, 2019 at 12:00 PM Guozhang Wang wrote: > > > > I see. > > > > So back to your original question, yes there will be a race condition > since > > the global ktable is updated with a separ

Re: Race condition with stream use of Global KTable

2019-04-05 Thread Guozhang Wang
in topic-1 though, using some other information in the > payload of topic-2. > > Regards, > Raman > > On Thu, Apr 4, 2019 at 12:57 AM Guozhang Wang wrote: > > > > Hi Raman, > > > > What I'm not clear is that since topic-2 is a transformed topic of > top

Re: Race condition with stream use of Global KTable

2019-04-03 Thread Guozhang Wang
e keys. The > global-ktable allows me to easily look up the values I need from > topic-1 using an attribute from the payload of topic-2, and combine > those to write to topic-3. > > Regards, > Raman > > On Tue, Apr 2, 2019 at 6:56 PM Guozhang Wang wrote: > > > > H

Re: Race condition with stream use of Global KTable

2019-04-02 Thread Guozhang Wang
Hello Raman, It seems from your case that `topic-1` is used for both the global ktable as well as another stream, which then be transformed to a new stream that will be "joined" somehow with the global ktable. Could you elaborate your case a bit more on why do you want to use the same source

Re: Question on Kafka Streams

2019-04-01 Thread Guozhang Wang
Hello Mark, That's a very general question, and the answer depends on your hardware, your computational logic etc. Could you elaborate a bit more on your use case? Guozhang On Mon, Apr 1, 2019 at 11:16 AM Mark Fursht wrote: > Hello, > > I would like to know, how many opened windows Kafka

Re: Kafka Streams upgrade.from config while upgrading to 2.1.0

2019-03-27 Thread Guozhang Wang
Hello Anirudh, The config `upgrade.from` is recommended for safe and smooth upgrade. In your case it is possible that when rolling bounce the instances the first upgraded instance happen to be the leader of the group and hence even without the config it can recognize other instances; but if you

Re: KafkaStreams backoff for non-existing topic

2019-03-25 Thread Guozhang Wang
ams? > The Topology should know all topics and the existence of the topics could > be verified with the AdminClient. > This would allow to fail fast similar to when the state directory is not > available. > Or am I missing something? > best regards > Patrik > > On M

Re: KafkaStreams backoff for non-existing topic

2019-03-25 Thread Guozhang Wang
Hello Murilo, Just to give some more background to John's message and KAFKA-7970 here. The main reason of trickiness is around the scenario of "topics being partially available", e.g. say your application is joining to topics A and B, while topicA exists but topicB does not (it is surprisingly

Re: Kafka Streams Disk Usage on upgrade to 2.1.0

2019-03-01 Thread Guozhang Wang
Hello Adrian, What you described did sounds wired to me. I'm not aware of any regressions on rocksDB disk usage from 1.1 to 2.1. Could you file a JIRA ticket with more details like state dir snapshots, your code snippet and configs etc so we can find a way to reproduce it? Guozhang On Fri,

<    1   2   3   4   5   6   7   8   9   10   >