Re: Kafka capabilities
Dear Kafka experts , Could anyone having this data share the details please On Wed, Apr 3, 2024 at 3:42 PM Kafka Life wrote: > Hi Kafka users > > Does any one have a document or ppt that showcases the capabilities of > Kafka along with any cost management capability? > i have a customer who is still using IBM MQM and rabbit MQ. I want the > client to consider kafka for messaging and data streaming. I wanted to seek > your expert help if you have any document or ppt i can propose it as an > example. could you pls help. > > thanks and regards > KrisG >
Kafka capabilities
Hi Kafka users Does any one have a document or ppt that showcases the capabilities of Kafka along with any cost management capability? i have a customer who is still using IBM MQM and rabbit MQ. I want the client to consider kafka for messaging and data streaming. I wanted to seek your expert help if you have any document or ppt i can propose it as an example. could you pls help. thanks and regards KrisG
RE: Re: Kafka trunk test & build stability
I wonder if we've considered adding a Gradle task timeout [0] on unitTest and integrationTest tasks. The timeout applies separately for each subproject and marks the currently running test as SKIPPED on timeout. This helped me find a test which stalls builds [1]. [0] https://docs.gradle.org/8.5/userguide/more_about_tasks.html#sec:task_timeouts [1] https://issues.apache.org/jira/browse/KAFKA-16219 Best, Gaurav On 2024/01/25 21:49:00 Justine Olshan wrote: > It looks like there was some server maintenance that shut down Jenkins. > Upon coming back up, the builds were expired but unable to stop. > > They all had similar logs: > > Cancelling nested steps due to timeoutCancelling nested steps due to > timeoutBody did not finish within grace period; terminating with > extreme prejudiceBody did not finish within grace period; terminating > with extreme prejudicePausing (Preparing for shutdown) > Resuming build at Thu Jan 25 06:56:23 UTC 2024 after Jenkins restart > Resuming build at Thu Jan 25 09:45:03 UTC 2024 after Jenkins restart > Pausing (Preparing for shutdown) > Resuming build at Thu Jan 25 10:37:41 UTC 2024 after Jenkins > restartTimeout expired 7 hr 39 min agoTimeout expired 7 hr 39 min > agoCancelling nested steps due to timeoutCancelling nested steps due > to timeout*02:37:41* Waiting for reconnection of builds41 before > proceeding with build*02:37:41* Waiting for reconnection of builds32 > before proceeding with buildStill pausedBody did not finish within > grace period; terminating with extreme prejudiceBody did not finish > within grace period; terminating with extreme prejudice > > > I forcibly killed the builds running over one day to free resources. I > believe the rest are running as expected now. > > Justine > > On Thu, Jan 25, 2024 at 10:22 AM Justine Olshan > wrote: > > > Hey folks -- I noticed some builds have been running for a day or more. I > > thought we limited builds to 8 hours. Any ideas why this is happening? > > > > > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/activity/ > > I tried to abort the build for PR-15257, and it also still seems to be > > running. > > > > Justine > > > > On Sun, Jan 14, 2024 at 6:25 AM Qichao Chu > > wrote: > > > >> Hi Divij and all, > >> > >> Regarding the speeding up of the build & de-flaking tests, LinkedIn has > >> done some great work which we probably can borrow ideas from. > >> In the LinkedIn/Kafka repo, we can see one of their most recent PRs > >> <https://github.com/linkedin/kafka/pull/500/checks> only took < 9 > >> min(unit > >> test) + < 12 min (integration-test) + < 9 (code check) = < 30 min to > >> finish > >> all the checks: > >> > >>1. Similar to what David(mumrah) has mentioned/experimented with, the > >>LinkedIn team used GitHub Actions, which displayed the results in a > >> cleaner > >> way directly from GitHub. > >>2. Each top-level package is checked separately to increase the > >>concurrency. To further boost the speed for integration tests, the > >> tests > >>inside one package are divided into sub-groups(A-Z) based on their > >>names(see this job > >><https://github.com/linkedin/kafka/actions/runs/7303478151/> for > >>details). > >>3. Once the tests are running at a smaller granularity with a decent > >>runner, heavy integration tests are less likely to be flaky, and flaky > >>tests are easier to catch. > >> > >> > >> -- > >> Qichao > >> > >> > >> On Wed, Jan 10, 2024 at 2:57 PM Divij Vaidya > >> wrote: > >> > >> > Hey folks > >> > > >> > We seem to have a handle on the OOM issues with the multiple fixes > >> > community members made. In > >> > https://issues.apache.org/jira/browse/KAFKA-16052, > >> > you can see the "before" profile in the description and the "after" > >> profile > >> > in the latest comment to see the difference. To prevent future > >> recurrence, > >> > we have an ongoing solution at > >> https://github.com/apache/kafka/pull/15101 > >> > and after that we will start another once to get rid of mockito mocks at > >> > the end of every test suite using a similar extension. Note that this > >> > doesn't solve the flaky test problems in the trunk but it removes the > >> > aspect of build failures due to OOM (one of the many problems). &
Re: [VOTE] 3.7.0 RC2
Apologies, I duplicated KAFKA-16157 twice in my previous message. I intended to mention KAFKA-16195 with the PR at https://github.com/apache/kafka/pull/15262 as the second JIRA. Thanks, Gaurav > On 26 Jan 2024, at 15:34, ka...@gnarula.com wrote: > > Hi Stan, > > I wanted to share some updates about the bugs you shared earlier. > > - KAFKA-14616: I've reviewed and tested the PR from Colin and have observed > the fix works as intended. > - KAFKA-16162: I reviewed Proven's PR and found some gaps in the proposed > fix. I've > therefore raised https://github.com/apache/kafka/pull/15270 following a > discussion with Luke in JIRA. > - KAFKA-16082: I don't think this is marked as a blocker anymore. I'm awaiting > feedback/reviews at https://github.com/apache/kafka/pull/15136 > > In addition to the above, there are 2 JIRAs I'd like to bring everyone's > attention to: > > - KAFKA-16157: This is similar to KAFKA-14616 and is marked as a blocker. > I've raised > https://github.com/apache/kafka/pull/15263 and am awaiting reviews on it. > - KAFKA-16157: I raised this yesterday and have addressed feedback from Luke. > This should > hopefully get merged soon. > > Regards, > Gaurav > > >> On 24 Jan 2024, at 11:51, ka...@gnarula.com wrote: >> >> Hi Stanislav, >> >> Thanks for bringing these JIRAs/PRs up. >> >> I'll be testing the open PRs for KAFKA-14616 and KAFKA-16162 this week and I >> hope to have some feedback >> by Friday. I gather the latter JIRA is marked as a WIP by Proven and he's >> away. I'll try to build on his work in the meantime. >> >> As for KAFKA-16082, we haven't been able to deduce a data loss scenario. >> There's a PR open >> by me for promoting an abandoned future replica with approvals from Omnia >> and Proven, >> so I'd appreciate a committer reviewing it. >> >> Regards, >> Gaurav >> >> On 23 Jan 2024, at 20:17, Stanislav Kozlovski >> wrote: >>> >>> Hey all, I figured I'd give an update about what known blockers we have >>> right now: >>> >>> - KAFKA-16101: KRaft migration rollback documentation is incorrect - >>> https://github.com/apache/kafka/pull/15193; This need not block RC >>> creation, but we need the docs updated so that people can test properly >>> - KAFKA-14616: Topic recreation with offline broker causes permanent URPs - >>> https://github.com/apache/kafka/pull/15230 ; I am of the understanding that >>> this is blocking JBOD for 3.7 >>> - KAFKA-16162: New created topics are unavailable after upgrading to 3.7 - >>> a strict blocker with an open PR https://github.com/apache/kafka/pull/15232 >>> - although I understand Proveen is out of office >>> - KAFKA-16082: JBOD: Possible dataloss when moving leader partition - I am >>> hearing mixed opinions on whether this is a blocker ( >>> https://github.com/apache/kafka/pull/15136) >>> >>> Given that there are 3 JBOD blocker bugs, and I am not confident they will >>> all be merged this week - I am on the edge of voting to revert JBOD from >>> this release, or mark it early access. >>> >>> By all accounts, it seems that if we keep with JBOD the release will have >>> to spill into February, which is a month extra from the time-based release >>> plan we had of start of January. >>> >>> Can I ask others for an opinion? >>> >>> Best, >>> Stan >>> >>> On Thu, Jan 18, 2024 at 1:21 PM Luke Chen wrote: >>> >>>> Hi all, >>>> >>>> I think I've found another blocker issue: KAFKA-16162 >>>> <https://issues.apache.org/jira/browse/KAFKA-16162> . >>>> The impact is after upgrading to 3.7.0, any new created topics/partitions >>>> will be unavailable. >>>> I've put my findings in the JIRA. >>>> >>>> Thanks. >>>> Luke >>>> >>>> On Thu, Jan 18, 2024 at 9:50 AM Matthias J. Sax wrote: >>>> >>>>> Stan, thanks for driving this all forward! Excellent job. >>>>> >>>>> About >>>>> >>>>>> StreamsStandbyTask - https://issues.apache.org/jira/browse/KAFKA-16141 >>>>>> StreamsUpgradeTest - https://issues.apache.org/jira/browse/KAFKA-16139 >>>>> >>>>> For `StreamsUpgradeTest` it was a test setup issue and should be fixed >>>>> now in trunk and 3.7 (and actually also in 3.6...) >>>>> >>>>> For `Stream
Re: [VOTE] 3.7.0 RC2
Hi Stan, I wanted to share some updates about the bugs you shared earlier. - KAFKA-14616: I've reviewed and tested the PR from Colin and have observed the fix works as intended. - KAFKA-16162: I reviewed Proven's PR and found some gaps in the proposed fix. I've therefore raised https://github.com/apache/kafka/pull/15270 following a discussion with Luke in JIRA. - KAFKA-16082: I don't think this is marked as a blocker anymore. I'm awaiting feedback/reviews at https://github.com/apache/kafka/pull/15136 In addition to the above, there are 2 JIRAs I'd like to bring everyone's attention to: - KAFKA-16157: This is similar to KAFKA-14616 and is marked as a blocker. I've raised https://github.com/apache/kafka/pull/15263 and am awaiting reviews on it. - KAFKA-16157: I raised this yesterday and have addressed feedback from Luke. This should hopefully get merged soon. Regards, Gaurav > On 24 Jan 2024, at 11:51, ka...@gnarula.com wrote: > > Hi Stanislav, > > Thanks for bringing these JIRAs/PRs up. > > I'll be testing the open PRs for KAFKA-14616 and KAFKA-16162 this week and I > hope to have some feedback > by Friday. I gather the latter JIRA is marked as a WIP by Proven and he's > away. I'll try to build on his work in the meantime. > > As for KAFKA-16082, we haven't been able to deduce a data loss scenario. > There's a PR open > by me for promoting an abandoned future replica with approvals from Omnia and > Proven, > so I'd appreciate a committer reviewing it. > > Regards, > Gaurav > > On 23 Jan 2024, at 20:17, Stanislav Kozlovski > wrote: >> >> Hey all, I figured I'd give an update about what known blockers we have >> right now: >> >> - KAFKA-16101: KRaft migration rollback documentation is incorrect - >> https://github.com/apache/kafka/pull/15193; This need not block RC >> creation, but we need the docs updated so that people can test properly >> - KAFKA-14616: Topic recreation with offline broker causes permanent URPs - >> https://github.com/apache/kafka/pull/15230 ; I am of the understanding that >> this is blocking JBOD for 3.7 >> - KAFKA-16162: New created topics are unavailable after upgrading to 3.7 - >> a strict blocker with an open PR https://github.com/apache/kafka/pull/15232 >> - although I understand Proveen is out of office >> - KAFKA-16082: JBOD: Possible dataloss when moving leader partition - I am >> hearing mixed opinions on whether this is a blocker ( >> https://github.com/apache/kafka/pull/15136) >> >> Given that there are 3 JBOD blocker bugs, and I am not confident they will >> all be merged this week - I am on the edge of voting to revert JBOD from >> this release, or mark it early access. >> >> By all accounts, it seems that if we keep with JBOD the release will have >> to spill into February, which is a month extra from the time-based release >> plan we had of start of January. >> >> Can I ask others for an opinion? >> >> Best, >> Stan >> >> On Thu, Jan 18, 2024 at 1:21 PM Luke Chen wrote: >> >>> Hi all, >>> >>> I think I've found another blocker issue: KAFKA-16162 >>> <https://issues.apache.org/jira/browse/KAFKA-16162> . >>> The impact is after upgrading to 3.7.0, any new created topics/partitions >>> will be unavailable. >>> I've put my findings in the JIRA. >>> >>> Thanks. >>> Luke >>> >>> On Thu, Jan 18, 2024 at 9:50 AM Matthias J. Sax wrote: >>> >>>> Stan, thanks for driving this all forward! Excellent job. >>>> >>>> About >>>> >>>>> StreamsStandbyTask - https://issues.apache.org/jira/browse/KAFKA-16141 >>>>> StreamsUpgradeTest - https://issues.apache.org/jira/browse/KAFKA-16139 >>>> >>>> For `StreamsUpgradeTest` it was a test setup issue and should be fixed >>>> now in trunk and 3.7 (and actually also in 3.6...) >>>> >>>> For `StreamsStandbyTask` the failing test exposes a regression bug, so >>>> it's a blocker. I updated the ticket accordingly. We already have an >>>> open PR that reverts the code introducing the regression. >>>> >>>> >>>> -Matthias >>>> >>>> On 1/17/24 9:44 AM, Proven Provenzano wrote: >>>>> We have another blocking issue for the RC : >>>>> https://issues.apache.org/jira/browse/KAFKA-16157. This bug is similar >>>> to >>>>> https://issues.apache.org/jira/browse/KAFKA-14616. The new issue >>> however >>>>> can lead to the new topic having partitions th
Re: [VOTE] 3.7.0 RC2
Hi Stanislav, Thanks for bringing these JIRAs/PRs up. I'll be testing the open PRs for KAFKA-14616 and KAFKA-16162 this week and I hope to have some feedback by Friday. I gather the latter JIRA is marked as a WIP by Proven and he's away. I'll try to build on his work in the meantime. As for KAFKA-16082, we haven't been able to deduce a data loss scenario. There's a PR open by me for promoting an abandoned future replica with approvals from Omnia and Proven, so I'd appreciate a committer reviewing it. Regards, Gaurav On 23 Jan 2024, at 20:17, Stanislav Kozlovski wrote: > > Hey all, I figured I'd give an update about what known blockers we have > right now: > > - KAFKA-16101: KRaft migration rollback documentation is incorrect - > https://github.com/apache/kafka/pull/15193; This need not block RC > creation, but we need the docs updated so that people can test properly > - KAFKA-14616: Topic recreation with offline broker causes permanent URPs - > https://github.com/apache/kafka/pull/15230 ; I am of the understanding that > this is blocking JBOD for 3.7 > - KAFKA-16162: New created topics are unavailable after upgrading to 3.7 - > a strict blocker with an open PR https://github.com/apache/kafka/pull/15232 > - although I understand Proveen is out of office > - KAFKA-16082: JBOD: Possible dataloss when moving leader partition - I am > hearing mixed opinions on whether this is a blocker ( > https://github.com/apache/kafka/pull/15136) > > Given that there are 3 JBOD blocker bugs, and I am not confident they will > all be merged this week - I am on the edge of voting to revert JBOD from > this release, or mark it early access. > > By all accounts, it seems that if we keep with JBOD the release will have > to spill into February, which is a month extra from the time-based release > plan we had of start of January. > > Can I ask others for an opinion? > > Best, > Stan > > On Thu, Jan 18, 2024 at 1:21 PM Luke Chen wrote: > >> Hi all, >> >> I think I've found another blocker issue: KAFKA-16162 >> <https://issues.apache.org/jira/browse/KAFKA-16162> . >> The impact is after upgrading to 3.7.0, any new created topics/partitions >> will be unavailable. >> I've put my findings in the JIRA. >> >> Thanks. >> Luke >> >> On Thu, Jan 18, 2024 at 9:50 AM Matthias J. Sax wrote: >> >>> Stan, thanks for driving this all forward! Excellent job. >>> >>> About >>> >>>> StreamsStandbyTask - https://issues.apache.org/jira/browse/KAFKA-16141 >>>> StreamsUpgradeTest - https://issues.apache.org/jira/browse/KAFKA-16139 >>> >>> For `StreamsUpgradeTest` it was a test setup issue and should be fixed >>> now in trunk and 3.7 (and actually also in 3.6...) >>> >>> For `StreamsStandbyTask` the failing test exposes a regression bug, so >>> it's a blocker. I updated the ticket accordingly. We already have an >>> open PR that reverts the code introducing the regression. >>> >>> >>> -Matthias >>> >>> On 1/17/24 9:44 AM, Proven Provenzano wrote: >>>> We have another blocking issue for the RC : >>>> https://issues.apache.org/jira/browse/KAFKA-16157. This bug is similar >>> to >>>> https://issues.apache.org/jira/browse/KAFKA-14616. The new issue >> however >>>> can lead to the new topic having partitions that a producer cannot >> write >>> to. >>>> >>>> --Proven >>>> >>>> On Tue, Jan 16, 2024 at 12:04 PM Proven Provenzano < >>> pprovenz...@confluent.io> >>>> wrote: >>>> >>>>> >>>>> I have a PR https://github.com/apache/kafka/pull/15197 for >>>>> https://issues.apache.org/jira/browse/KAFKA-16131 that is building >> now. >>>>> --Proven >>>>> >>>>> On Mon, Jan 15, 2024 at 5:03 AM Jakub Scholz wrote: >>>>> >>>>>> *> Hi Jakub,> > Thanks for trying the RC. I think what you found is a >>>>>> blocker bug because it * >>>>>> *> will generate huge amount of logspam. I guess we didn't find it in >>>>>> junit >>>>>> tests * >>>>>> *> since logspam doesn't fail the automated tests. But certainly it's >>> not >>>>>> suitable * >>>>>> *> for production. Did you file a JIRA yet?* >>>>>> >>>>>> Hi Colin, >>>>>> >>>>>> I opened https://issues.apache.org/jira/browse/KAFKA-161
Re: [ANNOUNCE] New committer: Greg Harris
Congrats Greg! - Gaurav > On 10 Jul 2023, at 17:25, Hector Geraldino (BLOOMBERG/ 919 3RD A) > wrote: > > Congrats Greg! Well deserved > > From: dev@kafka.apache.org At: 07/10/23 12:18:48 UTC-4:00To: > dev@kafka.apache.org > Subject: Re: [ANNOUNCE] New committer: Greg Harris > > Congratulations! > > On Mon, Jul 10, 2023 at 9:17 AM Randall Hauch wrote: >> >> Congratulations, Greg. >> >> On Mon, Jul 10, 2023 at 11:13 AM Mickael Maison >> wrote: >> >>> Congratulations Greg! >>> >>> On Mon, Jul 10, 2023 at 6:08 PM Bill Bejeck >>> wrote: >>>> >>>> Congrats Greg! >>>> >>>> -Bill >>>> >>>> On Mon, Jul 10, 2023 at 11:53 AM Divij Vaidya >>>> wrote: >>>> >>>>> Congratulations Greg! I am going through a new committer teething >>> process >>>>> right now and would be happy to get you up to speed. Looking forward to >>>>> working with you in your new role. >>>>> >>>>> -- >>>>> Divij Vaidya >>>>> >>>>> >>>>> >>>>> On Mon, Jul 10, 2023 at 5:51 PM Josep Prat >>> >>>>> wrote: >>>>> >>>>>> Congrats Greg! >>>>>> >>>>>> >>>>>> ——— >>>>>> Josep Prat >>>>>> >>>>>> Aiven Deutschland GmbH >>>>>> >>>>>> Alexanderufer 3-7, 10117 Berlin >>>>>> >>>>>> Amtsgericht Charlottenburg, HRB 209739 B >>>>>> >>>>>> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen >>>>>> >>>>>> m: +491715557497 >>>>>> >>>>>> w: aiven.io >>>>>> >>>>>> e: josep.p...@aiven.io >>>>>> >>>>>> On Mon, Jul 10, 2023, 17:47 Matthias J. Sax >>> wrote: >>>>>> >>>>>>> Congrats! >>>>>>> >>>>>>> On 7/10/23 8:45 AM, Chris Egerton wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> The PMC for Apache Kafka has invited Greg Harris to become a >>>>> committer, >>>>>>> and >>>>>>>> we are happy to announce that he has accepted! >>>>>>>> >>>>>>>> Greg has been contributing to Kafka since 2019. He has made over >>> 50 >>>>>>> commits >>>>>>>> mostly around Kafka Connect and Mirror Maker 2. His most notable >>>>>>>> contributions include KIP-898: "Modernize Connect plugin >>> discovery" >>>>>> and a >>>>>>>> deep overhaul of the offset syncing logic in MM2 that addressed >>>>> several >>>>>>>> technically-difficult, long-standing, high-impact issues. >>>>>>>> >>>>>>>> He has also been an active participant in discussions and >>> reviews on >>>>>> the >>>>>>>> mailing lists and on GitHub. >>>>>>>> >>>>>>>> Thanks for all of your contributions, Greg. Congratulations! >>>>>>>> >>>>>>> >>>>>> >>>>> >>> > >
Permission to contribute to Apache Kafka
Hi, I'd like to request permissions to contribute to Apache Kafka. My account details are as follows: # Wiki Email: gaurav_naru...@apple.com <mailto:gaurav_naru...@apple.com> Username: gnarula # JIRA Email: gaurav_naru...@apple.com <mailto:gaurav_naru...@apple.com> Username: gnarula Regards, Gaurav
Consumer offset value -Apache kafka 3.2.3
Dear Kafka Experts How can we check for a particular offset number in Apache kafka 3.2.3 version.Could you please share some light. The kafka_console_consumer tool is throwing class not found error. ./kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --topic your-topic --group your-consumer-group --zookeeper localhost:2181
Re: Consumer Lag Metrics/ Topic level metrics
Many thanks Samuel. Will go thru this. On Tue, Apr 25, 2023 at 9:03 PM Samuel Delepiere < samuel.delepi...@celer-tech.com> wrote: > Hi, > > I use a combination of the Prometheus JMX exporter ( > https://github.com/prometheus/jmx_exporter) and the Prometheus Kafka > exporter (https://github.com/danielqsj/kafka_exporter). > The consumer lag metrics come from the latter. > > I can then output the data in Grafana > > > Regards, > > Sam. > > > > On 25 Apr 2023, at 16:26, Kafka Life wrote: > > Dear Kafka Experts > > Could you please suggest good metrics exporter for consumer lag and topic > level metrics apart from Linkedin kafka burrow for the kafka broker > cluster. > > > > *This message, including any attachments, may include private, privileged > and confidential information and is intended only for the personal and > confidential use of the intended recipient(s). If the reader of this > message is not an intended recipient, you are hereby notified that any > review, use, dissemination, distribution, printing or copying of this > message or its contents is strictly prohibited and may be unlawful. If you > are not an intended recipient or have received this communication in error, > please immediately notify the sender by telephone and/or a reply email and > permanently delete the original message, including any attachments, without > making a copy.* >
Consumer Lag Metrics/ Topic level metrics
Dear Kafka Experts Could you please suggest good metrics exporter for consumer lag and topic level metrics apart from Linkedin kafka burrow for the kafka broker cluster.
Zookeeper upgrade 3.4.14 to 3.5.7
Hi Kafka , zookeeper experts Is it possible to upgrade the 3.4.14 version of zookeeper cluster in a rolling fashion (one by one node) to 3.5.7 zookeeper version. Would the cluster work with a possible combination of 3.4.14 and 3.5.7 . Please advise .
Re: Kafka Metrics to Grafana Labs
Any help from these experts ? On Sat, Apr 8, 2023 at 2:23 PM Kafka Life wrote: > Hello Kafka Experts > > Need a help . Currently the grafana agent > triggering kafka/3.2.3/config/kafka_metrics.yml is sending over 5 thousand > metrics. Is there a way to limit these many metrics to be sent and send > only what is required .Any pointers or such customized script is much > appreciated.kafka/3.2.3/config/kafka_metrics.yml > kafka/3.2.3/config/kafka_metrics.ymlkafka/3.2.3/config/kafka_metrics.yml >
Kafka Metrics to Grafana Labs
Hello Kafka Experts Need a help . Currently the grafana agent triggering kafka/3.2.3/config/kafka_metrics.yml is sending over 5 thousand metrics. Is there a way to limit these many metrics to be sent and send only what is required .Any pointers or such customized script is much appreciated.kafka/3.2.3/config/kafka_metrics.yml kafka/3.2.3/config/kafka_metrics.ymlkafka/3.2.3/config/kafka_metrics.yml
Re: Kafka (apache or confluent) - Subject of Work (SOW)
Hi experts.. any pointers or guidance for this On Wed, Apr 5, 2023 at 8:35 PM Kafka Life wrote: > Respected Kafka experts/managers > > Do anyone have Subject of work -Activities related to Kafka cluster > management for Apache or Confluent kafka . Something to assess and propose > to an enterprise for kafka cluster management. Request you to kindly share > any such documentation please. >
Kafka (apache or confluent) - Subject of Work (SOW)
Respected Kafka experts/managers Do anyone have Subject of work -Activities related to Kafka cluster management for Apache or Confluent kafka . Something to assess and propose to an enterprise for kafka cluster management. Request you to kindly share any such documentation please.
Re: Kafka Cluster WITHOUT Zookeeper
This is really great information Paul . Thank you . On Tue, Mar 28, 2023 at 4:01 AM Brebner, Paul wrote: > I have a recent 3 part blog series on Kraft (expanded version of ApacheCon > 2022 talk): > > > > > https://www.instaclustr.com/blog/apache-kafka-kraft-abandons-the-zookeeper-part-1-partitions-and-data-performance/ > > > https://www.instaclustr.com/blog/apache-kafka-kraft-abandons-the-zookeeper-part-2-partitions-and-meta-data-performance/ > > > https://www.instaclustr.com/blog/apache-kafka-kraft-abandons-the-zookeeper-part-3-maximum-partitions-and-conclusions/ > > > > Regards, Paul > > > > *From: *Chia-Ping Tsai > *Date: *Monday, 27 March 2023 at 5:37 pm > *To: *dev@kafka.apache.org > *Cc: *us...@kafka.apache.org , > mmcfarl...@cavulus.com , Israel Ekpo < > israele...@gmail.com>, ranlupov...@gmail.com , > scante...@gmail.com , show...@gmail.com < > show...@gmail.com>, sunilmchaudhar...@gmail.com < > sunilmchaudhar...@gmail.com> > *Subject: *Re: Kafka Cluster WITHOUT Zookeeper > > *NetApp Security WARNING*: This is an external email. Do not click links > or open attachments unless you recognize the sender and know the content is > safe. > > > > hi > > > > You can use the keyword “kraft” to get the answer by google or chatgpt. > For example: > > > > Introduction: > > KRaft - Apache Kafka Without ZooKeeper > <https://developer.confluent.io/learn/kraft/> > > developer.confluent.io <https://developer.confluent.io/learn/kraft/> > > > > > > QuickStart: > > Apache Kafka <https://kafka.apache.org/quickstart> > > kafka.apache.org <https://kafka.apache.org/quickstart> > > > > > > — > > Chia-Ping > > > > > > > > Kafka Life 於 2023年3月27日 下午1:33 寫道: > > Hello Kafka experts > > Is there a way where we can have Kafka Cluster be functional serving > producers and consumers without having Zookeeper cluster manage the > instance . > > Any particular version of kafka for this or how can we achieve this please > >
Re: Apache Kafka version - End of support
Many thanks Joseph for your response On Mon, Mar 27, 2023 at 4:50 PM Josep Prat wrote: > Hello there, > > You can find the general policy here: > > https://cwiki.apache.org/confluence/display/KAFKA/Time+Based+Release+Plan#TimeBasedReleasePlan-WhatIsOurEOLPolicy > > Basically, the last 3 releases are covered and one can upgrade to the > latest version from the last previous 3. Bug fixes are ported to the last 3 > versions on a 'best effort basis'. Current latest release is 3.4.0. > Version 0.11 reached "end of support" (or EOL) a long time ago. > > Hope this helps, > > On Mon, Mar 27, 2023 at 12:44 PM Kafka Life > wrote: > > > > > > > > > > Hello Kafka experts > > > > > > where can i see the end of support for apache kafka versions? i would > > like > > > to know about 0.11 Kafka version on when was this deprecated > > > > > > > > -- > [image: Aiven] <https://www.aiven.io> > > *Josep Prat* > Open Source Engineering Director, *Aiven* > josep.p...@aiven.io | +491715557497 > aiven.io <https://www.aiven.io> | <https://www.facebook.com/aivencloud > > > <https://www.linkedin.com/company/aiven/> < > https://twitter.com/aiven_io> > *Aiven Deutschland GmbH* > Alexanderufer 3-7, 10117 Berlin > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen > Amtsgericht Charlottenburg, HRB 209739 B >
Apache Kafka version - End of support
> > > Hello Kafka experts > > where can i see the end of support for apache kafka versions? i would like > to know about 0.11 Kafka version on when was this deprecated >
Kafka Cluster WITHOUT Zookeeper
Hello Kafka experts Is there a way where we can have Kafka Cluster be functional serving producers and consumers without having Zookeeper cluster manage the instance . Any particular version of kafka for this or how can we achieve this please
Re: Consumer Lag-Apache_kafka_JMX metrics
Hello Experts, Any info or pointers on my query please. On Mon, Aug 15, 2022 at 11:36 PM Kafka Life wrote: > Dear Kafka Experts > we need to monitor the consumer lag in kafka clusters 2.5.1 and 2.8.0 > versions of kafka in Grafana. > > 1/ What is the correct path for JMX metrics to evaluate Consumer Lag in > kafka cluster. > > 2/ I had thought it is FetcherLag but it looks like it is not as per the > link below. > > https://www.instaclustr.com/support/documentation/kafka/monitoring-information/fetcher-lag-metrics/#:~:text=Aggregated%20Fetcher%20Consumer%20Lag%20This%20metric%20aggregates%20lag,in%20sync%20with%20partitions%20that%20it%20is%20replicating > . > > Could one of you experts please guide on which JMX i should use for > consumer lag apart from kafka burrow or such intermediate tools > > Thanking you in advance > >
Consumer Lag-Apache_kafka_JMX metrics
Dear Kafka Experts we need to monitor the consumer lag in kafka clusters 2.5.1 and 2.8.0 versions of kafka in Grafana. 1/ What is the correct path for JMX metrics to evaluate Consumer Lag in kafka cluster. 2/ I had thought it is FetcherLag but it looks like it is not as per the link below. https://www.instaclustr.com/support/documentation/kafka/monitoring-information/fetcher-lag-metrics/#:~:text=Aggregated%20Fetcher%20Consumer%20Lag%20This%20metric%20aggregates%20lag,in%20sync%20with%20partitions%20that%20it%20is%20replicating . Could one of you experts please guide on which JMX i should use for consumer lag apart from kafka burrow or such intermediate tools Thanking you in advance
Re: HACKING vulnerability is SpringBoot (Java) for apache kafka
Dear Luke , Thank you for your kind and prompt response. On Mon, Apr 4, 2022 at 1:23 PM Luke Chen wrote: > Hi, > > The impact for the CVE-2022-22965? Since this is a RCE vulnerability, which > means the whole system (including Kafka and ZK) is under the attackers' > control, and can do whatever they want. > > The ideal fix for this is to upgrade Spring Framework 5.3.18 and 5.2.20 or > greater. Alternatively, you can have workarounds: > 1. Upgrading Tomcat > 2. Downgrading to Java 8 > 3. Disallowed Fields > > I think this blog from Spring community is very clear: > https://spring.io/blog/2022/03/31/spring-framework-rce-early-announcement > > Thank you. > Luke > > On Mon, Apr 4, 2022 at 3:32 PM Kafka Life wrote: > > > Hi Kafka Experts > > > > Regarding the recent threat of vulnerability in spring framework , > > CVE-2022-22965 vulnerability is SpringBoot (Java) for apache kafka and > > Zookeeper. Could one of you suggest how Apache kafka and zk are impacted > > and what should be the ideal fix for this . > > > > Vulnerability in the Spring Framework (CVE-2022-22965) | Information > > Security Office (berkeley.edu) > > < > > > https://security.berkeley.edu/news/vulnerability-spring-framework-cve-2022-22965 > > > > > > > Critical alert – Spring4Shell RCE (CVE-2022-22965 in Spring) | Acunetix > > < > > > https://www.acunetix.com/blog/web-security-zone/critical-alert-spring4shell-rce-cve-2022-22965-in-spring/ > > > > > > > > > Thanks in advance > > >
HACKING vulnerability is SpringBoot (Java) for apache kafka
Hi Kafka Experts Regarding the recent threat of vulnerability in spring framework , CVE-2022-22965 vulnerability is SpringBoot (Java) for apache kafka and Zookeeper. Could one of you suggest how Apache kafka and zk are impacted and what should be the ideal fix for this . Vulnerability in the Spring Framework (CVE-2022-22965) | Information Security Office (berkeley.edu) <https://security.berkeley.edu/news/vulnerability-spring-framework-cve-2022-22965> Critical alert – Spring4Shell RCE (CVE-2022-22965 in Spring) | Acunetix <https://www.acunetix.com/blog/web-security-zone/critical-alert-spring4shell-rce-cve-2022-22965-in-spring/> Thanks in advance
Re: KAFKA- UNDER_REPLICATED_PARTIONS - AUTOMATE
Thank you Malcolm. Will go through this. On Sat, Feb 26, 2022 at 2:22 AM Malcolm McFarland wrote: > Maybe this could help? > https://github.com/dimas/kafka-reassign-tool > > Cheers, > Malcolm McFarland > Cavulus > > > On Fri, Feb 25, 2022 at 9:00 AM Kafka Life wrote: > > > Dear Experts > > > > do you have any solution for this please > > > > On Tue, Feb 22, 2022 at 8:31 PM Kafka Life > wrote: > > > > > Dear Kafka Experts > > > > > > Does anyone have a dynamically generated Json file based on the Under > > > replicated partition in the kafka cluster. > > > Everytime when the URP is increased to over 500 , it is a tedious job > to > > > manually create a Json file . > > > > > > I request you to share any such dynamically generated script /json > file. > > > > > > Thanks in advance. > > > > > >> > > >
Re: KAFKA- UNDER_REPLICATED_PARTIONS - AUTOMATE
Dear Experts do you have any solution for this please On Tue, Feb 22, 2022 at 8:31 PM Kafka Life wrote: > Dear Kafka Experts > > Does anyone have a dynamically generated Json file based on the Under > replicated partition in the kafka cluster. > Everytime when the URP is increased to over 500 , it is a tedious job to > manually create a Json file . > > I request you to share any such dynamically generated script /json file. > > Thanks in advance. > >>
KAFKA- UNDER_REPLICATED_PARTIONS - AUTOMATE
Dear Kafka Experts Does anyone have a dynamically generated Json file based on the Under replicated partition in the kafka cluster. Everytime when the URP is increased to over 500 , it is a tedious job to manually create a Json file . I request you to share any such dynamically generated script /json file. Thanks in advance. >
Mirror Maker Issue : Huge difference in topic size on disk
Dear Kafka Experts , Need your advice please I am running a mirror maker in kafka 2.8 to replicate a topic from kafka 0.11 instance. The size of each partition for a topic on 0.11 is always in 5 to 6 GB but the replicated topic in 2.8 instances is in 40 GB for the same partition. The topic configuration in terms of retention, topic partition max size is same across both the instances. Is there any issue with MM on 2.8 kafka or any other pointers to look at please. Many thanks in advance .
Kafka : High avaialability settings
Dear Kafka experts i have a 10 broker kafka cluster with all topics having replication factor as 3 and partition 50 min.in.synch replicas is 2. One broker went down for a hardware failure, but many applications complained they are not able to produce /consume messages. I request you to please suggest, how do i overcome this problem and make kafka high available even during broker being down or during rolling restarts. IS there a configuration at a topic level i can set it up to have new partition created in active and running brokers when a node is down ? i read through ack -0/1/all to be set at application /producer end .But applications are not ready to change ack all . Can you please suggest . many thanks in advance
Re: Automation Script : Kafka topic creation
Thank you Men and Ran On Sat, Nov 6, 2021 at 7:23 PM Men Lim wrote: > I'm currently using Kafka-gitops. > > On Sat, Nov 6, 2021 at 3:35 AM Kafka Life wrote: > > > Dear Kafka experts > > > > does anyone have ready /automated script to create /delete /alter topics > in > > different environments? > > taking Configuration parameter as input . > > > > if yes i request you to kindly share it with me .. please > > >
Automation Script : Kafka topic creation
Dear Kafka experts does anyone have ready /automated script to create /delete /alter topics in different environments? taking Configuration parameter as input . if yes i request you to kindly share it with me .. please
Re: New Kafka Consumer : unknown member id
Hello Luke i have build a new kafka environment with kafka 2.8.0 the consumer is a new consumer set up to this environment is throwing the below error... the old consumers for the same applications for the same environment -2.8.0 is working fine.. . could you please advise 2021-11-02 12:25:24 DEBUG AbstractCoordinator:557 - [Consumer clientId=, groupId=] Attempt to join group failed due to unknown member id. On Fri, Oct 29, 2021 at 7:36 AM Luke Chen wrote: > Hi, > Which version of kafka client are you using? > I can't find this error message in the source code. > When googling this error message, it showed the error is in Kafka v0.9. > > Could you try to use the V3.0.0 and see if that issue still exist? > > Thank you. > Luke > > On Thu, Oct 28, 2021 at 11:15 PM Kafka Life > wrote: > > > Dear Kafka Experts > > > > We have set up a group.id (consumer ) = YYY > > But when tried to connect to kafka instance : i get this error message. I > > am sure this consumer (group id does not exist in kafka) .We user plain > > text protocol to connect to kafka 2.8.0. Please suggest how to resolve > this > > issue. > > > > DEBUG AbstractCoordinator:557 - [Consumer clientId=X, > groupId=YYY] > > Attempt to join group failed due to unknown member id. > > >
New Kafka Consumer : unknown member id
Dear Kafka Experts We have set up a group.id (consumer ) = YYY But when tried to connect to kafka instance : i get this error message. I am sure this consumer (group id does not exist in kafka) .We user plain text protocol to connect to kafka 2.8.0. Please suggest how to resolve this issue. DEBUG AbstractCoordinator:557 - [Consumer clientId=X, groupId=YYY] Attempt to join group failed due to unknown member id.
Apache Kafka : start up scripts
Dear Kafka experts when an broker is started using start script , could any of you please let me know the sequence of steps that happens in the back ground when the node UP like : when the script is initiated to start , 1/ is it checking indexes .. ? 2/ is it checking isr ? 3/ is URP being made to zero.. ? i tried to look in ther server log but could not under the sequence of events performed till the node was up .. could some one please help .. Thanks
Re: Zookeeper : Throttling connections
Thank you very much Mr. Israel Ekpo. Really appreciate it. We are using the 0.10 version of kafka and in the process of upgrading to 2.6.1 . Planning in process and Yes, these connections to zookeepers are for Kafka functionality. frequently there are incidents where zookeepers get bombarded with around 8,000 to 10,000 connections from multiple clients. Do you suggest we can configure globalOutstandingLimit to 8000 ? should this be set in zoo.cfg file in Zookeeper ? On Fri, Jul 16, 2021 at 7:11 PM Israel Ekpo wrote: > Hello, > > I am assuming you are using Zookeeper because of your Kafka brokers. What > version of Kafka are you using. > > I would like to start by stating that very soon this will no longer be an > issue as the project is taking steps to decouple Kafka from Zookeeper. Take > a look at KIP-500 for additional information. In Kafka 2.8.0 you should be > able to configure the early access version of running Kafka without > Zookeeper. > > When running Kafka in KRaft Mode (Without Zookeeper) you do not need to > worry about this throttling since Zookeeper is not part of the > architecture. > > To answer you questions, some changes were made to prevent Zookeeper from > getting overwhelmed hence to avoid recurring crashes during stressful loads > restrictions can be configured to manage the load on Zookeeper via the > throttling settings > > https://issues.apache.org/jira/plugins/servlet/mobile#issue/ZOOKEEPER-3243 > > You should be able to see additional information in the administration > guides on how this is configured > > https://zookeeper.apache.org/doc/r3.2.2/zookeeperAdmin.html > > For example, the configuration setting below > > globalOutstandingLimit > > (Java system property: *zookeeper.globalOutstandingLimit.*) > > Clients can submit requests faster than ZooKeeper can process them, > especially if there are a lot of clients. To prevent ZooKeeper from running > out of memory due to queued requests, ZooKeeper will throttle clients so > that there is no more than globalOutstandingLimit outstanding requests in > the system. The default limit is 1,000. > > I hope this gets you started > > Feel free to reach out if you have additional questions > > Have a great day > > > On Fri, Jul 16, 2021 at 7:44 AM Kafka Life wrote: > > > Dear KAFKA & Zookeeper experts. > > > > 1/ What is zookeeper Throttling ? Is it done at zookeepr ? How is it set > > configured ? > > 2/ Is it helpful ? > > >
Zookeeper : Throttling connections
Dear KAFKA & Zookeeper experts. 1/ What is zookeeper Throttling ? Is it done at zookeepr ? How is it set configured ? 2/ Is it helpful ?
consumer group exit :explanation
Dear kafka Experts Could one of you please help to explain what this below log in broker instance mean..and what scenarios it would occur when there is no change done . INFO [GroupCoordinator 9610]: Member webhooks-retry-app-840d3107-833f-4908-90bc-ea8c394c07c3-StreamThread-2-consumer-f87c3b85-5aa1-40f5-a42f-58927421b89e in group webhooks-retry-app has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator) INFO [GroupCoordinator 9611]: Member cm.consumer.9-d65d39d3-703f-408b-bf4b-fbf087321d8c in group cm_group_apac_sy_cu_01 has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator) Please help to explain . thanks
Consumer Reporting error
Hello Kafka experts The consumer team is reporting issue while consuming the data from the topic as Singularity Header issue. Can some one please tell on how to resolve this issue. Error looks like ; Starting offset: 1226716 offset: 1226716 position: 0 CreateTime: 1583780622665 isvalid: true keysize: -1 valuesize: 346 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 sequence: -1 isT*ransactional: false headerKeys: [singularityheader]* payload: ^@^@^@^A^@^@^@ 5.0.0.1246^T5.0.0.1246???\^B,AzLJC6OMQFemD1qRiUdTbg^B^@^BH0aa8db3c-6967-41a3-a7e5-395a5b70a59b^B^P79637040^@^@^@?^C^@^@^B$137@10 @avor-abb113???\??^A^@^@?^A^B^@^BH0186bf9c-53d8-4ec1-ae0b-3f9f6c98c4f4^Rundefined^B?8?>^B^FSIC
Apache Version upgrade
Dear Kafka Experts 1- Can any one share the upgrade plan with steps /Plan /tracker or any useful documentation please. 2- upgrading kafka from old version of 0.11 to 2.5 .Any suggestions/directions is highly appreciated. Thanks
confluent-kafka python library is not working with ubutu14 and python3
I am using "confluent-kafka==1.0.1". It works fine when I am using py3 and ubuntu18, but fails with py3 and ubuntu14. I get the following error. Traceback (most recent call last): File "/usr/local/lib/python3.4/dist-packages/metrics_agent/kafka_writer.py", line 147, in enqueue_for_topic producer.produce(topic_name, msg, partition=_get_partition(producer, topic_name)) File "/usr/local/lib/python3.4/dist-packages/confluent_kafka/serializing_producer.py", line 168, in produce raise KeySerializationError(se) confluent_kafka.error.KeySerializationError: KafkaError{code=_KEY_SERIALIZATION,val=-162,str="'bytes' object has no attribute 'encode'"} Exception KafkaError{code=_KEY_SERIALIZATION,val=-162,str="'bytes' object has no attribute 'encode'"} Traceback (most recent call last): File "/usr/local/lib/python3.4/dist-packages/confluent_kafka/serializing_producer.py", line 166, in produce key = self._key_serializer(key, ctx) File "/usr/local/lib/python3.4/dist-packages/confluent_kafka/serialization/__init__.py", line 369, in __call__ return obj.encode(self.codec) AttributeError: 'bytes' object has no attribute 'encode'
Contribute to kafka
Hi My jira id is sshil. I want to contribute to kafka. Can you please add me to the contributors list?
Pristine Zookeeper and Kafka (via Docker) producing OFFSET_OUT_OF_RANGE for Fetch
Hi all, I'm implementing a custom client. I was wondering whether anyone could explain the OFFSET_OUT_OF_RANGE error in this scenario. My test suite tears down and spins up a fresh zookeeper and kafka every time inside a pristine docker container. The test suite runs as: 1. Producer runs first and finishes. 2. Consumer group members then runs later in 3 separate threads. I write key/pairs of "fruit", "animal" and "vegetable" with a round-robin algorithm for each partition. The consumer group process runs an OffsetCommit with offset=0 for each partition to kick off. (I found that if I just started with OffsetFetch I would get UNKNOWN_TOPIC_OR_PARTITION, and couldn't find docs about whether this was "normal" or not. But that's a tangent.) This page shows writing to three partitions within a topic, and each Produce request succeeding: https://chrisdone.com/consumer-groups-sink-out-of-range.html [fine] Each column is a thread in my test suite. However, when trying to fetch those three partitions, for some reason on partition 2, I get OFFSET_OUT_OF_RANGE. The other two partitions consume successfully. This can be seen in the consumer side shown in this page: https://chrisdone.com/consumer-groups-source-out-of-range.html [problem] (scroll to about half way through, as the first half is three threads trying to join the group, then when they've joined the group. Three new threads spin up to the right, one for each consumer in the consumer group.) Yet, this is a nondeterministic error that seems to depend on timing. I have random 1-500ms waiting lag intentionally placed in every message log so that the program might exhibit real-world cases like this. If I remove the random timeouts, this process works every time (demonstrated here: https://chrisdone.com/consumer-groups-source-working.html). So there is some kind of timing issue that I cannot identify. Upon receiving an OFFSET_OUT_OF_RANGE error, you can see in the log that I wait, refresh metadata, and retry the request again (as I read elsewhere[1] that this "typically implies a leader change") only to get another OFFSET_OUT_OF_RANGE. I'm receiving a OffsetFetch response of partitionIndex = 2 and committedOffset = 0, ( ThreadId 46 , SourceRequestMsg (ReceivedResponse 0.006022722 s (OffsetFetchResponseV0 { topicsArray = ARRAY [ OffsetFetchResponseV0Topics { name = STRING "355b26d6-ccab-4b28-bd05-a44ac6326cb7" , partitionsArray = ARRAY [ OffsetFetchResponseV0TopicsPartitions { partitionIndex = 2 , committedOffset = 0 , metadata = NULLABLE_STRING (Just "") , errorCode = NONE } ] } ] }))) I sent a fetch request, ( ThreadId 49 , ConsumerGroupConsumerFor "myclientid-e79931cc-d6d4-479b-90d6-1b61aab85198" [PartitionId 2] (KafkaSourceMsg (SourceRequestMsg (SendingRequest (FetchRequestV4 { replicaId = -1 , maxWaitTime = 200 , minBytes = 5 , maxBytes = 1048576 , isolationLevel = 0 , topicsArray = ARRAY [ FetchRequestV4Topics { topic = STRING "355b26d6-ccab-4b28-bd05-a44ac6326cb7" , partitionsArray = ARRAY [ FetchRequestV4TopicsPartitions { partition = 2 , fetchOffset = 0 , partitionMaxBytes = 1048576 } ] } ] }) And yet it returns FetchResponseV4Responses { topic = STRING "355b26d6-ccab-4b28-bd05-a44ac6326cb7" , partitionResponsesArray = ARRAY [ FetchResponseV4ResponsesPartitionResponses { partitionHeader = FetchResponseV4ResponsesPartitionResponsesPartitionHeader { partition = 2 , errorCode = OFFSET_OUT_OF_RANGE , highWatermark = -1 , lastStableOffset = -1 , abortedTransactionsArray = ARRAY [] } , recordSet = RecordBatchV2Sequence {recordBatchV2Sequence = []} } ] } So I am very confused. Can someone who is more familiar with this process hazard a guess as to what's going on? Cheers, Chris [1]: https://issues.apache.org/jira/browse/KAFKA-7395?focusedCommentId=16640313=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16640313
Re: How to change schema registry port
On Wed, Dec 25, 2019 at 5:51 PM Kafka Shil wrote: > Hi, > I am using docker compose file to start schema-registry. I need to change > default port to 8084 instead if 8081. I made the following changes in > docker compose file. > > schema-registry: > image: confluentinc/cp-schema-registry:5.3.1 > hostname: schema-registry > container_name: schema-registry > depends_on: > - zookeeper > - broker > ports: > - "8084:8084" > environment: > SCHEMA_REGISTRY_HOST_NAME: schema-registry > SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181' > SCHEMA_REGISTRY_LISTENERS: http://localhost:8084 > > But If I execute "docker ps", I can see it still listens to 8081. > > 69511efd32d4confluentinc/cp-schema-registry:5.3.1 > "/etc/confluent/dock…" 9 minutes ago Up 9 minutes > 8081/tcp, 0.0.0.0:8084->8084/tcp schema-registry > > How do I change port to 8084? > > Thanks >
How to change schema registry port
Hi, I am using docker compose file to start schema-registry. I need to change default port to 8084 instead if 8081. I made the following changes in docker compose file. schema-registry: image: confluentinc/cp-schema-registry:5.3.1 hostname: schema-registry container_name: schema-registry depends_on: - zookeeper - broker ports: - "8084:8084" environment: SCHEMA_REGISTRY_HOST_NAME: schema-registry SCHEMA_REGISTRY_KAFKASTORE_CONNECTION_URL: 'zookeeper:2181' SCHEMA_REGISTRY_LISTENERS: http://localhost:8084 But If I execute "docker ps", I can see it still listens to 8081. 69511efd32d4confluentinc/cp-schema-registry:5.3.1 "/etc/confluent/dock…" 9 minutes ago Up 9 minutes 8081/tcp, 0.0.0.0:8084->8084/tcp schema-registry How do I change port to 8084? Thanks
[jira] [Created] (KAFKA-7710) Poor Zookeeper ACL management with Kerberos
Mr Kafka created KAFKA-7710: --- Summary: Poor Zookeeper ACL management with Kerberos Key: KAFKA-7710 URL: https://issues.apache.org/jira/browse/KAFKA-7710 Project: Kafka Issue Type: Bug Reporter: Mr Kafka I have seen many organizations run many Kafka clusters. The simplest scenario is you may have a *kafka.dev.example.com* cluster and a *kafka.prod.example.com* cluster. The more extreme examples is teams with in an organization may run their own individual clusters. When you enable Zookeeper ACLs in Kafka the ACL looks to be set to the principal (SPN) that is used to authenticate against Zookeeper. For example I have brokers: * *01.kafka.dev.example.com* * *02.kafka.dev.example.com*** * *03.kafka.dev.example.com*** On *01.kafka.dev.example.com* **I run the below the security-migration tool: {code:java} KAFKA_OPTS="-Djava.security.auth.login.config=/etc/kafka/kafka_server_jaas.conf -Dzookeeper.sasl.clientconfig=ZkClient" zookeeper-security-migration --zookeeper.acl=secure --zookeeper.connect=a01.zookeeper.dev.example.com:2181 {code} I end up with ACL's in Zookeeper as below: {code:java} # [zk: localhost:2181(CONNECTED) 2] getAcl /cluster # 'sasl,'kafka/01.kafka.dev.example.com@EXAMPLE # : cdrwa {code} This ACL means no other broker in the cluster can access the znode in Zookeeper except broker 01. To resolve the issue you need to set the below properties in Zookeeper's config: {code:java} kerberos.removeHostFromPrincipal = true kerberos.removeRealmFromPrincipal = true {code} Now when Kafka set ACL's they are stored as: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7510) KStreams RecordCollectorImpl leaks data to logs on error
Mr Kafka created KAFKA-7510: --- Summary: KStreams RecordCollectorImpl leaks data to logs on error Key: KAFKA-7510 URL: https://issues.apache.org/jira/browse/KAFKA-7510 Project: Kafka Issue Type: Bug Components: streams Reporter: Mr Kafka org.apache.kafka.streams.processor.internals.RecordCollectorImpl leaks data on error as it dumps the *value* / message payload to the logs. This is problematic as it may contain personally identifiable information (pii) or other secret information to plain text log files which can then be propagated to other log systems i.e Splunk. I suggest the *key*, and *value* fields be moved to debug level as it is useful for some people while error level contains the *errorMessage, timestamp, topic* and *stackTrace*. {code:java} private void recordSendError( final K key, final V value, final Long timestamp, final String topic, final Exception exception ) { String errorLogMessage = LOG_MESSAGE; String errorMessage = EXCEPTION_MESSAGE; if (exception instanceof RetriableException) { errorLogMessage += PARAMETER_HINT; errorMessage += PARAMETER_HINT; } log.error(errorLogMessage, key, value, timestamp, topic, exception.toString()); sendException = new StreamsException( String.format( errorMessage, logPrefix, "an error caught", key, value, timestamp, topic, exception.toString() ), exception); }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7349) Long Disk Writes cause Zookeeper Disconnects
Adam Kafka created KAFKA-7349: - Summary: Long Disk Writes cause Zookeeper Disconnects Key: KAFKA-7349 URL: https://issues.apache.org/jira/browse/KAFKA-7349 Project: Kafka Issue Type: Bug Components: zkclient Affects Versions: 0.11.0.1 Reporter: Adam Kafka Attachments: SpikeInWriteTime.png We run our Kafka cluster on a cloud service provider. As a consequence, we notice a large tail latency write time that is out of our control. Some writes take on the order of seconds. We have noticed that often these long write times are correlated with subsequent Zookeeper disconnects from the brokers. It appears that during the long write time, the Zookeeper heartbeat thread does not get scheduled CPU time, resulting in a long gap of heartbeats sent. After the write, the ZK thread does get scheduled CPU time, but it detects that it has not received a heartbeat from Zookeeper in a while, so it drops its connection then rejoins the cluster. Note that the timeout reported is inconsistent with the timeout as set by the configuration ({{zookeeper.session.timeout.ms}} = default of 6 seconds). We have seen a range of values reported here, including 5950ms (less than threshold), 12032ms (double the threshold), 25999ms (much larger than the threshold). We noticed that during a service degradation of the storage service of our cloud provider, these Zookeeper disconnects increased drastically in frequency. We are hoping there is a way to decouple these components. Do you agree with our diagnosis that the ZK disconnects are occurring due to thread contention caused by long disk writes? Perhaps the ZK thread could be scheduled at a higher priority? Do you have any suggestions for how to avoid the ZK disconnects? Here is an example of one of these events: Logs on the Broker: {code} [2018-08-25 04:10:19,695] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:21,697] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:23,700] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:25,701] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:27,702] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:29,704] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:31,707] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:33,709] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:35,712] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:37,714] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:39,716] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:41,719] DEBUG Got ping response for sessionid: 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn) ... [2018-08-25 04:10:53,752] WARN Client session timed out, have not heard from server in 12032ms for sessionid 0x36202ab4337002c (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:53,754] INFO Client session timed out, have not heard from server in 12032ms for sessionid 0x36202ab4337002c, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) [2018-08-25 04:10:53,920] INFO zookeeper state changed (Disconnected) (org.I0Itec.zkclient.ZkClient) [2018-08-25 04:10:53,920] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient) ... {code} GC logs during the same time (demonstrating this is not just a long GC): {code} 2018-08-25T04:10:36.434+: 35150.779: [GC (Allocation Failure) 3074119K->2529089K(6223360K), 0.0137342 secs] 2018-08-25T04:10:37.367+: 35151.713: [GC (Allocation Failure) 3074433K->2528524K(6223360K), 0.0127938 secs] 2018-08-25T04:10:38.274+: 35152.620: [GC (Allocation Failure) 3073868K->2528357K(6223360K), 0.0131040 secs] 2018-08-25T04:10:39.220+: 35153.566: [GC (Allocation Failure) 3073701K->2528885K(6223360K), 0.0133247 secs] 2018-08-25T04:10:40.175+: 35154.520: [GC (Allocation Failure) 3074229K->2528639K(6223360K), 0.0127870 secs] 2018-08-25T04:10:41.084+: 35155.429: [GC (Allocation Failure) 3073983K->2530769K(6223360K), 0.0135058 secs] 2018-08-25T04:10:42.012+: 35156.358: [GC (Allocation Failure) 3076113K->2531772K(6223360K), 0.0165919 secs] 2018-08-25T04:10:5
client use high cpu which caused by delayedFetch operation immediately return
Hi all, we found our consumer have high cpu load in our product enviroment,as we know,fetch.min.bytes and fetch.wait.ma <http://fetch.wait.ma/>x.ms will affect the frequency of consumer’s return, so we adjust them to very big so that broker is very hard to satisfy it. then we found the problem is not be solved,then we check the kafka’s code,we check delayedFetch’s tryComplete() function has these codes, if (endOffset.messageOffset != fetchOffset.messageOffset) { if (endOffset.onOlderSegment(fetchOffset)) { // Case C, this can happen when the new fetch operation is on a truncated leader debug("Satisfying fetch %s since it is fetching later segments of partition %s.".format(fetchMetadata, topicAndPartition)) return forceComplete() } else if (fetchOffset.onOlderSegment(endOffset)) { // Case C, this can happen when the fetch operation is falling behind the current segment // or the partition has just rolled a new segment debug("Satisfying fetch %s immediately since it is fetching older segments.".format(fetchMetadata)) return forceComplete() } else if (fetchOffset.messageOffset < endOffset.messageOffset) { // we need take the partition fetch size as upper bound when accumulating the bytes accumulatedSize += math.min(endOffset.positionDiff(fetchOffset), fetchStatus.fetchInfo.fetchSize) } } so we can ensure that our fetchOffset’s segmentBaseOffset is not the same as endOffset’s segmentBaseOffset,then we check our topic-partition’s segment, we found the data in the segment is all cleaned by the kafka for log.retention. and we guess that the fetchOffset’s segmentBaseOffset is smaller than endOffset’s segmentBaseOffset leads this problem. but my point is should we use we use these code to make client use less cpu, if (endOffset.messageOffset != fetchOffset.messageOffset) { if (endOffset.onOlderSegment(fetchOffset)) { return false } else if (fetchOffset.onOlderSegment(endOffset)) { return false } } and then it will response after fetch.wait.ma <http://fetch.wait.ma/>x.ms in this scene instead of immediately return. Feedback is greatly appreciated. Thanks.
Re: Does Kafka 0.9 can guarantee not loss data
Oh please ignore my last reply. I find if leaderReplica.highWatermark.messageOffset >= requiredOffset , this can ensure all replicas’ leo in curInSyncReplicas is >= the requiredOffset. > 在 2016年9月23日,下午3:39,Kafka <kafka...@126.com> 写道: > > OK, the example before is not enough to exposure problem. > What will happen to the situation under the numAcks is 1,and > curInSyncReplica.size >= minIsr,but in fact the replica in curInSyncReplica > only have one replica has caught up to leader, > and this replica is the leader replica itself,this is not safe when the > machine that deploys leader partition’s broker is restart. > > current code is as belows, > if (minIsr <= curInSyncReplicas.size) { >(true, ErrorMapping.NoError) > } else { >(true, ErrorMapping.NotEnoughReplicasAfterAppendCode) > } > > why not the code as belows, > if (minIsr <= curInSyncReplicas.size && minIsr <= numAcks) { >(true, ErrorMapping.NoError) > } else { >(true, ErrorMapping.NotEnoughReplicasAfterAppendCode) > } > > Its seems that only one condition in kafka broker’s code is not enough to > ensure safe,because replicas in curInSyncReplicas is not Strong > synchronization. > >> 在 2016年9月23日,下午1:45,Becket Qin <becket@gmail.com> 写道: >> >> In order to satisfy a produce response, there are two conditions: >> A. The leader's high watermark should be higher than the requiredOffset >> (max offset in that produce request of that partition) >> B. The number of in sync replica is greater than min.isr. >> >> The ultimate goal here is to make sure at least min.isr number of replicas >> has caught up to requiredOffset. So the check is not only whether we have >> enough number of replicas in the isr, but also whether those replicas in >> the ISR has caught up to the required offset. >> >> In your example, if numAcks is 0 and curInSyncReplica.size >= minIsr, the >> produce response won't return if min.isr > 0, because >> leaderReplica.highWatermark must be less than requiredOffset given the fact >> that numAcks is 0. i.e. condition A is not met. >> >> We are actually even doing a stronger than necessary check here. >> Theoretically as long as min.isr number of replicas has caught up to >> requiredOffset, we should be able to return the response, but we also >> require those replicas to be in the ISR. >> >> On Thu, Sep 22, 2016 at 8:15 PM, Kafka <kafka...@126.com> wrote: >> >>> @wangguozhang,could you give me some advices. >>> >>>> 在 2016年9月22日,下午6:56,Kafka <kafka...@126.com> 写道: >>>> >>>> Hi all, >>>> in terms of topic, we create a topic with 6 partition,and each >>> with 3 replicas. >>>> in terms of producer,when we send message with ack -1 using sync >>> interface. >>>> in terms of brokers,we set min.insync.replicas to 2. >>>> >>>> after we review the kafka broker’s code,we know that we send a message >>> to broker with ack -1, then we can get response if ISR of this partition is >>> great than or equal to min.insync.replicas,but what confused >>>> me is replicas in ISR is not strongly consistent,in kafka 0.9 we use >>> replica.lag.time.max.ms param to judge whether to shrink ISR, and the >>> defaults is 1 ms, so replicas’ data in isr can lag 1ms at most, >>>> we we restart broker which own this partitions’ leader, then controller >>> will start a new leader election, which will choose the first replica in >>> ISR that not equals to current leader as new leader, then this will loss >>> data. >>>> >>>> >>>> The main produce handle code shows below: >>>> val numAcks = curInSyncReplicas.count(r => { >>>>if (!r.isLocal) >>>> if (r.logEndOffset.messageOffset >= requiredOffset) { >>>>trace("Replica %d of %s-%d received offset >>> %d".format(r.brokerId, topic, partitionId, requiredOffset)) >>>>true >>>> } >>>> else >>>>false >>>>else >>>> true /* also count the local (leader) replica */ >>>> }) >>>> >>>> trace("%d acks satisfied for %s-%d with acks = >>> -1".format(numAcks, topic, partitionId)) >>>> >>>> val minIsr = leaderReplica.log.get.config.minInSyncReplicas
Re: Does Kafka 0.9 can guarantee not loss data
OK, the example before is not enough to exposure problem. What will happen to the situation under the numAcks is 1,and curInSyncReplica.size >= minIsr,but in fact the replica in curInSyncReplica only have one replica has caught up to leader, and this replica is the leader replica itself,this is not safe when the machine that deploys leader partition’s broker is restart. current code is as belows, if (minIsr <= curInSyncReplicas.size) { (true, ErrorMapping.NoError) } else { (true, ErrorMapping.NotEnoughReplicasAfterAppendCode) } why not the code as belows, if (minIsr <= curInSyncReplicas.size && minIsr <= numAcks) { (true, ErrorMapping.NoError) } else { (true, ErrorMapping.NotEnoughReplicasAfterAppendCode) } Its seems that only one condition in kafka broker’s code is not enough to ensure safe,because replicas in curInSyncReplicas is not Strong synchronization. > 在 2016年9月23日,下午1:45,Becket Qin <becket@gmail.com> 写道: > > In order to satisfy a produce response, there are two conditions: > A. The leader's high watermark should be higher than the requiredOffset > (max offset in that produce request of that partition) > B. The number of in sync replica is greater than min.isr. > > The ultimate goal here is to make sure at least min.isr number of replicas > has caught up to requiredOffset. So the check is not only whether we have > enough number of replicas in the isr, but also whether those replicas in > the ISR has caught up to the required offset. > > In your example, if numAcks is 0 and curInSyncReplica.size >= minIsr, the > produce response won't return if min.isr > 0, because > leaderReplica.highWatermark must be less than requiredOffset given the fact > that numAcks is 0. i.e. condition A is not met. > > We are actually even doing a stronger than necessary check here. > Theoretically as long as min.isr number of replicas has caught up to > requiredOffset, we should be able to return the response, but we also > require those replicas to be in the ISR. > > On Thu, Sep 22, 2016 at 8:15 PM, Kafka <kafka...@126.com> wrote: > >> @wangguozhang,could you give me some advices. >> >>> 在 2016年9月22日,下午6:56,Kafka <kafka...@126.com> 写道: >>> >>> Hi all, >>> in terms of topic, we create a topic with 6 partition,and each >> with 3 replicas. >>> in terms of producer,when we send message with ack -1 using sync >> interface. >>> in terms of brokers,we set min.insync.replicas to 2. >>> >>> after we review the kafka broker’s code,we know that we send a message >> to broker with ack -1, then we can get response if ISR of this partition is >> great than or equal to min.insync.replicas,but what confused >>> me is replicas in ISR is not strongly consistent,in kafka 0.9 we use >> replica.lag.time.max.ms param to judge whether to shrink ISR, and the >> defaults is 1 ms, so replicas’ data in isr can lag 1ms at most, >>> we we restart broker which own this partitions’ leader, then controller >> will start a new leader election, which will choose the first replica in >> ISR that not equals to current leader as new leader, then this will loss >> data. >>> >>> >>> The main produce handle code shows below: >>> val numAcks = curInSyncReplicas.count(r => { >>> if (!r.isLocal) >>> if (r.logEndOffset.messageOffset >= requiredOffset) { >>> trace("Replica %d of %s-%d received offset >> %d".format(r.brokerId, topic, partitionId, requiredOffset)) >>> true >>> } >>> else >>> false >>> else >>> true /* also count the local (leader) replica */ >>> }) >>> >>> trace("%d acks satisfied for %s-%d with acks = >> -1".format(numAcks, topic, partitionId)) >>> >>> val minIsr = leaderReplica.log.get.config.minInSyncReplicas >>> >>> if (leaderReplica.highWatermark.messageOffset >= requiredOffset >> ) { >>> /* >>> * The topic may be configured not to accept messages if there >> are not enough replicas in ISR >>> * in this scenario the request was already appended locally and >> then added to the purgatory before the ISR was shrunk >>> */ >>> if (minIsr <= curInSyncReplicas.size) { >>> (true, ErrorMapping.NoError) >>> } else { >>> (true, ErrorMapping.NotEnoughReplicasAfterAppendCode) >>> } >>> } else >>> (false, ErrorMapping.NoError) >>> >>> >>> why only logging unAcks and not use numAcks to compare with minIsr, if >> numAcks is 0, but curInSyncReplicas.size >= minIsr, then this will return, >> as ISR shrink procedure is not real time, does this will loss data after >> leader election? >>> >>> Feedback is greatly appreciated. Thanks. >>> meituan.inf >>> >>> >>> >> >> >>
Re: Does Kafka 0.9 can guarantee not loss data
@wangguozhang,could you give me some advices. > 在 2016年9月22日,下午6:56,Kafka <kafka...@126.com> 写道: > > Hi all, > in terms of topic, we create a topic with 6 partition,and each with 3 > replicas. >in terms of producer,when we send message with ack -1 using sync > interface. > in terms of brokers,we set min.insync.replicas to 2. > > after we review the kafka broker’s code,we know that we send a message to > broker with ack -1, then we can get response if ISR of this partition is > great than or equal to min.insync.replicas,but what confused > me is replicas in ISR is not strongly consistent,in kafka 0.9 we use > replica.lag.time.max.ms param to judge whether to shrink ISR, and the > defaults is 1 ms, so replicas’ data in isr can lag 1ms at most, > we we restart broker which own this partitions’ leader, then controller will > start a new leader election, which will choose the first replica in ISR that > not equals to current leader as new leader, then this will loss data. > > > The main produce handle code shows below: > val numAcks = curInSyncReplicas.count(r => { > if (!r.isLocal) >if (r.logEndOffset.messageOffset >= requiredOffset) { > trace("Replica %d of %s-%d received offset > %d".format(r.brokerId, topic, partitionId, requiredOffset)) > true >} >else > false > else >true /* also count the local (leader) replica */ >}) > >trace("%d acks satisfied for %s-%d with acks = -1".format(numAcks, > topic, partitionId)) > >val minIsr = leaderReplica.log.get.config.minInSyncReplicas > >if (leaderReplica.highWatermark.messageOffset >= requiredOffset ) { > /* > * The topic may be configured not to accept messages if there are > not enough replicas in ISR > * in this scenario the request was already appended locally and then > added to the purgatory before the ISR was shrunk > */ > if (minIsr <= curInSyncReplicas.size) { >(true, ErrorMapping.NoError) > } else { >(true, ErrorMapping.NotEnoughReplicasAfterAppendCode) > } >} else > (false, ErrorMapping.NoError) > > > why only logging unAcks and not use numAcks to compare with minIsr, if > numAcks is 0, but curInSyncReplicas.size >= minIsr, then this will return, as > ISR shrink procedure is not real time, does this will loss data after leader > election? > > Feedback is greatly appreciated. Thanks. > meituan.inf > > >
Does Kafka 0.9 can guarantee not loss data
Hi all, in terms of topic, we create a topic with 6 partition,and each with 3 replicas. in terms of producer,when we send message with ack -1 using sync interface. in terms of brokers,we set min.insync.replicas to 2. after we review the kafka broker’s code,we know that we send a message to broker with ack -1, then we can get response if ISR of this partition is great than or equal to min.insync.replicas,but what confused me is replicas in ISR is not strongly consistent,in kafka 0.9 we use replica.lag.time.max.ms param to judge whether to shrink ISR, and the defaults is 1 ms, so replicas’ data in isr can lag 1ms at most, we we restart broker which own this partitions’ leader, then controller will start a new leader election, which will choose the first replica in ISR that not equals to current leader as new leader, then this will loss data. The main produce handle code shows below: val numAcks = curInSyncReplicas.count(r => { if (!r.isLocal) if (r.logEndOffset.messageOffset >= requiredOffset) { trace("Replica %d of %s-%d received offset %d".format(r.brokerId, topic, partitionId, requiredOffset)) true } else false else true /* also count the local (leader) replica */ }) trace("%d acks satisfied for %s-%d with acks = -1".format(numAcks, topic, partitionId)) val minIsr = leaderReplica.log.get.config.minInSyncReplicas if (leaderReplica.highWatermark.messageOffset >= requiredOffset ) { /* * The topic may be configured not to accept messages if there are not enough replicas in ISR * in this scenario the request was already appended locally and then added to the purgatory before the ISR was shrunk */ if (minIsr <= curInSyncReplicas.size) { (true, ErrorMapping.NoError) } else { (true, ErrorMapping.NotEnoughReplicasAfterAppendCode) } } else (false, ErrorMapping.NoError) why only logging unAcks and not use numAcks to compare with minIsr, if numAcks is 0, but curInSyncReplicas.size >= minIsr, then this will return, as ISR shrink procedure is not real time, does this will loss data after leader election? Feedback is greatly appreciated. Thanks. meituan.inf
Re: Compacted topic cannot accept message without key
thanks for your answer,I know the necessity of key for compacted topics,and as you know,__consumer_offsets is a internal compacted topic in kafka,and it’s key is a triple of <groupid, topic, partition>,these errors are occurred when the consumer client wants to commit group offset. so why does his happen? > 在 2016年7月19日,上午1:27,Dustin Cote <dus...@confluent.io> 写道: > > Compacted topics require keyed messages in order for compaction to > function. The solution is to provide a key for your messages. I would > suggest reading the wiki on log compaction. > <https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction> > > On Mon, Jul 18, 2016 at 12:03 PM, Kafka <kafka...@126.com> wrote: > >> Hi, >>The server log shows error as belows on broker 0.9.0. >>ERROR [Replica Manager on Broker 0]: Error processing append >> operation on partition [__consumer_offsets,5] (kafka.server.ReplicaManager) >> kafka.message.InvalidMessageException: Compacted topic cannot accept >> message without key. >> >> Why does this happen and what’s the solution? >> >> >> > > > -- > Dustin Cote > confluent.io
Compacted topic cannot accept message without key
Hi, The server log shows error as belows on broker 0.9.0. ERROR [Replica Manager on Broker 0]: Error processing append operation on partition [__consumer_offsets,5] (kafka.server.ReplicaManager) kafka.message.InvalidMessageException: Compacted topic cannot accept message without key. Why does this happen and what’s the solution?
__consumer_offsets leader is -1
Hi, __consumer_offsets ’s partition 7 and partition 27 leader is -1, and isr is null,who can tell me how to recover it,thank you. Topic: __consumer_offsets Partition: 0Leader: 3 Replicas: 3,4,5 Isr: 4,5,3 Topic: __consumer_offsets Partition: 1Leader: 4 Replicas: 4,5,0 Isr: 0,4,5 Topic: __consumer_offsets Partition: 2Leader: 5 Replicas: 5,0,1 Isr: 0,5,1 Topic: __consumer_offsets Partition: 3Leader: 0 Replicas: 0,1,3 Isr: 0,1,3 Topic: __consumer_offsets Partition: 4Leader: 1 Replicas: 1,3,4 Isr: 4,1,3 Topic: __consumer_offsets Partition: 5Leader: 3 Replicas: 3,5,0 Isr: 5,3,0 Topic: __consumer_offsets Partition: 6Leader: 4 Replicas: 4,0,1 Isr: 0,4,1 Topic: __consumer_offsets Partition: 7Leader: -1 Replicas: 5,1,3 Isr: Topic: __consumer_offsets Partition: 8Leader: 0 Replicas: 0,3,4 Isr: 0,4,3 Topic: __consumer_offsets Partition: 9Leader: 1 Replicas: 1,4,5 Isr: 4,5,1 Topic: __consumer_offsets Partition: 10 Leader: 3 Replicas: 3,0,1 Isr: 0,1,3 Topic: __consumer_offsets Partition: 11 Leader: 4 Replicas: 4,1,3 Isr: 4,1,3 Topic: __consumer_offsets Partition: 12 Leader: 5 Replicas: 5,3,4 Isr: 4,5,3 Topic: __consumer_offsets Partition: 13 Leader: 0 Replicas: 0,4,5 Isr: 0,4,5 Topic: __consumer_offsets Partition: 14 Leader: 1 Replicas: 1,5,0 Isr: 0,5,1 Topic: __consumer_offsets Partition: 15 Leader: 3 Replicas: 3,1,4 Isr: 1,4,3 Topic: __consumer_offsets Partition: 16 Leader: 4 Replicas: 4,3,5 Isr: 4,5,3 Topic: __consumer_offsets Partition: 17 Leader: 5 Replicas: 5,4,0 Isr: 0,4,5 Topic: __consumer_offsets Partition: 18 Leader: 0 Replicas: 0,5,1 Isr: 0,5,1 Topic: __consumer_offsets Partition: 19 Leader: 1 Replicas: 1,0,3 Isr: 0,1,3 Topic: __consumer_offsets Partition: 20 Leader: 3 Replicas: 3,4,5 Isr: 4,5,3 Topic: __consumer_offsets Partition: 21 Leader: 4 Replicas: 4,5,0 Isr: 0,4,5 Topic: __consumer_offsets Partition: 22 Leader: 5 Replicas: 5,0,1 Isr: 0,5,1 Topic: __consumer_offsets Partition: 23 Leader: 0 Replicas: 0,1,3 Isr: 0,1,3 Topic: __consumer_offsets Partition: 24 Leader: 1 Replicas: 1,3,4 Isr: 4,1,3 Topic: __consumer_offsets Partition: 25 Leader: 3 Replicas: 3,5,0 Isr: 5,3,0 Topic: __consumer_offsets Partition: 26 Leader: 4 Replicas: 4,0,1 Isr: 0,4,1 Topic: __consumer_offsets Partition: 27 Leader: -1 Replicas: 5,1,3 Isr:
Re: delay of producer and consumer in kafka 0.9 is too big to be accepted
can someone please explain why latency is so big for me?thanks > 在 2016年6月25日,下午11:16,Jay Kreps <j...@confluent.io> 写道: > > Can you sanity check this with the end-to-end latency test that ships with > Kafka in the tools package? > > https://apache.googlesource.com/kafka/+/1769642bb779921267bd57d3d338591dbdf33842/core/src/main/scala/kafka/tools/TestEndToEndLatency.scala > > On Saturday, June 25, 2016, Kafka <kafka...@126.com> wrote: > >> Hi all, >>my kafka cluster is composed of three brokers with each have 8core >> cpu and 8g memory and 1g network card. >>with java async client,I sent 100 messages with size of 1024 >> bytes per message ,the send gap between each sending is 20us,the consumer’s >> config is like this,fetch.min.bytes is set to 1, fetch.wait.max.ms is set >> to 100. >>to avoid the inconformity bewteen two machines,I start producer >> and consumer at the same machine,the machine’s configurations have enough >> resources to satisfy these two clients. >> >>I start consumer before producer on each test,with the sending >> timestamp in each message,when consumer receive the message,then I can got >> the consumer delay through the substraction between current timesstamp and >> sending timestamp. >>when I set acks to 0,replica to 2,then the average producer delay >> is 2.98ms, the average consumer delay is 52.23ms. >>when I set acks to 1,replica to 2,then the average producer delay >> is 3.9ms,the average consumer delay is 44.88ms. >>when I set acks to -1, replica to 2, then the average producer >> delay is 1782ms, the average consumer delay is 1786ms. >> >>I have two doubts,the first is why my consumer's delay with acks >> settled to 0 is logger than the consumer delay witch acks settled to 1. >> the second is why the delay of producer and consumer is so big when I set >> acks to -1,I think this delay is can not be accepted. >>and I found this delay is amplified with sending more messages. >> >>any feedback is appreciated. >> thanks >> >> >> >>
delay of producer and consumer in kafka 0.9 is too big to be accepted
Hi all, my kafka cluster is composed of three brokers with each have 8core cpu and 8g memory and 1g network card. with java async client,I sent 100 messages with size of 1024 bytes per message ,the send gap between each sending is 20us,the consumer’s config is like this,fetch.min.bytes is set to 1, fetch.wait.max.ms is set to 100. to avoid the inconformity bewteen two machines,I start producer and consumer at the same machine,the machine’s configurations have enough resources to satisfy these two clients. I start consumer before producer on each test,with the sending timestamp in each message,when consumer receive the message,then I can got the consumer delay through the substraction between current timesstamp and sending timestamp. when I set acks to 0,replica to 2,then the average producer delay is 2.98ms, the average consumer delay is 52.23ms. when I set acks to 1,replica to 2,then the average producer delay is 3.9ms,the average consumer delay is 44.88ms. when I set acks to -1, replica to 2, then the average producer delay is 1782ms, the average consumer delay is 1786ms. I have two doubts,the first is why my consumer's delay with acks settled to 0 is logger than the consumer delay witch acks settled to 1. the second is why the delay of producer and consumer is so big when I set acks to -1,I think this delay is can not be accepted. and I found this delay is amplified with sending more messages. any feedback is appreciated. thanks
Re: test of producer's delay and consumer's delay
To Christian Posta <https://www.mail-archive.com/search?l=dev@kafka.apache.org=from:%22Christian+Posta%22>, I have taken into account the interpretation of time. my producer and consumer are deployed on the same machine, the machine’s configuration is very good, so it will not be the bottlenecks. > 在 2016年6月18日,下午10:29,Kafka <kafka...@126.com> 写道: > > I send every message with timestamp, and when I receive a message,I do a > subtraction between current timestamp and message’s timestamp. then I get > the consumer’s delay. > >> 在 2016年6月18日,上午11:28,Kafka <kafka...@126.com> 写道: >> >> >
Re: test of producer's delay and consumer's delay
I send every message with timestamp, and when I receive a message,I do a subtraction between current timestamp and message’s timestamp. then I get the consumer’s delay. > 在 2016年6月18日,上午11:28,Kafka <kafka...@126.com> 写道: > >
test of producer's delay and consumer's delay
hello,I have done a series of tests on kafka 0.9.0,and one of the results confused me. test enviroment: kafka cluster: 3 brokers,8core cpu / 8g mem /1g netcard client:4core cpu/4g mem topic:6 partitions,2 replica total messages:1 singal message size:1024byte fetch.min.bytes:1 fetch.wait.max.ms:100ms all send tests are under the enviroment of using scala sync interface, when I set ack to 0,the producer’s delay is 0.3ms,the consumer’s delay is 7.7ms when I set ack to 1,the producer's delay is 1.6ms, the consumer’s delay is 3.7ms when I set ack to -1,the produce's delay is 3.5ms, the consumer’s delay is 4.2ms but why consumer’s delay is decreased when I set ack from 0 to 1,its confused me。
kafka 0.8.2.1 High CPU load
Hi,we are using kafka_2.10-0.8.2.1,and our vm machine config is:4 core,8G our cluster is consist of three brokers,and our broker config is default 2 replica,our broker load often very high once in a while, load is greater than 1.5 on average core。 we have about 70 topics on this cluster when we use Top util, we can see 280% cpu unilization, then i use JSTACK, I found there are 4 threads use cpu most, which show below: "kafka-network-thread-9092-0" prio=10 tid=0x7f46c8709000 nid=0x35dd runnable [0x7f46b73f2000] java.lang.Thread.State: RUNNABLE "kafka-network-thread-9092-1" prio=10 tid=0x7f46c873c000 nid=0x35de runnable [0x7f46b75f4000] /kafka-network-thread "kafka-network-thread-9092-2" prio=10 tid=0x7f46c8756000 nid=0x35df runnable [0x7f46b7cfb000] java.lang.Thread.State: RUNNABLE "kafka-network-thread-9092-3" prio=10 tid=0x7f46c876f800 nid=0x35e0 runnable [0x7f46b5adb000] java.lang.Thread.State: RUNNABLE I found one task:https:// <https://issues.apache.org/jira/browse/KAFKA-493 <https://issues.apache.org/jira/browse/KAFKA-493>>issues.apache.org/jira/browse/KAFKA-493 <http://issues.apache.org/jira/browse/KAFKA-493><http://issues.apache.org/jira/browse/KAFKA-493 <http://issues.apache.org/jira/browse/KAFKA-493>> concerns this, but I think this does’t fit me, because we have other clusters that deployed in the same config’s vm machine, and others are not occurred this. the brokers config is same,but we add: replica.fetch.wait.max.ms=100 num.replica.fetchers=2 I want to question does is the main reason, or have other reason that leads to high cpu load?