Re: BUG: eos KeyValueStore::delete() in Punctuator

2022-12-05 Thread Colt McNealy
I re-compiled with the current `trunk` branch and the bug was fixed. Thank
you for pointing that out, Matthias, and sorry for the false alarm!

Cheers,
Colt McNealy
*Founder, LittleHorse.io*


On Mon, Dec 5, 2022 at 7:42 PM Matthias J. Sax  wrote:

> Thanks for reporting this issue.
>
> It might have been fixed via
> https://issues.apache.org/jira/browse/KAFKA-14294 already.
>
>
> -Matthias
>
>
>
> On 12/3/22 7:05 PM, Colt McNealy wrote:
> > Hi all,
> >
> > I believe I've found a bug in Kafka Streams when:
> >
> > - Running an app in EOS
> > - Calling KeyValueStore::delete(...) on a nonexistent key
> > - Can cause a ProducerFencedException
> >
> > The expected behavior is that the call to delete() returns null (as per
> the
> > javadoc) and doesn't cause a ProducerFencedException.
> >
> > I've created a minimally reproducible example which reliably produces the
> > bug on my own environment at this repository:
> >
> > https://github.com/littlehorse-eng/kafka-punctuator-fencing-issue
> >
> > Could someone please take a look and let me know if you can reliably
> > reproduce it on your end as well, and if so, how to file a bug?
> >
> > Thank you,
> > Colt McNealy
> > *Founder, LittleHorse.io*
> >
>


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1407

2022-12-05 Thread Apache Jenkins Server
See 




Re: BUG: eos KeyValueStore::delete() in Punctuator

2022-12-05 Thread Matthias J. Sax

Thanks for reporting this issue.

It might have been fixed via 
https://issues.apache.org/jira/browse/KAFKA-14294 already.



-Matthias



On 12/3/22 7:05 PM, Colt McNealy wrote:

Hi all,

I believe I've found a bug in Kafka Streams when:

- Running an app in EOS
- Calling KeyValueStore::delete(...) on a nonexistent key
- Can cause a ProducerFencedException

The expected behavior is that the call to delete() returns null (as per the
javadoc) and doesn't cause a ProducerFencedException.

I've created a minimally reproducible example which reliably produces the
bug on my own environment at this repository:

https://github.com/littlehorse-eng/kafka-punctuator-fencing-issue

Could someone please take a look and let me know if you can reliably
reproduce it on your end as well, and if so, how to file a bug?

Thank you,
Colt McNealy
*Founder, LittleHorse.io*



[jira] [Resolved] (KAFKA-14440) Local state wipeout with EOS

2022-12-05 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-14440.
-
Resolution: Duplicate

> Local state wipeout with EOS
> 
>
> Key: KAFKA-14440
> URL: https://issues.apache.org/jira/browse/KAFKA-14440
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.2.3
>Reporter: Abdullah alkhawatrah
>Priority: Major
> Attachments: Screenshot 2022-12-02 at 09.26.27.png
>
>
> Hey,
> I have a kafka streams service that aggregates events from multiple input 
> topics (running in a k8s cluster). The topology has multiple FKJs. The input 
> topics have around 7 billion events when the service was started from 
> `earliest`.
> The service has EOS enabled and 
> {code:java}
> transaction.timeout.ms: 60{code}
> The problem I am having is with frequent local state wipe-outs, this leads to 
> very long rebalances. As can be seen from the attached images, local disk 
> sizes go to ~ 0 very often. These wipe out are part of the EOS guarantee 
> based on this log message: 
> {code:java}
> State store transfer-store did not find checkpoint offsets while stores are 
> not empty, since under EOS it has the risk of getting uncommitted data in 
> stores we have to treat it as a task corruption error and wipe out the local 
> state of task 1_8 before re-bootstrapping{code}
>  
> I noticed that this happens as a result of one of the following:
>  * Process gets sigkill when running out of memory or on failure to shutdown 
> gracefully on pod rotation for example, this explains the missing local 
> checkpoint file, but for some reason I thought local checkpoint updates are 
> frequent, so I expected to get part of the state to be reset but not the 
> whole local state.
>  * Although we have a  long transaction timeout config, this appears many 
> times in the logs, after which kafka streams gets into error state. On 
> startup, local checkpoint file is not found:
> {code:java}
> Transiting to abortable error state due to 
> org.apache.kafka.common.errors.InvalidProducerEpochException: Producer 
> attempted to produce with an old epoch.{code}
> The service has 10 instances all having the same behaviour. The issue 
> disappears when EOS is disabled.
> The kafka cluster runs kafka 2.6, with minimum isr of 3.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-893: The Kafka protocol should support nullable structs

2022-12-05 Thread deng ziming
+1 (binding)

--
Thanks,
Ziming


> On Dec 6, 2022, at 10:48, John Roesler  wrote:
> 
> +1 (binding)
> 
> Thanks,
> -John
> 
> On Mon, Dec 5, 2022, at 16:57, Kirk True wrote:
>> +1 (non-binding)
>> 
>> On Mon, Dec 5, 2022, at 10:05 AM, Colin McCabe wrote:
>>> +1 (binding)
>>> 
>>> best,
>>> Colin
>>> 
>>> On Mon, Dec 5, 2022, at 10:03, David Jacot wrote:
 Hi all,
 
 As this KIP-893 is trivial and non-controversial, I would like to
 start the vote on it. The KIP is here:
 https://cwiki.apache.org/confluence/x/YJIODg
 
 Thanks,
 David
>>> 



Re: [VOTE] KIP-893: The Kafka protocol should support nullable structs

2022-12-05 Thread John Roesler
+1 (binding)

Thanks,
-John

On Mon, Dec 5, 2022, at 16:57, Kirk True wrote:
> +1 (non-binding)
>
> On Mon, Dec 5, 2022, at 10:05 AM, Colin McCabe wrote:
>> +1 (binding)
>> 
>> best,
>> Colin
>> 
>> On Mon, Dec 5, 2022, at 10:03, David Jacot wrote:
>> > Hi all,
>> >
>> > As this KIP-893 is trivial and non-controversial, I would like to
>> > start the vote on it. The KIP is here:
>> > https://cwiki.apache.org/confluence/x/YJIODg
>> >
>> > Thanks,
>> > David
>>


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1406

2022-12-05 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-14444) Simplify user experience of customizing partitioning strategy in Streams

2022-12-05 Thread A. Sophie Blee-Goldman (Jira)
A. Sophie Blee-Goldman created KAFKA-1:
--

 Summary: Simplify user experience of customizing partitioning 
strategy in Streams
 Key: KAFKA-1
 URL: https://issues.apache.org/jira/browse/KAFKA-1
 Project: Kafka
  Issue Type: New Feature
  Components: streams
Reporter: A. Sophie Blee-Goldman


The current process of plugging a custom partitioning scheme across a Streams 
application is fairly intensive and extremely error prone. While defining their 
topology users must pay close attention to where an operator/node may be 
connected to or creating a topic that will be produced to, or else print out 
their topology description and try to locate all sink nodes in this way. If 
they miss passing in their custom partitioner to one or more such locations in 
the topology, everything downstream will be affected by the 
inconsistent/unintended partitioning scheme.

It can also be easy for users to miss this process entirely and try to 
customize the partitioning scheme via the producer config. This does not work, 
and unfortunately results in a runtime exception that's difficult for users to 
interpret. Ideally we would provide a similar config for Streams where users 
could define a default implementation of the StreamPartitioner interface.

...unfortunately, this is not so straightforward. Unlike the case of the 
Producer config, where there is a clearly defined key and value type, there's 
no guarantee each sink node requiring the custom partitioner deals with the 
same key/value type as the others.

We could utilize the default.key/value configs for this, and only require users 
to plug in their partitioner where the key/value types differ from the default, 
but this would likely limit the usefulness of a default partitioner 
significantly. We could push this to the user to write a generic implementation 
class with type checking and handling, but this would be pretty awkward and 
error prone as well.

Either way this will take some thought, which is why the idea was pulled from 
the proposal in KIP-878 and left for a follow-up KIP



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1405

2022-12-05 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 477860 lines...]
[2022-12-05T16:34:37.477Z] 
[2022-12-05T16:34:37.477Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldOnlyReadRecordsWhereEarliestSpecifiedWithNoCommittedOffsetsWithGlobalAutoOffsetResetLatest()
 PASSED
[2022-12-05T16:34:37.477Z] 
[2022-12-05T16:34:37.477Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldThrowExceptionOverlappingPattern() STARTED
[2022-12-05T16:34:37.477Z] 
[2022-12-05T16:34:37.477Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldThrowExceptionOverlappingPattern() PASSED
[2022-12-05T16:34:37.477Z] 
[2022-12-05T16:34:37.477Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldThrowExceptionOverlappingTopic() STARTED
[2022-12-05T16:34:37.477Z] 
[2022-12-05T16:34:37.477Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldThrowExceptionOverlappingTopic() PASSED
[2022-12-05T16:34:37.477Z] 
[2022-12-05T16:34:37.477Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldOnlyReadRecordsWhereEarliestSpecifiedWithInvalidCommittedOffsets() STARTED
[2022-12-05T16:34:54.606Z] 
[2022-12-05T16:34:54.606Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldOnlyReadRecordsWhereEarliestSpecifiedWithInvalidCommittedOffsets() PASSED
[2022-12-05T16:34:54.606Z] 
[2022-12-05T16:34:54.606Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldOnlyReadRecordsWhereEarliestSpecifiedWithNoCommittedOffsetsWithDefaultGlobalAutoOffsetResetEarliest()
 STARTED
[2022-12-05T16:35:06.951Z] 
[2022-12-05T16:35:06.951Z] > Task :core:integrationTest
[2022-12-05T16:35:06.951Z] 
[2022-12-05T16:35:06.951Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 169 > ReassignPartitionsIntegrationTest > 
testAlterReassignmentThrottle(String) > 
kafka.admin.ReassignPartitionsIntegrationTest.testAlterReassignmentThrottle(String)[2]
 PASSED
[2022-12-05T16:35:06.951Z] 
[2022-12-05T16:35:06.951Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 169 > TransactionsTest > testFailureToFenceEpoch(String) > 
kafka.api.TransactionsTest.testFailureToFenceEpoch(String)[1] STARTED
[2022-12-05T16:35:06.951Z] 
[2022-12-05T16:35:06.951Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 170 > TransactionsExpirationTest > 
testTransactionAfterProducerIdExpires(String) > 
kafka.api.TransactionsExpirationTest.testTransactionAfterProducerIdExpires(String)[2]
 PASSED
[2022-12-05T16:35:29.473Z] 
[2022-12-05T16:35:29.473Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 169 > TransactionsTest > testFailureToFenceEpoch(String) > 
kafka.api.TransactionsTest.testFailureToFenceEpoch(String)[1] PASSED
[2022-12-05T16:35:29.473Z] 
[2022-12-05T16:35:29.473Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 169 > TransactionsTest > testFailureToFenceEpoch(String) > 
kafka.api.TransactionsTest.testFailureToFenceEpoch(String)[2] STARTED
[2022-12-05T16:35:37.953Z] 
[2022-12-05T16:35:37.953Z] > Task :streams:integrationTest
[2022-12-05T16:35:37.953Z] 
[2022-12-05T16:35:37.953Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldOnlyReadRecordsWhereEarliestSpecifiedWithNoCommittedOffsetsWithDefaultGlobalAutoOffsetResetEarliest()
 PASSED
[2022-12-05T16:35:37.953Z] 
[2022-12-05T16:35:37.953Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldThrowStreamsExceptionNoResetSpecified() STARTED
[2022-12-05T16:35:39.049Z] 
[2022-12-05T16:35:39.049Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > FineGrainedAutoResetIntegrationTest > 
shouldThrowStreamsExceptionNoResetSpecified() PASSED
[2022-12-05T16:35:42.600Z] 
[2022-12-05T16:35:42.600Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > GlobalKTableIntegrationTest > 
shouldGetToRunningWithOnlyGlobalTopology() STARTED
[2022-12-05T16:35:43.529Z] 
[2022-12-05T16:35:43.529Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > GlobalKTableIntegrationTest > 
shouldGetToRunningWithOnlyGlobalTopology() PASSED
[2022-12-05T16:35:43.529Z] 
[2022-12-05T16:35:43.529Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 168 > GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin() STARTED
[2022-12-05T16:35:48.106Z] 
[2022-12-05T16:35:48.106Z] Gradle Test Run 

[jira] [Resolved] (KAFKA-14398) Update EndToEndAuthorizerTest.scala to test with ZK and KRAFT quorum servers

2022-12-05 Thread Proven Provenzano (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Proven Provenzano resolved KAFKA-14398.
---
Resolution: Resolved

> Update EndToEndAuthorizerTest.scala to test with ZK and KRAFT quorum servers
> 
>
> Key: KAFKA-14398
> URL: https://issues.apache.org/jira/browse/KAFKA-14398
> Project: Kafka
>  Issue Type: Improvement
>  Components: kraft, unit tests
>Reporter: Proven Provenzano
>Assignee: Proven Provenzano
>Priority: Major
> Fix For: 3.4.0
>
>
> KRAFT is a replacement for ZK for storing metadata.
> We should validate that ACLs work with KRAFT for the supported authentication 
> mechanizms. 
> I will update EndToEndAuthorizerTest.scala to test with ZK and KRAFT.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Apache Kafka 3.4.0 release

2022-12-05 Thread Greg Harris
Hi All,

Just notifying everyone of a regression introduced by KIP-787, currently
only present on trunk, but which may qualify as a blocker for the release.
It manifests as a moderate resource leak on MirrorMaker2 clusters. The fix
should have a small scope and low risk.

Here's the bug ticket: https://issues.apache.org/jira/browse/KAFKA-14443
Here's the tentative fix PR: https://github.com/apache/kafka/pull/12955

Thanks!
Greg

On Fri, Dec 2, 2022 at 8:06 AM David Jacot 
wrote:

> Hi Sophie,
>
> FYI - I just merged KIP-840
> (
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211884652
> )
> so it will be in 3.4.
>
> Best,
> David
>
> On Thu, Dec 1, 2022 at 3:01 AM Sophie Blee-Goldman
>  wrote:
> >
> > Hey all! It's officially *feature freeze for 3.4* so make sure you get
> that
> > feature work merged by the end of today.
> > After this point, only bug fixes and other work focused on stabilizing
> the
> > release should be merged to the release
> > branch. Also note that the *3.4 code freeze* will be in one week (*Dec
> 7th*)
> > so please make sure to stabilize and
> > thoroughly test any new features.
> >
> > I will wait until Friday to create the release branch to allow for any
> > existing PRs to be merged. After this point you'll
> > need to cherrypick any new commits to the 3.4 branch once a PR is merged.
> >
> > Finally, I've updated the list of KIPs targeted for 3.4. Please check out
> > the Planned KIP Content on the release
> > plan and let me know if there is anything missing or incorrect on there.
> >
> > Cheers,
> > Sophie
> >
> >
> > On Wed, Nov 30, 2022 at 12:29 PM David Arthur  wrote:
> >
> > > Sophie, KIP-866 has been accepted. Thanks!
> > >
> > > -David
> > >
> > > On Thu, Nov 17, 2022 at 12:21 AM Sophie Blee-Goldman
> > >  wrote:
> > > >
> > > > Thanks for the update Rajini, I've added this to the release page
> since
> > > it
> > > > looks like
> > > > it will pass but of course if anything changes, just let me know.
> > > >
> > > > David, I'm fine with aiming to include KIP-866 in the 3.4 release as
> well
> > > > since this
> > > > seems to be a critical part of the zookeeper removal/migration.
> Please
> > > let
> > > > me know
> > > > when it's been accepted
> > > >
> > > > On Wed, Nov 16, 2022 at 11:08 AM Rajini Sivaram <
> rajinisiva...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi Sophie,
> > > > >
> > > > > KIP-881 has three binding votes (David Jacot, Jun and me) and one
> > > > > non-binding vote (Maulin). So it is good to go for 3.4.0 if there
> are
> > > no
> > > > > objections until the voting time of 72 hours completes on Friday.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Rajini
> > > > >
> > > > > On Wed, Nov 16, 2022 at 3:15 PM David Arthur
> > > > >  wrote:
> > > > >
> > > > > > Sophie, the vote for KIP-866 is underway, but there is still some
> > > > > > discussion happening. I'm hopeful that the vote can close this
> week,
> > > but
> > > > > it
> > > > > > may fall into next week. Can we include this KIP in 3.4?
> > > > > >
> > > > > > Thanks,
> > > > > > David
> > > > > >
> > > > > > On Tue, Nov 15, 2022 at 6:52 AM Rajini Sivaram <
> > > rajinisiva...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Sophie,
> > > > > > >
> > > > > > > I was out of office and hence couldn't get voting started for
> > > KIP-881
> > > > > in
> > > > > > > time. I will start the vote for the KIP today. If there are
> > > sufficient
> > > > > > > votes by tomorrow (16th Nov), can we include this KIP in 3.4,
> even
> > > > > though
> > > > > > > voting will only complete on the 17th? It is a small KIP, so
> we can
> > > > > merge
> > > > > > > by feature freeze.
> > > > > > >
> > > > > > > Thank you,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Nov 10, 2022 at 4:02 PM Sophie Blee-Goldman
> > > > > > >  wrote:
> > > > > > >
> > > > > > > > Hello again,
> > > > > > > >
> > > > > > > > This is a reminder that the KIP freeze deadline is
> approaching,
> > > all
> > > > > > KIPs
> > > > > > > > must be voted
> > > > > > > > and accepted by *next Wednesday* *(the 16th)*
> > > > > > > >
> > > > > > > > Keep in mind that to allow for the full voting period, this
> > > means you
> > > > > > > must
> > > > > > > > kick off the
> > > > > > > > vote for your KIP no later than* next Monday* (*the 14th*).
> > > > > > > >
> > > > > > > > The feature freeze deadline will be 2 weeks after this, so
> make
> > > sure
> > > > > to
> > > > > > > get
> > > > > > > > your KIPs in!
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Sophie
> > > > > > > >
> > > > > > > > On Tue, Oct 18, 2022 at 2:01 PM Sophie Blee-Goldman <
> > > > > > sop...@confluent.io
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey all,
> > > > > > > > >
> > > > > > > > > I've created the release page for 3.4.0 with the current
> plan,
> > > > > which
> > > > > > > you
> > > > > > > > > can find here:
> > > > > > > > >
> > > > > 

Re: Supported Kafka/Zookeeper Version with ELK 8.4.3

2022-12-05 Thread sunil chaudhari
Hi All,
Well this question can be asked on discuss.elastic.co
However I would like to answer this.
I am sure you need compatibility with logstash.
You can go through this document for logstash-kafka integration
https://www.elastic.co/guide/en/logstash/current/plugins-integrations-kafka.html

https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=73638194#content/view/73638194


Please feel free to contact me if you have any questions or concerns on
this matter.

Cheers,
Sunil Chaudhari.


On Mon, 5 Dec 2022 at 11:33 PM, Colin McCabe  wrote:

> Hi,
>
> Sorry, we do not develop ELK. In fact, I'm not sure what that acronym
> refers to. I would suggest checking in with support for that product /
> project, since it is not part of Apache Kafka.
>
> best,
> Colin
>
>
> On Fri, Oct 28, 2022, at 06:23, Kumar, Sudip wrote:
> > Hi Team,
> >
> > We are still waiting for the reply. Please update we must know what
> version of Kafka is compatible with ELK 8.4 version.
> >
> > Still, I can see no one replied on user and Dev community portal
> >
> >
> >
> >
> > Thanks
> > Sudip
> >
> >
> > *From:* Kumar, Sudip
> > *Sent:* Monday, October 17, 2022 5:23 PM
> > *To:* us...@kafka.apache.org; dev@kafka.apache.org
> > *Cc:* Rajendra Bangal, Nikhil ;
> Verma, Harshit ; Verma, Deepak Kumar <
> deepak-kumar.ve...@capgemini.com>; Arkal, Dinesh Balaji <
> dinesh-balaji.ar...@capgemini.com>; Saurabh, Shobhit <
> shobhit.saur...@capgemini.com>
> > *Subject:* Supported Kafka/Zookeeper Version with ELK 8.4.3
> > *Importance:* High
> >
> >
> > Hi Kafka Team,
> >
> > Currently we are planning to upgrade ELK 7.16 to 8.4.3 version. In our
> ecosystem we are using Kafka as middleware which is ingesting data coming
> from different sources where publisher (Logstash shipper) publishing data
> in different Kafka Topics and subscriber (Logstash indexer) consuming the
> data.
> >
> > We have an integration of ELK 7.16 with Kafka V2.5.1 and zookeeper
> 3.5.8. Please suggest if we upgrade on ELK 8.4.3 version which Kafka and
> Zookeeper version will be supported? Provide us handful documents.
> >
> > Let me know if you any further questions.
> >
> > *Thanks*
> > *Sudip Kumar*
> > *Capgemini-India *
> >
> >
> > This message contains information that may be privileged or confidential
> and is the property of the Capgemini Group. It is intended only for the
> person to whom it is addressed. If you are not the intended recipient, you
> are not authorized to read, print, retain, copy, disseminate, distribute,
> or use this message or any part thereof. If you receive this message in
> error, please notify the sender immediately and delete all copies of this
> message.
>


[jira] [Created] (KAFKA-14443) Mirror Maker Connectors leak admin clients used for topic creation

2022-12-05 Thread Greg Harris (Jira)
Greg Harris created KAFKA-14443:
---

 Summary: Mirror Maker Connectors leak admin clients used for topic 
creation
 Key: KAFKA-14443
 URL: https://issues.apache.org/jira/browse/KAFKA-14443
 Project: Kafka
  Issue Type: Bug
  Components: mirrormaker
Reporter: Greg Harris
Assignee: Greg Harris


the MirrorMaker connectors are each responsible for creating internal topics.

For example, the Checkpoint connector creates a forwarding admin and passes it 
to a method to create the topic, but never closes the ForwardingAdmin or 
delegate objects: 
[https://github.com/apache/kafka/blob/13c9c78a1f4ad92023e8354069c6817b44c89ce6/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorCheckpointConnector.java#L161-L164]

Instead, this object should be intentionally closed when it is no longer 
needed, to prevent consuming resources in a running MM2 application.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-893: The Kafka protocol should support nullable structs

2022-12-05 Thread Colin McCabe
+1 (binding)

best,
Colin

On Mon, Dec 5, 2022, at 10:03, David Jacot wrote:
> Hi all,
>
> As this KIP-893 is trivial and non-controversial, I would like to
> start the vote on it. The KIP is here:
> https://cwiki.apache.org/confluence/x/YJIODg
>
> Thanks,
> David


Re: Ci stability

2022-12-05 Thread Dan S
Thanks Colin, I have a draft PR open which I occasionally check on and
disable the failing tests, I'll update it and see if it passes.

Thanks,

Daniel Scanteianu

On Mon, Dec 5, 2022, 18:02 Colin McCabe  wrote:

> FYI, there was a memory leak that affected some of the tests which was
> fixed recently, so hopefully stability will improve a bit. See KAFKA-14433
> for details.
>
> best,
> Colin
>
> On Thu, Nov 24, 2022, at 12:48, John Roesler wrote:
> > Hi Dan,
> >
> > I’m not sure if there’s a consistently used tag, but I’ve gotten good
> > mileage out of just searching for “flaky” or “flaky test” in Jira.
> >
> > If you’re thinking about filing a ticket for a specific test failure
> > you’ve seen, I’ve also usually been able to find out whether there’s
> > already a ticket by searching for the test class or method name.
> >
> > People seem to typically file tickets with “flaky” in the title and
> > then the test name.
> >
> > Thanks again for your interest in improving the situation!
> > -John
> >
> > On Thu, Nov 24, 2022, at 10:08, Dan S wrote:
> >> Thanks for the reply John! Is there a jira tag or view or something that
> >> can be used to find all the failing tests and maybe even try to fix them
> >> (even if fix just means extending a timeout)?
> >>
> >>
> >>
> >> On Thu, Nov 24, 2022, 16:03 John Roesler  wrote:
> >>
> >>> Hi Dan,
> >>>
> >>> Thanks for pointing this out. Flaky tests are a perennial problem. We
> >>> knock them out every now and then, but eventually more spring up.
> >>>
> >>> I’ve had some luck in the past filing Jira tickets for the failing
> tests
> >>> as they pop up in my PRs. Another thing that seems to motivate people
> is to
> >>> open a PR to disable the test in question, as you mention. That can be
> a
> >>> bit aggressive, though, so it wouldn’t be my first suggestion.
> >>>
> >>> I appreciate you bringing this up. I agree that flaky tests pose a
> risk to
> >>> the project because it makes it harder to know whether a PR breaks
> things
> >>> or not.
> >>>
> >>> Thanks,
> >>> John
> >>>
> >>> On Thu, Nov 24, 2022, at 02:38, Dan S wrote:
> >>> > Hello all,
> >>> >
> >>> > I've had a pr that has been open for a little over a month (several
> >>> > feedback cycles happened), and I've never seen a fully passing build
> >>> (tests
> >>> > in completely different parts of the codebase seemed to fail, often
> >>> > timeouts). A cursory look at open PRs seems to indicate that mine is
> not
> >>> > the only one. I was wondering if there is a place where all the flaky
> >>> tests
> >>> > are being tracked, and if it makes sense to fix (or at least
> temporarily
> >>> > disable) them so that confidence in new PRs could be increased.
> >>> >
> >>> > Thanks,
> >>> >
> >>> > Dan
> >>>
>


[VOTE] KIP-893: The Kafka protocol should support nullable structs

2022-12-05 Thread David Jacot
Hi all,

As this KIP-893 is trivial and non-controversial, I would like to
start the vote on it. The KIP is here:
https://cwiki.apache.org/confluence/x/YJIODg

Thanks,
David


Re: Supported Kafka/Zookeeper Version with ELK 8.4.3

2022-12-05 Thread Colin McCabe
Hi,

Sorry, we do not develop ELK. In fact, I'm not sure what that acronym refers 
to. I would suggest checking in with support for that product / project, since 
it is not part of Apache Kafka.

best,
Colin


On Fri, Oct 28, 2022, at 06:23, Kumar, Sudip wrote:
> Hi Team,
>  
> We are still waiting for the reply. Please update we must know what version 
> of Kafka is compatible with ELK 8.4 version.
>  
> Still, I can see no one replied on user and Dev community portal
>  
> 
>  
>  
> Thanks
> Sudip
>  
>  
> *From:* Kumar, Sudip 
> *Sent:* Monday, October 17, 2022 5:23 PM
> *To:* us...@kafka.apache.org; dev@kafka.apache.org
> *Cc:* Rajendra Bangal, Nikhil ; Verma, 
> Harshit ; Verma, Deepak Kumar 
> ; Arkal, Dinesh Balaji 
> ; Saurabh, Shobhit 
> 
> *Subject:* Supported Kafka/Zookeeper Version with ELK 8.4.3 
> *Importance:* High
> 
>  
> Hi Kafka Team,
>  
> Currently we are planning to upgrade ELK 7.16 to 8.4.3 version. In our 
> ecosystem we are using Kafka as middleware which is ingesting data coming 
> from different sources where publisher (Logstash shipper) publishing data in 
> different Kafka Topics and subscriber (Logstash indexer) consuming the data.
>  
> We have an integration of ELK 7.16 with Kafka V2.5.1 and zookeeper 3.5.8. 
> Please suggest if we upgrade on ELK 8.4.3 version which Kafka and Zookeeper 
> version will be supported? Provide us handful documents.  
>  
> Let me know if you any further questions.
>  
> *Thanks*
> *Sudip Kumar*
> *Capgemini-India *
>  
>  
> This message contains information that may be privileged or confidential and 
> is the property of the Capgemini Group. It is intended only for the person to 
> whom it is addressed. If you are not the intended recipient, you are not 
> authorized to read, print, retain, copy, disseminate, distribute, or use this 
> message or any part thereof. If you receive this message in error, please 
> notify the sender immediately and delete all copies of this message.


Re: Ci stability

2022-12-05 Thread Colin McCabe
FYI, there was a memory leak that affected some of the tests which was fixed 
recently, so hopefully stability will improve a bit. See KAFKA-14433 for 
details.

best,
Colin

On Thu, Nov 24, 2022, at 12:48, John Roesler wrote:
> Hi Dan,
>
> I’m not sure if there’s a consistently used tag, but I’ve gotten good 
> mileage out of just searching for “flaky” or “flaky test” in Jira. 
>
> If you’re thinking about filing a ticket for a specific test failure 
> you’ve seen, I’ve also usually been able to find out whether there’s 
> already a ticket by searching for the test class or method name. 
>
> People seem to typically file tickets with “flaky” in the title and 
> then the test name. 
>
> Thanks again for your interest in improving the situation!
> -John
>
> On Thu, Nov 24, 2022, at 10:08, Dan S wrote:
>> Thanks for the reply John! Is there a jira tag or view or something that
>> can be used to find all the failing tests and maybe even try to fix them
>> (even if fix just means extending a timeout)?
>>
>>
>>
>> On Thu, Nov 24, 2022, 16:03 John Roesler  wrote:
>>
>>> Hi Dan,
>>>
>>> Thanks for pointing this out. Flaky tests are a perennial problem. We
>>> knock them out every now and then, but eventually more spring up.
>>>
>>> I’ve had some luck in the past filing Jira tickets for the failing tests
>>> as they pop up in my PRs. Another thing that seems to motivate people is to
>>> open a PR to disable the test in question, as you mention. That can be a
>>> bit aggressive, though, so it wouldn’t be my first suggestion.
>>>
>>> I appreciate you bringing this up. I agree that flaky tests pose a risk to
>>> the project because it makes it harder to know whether a PR breaks things
>>> or not.
>>>
>>> Thanks,
>>> John
>>>
>>> On Thu, Nov 24, 2022, at 02:38, Dan S wrote:
>>> > Hello all,
>>> >
>>> > I've had a pr that has been open for a little over a month (several
>>> > feedback cycles happened), and I've never seen a fully passing build
>>> (tests
>>> > in completely different parts of the codebase seemed to fail, often
>>> > timeouts). A cursory look at open PRs seems to indicate that mine is not
>>> > the only one. I was wondering if there is a place where all the flaky
>>> tests
>>> > are being tracked, and if it makes sense to fix (or at least temporarily
>>> > disable) them so that confidence in new PRs could be increased.
>>> >
>>> > Thanks,
>>> >
>>> > Dan
>>>


Re: [DISCUSS] KIP-893: The Kafka protocol should support nullable structs

2022-12-05 Thread Colin McCabe
Hi David,

Thanks for posting this. I think it will be pretty useful. +1 for the idea

best,
Colin

On Thu, Dec 1, 2022, at 08:57, David Jacot wrote:
> Hi all,
>
> I have drafted a very small KIP which proposes to support nullable
> struct in the Kafka protocol. This is something that we plan to use
> for KIP-848.
>
> The KIP is here: https://cwiki.apache.org/confluence/x/YJIODg
>
> Please let me know what you think.
>
> Best,
> David


Re: Plase add me as a contributor to JIRA

2022-12-05 Thread Mickael Maison
Hi,

I've granted you contributor permissions. Thanks for your interest in
Apache Kafka!

Mickael

On Mon, Dec 5, 2022 at 4:31 PM Pie Land  wrote:
>
> Hello,
>
> Please add me as a contributor to JIRA.
> My JIRA username is: cookpieland
>
> Thanks,
> Cook


Re: [DISCUSS] KIP-883: Add delete callback method to Connector API

2022-12-05 Thread Chris Egerton
Hi Hector,

Thanks for the updates!

RE 1: This doesn't have the same user-facing behavior, though. Today
failures in Connector::stop can be surfaced via the status endpoints in the
REST API. But when connectors are deleted, they won't be visible at all in
these endpoints.

RE 3: It seems like this approach would only provide guarantees on a
per-worker basis; I was wondering more about trying to wait for tasks on
all workers in the cluster to stop. One potential approach for this could
be to handle connector deletes by breaking them up into two separate
rebalances: the first would revoke all of the connector's tasks, and the
second would revoke the connector itself. But this is just an example, and
would come with its own non-negligible overhead; we can and should explore
other approaches too.

RE 4: Thanks for the clarification, makes sense  Worth noting for anyone
following along that KIP-419 is similar to this KIP, but on a much smaller
scale: it only focuses on the cleanup of resources allocated by single Task
instances, whereas this KIP focuses on the cleanup of resources that are
meant to be used across the entire lifetime of the connector, which may
encompass the lifetimes of several Connector and Task instances.

Cheers,

Chris

On Wed, Nov 30, 2022 at 10:58 AM Hector Geraldino (BLOOMBERG/ 919 3RD A) <
hgerald...@bloomberg.net> wrote:

> Thanks for your feedback Chris,
>
> 1. I think the behavior should remain the same as it is today. The worker
> stops the connector when its configuration is updated, and if the update is
> a deletion, it won't start the connector again. If an error happens during
> stop() today, the statusListener will update the backing store with a
> FAILED state. The only thing that changes on this path is that the
> Connector#stop() method will include an additional boolean parameter, so
> the connector knows that the reason it is being stopped is because of a
> deletion, and can perform additional actions if necessary.
>
> 2. I agree; at first I thought it made sense, but after reading KIP-875
> and finding out that connectors can use custom offsets topics to store
> offsets, I think this idea needs more refinement. There's probably a way to
> reuse the work proposed by this KIP with the "Automatically delete offsets
> with connectors" feature mentioned on the "Future work" section of KIP-875,
> and am happy to explore it more.
>
> 3. I didn't consider that. There is some asymmetry here on how the
> StandaloneHerder handles this (tasks are stopped before the connector is)
> and the DistributedHerder. One option would be not to handle this on the
> #processConnectorConfigUpdates(...) method, but instead wait for the
> RebalanceListener#onRevoked(...) callback, which already stops the revoked
> connectors and tasks synchronously. The idea would be to enhance this to
> check the configState store and, if the configuration of the revoked
> connector(s) is gone, then we can let the connector know about that fact
> when stopping it (by the aforementioned mechanism). I'll update the KIP and
> PR if you think it is worth it.
>
> 4. That's correct. As the KIP motivates, we have connectors that need to
> do some provisioning/setup when they are deployed (we run connectors for
> internal clients), and when tenants delete a connector, we don't have a
> clear signal that allows us to cleanup those resources. The goal is
> probably similar to
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-419%3A+Safely+notify+Kafka+Connect+SourceTask+is+stopped,
> just took a different approach.
>
>
> From: dev@kafka.apache.org At: 11/29/22 15:31:31 UTC-5:00To:
> dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-883: Add delete callback method to Connector API
>
> Hi Hector,
>
> Thanks for the KIP! Here are my initial thoughts:
>
> 1. I like the simplicity of an overloaded stop method, but there is some
> asymmetry between stopping a connector and deleting one. If a connector is
> stopped (for rebalance, to be reconfigured, etc.) and a failure occurs
> then, the failure will be clearly visible in the REST API via, e.g., the
> GET /connectors/{connector}/status endpoint. If a connector is deleted and
> a failure occurs, with the current proposal, users won't have the same
> level of visibility. How can we clearly surface failures caused during the
> "destroy" phase of a connector's lifecycle to users?
>
> 2. I don't think that this new feature should be used to control (delete)
> offsets for connectors. We're addressing that separately in KIP-875, and it
> could be a source of headaches for users if they discover that some
> connectors' offsets persist across deletion/recreation while others do not.
> If anything, we should explicitly recommend against this kind of logic in
> the Javadocs for the newly-introduced method.
>
> 3. Is it worth trying to give all of the connector's tasks a chance to shut
> down before invoking "stop(true)" on the Connector? If so, any thoughts on
> how we can 

Plase add me as a contributor to JIRA

2022-12-05 Thread Pie Land
Hello,

Please add me as a contributor to JIRA.
My JIRA username is: cookpieland

Thanks,
Cook


Re: [DISCUSS] KIP-864: Add End-To-End Latency Metrics to Connectors

2022-12-05 Thread Chris Egerton
Hi Jorge,

Thanks for indulging my paranoia. LGTM!

Cheers,

Chris

On Mon, Dec 5, 2022 at 10:06 AM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Sure! I have a added the following to the proposed changes section:
>
> ```
> The per-record metrics will definitely be added to Kafka Connect as part of
> this KIP, but their metric level will be changed pending the performance
> testing described in KAFKA-14441, and will otherwise only be exposed at
> lower level (DEBUG instead of INFO, and TRACE instead of DEBUG)
> ```
>
> Let me know if how does it look.
>
> Many thanks!
> Jorge.
>
> On Mon, 5 Dec 2022 at 14:11, Chris Egerton 
> wrote:
>
> > Hi Jorge,
> >
> > Thanks for filing KAFKA-14441! In the ticket description we mention that
> > "there will be more confidence whether to design metrics to be exposed
> at a
> > DEBUG or INFO level depending on their impact" but it doesn't seem like
> > this is called out in the KIP and, just based on what's in the KIP, the
> > proposal is still to have several per-record metrics exposed at INFO
> level.
> >
> > Could we explicitly call out that the per-record metrics will definitely
> be
> > added to Kafka Connect as part of this KIP, but they will only be exposed
> > at INFO level pending pending the performance testing described in
> > KAFKA-14441, and will otherwise only be exposed at DEBUG level?
> Otherwise,
> > it's possible that a vote for the KIP as it's written today would be a
> vote
> > in favor of unconditionally exposing these metrics at INFO level, even if
> > the performance testing reveals issues.
> >
> > Cheers,
> >
> > Chris
> >
> > On Sun, Dec 4, 2022 at 7:08 PM Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > Thanks for the reminder Chris!
> > >
> > > I have added a note on the KIP to include this as part of the KIP as
> most
> > > of the metrics proposed are per-record and having all on DEBUG would
> > limit
> > > the benefits, and created
> > > https://issues.apache.org/jira/browse/KAFKA-14441
> > > to keep track of this task.
> > >
> > > Cheers,
> > > Jorge.
> > >
> > > On Tue, 29 Nov 2022 at 19:40, Chris Egerton 
> > > wrote:
> > >
> > > > Hi Jorge,
> > > >
> > > > Thanks! What were your thoughts on the possible benchmarking and/or
> > > > downgrading of per-record metrics to DEBUG?
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > > On Thu, Nov 24, 2022 at 8:20 AM Jorge Esteban Quilcate Otoya <
> > > > quilcate.jo...@gmail.com> wrote:
> > > >
> > > > > Thanks Chris! I have updated the KIP with "transform" instead of
> > > "alias".
> > > > > Agree it's clearer.
> > > > >
> > > > > Cheers,
> > > > > Jorge.
> > > > >
> > > > > On Mon, 21 Nov 2022 at 21:36, Chris Egerton
>  > >
> > > > > wrote:
> > > > >
> > > > > > Hi Jorge,
> > > > > >
> > > > > > Thanks for the updates, and apologies for the delay. The new
> > diagram
> > > > > > directly under the "Proposed Changes" section is absolutely
> > gorgeous!
> > > > > >
> > > > > >
> > > > > > Follow-ups:
> > > > > >
> > > > > > RE 2: Good point. We can use the same level for these metrics,
> it's
> > > > not a
> > > > > > big deal.
> > > > > >
> > > > > > RE 3: As long as all the per-record metrics are kept at DEBUG
> > level,
> > > it
> > > > > > should be fine to leave JMH benchmarking for a follow-up. If we
> > want
> > > to
> > > > > add
> > > > > > new per-record, INFO-level metrics, I would be more comfortable
> > with
> > > > > > including benchmarking as part of the testing plan for the KIP.
> One
> > > > > > possible compromise could be to propose that these features be
> > merged
> > > > at
> > > > > > DEBUG level, and then possibly upgraded to INFO level in the
> future
> > > > > pending
> > > > > > benchmarks to guard against performance degradation.
> > > > > >
> > > > > > RE 4: I think for a true "end-to-end" metric, it'd be useful to
> > > include
> > > > > the
> > > > > > time taken by the task to actually deliver the record. However,
> > with
> > > > the
> > > > > > new metric names and descriptions provided in the KIP, I have no
> > > > > objections
> > > > > > with what's currently proposed, and a new "end-to-end" metric can
> > be
> > > > > taken
> > > > > > on later in a follow-up KIP.
> > > > > >
> > > > > > RE 6: You're right, existing producer metrics should be enough
> for
> > > now.
> > > > > We
> > > > > > can revisit this later if/when we add delivery-centric metrics
> for
> > > sink
> > > > > > tasks as well.
> > > > > >
> > > > > > RE 7: The new metric names in the KIP LGTM; I don't see any need
> to
> > > > > expand
> > > > > > beyond those but if you'd still like to pursue others, LMK.
> > > > > >
> > > > > >
> > > > > > New thoughts:
> > > > > >
> > > > > > One small thought: instead of "alias" in
> "alias="{transform_alias}"
> > > for
> > > > > the
> > > > > > per-transform metrics, could we use "transform"? IMO it's clearer
> > > since
> > > > > we
> > > > > > don't use "alias" in the names of 

Re: [DISCUSS] KIP-864: Add End-To-End Latency Metrics to Connectors

2022-12-05 Thread Jorge Esteban Quilcate Otoya
Sure! I have a added the following to the proposed changes section:

```
The per-record metrics will definitely be added to Kafka Connect as part of
this KIP, but their metric level will be changed pending the performance
testing described in KAFKA-14441, and will otherwise only be exposed at
lower level (DEBUG instead of INFO, and TRACE instead of DEBUG)
```

Let me know if how does it look.

Many thanks!
Jorge.

On Mon, 5 Dec 2022 at 14:11, Chris Egerton  wrote:

> Hi Jorge,
>
> Thanks for filing KAFKA-14441! In the ticket description we mention that
> "there will be more confidence whether to design metrics to be exposed at a
> DEBUG or INFO level depending on their impact" but it doesn't seem like
> this is called out in the KIP and, just based on what's in the KIP, the
> proposal is still to have several per-record metrics exposed at INFO level.
>
> Could we explicitly call out that the per-record metrics will definitely be
> added to Kafka Connect as part of this KIP, but they will only be exposed
> at INFO level pending pending the performance testing described in
> KAFKA-14441, and will otherwise only be exposed at DEBUG level? Otherwise,
> it's possible that a vote for the KIP as it's written today would be a vote
> in favor of unconditionally exposing these metrics at INFO level, even if
> the performance testing reveals issues.
>
> Cheers,
>
> Chris
>
> On Sun, Dec 4, 2022 at 7:08 PM Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Thanks for the reminder Chris!
> >
> > I have added a note on the KIP to include this as part of the KIP as most
> > of the metrics proposed are per-record and having all on DEBUG would
> limit
> > the benefits, and created
> > https://issues.apache.org/jira/browse/KAFKA-14441
> > to keep track of this task.
> >
> > Cheers,
> > Jorge.
> >
> > On Tue, 29 Nov 2022 at 19:40, Chris Egerton 
> > wrote:
> >
> > > Hi Jorge,
> > >
> > > Thanks! What were your thoughts on the possible benchmarking and/or
> > > downgrading of per-record metrics to DEBUG?
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Thu, Nov 24, 2022 at 8:20 AM Jorge Esteban Quilcate Otoya <
> > > quilcate.jo...@gmail.com> wrote:
> > >
> > > > Thanks Chris! I have updated the KIP with "transform" instead of
> > "alias".
> > > > Agree it's clearer.
> > > >
> > > > Cheers,
> > > > Jorge.
> > > >
> > > > On Mon, 21 Nov 2022 at 21:36, Chris Egerton  >
> > > > wrote:
> > > >
> > > > > Hi Jorge,
> > > > >
> > > > > Thanks for the updates, and apologies for the delay. The new
> diagram
> > > > > directly under the "Proposed Changes" section is absolutely
> gorgeous!
> > > > >
> > > > >
> > > > > Follow-ups:
> > > > >
> > > > > RE 2: Good point. We can use the same level for these metrics, it's
> > > not a
> > > > > big deal.
> > > > >
> > > > > RE 3: As long as all the per-record metrics are kept at DEBUG
> level,
> > it
> > > > > should be fine to leave JMH benchmarking for a follow-up. If we
> want
> > to
> > > > add
> > > > > new per-record, INFO-level metrics, I would be more comfortable
> with
> > > > > including benchmarking as part of the testing plan for the KIP. One
> > > > > possible compromise could be to propose that these features be
> merged
> > > at
> > > > > DEBUG level, and then possibly upgraded to INFO level in the future
> > > > pending
> > > > > benchmarks to guard against performance degradation.
> > > > >
> > > > > RE 4: I think for a true "end-to-end" metric, it'd be useful to
> > include
> > > > the
> > > > > time taken by the task to actually deliver the record. However,
> with
> > > the
> > > > > new metric names and descriptions provided in the KIP, I have no
> > > > objections
> > > > > with what's currently proposed, and a new "end-to-end" metric can
> be
> > > > taken
> > > > > on later in a follow-up KIP.
> > > > >
> > > > > RE 6: You're right, existing producer metrics should be enough for
> > now.
> > > > We
> > > > > can revisit this later if/when we add delivery-centric metrics for
> > sink
> > > > > tasks as well.
> > > > >
> > > > > RE 7: The new metric names in the KIP LGTM; I don't see any need to
> > > > expand
> > > > > beyond those but if you'd still like to pursue others, LMK.
> > > > >
> > > > >
> > > > > New thoughts:
> > > > >
> > > > > One small thought: instead of "alias" in "alias="{transform_alias}"
> > for
> > > > the
> > > > > per-transform metrics, could we use "transform"? IMO it's clearer
> > since
> > > > we
> > > > > don't use "alias" in the names of transform-related properties, and
> > > > "alias"
> > > > > may be confused with the classloading term where you can use, e.g.,
> > > > > "FileStreamSource" as the name of a connector class in a connector
> > > config
> > > > > instead of
> "org.apache.kafka.connect.file.FileStreamSourceConnector".
> > > > >
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Chris
> > > > >
> > > > > On Fri, Nov 18, 2022 at 12:06 PM Jorge Esteban Quilcate Otoya <
> > > > > quilcate.jo...@gmail.com> 

Re: [DISCUSS] KIP-864: Add End-To-End Latency Metrics to Connectors

2022-12-05 Thread Chris Egerton
Hi Jorge,

Thanks for filing KAFKA-14441! In the ticket description we mention that
"there will be more confidence whether to design metrics to be exposed at a
DEBUG or INFO level depending on their impact" but it doesn't seem like
this is called out in the KIP and, just based on what's in the KIP, the
proposal is still to have several per-record metrics exposed at INFO level.

Could we explicitly call out that the per-record metrics will definitely be
added to Kafka Connect as part of this KIP, but they will only be exposed
at INFO level pending pending the performance testing described in
KAFKA-14441, and will otherwise only be exposed at DEBUG level? Otherwise,
it's possible that a vote for the KIP as it's written today would be a vote
in favor of unconditionally exposing these metrics at INFO level, even if
the performance testing reveals issues.

Cheers,

Chris

On Sun, Dec 4, 2022 at 7:08 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Thanks for the reminder Chris!
>
> I have added a note on the KIP to include this as part of the KIP as most
> of the metrics proposed are per-record and having all on DEBUG would limit
> the benefits, and created
> https://issues.apache.org/jira/browse/KAFKA-14441
> to keep track of this task.
>
> Cheers,
> Jorge.
>
> On Tue, 29 Nov 2022 at 19:40, Chris Egerton 
> wrote:
>
> > Hi Jorge,
> >
> > Thanks! What were your thoughts on the possible benchmarking and/or
> > downgrading of per-record metrics to DEBUG?
> >
> > Cheers,
> >
> > Chris
> >
> > On Thu, Nov 24, 2022 at 8:20 AM Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > Thanks Chris! I have updated the KIP with "transform" instead of
> "alias".
> > > Agree it's clearer.
> > >
> > > Cheers,
> > > Jorge.
> > >
> > > On Mon, 21 Nov 2022 at 21:36, Chris Egerton 
> > > wrote:
> > >
> > > > Hi Jorge,
> > > >
> > > > Thanks for the updates, and apologies for the delay. The new diagram
> > > > directly under the "Proposed Changes" section is absolutely gorgeous!
> > > >
> > > >
> > > > Follow-ups:
> > > >
> > > > RE 2: Good point. We can use the same level for these metrics, it's
> > not a
> > > > big deal.
> > > >
> > > > RE 3: As long as all the per-record metrics are kept at DEBUG level,
> it
> > > > should be fine to leave JMH benchmarking for a follow-up. If we want
> to
> > > add
> > > > new per-record, INFO-level metrics, I would be more comfortable with
> > > > including benchmarking as part of the testing plan for the KIP. One
> > > > possible compromise could be to propose that these features be merged
> > at
> > > > DEBUG level, and then possibly upgraded to INFO level in the future
> > > pending
> > > > benchmarks to guard against performance degradation.
> > > >
> > > > RE 4: I think for a true "end-to-end" metric, it'd be useful to
> include
> > > the
> > > > time taken by the task to actually deliver the record. However, with
> > the
> > > > new metric names and descriptions provided in the KIP, I have no
> > > objections
> > > > with what's currently proposed, and a new "end-to-end" metric can be
> > > taken
> > > > on later in a follow-up KIP.
> > > >
> > > > RE 6: You're right, existing producer metrics should be enough for
> now.
> > > We
> > > > can revisit this later if/when we add delivery-centric metrics for
> sink
> > > > tasks as well.
> > > >
> > > > RE 7: The new metric names in the KIP LGTM; I don't see any need to
> > > expand
> > > > beyond those but if you'd still like to pursue others, LMK.
> > > >
> > > >
> > > > New thoughts:
> > > >
> > > > One small thought: instead of "alias" in "alias="{transform_alias}"
> for
> > > the
> > > > per-transform metrics, could we use "transform"? IMO it's clearer
> since
> > > we
> > > > don't use "alias" in the names of transform-related properties, and
> > > "alias"
> > > > may be confused with the classloading term where you can use, e.g.,
> > > > "FileStreamSource" as the name of a connector class in a connector
> > config
> > > > instead of "org.apache.kafka.connect.file.FileStreamSourceConnector".
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > > On Fri, Nov 18, 2022 at 12:06 PM Jorge Esteban Quilcate Otoya <
> > > > quilcate.jo...@gmail.com> wrote:
> > > >
> > > > > Thanks Mickael!
> > > > >
> > > > >
> > > > > On Wed, 9 Nov 2022 at 15:54, Mickael Maison <
> > mickael.mai...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Jorge,
> > > > > >
> > > > > > Thanks for the KIP, it is a nice improvement.
> > > > > >
> > > > > > 1) The per transformation metrics still have a question mark next
> > to
> > > > > > them in the KIP. Do you want to include them? If so we'll want to
> > tag
> > > > > > them, we should be able to include the aliases in
> > TransformationChain
> > > > > > and use them.
> > > > > >
> > > > >
> > > > > Yes, I have added the changes on TransformChain that will be needed
> > to
> > > > add
> > > > > these metrics.
> > > > >
> > > > >
> > > > > >
> > > > > > 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1404

2022-12-05 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-14442) GlobalKTable restoration waits requestTimeout during application restart

2022-12-05 Thread Jira
Gergo L��p created KAFKA-14442:
--

 Summary: GlobalKTable restoration waits requestTimeout during 
application restart
 Key: KAFKA-14442
 URL: https://issues.apache.org/jira/browse/KAFKA-14442
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 3.0.0
Reporter: Gergo L��p


Using "exactly_once_beta" the highWatermark "skips" an offset after a 
transaction but in this case the global .checkpoint file contains different 
value (smaller by 1) than the highWatermark.
During restoration because of the difference between the checkpoint and 
highWatermark a poll will be attempted but sometimes there is no new record on 
the partition and the GlobalStreamThread has to wait for the requestTimeout to 
continue.
If there is any new record on the partition the problem does not occure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1403

2022-12-05 Thread Apache Jenkins Server
See