Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1970

2023-07-04 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 386274 lines...]

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testOuterWithLeftVersionedOnly[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testOuterWithLeftVersionedOnly[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testLeftWithRightVersionedOnly[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testLeftWithRightVersionedOnly[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerInner[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerInner[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerOuter[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerOuter[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerLeft[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerLeft[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testOuterInner[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testOuterInner[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testOuterOuter[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testOuterOuter[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerWithRightVersionedOnly[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerWithRightVersionedOnly[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testLeftWithLeftVersionedOnly[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testLeftWithLeftVersionedOnly[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerWithLeftVersionedOnly[caching enabled = false] STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TableTableJoinIntegrationTest > [caching enabled = false] > 
testInnerWithLeftVersionedOnly[caching enabled = false] PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TaskAssignorIntegrationTest > shouldProperlyConfigureTheAssignor STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TaskAssignorIntegrationTest > shouldProperlyConfigureTheAssignor PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TaskMetadataIntegrationTest > shouldReportCorrectEndOffsetInformation STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TaskMetadataIntegrationTest > shouldReportCorrectEndOffsetInformation PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TaskMetadataIntegrationTest > shouldReportCorrectCommittedOffsetInformation 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
TaskMetadataIntegrationTest > shouldReportCorrectCommittedOffsetInformation 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
EmitOnChangeIntegrationTest > shouldEmitSameRecordAfterFailover() STARTED

Gradle Te

[jira] [Created] (KAFKA-15146) Flaky test ConsumerBounceTest.testConsumptionWithBrokerFailures

2023-07-04 Thread Divij Vaidya (Jira)
Divij Vaidya created KAFKA-15146:


 Summary: Flaky test 
ConsumerBounceTest.testConsumptionWithBrokerFailures
 Key: KAFKA-15146
 URL: https://issues.apache.org/jira/browse/KAFKA-15146
 Project: Kafka
  Issue Type: Test
  Components: unit tests
Reporter: Divij Vaidya


Flaky test that fails with the following error. Example build - 
[https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13953/2] 
{noformat}
Gradle Test Run :core:integrationTest > Gradle Test Executor 177 > 
ConsumerBounceTest > testConsumptionWithBrokerFailures() FAILED
org.apache.kafka.clients.consumer.CommitFailedException: Offset commit 
cannot be completed since the consumer is not part of an active group for auto 
partition assignment; it is likely that the consumer was kicked out of the 
group.
at 
app//org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:1351)
at 
app//org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:1188)
at 
app//org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1518)
at 
app//org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1417)
at 
app//org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1374)
at 
app//kafka.api.ConsumerBounceTest.consumeWithBrokerFailures(ConsumerBounceTest.scala:109)
at 
app//kafka.api.ConsumerBounceTest.testConsumptionWithBrokerFailures(ConsumerBounceTest.scala:81){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [kafka-site] divijvaidya commented on pull request #531: Add CVE-2023-34455 to cve-list

2023-07-04 Thread via GitHub


divijvaidya commented on PR #531:
URL: https://github.com/apache/kafka-site/pull/531#issuecomment-1620427926

   @ijuma @showuon @jlprat @mimaison please review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka-site] divijvaidya opened a new pull request, #531: Add CVE-2023-34455 to cve-list

2023-07-04 Thread via GitHub


divijvaidya opened a new pull request, #531:
URL: https://github.com/apache/kafka-site/pull/531

   ## Change
   Add details about CVE-2023-34455 to cve-list section of the website as an 
action item from discussion in the mailing list secur...@kafka.apache.org.
   
   ## FAQs
   
   1. What versions of Kafka are impacted?
   A. All versions that support Snappy are impacted starting from [Apache Kafka 
0.8.0](https://issues.apache.org/jira/browse/KAFKA-187) starting with 
[snappy-java version 
1.0.4.1](https://github.com/apache/kafka/blob/15bb3961d9171c1c54c4c840a554ce2c76168163/project/build/KafkaProject.scala#L249).
 Note that Apache Kafka (AK) has been using the same Snappy-Java API 
(SnappyInputStream) [since the 
beginning](https://github.com/apache/kafka/blob/15bb3961d9171c1c54c4c840a554ce2c76168163/core/src/main/scala/kafka/message/CompressionFactory.scala#L45).
   
   2. What versions of Snappy-Java have this vulnerability?
   A. [All versions prior to 
1.1.10.1](https://github.com/xerial/snappy-java/security/advisories/GHSA-qcwq-55hx-v3vh)
   
   3. Which users are impacted?
   A. All workloads which use snappy based compression are impacted. Even if a 
user isn't currently using snappy compression, a malicious user can send a 
produce request containing a record with a malicious payload which is 
compressed using snappy.
   
   ## Testing
   
   Result of running the website locally after applying this PR is as follows:
   
   ![Screenshot 2023-07-04 at 16 55 
44](https://github.com/apache/kafka-site/assets/71267/47fc2057-298c-4028-be97-e82a5e4f849f)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-15140) Improve TopicCommandIntegrationTest to be less flaky

2023-07-04 Thread Divij Vaidya (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Divij Vaidya resolved KAFKA-15140.
--
Fix Version/s: 3.6.0
 Reviewer: Divij Vaidya
   Resolution: Fixed

> Improve TopicCommandIntegrationTest to be less flaky
> 
>
> Key: KAFKA-15140
> URL: https://issues.apache.org/jira/browse/KAFKA-15140
> Project: Kafka
>  Issue Type: Test
>  Components: unit tests
>Reporter: Divij Vaidya
>Assignee: Lan Ding
>Priority: Minor
>  Labels: newbie
> Fix For: 3.6.0
>
>
> *This is a good Jira for folks who are new to contributing to Kafka.*
> Tests in TopicCommandIntegrationTest get flaky from time to time. The 
> objective of the task is to make them more robust by doing the following:
> 1. Replace the usage {-}createAndWaitTopic{-}() adminClient.createTopics() 
> method and other places where were are creating a topic (without waiting) 
> with 
> TestUtils.createTopicWithAdmin(). The latter method already contains the 
> functionality to create a topic and wait for metadata to sync up.
> 2. Replace the number 6 at places such as 
> "adminClient.createTopics(
> Collections.singletonList(new NewTopic("foo_bar", 1, 6.toShort)))" with a 
> meaningful constant.
> 3. Add logs if an assertion fails, for example, lines such as "
> assertTrue(rows(0).startsWith(s"\tTopic: $testTopicName"), output)" should 
> have a third argument which prints the actual output printed so that we can 
> observe in the test logs on what was the output when assertion failed.
> 4. Replace occurrences of "\n" with System.lineSeparator() which is platform 
> independent
> 5. We should wait for reassignment to complete whenever we are re-assigning 
> partitions using alterconfig before we call describe to validate it. We could 
> use 
> TestUtils.waitForAllReassignmentsToComplete()
> *Motivation of this task*
> Try to fix the flaky test behaviour such as observed in 
> [https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13924/5/testReport/junit/kafka.admin/TopicCommandIntegrationTest/Build___JDK_11_and_Scala_2_13___testDescribeUnderMinIsrPartitionsMixed_String__quorum_zk/]
>  
> {noformat}
> org.opentest4j.AssertionFailedError: expected:  but was: 
>   at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>   at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>   at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
>   at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
>   at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)
>   at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:180)
>   at 
> app//kafka.admin.TopicCommandIntegrationTest.testDescribeUnderMinIsrPartitionsMixed(TopicCommandIntegrationTest.scala:794){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-793: Allow sink connectors to be used with topic-mutating SMTs

2023-07-04 Thread Mickael Maison
Hi,

+1 (binding)
Thanks for the KIP

Mickael


On Mon, Jul 3, 2023 at 8:18 PM Greg Harris  wrote:
>
> Hey Yash,
>
> Thanks so much for your effort in the design and discussion phase!
>
> +1 (non-binding)
>
> Greg
>
> On Mon, Jul 3, 2023 at 7:19 AM Chris Egerton  wrote:
> >
> > Hi Yash,
> >
> > Thanks for the KIP! +1 (binding)
> >
> > Cheers,
> >
> > Chris
> >
> > On Mon, Jul 3, 2023 at 7:02 AM Yash Mayya  wrote:
> >
> > > Hi all,
> > >
> > > I'd like to start a vote on KIP-793 which enables sink connector
> > > implementations to be used with SMTs that mutate the topic / partition /
> > > offset information of a record.
> > >
> > > KIP -
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-793%3A+Allow+sink+connectors+to+be+used+with+topic-mutating+SMTs
> > >
> > > Discussion thread -
> > > https://lists.apache.org/thread/dfo3spv0xtd7vby075qoxvcwsgx5nkj8
> > >
> > > Thanks,
> > > Yash
> > >


[jira] [Resolved] (KAFKA-15144) MM2 Checkpoint downstreamOffset stuck to 1

2023-07-04 Thread Edoardo Comar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edoardo Comar resolved KAFKA-15144.
---
Resolution: Not A Bug

Closing as not a bug.

The "problem" arose as without config changes, by updating MM2 from 3.3.2 to a 
later release the observable content of the Checkpoint topic has changed 
considerably.

 

In 3.3.2 even without new records in the OffsetSync topic, the Checkpoint 
records were advancing often (and even contain many duplicates). 



Now gaps of up to offset.lag.max must be expected and more reprocessing of 
records downstream may occur

> MM2 Checkpoint downstreamOffset stuck to 1
> --
>
> Key: KAFKA-15144
> URL: https://issues.apache.org/jira/browse/KAFKA-15144
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Reporter: Edoardo Comar
>Assignee: Edoardo Comar
>Priority: Major
> Attachments: edo-connect-mirror-maker-sourcetarget.properties
>
>
> Steps to reproduce :
> 1.Start the source cluster
> 2.Start the target cluster
> 3.Start connect-mirror-maker.sh using a config like the attached
> 4.Create a topic in source cluster
> 5.produce a few messages
> 6.consume them all with autocommit enabled
>  
> 7. then dump the Checkpoint topic content e.g.
> {{% bin/kafka-console-consumer.sh --bootstrap-server localhost:9192 --topic 
> source.checkpoints.internal --from-beginning --formatter 
> org.apache.kafka.connect.mirror.formatters.CheckpointFormatter}}
> {{{}Checkpoint{consumerGroupId=edogroup, 
> topicPartition=source.vf-mirroring-test-edo-0, upstreamOffset=3, 
> {*}downstreamOffset=1{*}, metadata={
>  
> the downstreamOffset remains at 1, while, in a fresh cluster pair like with 
> the source topic created while MM2 is running, 
> I'd expect the downstreamOffset to match the upstreamOffset.
> Note that dumping the offset sync topic, shows matching initial offsets
> {{% bin/kafka-console-consumer.sh --bootstrap-server localhost:9192 --topic 
> mm2-offset-syncs.source.internal --from-beginning --formatter 
> org.apache.kafka.connect.mirror.formatters.OffsetSyncFormatter}}
> {{{}OffsetSync{topicPartition=vf-mirroring-test-edo-0, upstreamOffset=0, 
> downstreamOffset=0{
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-910: Update Source offsets for Source Connectors without producing records

2023-07-04 Thread Yash Mayya
Hi Sagar,

Thanks for your continued work on this KIP! Here are my thoughts on your
updated proposal:

1) In the proposed changes section where you talk about modifying the
offsets, could you please clarify that tasks shouldn't modify the offsets
map that is passed as an argument? Currently, the distinction between the
offsets map passed as an argument and the offsets map that is returned is
not very clear in numerous places.

2) The default return value of Optional.empty() seems to be fairly
non-intuitive considering that the return value is supposed to be the
offsets that are to be committed. Can we consider simply returning the
offsets argument itself by default instead?

3) The KIP states that "It is also possible that a task might choose to
send a tombstone record as an offset. This is not recommended and to
prevent connectors shooting themselves in the foot due to this" - could you
please clarify why this is not recommended / supported?

4) The KIP states that "If a task returns an Optional of a null object or
an Optional of an empty map, even for such cases the behaviour would would
be disabled." - since this is an optional API that source task
implementations don't necessarily need to implement, I don't think I fully
follow why the return type of the proposed "updateOffsets" method is an
Optional? Can we not simply use the Map as the return type instead?

5) The KIP states that "The offsets passed to the updateOffsets  method
would be the offset from the latest source record amongst all source
records per partition. This way, if the source offset for a given source
partition is updated, that offset is the one that gets committed for the
source partition." - we should clarify that the "latest" offset refers to
the offsets that are about to be committed, and not the latest offsets
returned from SourceTask::poll so far (see related discussion in
https://issues.apache.org/jira/browse/KAFKA-15091 and
https://issues.apache.org/jira/browse/KAFKA-5716).

6) We haven't used the terminology of "Atleast Once Semantics" elsewhere in
Connect since the framework itself does not (and cannot) make any
guarantees on the delivery semantics. Depending on the source connector and
the source system, both at-least once and at-most once semantics (for
example - a source system where reads are destructive) are possible. We
should avoid introducing this terminology in the KIP and instead refer to
this scenario as exactly-once support being disabled.

7) Similar to the above point, we should remove the use of the term
"Exactly Once Semantics" and instead refer to exactly-once support being
enabled since the framework can't guarantee exactly-once semantics for all
possible source connectors (for example - a message queue source connector
where offsets are essentially managed in the source system via an ack
mechanism).

8) In a previous attempt to fix this gap in functionality, a significant
concern was raised on offsets ordering guarantees when we retry sending a
batch of records (ref -
https://github.com/apache/kafka/pull/5553/files#r213329307). It doesn't
look like this KIP addresses that concern either? In the case where
exactly-once support is disabled - if we update the committableOffsets with
the offsets provided by the task through the new updateOffsets method,
these offsets could be committed before older "regular" offsets are
committed due to producer retries which could then lead to an inconsistency
if the send operation eventually succeeds.

9) The KIP states that when exactly-once support is enabled, the new
SourceTask::updateOffsets method will be invoked only when an offset flush
is attempted. If the connector is configured to use a connector specified
transaction boundary rather than a poll or interval based boundary, isn't
it possible that we don't call SourceTask::updateOffsets until there are
actual records that are also being returned through poll (which would
defeat the primary motivation of the KIP)? Or are we making the assumption
that the connector defined transaction boundary should handle this case
appropriately if needed (i.e. source tasks should occasionally request for
a transaction commit via their transaction context if they want offsets to
be committed without producing records)? If so, I think we should
explicitly call that out in the KIP.

10) The Javadoc for SourceTask::updateOffsets in the section on public
interfaces also has the same issue with the definition of latest offsets
that I've mentioned above (latest offsets from poll versus latest offsets
that are about to be committed).

11) The Javadoc for SourceTask::updateOffsets also introduces the same
confusion w.r.t updating offsets that I've mentioned above (modifying the
offsets map argument versus returning a modified copy of the offsets map).

12) In the section on compatibility, we should explicitly mention that
connectors which implement the new method will still be compatible with
older Connect runtimes where the method will simp

Re: KIP-919: Allow AdminClient to Talk Directly with the KRaft Controller Quorum

2023-07-04 Thread Luke Chen
Hi Colin,

Thanks for the answers to my previous questions.

> Yes, the common thread here is that all of these shell commands perform
operations can be done without the broker. So it's reasonable to allow them
to be done without going through the broker. I don't know if we need a
separate note for each since the rationale is really the same for all (is
it reasonable? if so allow it.)

Yes, it makes sense. Could we make a note about the main rationale for
selecting these command-line tools in the KIP to make it clear?
Ex: The following command-line tools will get a new --bootstrap-controllers
argument (because these shell commands perform operations can be done
without the broker):

> kafka-reassign-partitions.sh cannot be used to move the
__cluster_metadata topic. However, it can be used to move partitions that
reside on the brokers, even when using --bootstrap-controllers to talk
directly to the quorum.

Fair enough.


4. Does all the command-line tools with `--bootstrap-controllers` support
all the options in the tool?
For example, kafka-configs.sh, In addition to the `--alter` option you
mentioned in the example, do we also support `--describe` or `--delete`
option?
If so, do we also support setting "quota" for users/clients/topics... via
`--bootstrap-controllers`? (not intuitive, but maybe we just directly
commit the change into the metadata from controller?)

5. Do we have any plan for this feature to be completed? v3.6.0?


Thank you.
Luke


On Fri, Apr 28, 2023 at 1:42 AM Colin McCabe  wrote:

> On Wed, Apr 26, 2023, at 22:08, Luke Chen wrote:
> > Hi Colin,
> >
> > Some comments:
> > 1. I agree we should set "top-level" errors for metadata response
> >
> > 2. In the "brokers" field of metadata response from controller, it'll
> > respond with "Controller endpoint information as given in
> > controller.quorum.voters", instead of the "alive" controllers(voters).
> That
> > will break the existing admin client because in admin client, we'll rely
> on
> > the metadata response to build the "current alive brokers" list, and
> choose
> > one from them to connect (either least load or other criteria). That
> means,
> > if now, we return the value in `controller.quorum.voters`, but one of
> them
> > is down. We might choose it to connect and get connection errors. Should
> we
> > return the "alive" controllers(voters) to client?
>
> Hi Luke,
>
> Good question. When talking to the controllers directly, the AdminClient
> needs to always send its RPCs to the active controller. There is one
> exception: configuring ephemeral log4j settings with
> incrementalAlterConfigs must be done by sending them to the specified
> controller node.
>
> I will add this to a section called "AdminClient Implementation Notes" so
> that it's captured in the KIP.
>
> >
> > 3. In the KIP, we list the command-line tools will get a new
> > --bootstrap-controllers argument, but without explaining why these tools
> > need to talk to controller directly. Could we add some explanation about
> > them? I tried but cannot know why some tools are listed here:
> > - kafka-acls.sh -> Allow clients to update ACLs via controller before
> > brokers up
> >
> > - kafka-cluster.sh -> Reasonable to get/update cluster info via
> > controller
> >
> > - kafka-configs.sh -> Allow clients to dynamically update
> > configs/describe configs from controller. But in this script, client can
> > still set quota for users/clients/topics... is client also able to update
> > via controllers? Or we only allow partial actions in the script to talk
> to
> > controllers?
> >
> > - kafka-delegation-tokens.sh -> Reasonable to update
> delegation-tokens
> > via controllers
> >
> > - kafka-features.sh -> Reasonable
> > - kafka-metadata-quorum.sh -> Reasonable
> > - kafka-metadata-shell.sh -> Reasonable
> >
> > - kafka-reassign-partitions.sh -> Why should we allow clients to move
> > metadata log partitions in controller nodes? What's the use-case?
> >
>
> Yes, the common thread here is that all of these shell commands perform
> operations can be done without the broker. So it's reasonable to allow them
> to be done without going through the broker. I don't know if we need a
> separate note for each since the rationale is really the same for all (is
> it reasonable? if so allow it.)
>
> kafka-reassign-partitions.sh cannot be used to move the __cluster_metadata
> topic. However, it can be used to move partitions that reside on the
> brokers, even when using --bootstrap-controllers to talk directly to the
> quorum.
>
> Colin
>
> >
> > Thank you.
> > Luke
> >
> > On Thu, Apr 27, 2023 at 8:04 AM Colin McCabe  wrote:
> >
> >> On Tue, Apr 25, 2023, at 04:59, Divij Vaidya wrote:
> >> > Thank you for the KIP Colin.
> >> >
> >> > In general, I like the idea of having the ability to interact directly
> >> with
> >> > the controllers. I agree with your observation that it helps in
> >> situations
> >> > where you would want to get data directly from the

Re: KIP-919: Allow AdminClient to Talk Directly with the KRaft Controller Quorum

2023-07-04 Thread Tom Bentley
Hi Colin,

Thanks for the KIP.

1. It mentions kafka-configs.sh as one of the affected tools, but doesn't
mention that ControllerApis doesn't currently support DESCRIBE_CONFIGS. I
think this is worth noting as it is, in effect, a change to the wire
protocol accepted by the controller, even if it's an existing RPC.
2. The diff you show for the MetadataRequest.son doesn't show a change to
the top-level "listeners" key, presumably this should add "controller"?
Similarly, per the above point, I guess we'd also be updating the JSON for
DescribeConfigs.
3. Do you have any timeline for calling a vote for this KIP?

Many thanks,

Tom

On Thu, 27 Apr 2023 at 18:51, Colin McCabe  wrote:

> On Wed, Apr 26, 2023, at 22:08, Luke Chen wrote:
> > Hi Colin,
> >
> > Some comments:
> > 1. I agree we should set "top-level" errors for metadata response
> >
> > 2. In the "brokers" field of metadata response from controller, it'll
> > respond with "Controller endpoint information as given in
> > controller.quorum.voters", instead of the "alive" controllers(voters).
> That
> > will break the existing admin client because in admin client, we'll rely
> on
> > the metadata response to build the "current alive brokers" list, and
> choose
> > one from them to connect (either least load or other criteria). That
> means,
> > if now, we return the value in `controller.quorum.voters`, but one of
> them
> > is down. We might choose it to connect and get connection errors. Should
> we
> > return the "alive" controllers(voters) to client?
>
> Hi Luke,
>
> Good question. When talking to the controllers directly, the AdminClient
> needs to always send its RPCs to the active controller. There is one
> exception: configuring ephemeral log4j settings with
> incrementalAlterConfigs must be done by sending them to the specified
> controller node.
>
> I will add this to a section called "AdminClient Implementation Notes" so
> that it's captured in the KIP.
>
> >
> > 3. In the KIP, we list the command-line tools will get a new
> > --bootstrap-controllers argument, but without explaining why these tools
> > need to talk to controller directly. Could we add some explanation about
> > them? I tried but cannot know why some tools are listed here:
> > - kafka-acls.sh -> Allow clients to update ACLs via controller before
> > brokers up
> >
> > - kafka-cluster.sh -> Reasonable to get/update cluster info via
> > controller
> >
> > - kafka-configs.sh -> Allow clients to dynamically update
> > configs/describe configs from controller. But in this script, client can
> > still set quota for users/clients/topics... is client also able to update
> > via controllers? Or we only allow partial actions in the script to talk
> to
> > controllers?
> >
> > - kafka-delegation-tokens.sh -> Reasonable to update
> delegation-tokens
> > via controllers
> >
> > - kafka-features.sh -> Reasonable
> > - kafka-metadata-quorum.sh -> Reasonable
> > - kafka-metadata-shell.sh -> Reasonable
> >
> > - kafka-reassign-partitions.sh -> Why should we allow clients to move
> > metadata log partitions in controller nodes? What's the use-case?
> >
>
> Yes, the common thread here is that all of these shell commands perform
> operations can be done without the broker. So it's reasonable to allow them
> to be done without going through the broker. I don't know if we need a
> separate note for each since the rationale is really the same for all (is
> it reasonable? if so allow it.)
>
> kafka-reassign-partitions.sh cannot be used to move the __cluster_metadata
> topic. However, it can be used to move partitions that reside on the
> brokers, even when using --bootstrap-controllers to talk directly to the
> quorum.
>
> Colin
>
> >
> > Thank you.
> > Luke
> >
> > On Thu, Apr 27, 2023 at 8:04 AM Colin McCabe  wrote:
> >
> >> On Tue, Apr 25, 2023, at 04:59, Divij Vaidya wrote:
> >> > Thank you for the KIP Colin.
> >> >
> >> > In general, I like the idea of having the ability to interact directly
> >> with
> >> > the controllers. I agree with your observation that it helps in
> >> situations
> >> > where you would want to get data directly from the controller instead
> of
> >> > going via a broker. I have some general comments but the main concern
> I
> >> > have is with the piggy-backing of error code with response of
> >> > __cluster_metadata topic.
> >> >
> >> > 1. With this change, how are we guarding against the possibility of
> >> > misbehaving client traffic from disrupting the controller (that you
> >> > mentioned as a motivation of earlier behaviour)? One solution could
> be to
> >> > have default values set for request throttling on the controller.
> >>
> >> Hi Divij,
> >>
> >> Thanks for the comments.
> >>
> >> Our guards against client misbehavior remain the same:
> >> 1. our recommendation to put the clients on a separate network
> >> 2. the fact that producers and consumers can't interact directly with
> the
> >> controller
> >> 3. the authorizer.
> >>
> >> 

[jira] [Created] (KAFKA-15145) AbstractWorkerSourceTask re-processes records filtered out by SMTs on retriable exceptions

2023-07-04 Thread Yash Mayya (Jira)
Yash Mayya created KAFKA-15145:
--

 Summary: AbstractWorkerSourceTask re-processes records filtered 
out by SMTs on retriable exceptions
 Key: KAFKA-15145
 URL: https://issues.apache.org/jira/browse/KAFKA-15145
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Yash Mayya
Assignee: Yash Mayya


If a RetriableException is thrown from an admin client or producer client 
operation in 
[AbstractWorkerSourceTask::sendRecords|https://github.com/apache/kafka/blob/5c2492bca71200806ccf776ea31639a90290d43e/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractWorkerSourceTask.java#L388],
 the send operation is retried for the remaining records in the batch. There is 
a minor bug in the logic for computing the remaining records for a batch which 
causes records that are filtered out by the task's transformation chain to be 
re-processed. This will also result in the SourceTask::commitRecord method 
being called twice for the same record, which can cause certain types of source 
connectors to fail. This bug seems to exist since when SMTs were first 
introduced in 0.10.2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-942: Add Power(ppc64le) support

2023-07-04 Thread Mickael Maison
Hi,

Thanks for the KIP!
In the Test Plan section you mentioned a few unit and integration
tests are failing. Are these flaky tests or did you find issues
related to ppc64le? If they are just flaky tests, I think we can
remove that table and instead describe how often we intend to run the
tests in the CI.

Thanks,
Mickael

On Tue, Jul 4, 2023 at 12:04 AM Colin McCabe  wrote:
>
> I agree with Divij. A nightly Apache Kafka build for PowerPC would be 
> welcome. But it shouldn't run on every build, since the extra time and 
> complexity would not be worth it.
>
> By the way, are there any features or plugins we don't intend to support on 
> PPC? If there are, this KIP would be a good place to spell them out.
>
> Naively, I would think all of our Java and Scala code should work on PPC 
> without changes. However, there may be library dependencies that don't exist 
> on PPC. (We have to remember that the last desktop PowerPC chip that an 
> average user could buy shipped in 2005)
>
> best,
> Colin
>
>
> On Mon, Jun 19, 2023, at 23:12, Vaibhav Nazare wrote:
> > Thank you for response Divij.
> >
> > 1. We are going to use ASF infra provided nodes for better availability
> > and stability as there are 3 power9 nodes managed officially by ASF
> > infra team themselves.
> > Ref: https://issues.apache.org/jira/browse/INFRA-24663
> > https://jenkins-ccos.apache.org/view/Shared%20-%20ppc64le%20nodes/
> > previously used power node details for apache/kafka CI:
> > RAM- 16GB
> > VCPUs- 8 VCPU
> > Disk- 160GB
> > for shared VMs we need to check with ASF infra team to provide details
> >
> > 2. We can run nightly builds once or twice in a day on specific period
> > of time instead of every commit
> > 3. apache/camel https://builds.apache.org/job/Camel/job/el/ has already
> > enabled CI for power platform they are using same H/W resources as
> > RAM- 16GB
> > VCPUs- 8 VCPU
> > Disk- 160GB
> >
> > -Original Message-
> > From: Divij Vaidya 
> > Sent: Monday, June 19, 2023 10:20 PM
> > To: dev@kafka.apache.org
> > Subject: [EXTERNAL] Re: [DISCUSS] KIP-942: Add Power(ppc64le) support
> >
> > Thank you for the KIP Vaibhav.
> >
> > 1. Builds for power architecture were intentionally disabled in the
> > past since the infrastructure was flaky [1]. Could you please add to
> > the KIP on what has changed since then?
> > 2. What do you think about an alternative solution where we run a
> > nightly build for this platform instead of running the CI with every
> > PR/commit?
> > 3. To bolster the case for this KIP, could you please add information
> > from other Apache projects who are already running CI for this
> > platform? Is their CI stable on Apache Infra hosts?
> >
> >
> > [1] https://github.com/apache/kafka/pull/12380
> >
> > --
> > Divij Vaidya
> >
> >
> >
> > On Mon, Jun 19, 2023 at 12:30 PM Vaibhav Nazare
> >  wrote:
> >
> >>
> >> INVALID URI REMOVED
> >> confluence_display_KAFKA_KIP-2D942-253A-2BAdd-2BPower-2528ppc64le-2529
> >> -2Bsupport&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=s9I3h_d72lHAurpHrTUoOkX
> >> 8ByFHVUGD0XU1PTKfCiw&m=z6ZZ_vt5XP--aKB5lpRRZxdVMA37hD_0ch7COCLdMtLhMve
> >> 8AJcbKfwRtBac267r&s=BQtj2lbWlu32mK0TP37XeZanal33QOf5HB1-33QJIqc&e=
> >>