[GitHub] [kafka-site] VikasGite opened a new pull request, #462: Dream11 powered by Apache Kafka section added

2022-11-29 Thread GitBox


VikasGite opened a new pull request, #462:
URL: https://github.com/apache/kafka-site/pull/462

   Hello Team,
   
   Please review changes made to powered-by page. Have added Dream11 section in 
it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.3 #128

2022-11-29 Thread Apache Jenkins Server
See 




Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1386

2022-11-29 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-866 ZooKeeper to KRaft Migration

2022-11-29 Thread Luke Chen
Thanks for the reply.
Yes, I think adding a section in Rejected Alternatives to explain the
rationale why we don't support combined mode upgrade in this KIP is helpful.

Thank you.
Luke

On Wed, Nov 30, 2022 at 8:47 AM Jun Rao  wrote:

> Hi, David,
>
> Thanks for the reply. No other comments from me.
>
> Jun
>
> On Tue, Nov 29, 2022 at 2:57 PM David Arthur
>  wrote:
>
> > Jun,
> >
> > 51. You're right, I missed that in the latest update. It's fixed now.
> >
> > 54. I was thinking we could update meta.properties to v1 as the brokers
> > were being restarted in KRaft mode without the migration config set.
> > However, as you mentioned, it is still possible to downgrade even then
> (as
> > long as the controller has not exited dual-write mode). Upgrading the
> > meta.properties after seeing the final ZkMigrationRecord sounds like a
> good
> > idea to me. I've updated the KIP to include this detail under
> > "Meta.Properties" section.
> >
> > 58. Yes, the metadata migration from ZK to KRaft will not migrate the
> > contents of /brokers.
> >
> > Thanks,
> > David
> >
> > On Tue, Nov 29, 2022 at 4:50 PM Jun Rao 
> wrote:
> >
> > > Hi, David,
> > >
> > > Thanks for the reply.
> > >
> > > 51. Is this reflected in the KIP? It seems that ZkMigrationState still
> > has
> > > the None value.
> > >
> > > 54. Supporting both v0 and v1 indefinitely in a KRaft broker could be a
> > bit
> > > confusing and may complicate future upgrades. Another approach is to
> let
> > > KRaft broker write the v1 meta.properties after the KRaft controller
> > exits
> > > the dual write mode. We could extend ZkMigrationRecord to 3 states like
> > > migration, dualWrite and KRaftOnly. Once a broker sees KRaftOnly, it
> will
> > > write meta.properties in v1 format. At that point, downgrade could
> cause
> > > metadata loss and require manual work. Will that work?
> > >
> > > 58. When copying metadata from ZK to KRaft, I guess we will ignore
> broker
> > > registration since the KRaft controller has already generated a
> > > BrokerRegistrationRecord based on BrokerRegistrationRequest?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Nov 29, 2022 at 7:14 AM David Arthur
> > >  wrote:
> > >
> > > > Jun, Thanks for the comments. Igor, please see 54 below for some
> > > additional
> > > > discussion on the meta.properties
> > > >
> > > > 50.1 Yes, that field name sounds fine to me.
> > > >
> > > > 50.2 Ok, I'll add something to the KIP under the Controller section.
> To
> > > > your other question, NoOpRecords are used as part of our liveness
> check
> > > for
> > > > the quorum. It doesn't produce any metadata really, so I don't think
> it
> > > > causes any harm to let it happen before the migration.  KIP-835 has
> the
> > > > details on the NoOpRecords
> > > >
> > > > 54. Colin and I discussed the meta.properties issue last night. How
> > about
> > > > we simply let the KRaft broker accept v0 or v1 meta.properties. At
> this
> > > > point, the two versions have the same contents, but different field
> > > names.
> > > > By leaving the meta.properties intact, we can simplify the downgrade
> > > > process. If we care to, we could rewrite meta.properties once a
> broker
> > is
> > > > restarted after the migration is finalized (migration config
> disabled).
> > > >
> > > > 57. If a ZK broker can't send a BrokerRegistrationRequest because the
> > > > quorum is unavailable, it should just continue operating normally.
> > Once a
> > > > leader is available, the broker will send the registration and start
> > > > heart-beating. Unlike KRaft mode, we won't block startup on a
> > successful
> > > > BrokerRegistration response. Concretely, BrokerLifecycleManager will
> > keep
> > > > trying to contact the quorum in its own thread until the
> > > > BrokerToChannelManager gets a controller ID from KafkaRaftManager.
> This
> > > > shouldn't interfere with other ZK broker activity.
> > > >
> > > > -David
> > > >
> > > > >
> > > >
> > > > --
> > > > -David
> > > >
> > >
> >
> >
> > --
> > -David
> >
>


Re: [VOTE] KIP-866 ZooKeeper to KRaft Migration

2022-11-29 Thread Jun Rao
Hi, David,

Thanks for the KIP. +1

Jun

On Tue, Nov 22, 2022 at 3:43 PM Jason Gustafson 
wrote:

> Thanks, +1 from me. I suspect we might be in for at least one surprise with
> the re-implemented controller RPCs, but I agree the alternative has risks
> as well.
>
> Best,
> Jason
>
> On Mon, Nov 14, 2022 at 12:00 PM Colin McCabe  wrote:
>
> > On Fri, Nov 11, 2022, at 08:59, David Arthur wrote:
> > > Thanks, Colin.
> > >
> > >> never start an upgrade without first verifying the quorum
> configuration
> > on the ZK-based brokers
> > >
> > > I agree that this is a pretty big benefit. I could imagine debugging
> > > and fixing connection problems mid-migration would be a big pain.
> > > Especially if you had some brokers correctly configured, and others
> > > not.
> > >
> > > Adding a heartbeat raises some questions about what to do if a broker
> > > goes into a bad state, or stops heartbeating, during a migration.
> > > However, I think the same is true for a registration based approach,
> > > so maybe it's not an increase in net complexity.
> > >
> >
> > Hi David,
> >
> > Yeah. I think the goal should be for the set of heartbeaters to match the
> > set of broker registrations under /brokers
> >
> > Obviously, people could add or remove brokers after the upgrade has
> begun,
> > but that's unavoidable, I think. We can at least ensure that at the time
> we
> > enter upgrade, all the brokers are ready.
> >
> > > I've replaced the ZK registration section with a new RPC and brief
> > > description. Please take a look.
> > >
> >
> > Thanks, David. With these changes it LGTM to me.
> >
> > +1 (binding)
> >
> > Colin
> >
> > > Thanks!
> > > David
> > >
> > > On Wed, Nov 9, 2022 at 5:46 PM Colin McCabe 
> wrote:
> > >>
> > >> Hi David,
> > >>
> > >> Thanks for the response. Replies inline.
> > >>
> > >> On Wed, Nov 9, 2022, at 08:17, David Arthur wrote:
> > >> > Colin
> > >> >
> > >> >>  Maybe zk.metadata.migration.enable ?
> > >> >
> > >> > Done. I went with "zookeeper.metadata.migration.enable" since our
> > >> > other ZK configs start with "zookeeper.*"
> > >> >
> > >> >> SImilarly, for MigrationRecord: can we rename this to
> > ZkMigrationStateRecord? Then change MigrationState -> ZkMigrationState.
> > >> >
> > >> > Sure
> > >> >
> > >> >> With ZkMigrationStateRecord, one thing to keep in mind here is that
> > we will eventually compact all the metadata logs into a snapshot. That
> > snapshot will then have to keep alive the memory of the old migration. So
> > it is not really a matter of replaying the old metadata logs (probably)
> but
> > a matter of checking to see what the ZkMigrationState is, which I suppose
> > could be Optional. If it's not Optional.empty, we
> already
> > migrated / are migrating.
> > >> >
> > >> > Yea, makes sense.
> > >> >
> > >> >> For the /migration ZNode, is "last_update_time_ms" necessary? I
> > thought ZK already tracked this information in the mzxid of the znode?
> > >> >
> > >> > Yes, Jun pointed this out previously, I missed this update in the
> KIP.
> > >> > Fixed now.
> > >> >
> > >> >> It is true that technically it is only needed in UMR, but I would
> > still suggest including KRaftControllerId in LeaderAndIsrRequest because
> it
> > will make debugging much easier.
> > >> >>
> > >> >> I would suggest not implementing the topic deletion state machine,
> > but just deleting topics eagerly when in migration mode. We can implement
> > this behavior change by keying off of whether KRaftControllerId is
> present
> > in LeaderAndIsrRequest. On broker startup, we'll be sent a full
> > LeaderAndIsrRequest and can delete stray partitions whose IDs are not as
> > expected (again, this behavior change would only be for migration mode)
> > >> >
> > >> > Sounds good to me. Since this is somewhat of an implementation
> detail,
> > >> > do you think we need this included in the KIP?
> > >>
> > >> Yeah, maybe we don't need to go into the delete behavior here. But I
> > think the KIP should specify that we have KRaftControllerId in both
> > LeaderAndIsrRequest. That will allow us to implement this behavior
> > conditionally on zk-based brokers when in dual write mode.
> > >>
> > >> >
> > >> >> For existing KRaft controllers, will
> > kafka.controller:type=KafkaController,name=MigrationState show up as 4
> > (MigrationFinalized)? I assume this is true, but it would be good to
> spell
> > it out. Sorry if this is answered somewhere else.
> > >> >
> > >> > We discussed using 0 (None) as the value to report for original,
> > >> > un-migrated KRaft clusters. 4 (MigrationFinalized) would be for
> > >> > clusters which underwent a migration. I have some description of
> this
> > >> > in the table under "Migration Overview"
> > >> >
> > >>
> > >> I don't feel that strongly about this, but wouldn't it be a good idea
> > for MigrationState to have a different value for ZK-based clusters and
> > KRaft-based clusters? If you have a bunch of clusters and you take an
> > aggregate of this metric, it 

Re: [DISCUSS] KIP-866 ZooKeeper to KRaft Migration

2022-11-29 Thread Jun Rao
Hi, David,

Thanks for the reply. No other comments from me.

Jun

On Tue, Nov 29, 2022 at 2:57 PM David Arthur
 wrote:

> Jun,
>
> 51. You're right, I missed that in the latest update. It's fixed now.
>
> 54. I was thinking we could update meta.properties to v1 as the brokers
> were being restarted in KRaft mode without the migration config set.
> However, as you mentioned, it is still possible to downgrade even then (as
> long as the controller has not exited dual-write mode). Upgrading the
> meta.properties after seeing the final ZkMigrationRecord sounds like a good
> idea to me. I've updated the KIP to include this detail under
> "Meta.Properties" section.
>
> 58. Yes, the metadata migration from ZK to KRaft will not migrate the
> contents of /brokers.
>
> Thanks,
> David
>
> On Tue, Nov 29, 2022 at 4:50 PM Jun Rao  wrote:
>
> > Hi, David,
> >
> > Thanks for the reply.
> >
> > 51. Is this reflected in the KIP? It seems that ZkMigrationState still
> has
> > the None value.
> >
> > 54. Supporting both v0 and v1 indefinitely in a KRaft broker could be a
> bit
> > confusing and may complicate future upgrades. Another approach is to let
> > KRaft broker write the v1 meta.properties after the KRaft controller
> exits
> > the dual write mode. We could extend ZkMigrationRecord to 3 states like
> > migration, dualWrite and KRaftOnly. Once a broker sees KRaftOnly, it will
> > write meta.properties in v1 format. At that point, downgrade could cause
> > metadata loss and require manual work. Will that work?
> >
> > 58. When copying metadata from ZK to KRaft, I guess we will ignore broker
> > registration since the KRaft controller has already generated a
> > BrokerRegistrationRecord based on BrokerRegistrationRequest?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Nov 29, 2022 at 7:14 AM David Arthur
> >  wrote:
> >
> > > Jun, Thanks for the comments. Igor, please see 54 below for some
> > additional
> > > discussion on the meta.properties
> > >
> > > 50.1 Yes, that field name sounds fine to me.
> > >
> > > 50.2 Ok, I'll add something to the KIP under the Controller section. To
> > > your other question, NoOpRecords are used as part of our liveness check
> > for
> > > the quorum. It doesn't produce any metadata really, so I don't think it
> > > causes any harm to let it happen before the migration.  KIP-835 has the
> > > details on the NoOpRecords
> > >
> > > 54. Colin and I discussed the meta.properties issue last night. How
> about
> > > we simply let the KRaft broker accept v0 or v1 meta.properties. At this
> > > point, the two versions have the same contents, but different field
> > names.
> > > By leaving the meta.properties intact, we can simplify the downgrade
> > > process. If we care to, we could rewrite meta.properties once a broker
> is
> > > restarted after the migration is finalized (migration config disabled).
> > >
> > > 57. If a ZK broker can't send a BrokerRegistrationRequest because the
> > > quorum is unavailable, it should just continue operating normally.
> Once a
> > > leader is available, the broker will send the registration and start
> > > heart-beating. Unlike KRaft mode, we won't block startup on a
> successful
> > > BrokerRegistration response. Concretely, BrokerLifecycleManager will
> keep
> > > trying to contact the quorum in its own thread until the
> > > BrokerToChannelManager gets a controller ID from KafkaRaftManager. This
> > > shouldn't interfere with other ZK broker activity.
> > >
> > > -David
> > >
> > > >
> > >
> > > --
> > > -David
> > >
> >
>
>
> --
> -David
>


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1385

2022-11-29 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 514987 lines...]
[2022-11-30T00:46:28.342Z] > Task :connect:api:testJar
[2022-11-30T00:46:28.342Z] > Task :connect:api:testSrcJar
[2022-11-30T00:46:28.342Z] 
[2022-11-30T00:46:28.342Z] > Task :streams:javadoc
[2022-11-30T00:46:28.342Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/processor/StreamPartitioner.java:50:
 warning - Tag @link: reference not found: 
org.apache.kafka.clients.producer.internals.DefaultPartitioner
[2022-11-30T00:46:28.342Z] 
[2022-11-30T00:46:28.342Z] > Task 
:connect:api:publishMavenJavaPublicationToMavenLocal
[2022-11-30T00:46:28.342Z] > Task :connect:api:publishToMavenLocal
[2022-11-30T00:46:30.408Z] 
[2022-11-30T00:46:30.408Z] > Task :streams:javadoc
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:890:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:919:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:939:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:854:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:890:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:919:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:939:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:84:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:136:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/Produced.java:147:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/Repartitioned.java:101:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:30.408Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/kstream/Repartitioned.java:167:
 warning - Tag @link: reference not found: DefaultPartitioner
[2022-11-30T00:46:31.440Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/TopologyConfig.java:62:
 warning - Tag @link: missing '#': "org.apache.kafka.streams.StreamsBuilder()"
[2022-11-30T00:46:31.440Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/TopologyConfig.java:62:
 warning - Tag @link: can't find org.apache.kafka.streams.StreamsBuilder() in 
org.apache.kafka.streams.TopologyConfig
[2022-11-30T00:46:31.440Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/TopologyDescription.java:38:
 warning - Tag @link: reference not found: ProcessorContext#forward(Object, 
Object) forwards
[2022-11-30T00:46:31.440Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/query/Position.java:44:
 warning - Tag @link: can't find query(Query,
[2022-11-30T00:46:31.440Z]  PositionBound, boolean) in 
org.apache.kafka.streams.processor.StateStore
[2022-11-30T00:46:31.440Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_trunk_2/streams/src/main/java/org/apache/kafka/streams/query/QueryResult.java:109:
 warning - Tag @link: reference not found: this#getResult()
[2022-11-30T00:46:31.440Z] 

[jira] [Created] (KAFKA-14424) Cancellation of an ongoing replica reassignment should have sanity checks

2022-11-29 Thread Lucas Wang (Jira)
Lucas Wang created KAFKA-14424:
--

 Summary: Cancellation of an ongoing replica reassignment should 
have sanity checks
 Key: KAFKA-14424
 URL: https://issues.apache.org/jira/browse/KAFKA-14424
 Project: Kafka
  Issue Type: Improvement
Reporter: Lucas Wang


When reassigning replicas, Kafka runs a sanity check to ensure all of the 
target replicas are alive before allowing the reassignment request to proceed.
However, for an AlterPartitionReassignments request that cancels an ongoing 
reassignment, there is no such check.
The result is that if the original replicas are offline, the cancellation may 
result in partitions
without any leaders. This problem has been observed in our clusters.

 

There should be some sanity check to ensure the cancellation would also land 
the partitions in valid states, e.g. by ensuring all of the original replicas 
are all alive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-866 ZooKeeper to KRaft Migration

2022-11-29 Thread David Arthur
Jun,

51. You're right, I missed that in the latest update. It's fixed now.

54. I was thinking we could update meta.properties to v1 as the brokers
were being restarted in KRaft mode without the migration config set.
However, as you mentioned, it is still possible to downgrade even then (as
long as the controller has not exited dual-write mode). Upgrading the
meta.properties after seeing the final ZkMigrationRecord sounds like a good
idea to me. I've updated the KIP to include this detail under
"Meta.Properties" section.

58. Yes, the metadata migration from ZK to KRaft will not migrate the
contents of /brokers.

Thanks,
David

On Tue, Nov 29, 2022 at 4:50 PM Jun Rao  wrote:

> Hi, David,
>
> Thanks for the reply.
>
> 51. Is this reflected in the KIP? It seems that ZkMigrationState still has
> the None value.
>
> 54. Supporting both v0 and v1 indefinitely in a KRaft broker could be a bit
> confusing and may complicate future upgrades. Another approach is to let
> KRaft broker write the v1 meta.properties after the KRaft controller exits
> the dual write mode. We could extend ZkMigrationRecord to 3 states like
> migration, dualWrite and KRaftOnly. Once a broker sees KRaftOnly, it will
> write meta.properties in v1 format. At that point, downgrade could cause
> metadata loss and require manual work. Will that work?
>
> 58. When copying metadata from ZK to KRaft, I guess we will ignore broker
> registration since the KRaft controller has already generated a
> BrokerRegistrationRecord based on BrokerRegistrationRequest?
>
> Thanks,
>
> Jun
>
> On Tue, Nov 29, 2022 at 7:14 AM David Arthur
>  wrote:
>
> > Jun, Thanks for the comments. Igor, please see 54 below for some
> additional
> > discussion on the meta.properties
> >
> > 50.1 Yes, that field name sounds fine to me.
> >
> > 50.2 Ok, I'll add something to the KIP under the Controller section. To
> > your other question, NoOpRecords are used as part of our liveness check
> for
> > the quorum. It doesn't produce any metadata really, so I don't think it
> > causes any harm to let it happen before the migration.  KIP-835 has the
> > details on the NoOpRecords
> >
> > 54. Colin and I discussed the meta.properties issue last night. How about
> > we simply let the KRaft broker accept v0 or v1 meta.properties. At this
> > point, the two versions have the same contents, but different field
> names.
> > By leaving the meta.properties intact, we can simplify the downgrade
> > process. If we care to, we could rewrite meta.properties once a broker is
> > restarted after the migration is finalized (migration config disabled).
> >
> > 57. If a ZK broker can't send a BrokerRegistrationRequest because the
> > quorum is unavailable, it should just continue operating normally. Once a
> > leader is available, the broker will send the registration and start
> > heart-beating. Unlike KRaft mode, we won't block startup on a successful
> > BrokerRegistration response. Concretely, BrokerLifecycleManager will keep
> > trying to contact the quorum in its own thread until the
> > BrokerToChannelManager gets a controller ID from KafkaRaftManager. This
> > shouldn't interfere with other ZK broker activity.
> >
> > -David
> >
> > >
> >
> > --
> > -David
> >
>


-- 
-David


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1384

2022-11-29 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-866 ZooKeeper to KRaft Migration

2022-11-29 Thread Jun Rao
Hi, David,

Thanks for the reply.

51. Is this reflected in the KIP? It seems that ZkMigrationState still has
the None value.

54. Supporting both v0 and v1 indefinitely in a KRaft broker could be a bit
confusing and may complicate future upgrades. Another approach is to let
KRaft broker write the v1 meta.properties after the KRaft controller exits
the dual write mode. We could extend ZkMigrationRecord to 3 states like
migration, dualWrite and KRaftOnly. Once a broker sees KRaftOnly, it will
write meta.properties in v1 format. At that point, downgrade could cause
metadata loss and require manual work. Will that work?

58. When copying metadata from ZK to KRaft, I guess we will ignore broker
registration since the KRaft controller has already generated a
BrokerRegistrationRecord based on BrokerRegistrationRequest?

Thanks,

Jun

On Tue, Nov 29, 2022 at 7:14 AM David Arthur
 wrote:

> Jun, Thanks for the comments. Igor, please see 54 below for some additional
> discussion on the meta.properties
>
> 50.1 Yes, that field name sounds fine to me.
>
> 50.2 Ok, I'll add something to the KIP under the Controller section. To
> your other question, NoOpRecords are used as part of our liveness check for
> the quorum. It doesn't produce any metadata really, so I don't think it
> causes any harm to let it happen before the migration.  KIP-835 has the
> details on the NoOpRecords
>
> 54. Colin and I discussed the meta.properties issue last night. How about
> we simply let the KRaft broker accept v0 or v1 meta.properties. At this
> point, the two versions have the same contents, but different field names.
> By leaving the meta.properties intact, we can simplify the downgrade
> process. If we care to, we could rewrite meta.properties once a broker is
> restarted after the migration is finalized (migration config disabled).
>
> 57. If a ZK broker can't send a BrokerRegistrationRequest because the
> quorum is unavailable, it should just continue operating normally. Once a
> leader is available, the broker will send the registration and start
> heart-beating. Unlike KRaft mode, we won't block startup on a successful
> BrokerRegistration response. Concretely, BrokerLifecycleManager will keep
> trying to contact the quorum in its own thread until the
> BrokerToChannelManager gets a controller ID from KafkaRaftManager. This
> shouldn't interfere with other ZK broker activity.
>
> -David
>
> >
>
> --
> -David
>


Re: [DISCUSS] KIP-883: Add delete callback method to Connector API

2022-11-29 Thread Chris Egerton
Hi Hector,

Thanks for the KIP! Here are my initial thoughts:

1. I like the simplicity of an overloaded stop method, but there is some
asymmetry between stopping a connector and deleting one. If a connector is
stopped (for rebalance, to be reconfigured, etc.) and a failure occurs
then, the failure will be clearly visible in the REST API via, e.g., the
GET /connectors/{connector}/status endpoint. If a connector is deleted and
a failure occurs, with the current proposal, users won't have the same
level of visibility. How can we clearly surface failures caused during the
"destroy" phase of a connector's lifecycle to users?

2. I don't think that this new feature should be used to control (delete)
offsets for connectors. We're addressing that separately in KIP-875, and it
could be a source of headaches for users if they discover that some
connectors' offsets persist across deletion/recreation while others do not.
If anything, we should explicitly recommend against this kind of logic in
the Javadocs for the newly-introduced method.

3. Is it worth trying to give all of the connector's tasks a chance to shut
down before invoking "stop(true)" on the Connector? If so, any thoughts on
how we can accomplish that?

4. Just to make sure we're on the same page--this feature is not being
proposed so that connectors can try to delete the data that they've
produced (i.e., that sink connectors have written to the external system,
or that source connectors have written to Kafka), right?

Cheers,

Chris

On Thu, Nov 17, 2022 at 5:31 PM Hector Geraldino (BLOOMBERG/ 919 3RD A) <
hgerald...@bloomberg.net> wrote:

> Hi,
>
> I've updated the KIP with the new #stop(boolean isDeleted) overloaded
> method, and have also amended the PR and JIRA tickets. I also added a
> couple entries to the "Rejected alternatives" section with the reasons why
> I pivoted from introducing new callback methods to retrofit the existing
> one.
>
> Please let me know what your thoughts are.
>
> Cheers,
> Hector
>
> From: Hector Geraldino (BLOOMBERG/ 919 3RD A) At: 11/16/22 17:38:59
> UTC-5:00To:  dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-883: Add delete callback method to Connector API
>
> Hi Mickael,
>
> I agree that the new STOPPED state proposed in KIP-875 will improve the
> connector lifecycle. The changes proposed in this KIP aim to cover the gap
> where connectors need to actually be deleted, but because the API doesn't
> provide any hooks, external assets are left lingering where they shouldn't.
>
> I agree that this proposal is similar to KIP-419, maybe the main
> difference is their focus on Tasks whereas KIP-833 proposes changes to the
> Connector. My goal is to figure out the correct semantics for notifying
> connectors that they're being stopped because the connector has been
> deleted.
>
> Now, computing the "deleted" state in both the Standalone and Distributed
> herders is not hard, so the question is: when shall the connector be
> notified? The "easiest" option would be to do it by calling an overloaded
> Connector#stop(deleted) method, but there are other - more expressive -
> ways, like providing an 'onDelete()' or 'destroy()' method.
>
> I'm leaning towards adding an overload method (less complexity, known
> corner cases), and will amend the KIP with the reasoning behind that
> decision soon.
>
> Thanks for your feedback!
>
> From: dev@kafka.apache.org At: 11/16/22 11:13:17 UTC-5:00To:
> dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-883: Add delete callback method to Connector API
>
> Hi Hector,
>
> Thanks for the KIP.
>
> One tricky aspect is that currently there's no real way to stop a
> connector so to do so people often just delete them temporarily.
> KIP-875 proposes adding a mechanism to properly stop connectors which
> should reduce the need to deleting them and avoid doing potentially
> expensive cleanup operations repetitively.
>
> This KIP also reminds me of KIP-419:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-419%3A+Safely+notify+Kafka
> +Connect+SourceTask+is+stopped.
> Is it guaranteed the new delete callback will be the last method
> called?
>
> Thanks,
> Mickael
>
>
> On Tue, Nov 15, 2022 at 5:40 PM Sagar  wrote:
> >
> > Hey Hector,
> >
> > Thanks for the KIP. I have a minor suggestion in terms of naming. Since
> > this is a callback method, would it make sense to call it onDelete()?
> >
> > Also, the failure scenarios discussed by Greg would need handling. Among
> > other things, I like the idea of having a timeout for graceful shutdown
> or
> > else try a force shutdown. What do you think about that approach?
> >
> > Thanks!
> > Sagar.
> >
> > On Sat, Nov 12, 2022 at 1:53 AM Hector Geraldino (BLOOMBERG/ 919 3RD A) <
> > hgerald...@bloomberg.net> wrote:
> >
> > > Thanks Greg for taking your time to review not just the KIP but also
> the
> > > PR.
> > >
> > > 1. You made very valid points regarding the behavior of the destroy()
> > > callback for connectors that don't follow the happy path. 

Re: [DISCUSS] KIP-877: Mechanism for plugins and connectors to register metrics

2022-11-29 Thread Chris Egerton
Hi Mickael,

Thanks for the KIP! This seems especially useful to reduce the
implementation cost and divergence in behavior for connectors that choose
to publish their own metrics.

My initial thoughts:

1. Are you certain that the default implementation of the "metrics" method
for the various connector/task context classes will be used on older
versions of the Connect runtime? My understanding was that a
NoSuchMethodError (or some similar classloading exception) would be thrown
in that case. If that turns out to be true, WDYT about having the Connector
and Task classes implement the Monitorable interface, both for
consistency's sake, and to prevent classloading headaches?

2. Although I agree that administrators should be careful about which
plugins they run on their clients, Connect clusters, etc., I wonder if
there might still be value in wrapping the Metrics class behind a new
interface, for a few reasons:

  a. Developers and administrators may still make mistakes, and if we can
reduce the blast radius by preventing plugins from, e.g., closing the
Metrics instance we give them, it may be worth it. This could also be
accomplished by forbidding plugins from invoking these methods, and giving
them a subclass of Metrics that throws UnsupportedOperationException from
these methods.

  b. If we don't know of any reasonable use cases for closing the instance,
adding new reporters, removing metrics, etc., it can make the API cleaner
and easier for developers to grok if they don't even have the option to do
any of those things.

  c. Interoperability between plugins that implement Monitorable and their
runtime becomes complicated. For example, a connector may be built against
a version of Kafka that introduces new methods for the Metrics class, which
introduces risks of incompatibility if its developer chooses to take
advantage of these methods without realizing that they will not be
available on Connect runtimes built against an older version of Kafka. With
a wrapper interface, we at least have a chance to isolate these issues so
that the Metrics class can be expanded without adding footguns for plugins
that implement Monitorable, and to call out potential compatibility
problems in documentation more clearly if/when we do expand the wrapper
interface.

3. It'd be nice to see a list of exactly which plugins will be able to take
advantage of the new Monitorable interface.

Looking forward to your thoughts!

Cheers,

Chris

On Mon, Nov 7, 2022 at 11:42 AM Mickael Maison 
wrote:

> Hi,
>
> I have opened KIP-877 to make it easy for plugins and connectors to
> register their own metrics:
>
> https://eu01.z.antigena.com/l/9lWv8kbU9CKs2LajwgfKF~yMNQVM7rWRxYmYVNrHU_2nQbisTiXYZdowNfQ-NcgF1uai2lk-sv6hJASnbdr_gqVwyVae_~y-~oq5yQFgO_-IHD3UGDn3lsIyauAG2tG6giPJH-9yCYg3Hwe26sm7nep258qB6SNXRwpaVxbU3SaVTybfLQVvTn_uUlHKMhmVnpnc1dUnusK6x4j8JPPQQ1Ce~rrg-nsSLouHHMf0ewmpsFNy4BcbMaqHd4Y
>
> Let me know if you have any feedback or suggestions.
>
> Thanks,
> Mickael
>
>


Re: [DISCUSS] KIP-890 Server Side Defense

2022-11-29 Thread Justine Olshan
Hi Artem and Jeff,


Thanks for taking a look and sorry for the slow response.

You both mentioned the change to handle UNKNOWN_PRODUCER_ID errors. To be
clear — this error code will only be sent again when the client's request
version is high enough to ensure we handle it correctly.
The current (Java) client handles this by the following (somewhat long)
code snippet:

// An UNKNOWN_PRODUCER_ID means that we have lost the producer state on the
broker. Depending on the log start

// offset, we may want to retry these, as described for each case below. If
none of those apply, then for the

// idempotent producer, we will locally bump the epoch and reset the
sequence numbers of in-flight batches from

// sequence 0, then retry the failed batch, which should now succeed. For
the transactional producer, allow the

// batch to fail. When processing the failed batch, we will transition to
an abortable error and set a flag

// indicating that we need to bump the epoch (if supported by the broker).

if (error == Errors.*UNKNOWN_PRODUCER_ID*) {

if (response.logStartOffset == -1) {

// We don't know the log start offset with this response. We should
just retry the request until we get it.

// The UNKNOWN_PRODUCER_ID error code was added along with the new
ProduceResponse which includes the

// logStartOffset. So the '-1' sentinel is not for backward
compatibility. Instead, it is possible for

// a broker to not know the logStartOffset at when it is returning
the response because the partition

// may have moved away from the broker from the time the error was
initially raised to the time the

// response was being constructed. In these cases, we should just
retry the request: we are guaranteed

// to eventually get a logStartOffset once things settle down.

return true;

}


if (batch.sequenceHasBeenReset()) {

// When the first inflight batch fails due to the truncation case,
then the sequences of all the other

// in flight batches would have been restarted from the beginning.
However, when those responses

// come back from the broker, they would also come with an
UNKNOWN_PRODUCER_ID error. In this case, we should not

// reset the sequence numbers to the beginning.

return true;

} else if (lastAckedOffset(batch.topicPartition).orElse(
*NO_LAST_ACKED_SEQUENCE_NUMBER*) < response.logStartOffset) {

// The head of the log has been removed, probably due to the
retention time elapsing. In this case,

// we expect to lose the producer state. For the transactional
producer, reset the sequences of all

// inflight batches to be from the beginning and retry them, so
that the transaction does not need to

// be aborted. For the idempotent producer, bump the epoch to avoid
reusing (sequence, epoch) pairs

if (isTransactional()) {

txnPartitionMap.startSequencesAtBeginning(batch.topicPartition,
this.producerIdAndEpoch);

} else {

requestEpochBumpForPartition(batch.topicPartition);

}

return true;

}


if (!isTransactional()) {

// For the idempotent producer, always retry UNKNOWN_PRODUCER_ID
errors. If the batch has the current

// producer ID and epoch, request a bump of the epoch. Otherwise
just retry the produce.

requestEpochBumpForPartition(batch.topicPartition);

return true;

}

}


I was considering keeping this behavior — but am open to simplifying it.



We are leaving changes to older clients off the table here since it caused
many issues for clients in the past. Previously this was a fatal error and
we didn't have the mechanisms in place to detect when this was a legitimate
case vs some bug or gap in the protocol. Ensuring each transaction has its
own epoch should close this gap.




And to address Jeff's second point:
*does the typical produce request path append records to local log along*

*with the currentTxnFirstOffset information? I would like to understand*

*when the field is written to disk.*


Yes, the first produce request populates this field and writes the offset
as part of the record batch and also to the producer state snapshot. When
we reload the records on restart and/or reassignment, we repopulate this
field with the snapshot from disk along with the rest of the producer state.

Let me know if there are further comments and/or questions.

Thanks,
Justine

On Tue, Nov 22, 2022 at 9:00 PM Jeff Kim 
wrote:

> Hi Justine,
>
> Thanks for the KIP! I have two questions:
>
> 1) For new clients, we can once again return an error UNKNOWN_PRODUCER_ID
> for sequences
> that are non-zero when there is no producer state present on the server.
> This will indicate we missed the 0 sequence and we don't yet want to write
> to the log.
>
> I would like to understand the current behavior to handle older clients,
> and if there are any changes we are making. 

Re: [DISCUSS] KIP-864: Add End-To-End Latency Metrics to Connectors

2022-11-29 Thread Chris Egerton
Hi Jorge,

Thanks! What were your thoughts on the possible benchmarking and/or
downgrading of per-record metrics to DEBUG?

Cheers,

Chris

On Thu, Nov 24, 2022 at 8:20 AM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Thanks Chris! I have updated the KIP with "transform" instead of "alias".
> Agree it's clearer.
>
> Cheers,
> Jorge.
>
> On Mon, 21 Nov 2022 at 21:36, Chris Egerton 
> wrote:
>
> > Hi Jorge,
> >
> > Thanks for the updates, and apologies for the delay. The new diagram
> > directly under the "Proposed Changes" section is absolutely gorgeous!
> >
> >
> > Follow-ups:
> >
> > RE 2: Good point. We can use the same level for these metrics, it's not a
> > big deal.
> >
> > RE 3: As long as all the per-record metrics are kept at DEBUG level, it
> > should be fine to leave JMH benchmarking for a follow-up. If we want to
> add
> > new per-record, INFO-level metrics, I would be more comfortable with
> > including benchmarking as part of the testing plan for the KIP. One
> > possible compromise could be to propose that these features be merged at
> > DEBUG level, and then possibly upgraded to INFO level in the future
> pending
> > benchmarks to guard against performance degradation.
> >
> > RE 4: I think for a true "end-to-end" metric, it'd be useful to include
> the
> > time taken by the task to actually deliver the record. However, with the
> > new metric names and descriptions provided in the KIP, I have no
> objections
> > with what's currently proposed, and a new "end-to-end" metric can be
> taken
> > on later in a follow-up KIP.
> >
> > RE 6: You're right, existing producer metrics should be enough for now.
> We
> > can revisit this later if/when we add delivery-centric metrics for sink
> > tasks as well.
> >
> > RE 7: The new metric names in the KIP LGTM; I don't see any need to
> expand
> > beyond those but if you'd still like to pursue others, LMK.
> >
> >
> > New thoughts:
> >
> > One small thought: instead of "alias" in "alias="{transform_alias}" for
> the
> > per-transform metrics, could we use "transform"? IMO it's clearer since
> we
> > don't use "alias" in the names of transform-related properties, and
> "alias"
> > may be confused with the classloading term where you can use, e.g.,
> > "FileStreamSource" as the name of a connector class in a connector config
> > instead of "org.apache.kafka.connect.file.FileStreamSourceConnector".
> >
> >
> > Cheers,
> >
> > Chris
> >
> > On Fri, Nov 18, 2022 at 12:06 PM Jorge Esteban Quilcate Otoya <
> > quilcate.jo...@gmail.com> wrote:
> >
> > > Thanks Mickael!
> > >
> > >
> > > On Wed, 9 Nov 2022 at 15:54, Mickael Maison 
> > > wrote:
> > >
> > > > Hi Jorge,
> > > >
> > > > Thanks for the KIP, it is a nice improvement.
> > > >
> > > > 1) The per transformation metrics still have a question mark next to
> > > > them in the KIP. Do you want to include them? If so we'll want to tag
> > > > them, we should be able to include the aliases in TransformationChain
> > > > and use them.
> > > >
> > >
> > > Yes, I have added the changes on TransformChain that will be needed to
> > add
> > > these metrics.
> > >
> > >
> > > >
> > > > 2) I see no references to predicates. If we don't want to measure
> > > > their latency, can we say it explicitly?
> > > >
> > >
> > > Good question, I haven't considered these. Though as these are
> > materialized
> > > as PredicatedTransformation, they should be covered by these changes.
> > > Adding a note about this.
> > >
> > >
> > > >
> > > > 3) Should we have sink-record-batch-latency-avg-ms? All other metrics
> > > > have both the maximum and average values.
> > > >
> > > >
> > > Good question. I will remove it and change the record latency from
> > > DEBUG->INFO as it already cover the maximum metric.
> > >
> > > Hope it's clearer now, let me know if there any additional feedback.
> > > Thanks!
> > >
> > >
> > >
> > > > Thanks,
> > > > Mickael
> > > >
> > > > On Thu, Oct 20, 2022 at 9:58 PM Jorge Esteban Quilcate Otoya
> > > >  wrote:
> > > > >
> > > > > Thanks, Chris! Great feedback! Please, find my comments below:
> > > > >
> > > > > On Thu, 13 Oct 2022 at 18:52, Chris Egerton
>  > >
> > > > wrote:
> > > > >
> > > > > > Hi Jorge,
> > > > > >
> > > > > > Thanks for the KIP. I agree with the overall direction and think
> > this
> > > > would
> > > > > > be a nice improvement to Kafka Connect. Here are my initial
> > thoughts
> > > > on the
> > > > > > details:
> > > > > >
> > > > > > 1. The motivation section outlines the gaps in Kafka Connect's
> task
> > > > metrics
> > > > > > nicely. I think it'd be useful to include more concrete details
> on
> > > why
> > > > > > these gaps need to be filled in, and in which cases additional
> > > metrics
> > > > > > would be helpful. One goal could be to provide enhanced
> monitoring
> > of
> > > > > > production deployments that allows for cluster administrators to
> > set
> > > up
> > > > > > automatic alerts for latency spikes and, if triggered, quickly
> > > > 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1383

2022-11-29 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-866 ZooKeeper to KRaft Migration

2022-11-29 Thread David Arthur
Jun, Thanks for the comments. Igor, please see 54 below for some additional
discussion on the meta.properties

50.1 Yes, that field name sounds fine to me.

50.2 Ok, I'll add something to the KIP under the Controller section. To
your other question, NoOpRecords are used as part of our liveness check for
the quorum. It doesn't produce any metadata really, so I don't think it
causes any harm to let it happen before the migration.  KIP-835 has the
details on the NoOpRecords

54. Colin and I discussed the meta.properties issue last night. How about
we simply let the KRaft broker accept v0 or v1 meta.properties. At this
point, the two versions have the same contents, but different field names.
By leaving the meta.properties intact, we can simplify the downgrade
process. If we care to, we could rewrite meta.properties once a broker is
restarted after the migration is finalized (migration config disabled).

57. If a ZK broker can't send a BrokerRegistrationRequest because the
quorum is unavailable, it should just continue operating normally. Once a
leader is available, the broker will send the registration and start
heart-beating. Unlike KRaft mode, we won't block startup on a successful
BrokerRegistration response. Concretely, BrokerLifecycleManager will keep
trying to contact the quorum in its own thread until the
BrokerToChannelManager gets a controller ID from KafkaRaftManager. This
shouldn't interfere with other ZK broker activity.

-David

>

-- 
-David


Re: [DISCUSS] KIP-879: Multi-level Rack Awareness

2022-11-29 Thread Viktor Somogyi-Vass
Hi All,

I'd like to bump this. I've also updated the KIP to incorporate the new
KRaft changes (ReplicaPlacer). Luckily my proposals were quite similar to
that, so mostly I've made some minor rewording, naming changes, etc.

Again, the brief summary of the KIP:
- expose replica placement strategies with a new config
- create an admin API and protocol to expose replica placement
functionality (mainly for the reassignment tool)
- create a new multi-level rack awareness strategy which improves
availability on stretch clusters

I'm happy for any feedback.

Best,
Viktor

On Fri, Oct 28, 2022 at 4:14 PM Viktor Somogyi-Vass <
viktor.somo...@cloudera.com> wrote:

> Hey all,
>
> I'd like to propose a new broker side replica assignment strategy and an
> interface that generalizes replica assignment on brokers and makes them
> pluggable.
>
> Briefly, the motivation for the new replica assignment strategy is that
> more and more of our customers would want to run their clusters in a
> stretched environment, where for instance a cluster is running over
> multiple regions (and multiple racks inside a region). Since this seems
> like a more common need, we'd like to contribute back our implementation
> and also make a generalized interface, so that new strategies that people
> may come up with could be served better.
>
> I welcome any feedback on this KIP.
>
> The link:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-879%3A+Multi-level+Rack+Awareness
>
> Best to all,
> Viktor
>


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.3 #127

2022-11-29 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 500809 lines...]
[2022-11-29T11:46:43.010Z] 
[2022-11-29T11:46:43.010Z] 
org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest > 
shouldThrowUnlimitedWindows[ON_WINDOW_CLOSE_false] PASSED
[2022-11-29T11:46:43.010Z] 
[2022-11-29T11:46:43.010Z] 
org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest > 
shouldAggregateWindowedWithNoGrace[ON_WINDOW_CLOSE_false] STARTED
[2022-11-29T11:46:44.758Z] 
[2022-11-29T11:46:44.758Z] 
org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest > 
shouldAggregateWindowedWithNoGrace[ON_WINDOW_CLOSE_false] PASSED
[2022-11-29T11:46:44.758Z] 
[2022-11-29T11:46:44.758Z] 
org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest > 
shouldAggregateWindowedWithGrace[ON_WINDOW_CLOSE_false] STARTED
[2022-11-29T11:46:46.505Z] 
[2022-11-29T11:46:46.505Z] 
org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest > 
shouldAggregateWindowedWithGrace[ON_WINDOW_CLOSE_false] PASSED
[2022-11-29T11:46:46.505Z] 
[2022-11-29T11:46:46.505Z] 
org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest > 
shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_false] STARTED
[2022-11-29T11:47:03.806Z] > Task :core:compileTestScala
[2022-11-29T11:47:35.508Z] 
[2022-11-29T11:47:35.508Z] 
org.apache.kafka.streams.integration.TimeWindowedKStreamIntegrationTest > 
shouldRestoreAfterJoinRestart[ON_WINDOW_CLOSE_false] PASSED
[2022-11-29T11:47:45.535Z] 
[2022-11-29T11:47:45.535Z] FAILURE: Build failed with an exception.
[2022-11-29T11:47:45.535Z] 
[2022-11-29T11:47:45.535Z] * What went wrong:
[2022-11-29T11:47:45.535Z] Execution failed for task 
':streams:upgrade-system-tests-10:unitTest'.
[2022-11-29T11:47:45.535Z] > Process 'Gradle Test Executor 9' finished with 
non-zero exit value 1
[2022-11-29T11:47:45.535Z]   This problem might be caused by incorrect test 
process configuration.
[2022-11-29T11:47:45.535Z]   Please refer to the test execution section in the 
User Manual at 
https://docs.gradle.org/7.4.2/userguide/java_testing.html#sec:test_execution
[2022-11-29T11:47:45.535Z] 
[2022-11-29T11:47:45.535Z] * Try:
[2022-11-29T11:47:45.535Z] > Run with --stacktrace option to get the stack 
trace.
[2022-11-29T11:47:45.535Z] > Run with --info or --debug option to get more log 
output.
[2022-11-29T11:47:45.535Z] > Run with --scan to get full insights.
[2022-11-29T11:47:45.535Z] 
[2022-11-29T11:47:45.535Z] * Get more help at https://help.gradle.org
[2022-11-29T11:47:45.535Z] 
[2022-11-29T11:47:45.535Z] BUILD FAILED in 2h 42m 40s
[2022-11-29T11:47:45.535Z] 212 actionable tasks: 115 executed, 97 up-to-date
[2022-11-29T11:47:46.459Z] 
[2022-11-29T11:47:46.459Z] See the profiling report at: 
file:///home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.3/build/reports/profile/profile-2022-11-29-09-05-10.html
[2022-11-29T11:47:46.459Z] A fine-grained performance profile is available: use 
the --scan option.
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // timestamps
[Pipeline] }
[Pipeline] // timeout
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
Failed in branch JDK 11 and Scala 2.12
[2022-11-29T11:47:52.600Z] > Task :core:testClasses
[2022-11-29T11:48:06.523Z] > Task :streams:compileTestJava
[2022-11-29T11:48:06.523Z] > Task :streams:testClasses
[2022-11-29T11:48:07.448Z] > Task :streams:testJar
[2022-11-29T11:48:08.373Z] > Task :streams:testSrcJar
[2022-11-29T11:48:08.373Z] > Task 
:streams:publishMavenJavaPublicationToMavenLocal
[2022-11-29T11:48:08.373Z] > Task :streams:publishToMavenLocal
[2022-11-29T11:48:08.373Z] 
[2022-11-29T11:48:08.373Z] Deprecated Gradle features were used in this build, 
making it incompatible with Gradle 8.0.
[2022-11-29T11:48:08.373Z] 
[2022-11-29T11:48:08.373Z] You can use '--warning-mode all' to show the 
individual deprecation warnings and determine if they come from your own 
scripts or plugins.
[2022-11-29T11:48:08.373Z] 
[2022-11-29T11:48:08.373Z] See 
https://docs.gradle.org/7.4.2/userguide/command_line_interface.html#sec:command_line_warnings
[2022-11-29T11:48:08.373Z] 
[2022-11-29T11:48:08.373Z] Execution optimizations have been disabled for 2 
invalid unit(s) of work during this build to ensure correctness.
[2022-11-29T11:48:08.373Z] Please consult deprecation warnings for more details.
[2022-11-29T11:48:08.373Z] 
[2022-11-29T11:48:08.373Z] BUILD SUCCESSFUL in 3m 41s
[2022-11-29T11:48:08.373Z] 79 actionable tasks: 37 executed, 42 up-to-date
[Pipeline] sh
[2022-11-29T11:48:11.854Z] + grep ^version= gradle.properties
[2022-11-29T11:48:11.854Z] + cut -d= -f 2
[Pipeline] dir
[2022-11-29T11:48:12.533Z] Running in 
/home/jenkins/workspace/Kafka_kafka_3.3/streams/quickstart
[Pipeline] {
[Pipeline] sh
[2022-11-29T11:48:14.645Z] + mvn clean 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1382

2022-11-29 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-889 Versioned State Stores

2022-11-29 Thread Bruno Cadonna

Hi Victoria,

Regarding

>I notice that
> there is code for validating topic configs and collecting validation 
errors

> (
> 
https://github.com/apache/kafka/blob/be032735b39360df1a6de1a7feea8b4336e5bcc0/streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java#L318-L319)

> but this method is not called from anywhere, even though there are unit
> tests for it. I was unable to find history of this validation after a 
quick

> search.

Those checks were probably part of KIP-698[1] which has never been fully 
implemented.


[1] 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-698%3A+Add+Explicit+User+Initialization+of+Broker-side+State+to+Kafka+Streams


Best,
Bruno

On 28.11.22 20:44, Victoria Xia wrote:

Thanks, Sagar and Bruno, for your insights and comments!


Sagar: Can we name according to the semantics that you want to

support like `getAsOf` or something like that? I am not sure if we do that
in our codebase though. Maybe the experts can chime in.

Because it is a new method that will be added, we should be able to name it
whatever we like. I agree `getAsOf` is more clear, albeit wordier.
Introducing `getAsOf(key, timestamp)` means we could leave open `get(key,
timeFrom, timeTo)` to have an exclusive `timeTo` without introducing a
collision. (We could introduce `getBetween(key, timeFrom, timeTo)` instead
to delineate even more clearly, though this is better left for a future
KIP.)

I don't think there's any existing precedent in codebase to follow here but
I'll leave that to the experts. Curious to hear what others prefer as well.


Sagar: With delete, we would stlll keep the older versions of the key

right?

We could certainly choose this for the semantics of delete(...) -- and it
sounds like we should too, based on Bruno's confirmation below that this
feels more natural to him as well -- but as Bruno noted in his message
below I think we'll want the method signature to be `delete(key,
timestamp)` then, so that there is an explicit timestamp to associate with
the deletion. In other words, `delete(key, timestamp)` has the same effect
as `put(key, null, timestamp)`. The only difference is that the `put(...)`
method has a `void` return type, while `delete(key, timestamp)` can have
`ValueAndTimestamp` as return type in order to return the record which is
replaced (if any). In other words, `delete(key, timestamp)` is equivalent
to `put(key, null, timestamp)` followed by `get(key, timestamp)`.


Bruno: I would also not change the semantics so that it deletes all

versions of
a key. I would rather add a new method purge(key) or
deleteAllVersions(key) or similar if we want to have such a method in
this first KIP.

Makes sense; I'm convinced. Let's defer
`purge(key)`/`deleteAllVersions(key)` to a future KIP. If there's agreement
that `delete(key, timestamp)` (as described above) is valuable, we can keep
it in this first KIP even though it is syntactic sugar. If this turns into
a larger discussion, we can defer this to a future KIP as well.


Bruno: I would treat the history retention as a strict limit. [...] You

could also add historyRetentionMs() to the VersionedKeyValueStore
interface to make the concept of the history retention part of the
interface.

OK. That's the second vote for rewording the javadoc for
`VersionedKeyValueStore#get(key, timestampTo)` to remove the parenthetical
and clarify that history retention should be used to dictate this case, so
I'll go ahead and do that. I'll leave out adding `historyRetentionMs()` to
the interface for now, though, for the sake of consistency with other
stores (e.g., window stores) which don't expose similar types of
configurations from their interfaces.


Bruno: exclusive vs inclusive regarding validTo timestamp in get().

Doesn't this decision depend on the semantics of the join for which this
state store should be used?

Yes, you are correct. As a user I would expect that a stream-side record
with the same timestamp as a table-side record _would_ produce a join
result, which is consistent with the proposal for timestampTo to be
inclusive. (FWIW I tried this out with a Flink temporal join just now and
observed this result as well. Not sure where to look for other standards to
validate this expectation.)


Bruno: If Streams does not update min.compaction.lag.ms during rebalances,

users have to do it each time they change history retention in the code,
right? That seems odd to me. What is the actual reason for not updating
the config? How does Streams handle updates to windowed stores?

Yes, users will have to update min.compaction.lag.ms for the changelog
topic themselves if they update history retention in their code. This is
consistent with what happens for window stores today: e.g., if a user
updates grace period for a windowed aggregation, then they are responsible
for updating retention.ms on their windowed changelog topic as well.

I'm not familiar with the historical context around why this is the case --