Jenkins build is back to normal : Kafka » kafka-2.6-jdk8 #144

2021-12-21 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-13562) Mirror Maker 2 Negative Offsets

2021-12-21 Thread David Urton (Jira)
David Urton created KAFKA-13562:
---

 Summary: Mirror Maker 2 Negative Offsets
 Key: KAFKA-13562
 URL: https://issues.apache.org/jira/browse/KAFKA-13562
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect, mirrormaker
Affects Versions: 2.8.1
Reporter: David Urton


We recently started a migration from a traditional self-managed Apache Kafka 
cluster to Strimzi. We enabled group offset sync and something I'm struggling 
to fully understand is why the mirrored group has a current-offset matching (or 
close to matching) the source offset, but then the log-end-offset essentially 
starts over resulting in a negative lag. Is this just aesthetic? The lag is 
slowly moving closer to 0 but some of our negative lags are very "large". 
Several hundred million. I verified all the data has moved over, but I'm 
struggling on this lag issue.

Any help clearing this up is much appreciated!

Source cluster is 2.4.1 destination is 2.8.1. Detailed info in the Github URL.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] KIP-811 Add separate delete.interval.ms to Kafka Streams

2021-12-21 Thread Nick Telford
Hi everyone,

Thanks for your feedback. I've made the suggested changes to the KIP (1, 2,
3 and 5).

For the new name, I've chosen repartition.purge.interval.ms, as I felt it
struck a good balance between being self-descriptive and concise. Please
let me know if you'd prefer something else.

On point 6: My PR has basic validation to ensure the value is positive, but
I don't think it's necessary to have dynamic validation, to ensure it's not
less than commit.interval.ms. The reason is that it will be implicitly
limited to that value anyway, and won't break anything. But I can add it if
you'd prefer it.

On point 7: I worry that defaulting it to follow the value of
commit.interval.ms may confuse users, who will likely expect the default to
not be affected by changes to other configuration options. I can see the
appeal of retaining the existing behaviour (following the commit interval)
by default, but I believe that the majority of users who customize
commit.interval.ms do not desire a different frequency of repartition
record purging as well.

As for multiples of commit interval: I think the argument against that is
that an interval is more intuitive when given as a time, rather than as a
multiple of some other operation. Users configuring this should not need to
break out a calculator to work out how frequently the records are going to
be purged!

I've also updated the PR with the relevant changes.

BTW, for some reason I didn't receive Sophie's email. I'll periodically
check the thread on the archive to ensure I don't miss any more of your
messages!

Regards,

Nick

On Tue, 21 Dec 2021 at 12:34, Luke Chen  wrote:

> Thanks, Bruno.
>
> I agree my point 4 is more like PR discussion, not KIP discussion.
> @Nick , please update my point 4 in PR directly.
>
> Thank you.
> Luke
>
>
>
>
> On Tue, Dec 21, 2021 at 7:24 PM Bruno Cadonna  wrote:
>
> > Hi Nick,
> >
> > Thank you for the KIP!
> >
> > I agree on points 1, 2, and 3. I am not sure about point 4. I agree that
> > we should update the docs for commit.interval.ms but I am not sure if
> > this needs to mentioned in the KIP. That seems to me more a PR
> > discussion. Also on point 2, I agree that we need to add a doc string
> > but the content should be exemplary not binding. What I want to say is
> > that, we do not need a KIP to change docs.
> >
> > Here my points:
> >
> > 5. Could you specify in the motivation that the KIP is about deleting
> > records from repartition topics? Maybe with a short description when why
> > and when records are deleted from the repartition topics. For us it
> > might be clear, but IMO we should try to write KIPs so that someone that
> > is relatively new to Kafka Streams can understand the KIP without
> > needing to know too much background.
> >
> > 6. Does the config need to be validated? For example, does
> > delete.interval.ms need to be greater than or equal to
> commit.interval.ms?
> >
> > 7. Should the default value for non-EOS be 30s or the same value as
> > commit.interval.ms? I am just thinking about the case where a user
> > explicitly changes commit.interval.ms but not delete.interval.ms (or
> > whatever name you come up for it). Once delete.interval.ms is set
> > explicitly it is decoupled from commit.interval.ms. Similar could be
> > done for the EOS case.
> > Alternatively, we could also define delete.interval.ms to take a
> > integral number without a unit that specifies after how many commit
> > intervals the records in repartition topics should be deleted. This
> > would make sense since delete.interval.ms is tightly bound to
> > commit.interval.ms. Additionally, it would make the semantics of the
> > config simpler. The name of the config should definitely change if we go
> > down this way.
> >
> > Best,
> > Bruno
> >
> >
> >
> > On 21.12.21 11:14, Luke Chen wrote:
> > > Hi Nick,
> > >
> > > Thanks for the KIP!
> > >
> > > In addition to Sophie's comments, I have one more to this KIP:
> > > 3. I think you should mention the behavior change *explicitly* in
> > > "Compatibility" section. I know you already mentioned it in KIP, in the
> > > benefit way. But I think in this section, we should clearly point out
> > which
> > > behavior will be change after this KIP. That is, you should put it
> clear
> > > that the delete record interval will change from 100ms to 30s with EOS
> > > enabled. And it should also be mentioned in doc/upgrade.html doc.
> > > 4. Since this new config has some relationship with commit.interval.ms
> ,
> > I
> > > think we should also update the doc description for `
> commit.interval.ms
> > `,
> > > to let user know there's another config to control delete interval and
> > > should be greater than commit.interval.ms. Something like that. WDYT?
> > (You
> > > should put this change in the KIP as Sophie mentioned)
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Tue, Dec 21, 2021 at 9:27 AM Sophie Blee-Goldman
> > >  wrote:
> > >
> > >> Hey Nick,
> > >>
> > >> I think you forgot to link to th

[jira] [Created] (KAFKA-13561) Consider deprecating `StreamsBuilder#build(props)` function

2021-12-21 Thread Guozhang Wang (Jira)
Guozhang Wang created KAFKA-13561:
-

 Summary: Consider deprecating `StreamsBuilder#build(props)` 
function
 Key: KAFKA-13561
 URL: https://issues.apache.org/jira/browse/KAFKA-13561
 Project: Kafka
  Issue Type: Improvement
Reporter: Guozhang Wang


With 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
 being accepted that introduced the new `StreamsBuilder(TopologyConfig)` 
constructor, we can consider deprecating the `StreamsBuilder#build(props)` 
function now. There are still a few things we'd need to do:

1. Copy the `StreamsConfig.TOPOLOGY_OPTIMIZATION_CONFIG` to TopologyConfig.
2. Make sure the overloaded `StreamsBuilder()` constructor takes in default 
values of TopologyConfig.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] KIP-591: Add Kafka Streams config to set default state store

2021-12-21 Thread Guozhang Wang
Thanks Luke, I do not have any major comments on the wiki any more. BTW
thanks for making the "public StreamsBuilder(final TopologyConfig
topologyConfigs)" API public now, I think it will benefit a lot of future
works!

I think with the new API, we can deprecate the `build(props)` function in
StreamsBuilder now, and will file a separate JIRA for it.

Just a few nits:

1) There seems to be a typo at the end: "ROCK_DB"
2) Sometimes people refer to "store type" as kv-store, window-store etc;
maybe we can differentiate them a bit by calling the new API names
`StoreImplType`,
`default.dsl.store.impl.type` and `The default store implementation type
used by DSL operators`.

On Thu, Dec 16, 2021 at 2:29 AM Luke Chen  wrote:

> Hi Guozhang,
>
> I've updated the KIP to use `enum`, instead of store implementation, and
> some content accordingly.
> Please let me know if there's other comments.
>
>
> Thank you.
> Luke
>
> On Sun, Dec 12, 2021 at 3:55 PM Luke Chen  wrote:
>
> > Hi Guozhang,
> >
> > Thanks for your comments.
> > I agree that in the KIP, there's a trade-off regarding the API
> complexity.
> > With the store impl, we can support default custom stores, but introduce
> > more complexity for users, while with the enum types, users can configure
> > default built-in store types easily, but it can't work for custom stores.
> >
> > For me, I'm OK to narrow down the scope and introduce the default
> built-in
> > enum store types first.
> > And if there's further request, we can consider a better way to support
> > default store impl.
> >
> > I'll update the KIP next week, unless there are other opinions from other
> > members.
> >
> > Thank you.
> > Luke
> >
> > On Fri, Dec 10, 2021 at 6:33 AM Guozhang Wang 
> wrote:
> >
> >> Thanks Luke for the updated KIP.
> >>
> >> One major change the new proposal has it to change the original enum
> store
> >> type with a new interface. Where in the enum approach our internal
> >> implementations would be something like:
> >>
> >> "
> >> Stores#keyValueBytesStoreSupplier(storeImplTypes, storeName, ...)
> >> Stores#windowBytesStoreSupplier(storeImplTypes, storeName, ...)
> >> Stores#sessionBytesStoreSupplier(storeImplTypes, storeName, ...)
> >> "
> >>
> >> And inside the impl classes like here we would could directly do:
> >>
> >> "
> >> if ((supplier = materialized.storeSupplier) == null) {
> >> supplier =
> >> Stores.windowBytesStoreSupplier(materialized.storeImplType())
> >> }
> >> "
> >>
> >> While I understand the benefit of having an interface such that user
> >> customized stores could be used as the default store types as well,
> >> there's
> >> a trade-off I feel regarding the API complexity. Since with this
> approach,
> >> our API complexity granularity is in the order of "number of impl
> types" *
> >> "number of store types". This means that whenever we add new store types
> >> in
> >> the future, this API needs to be augmented and customized impl needs to
> be
> >> updated to support the new store types, in addition, not all custom impl
> >> types may support all store types, but with this interface they are
> forced
> >> to either support all or explicitly throw un-supported exceptions.
> >>
> >> The way I see a default impl type is that, they would be safe to use for
> >> any store types, and since store types are evolved by the library
> itself,
> >> the default impls would better be controlled by the library as well.
> >> Custom
> >> impl classes can choose to replace some of the stores explicitly, but
> may
> >> not be a best fit as the default impl classes --- if there are in the
> >> future, one way we can consider is to make Stores class extensible along
> >> with the enum so that advanced users can add more default impls,
> assuming
> >> such scenarios are not very common.
> >>
> >> So I'm personally still a bit learning towards the enum approach with a
> >> narrower scope, for its simplicity as an API and also its low
> maintenance
> >> cost in the future. Let me know what do you think?
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Wed, Dec 1, 2021 at 6:48 PM Luke Chen  wrote:
> >>
> >> > Hi devs,
> >> >
> >> > I'd like to propose a KIP to allow users to set default store
> >> > implementation class (built-in RocksDB/InMemory, or custom one), and
> >> > default to RocksDB state store, to keep backward compatibility.
> >> >
> >> > Detailed description can be found here:
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
> >> >
> >> > Any feedback and comments are welcome.
> >> >
> >> > Thank you.
> >> > Luke
> >> >
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>


-- 
-- Guozhang


Re: [DISCUSS] KIP-802: Validation Support for Kafka Connect SMT Options

2021-12-21 Thread Chris Egerton
Hi Gunnar,

Thanks, this looks great. I'm ready to cast a non-binding on the vote
thread when it comes.

One small non-blocking nit: I like that you call out that the new
validation steps will take place when a connector gets registered or
updated. IMO this is important enough to be included in the "Public
Interfaces" section as that type of preflight check is arguably more
important than the PUT /connector-plugins/{name}/config/validate endpoint,
when considering that use of the validation endpoint is strictly opt-in,
but preflight checks for new connector configs are unavoidable (without
resorting to devious hacks like publishing directly to the config topic).
But this is really minor, I'm happy to +1 the KIP as-is.

Cheers,

Chris

On Tue, Dec 21, 2021 at 8:43 AM Gunnar Morling
 wrote:

> Hey Chris,
>
> Thanks a lot for reviewing this KIP and your comments! Some more answers
> inline.
>
> Am Di., 7. Dez. 2021 um 23:49 Uhr schrieb Chris Egerton
> :
>
> > Hi Gunnar,
> >
> > Thanks for the KIP! The section on backwards compatibility is especially
> > impressive and was enjoyable to read.
> >
>
> Excellent, that's great to hear!
>
> Overall I like the direction of the KIP (and in fact just ran into a
> > situation yesterday where it would be valuable). I only have one major
> > thought: could we add similar validate methods for the Converter and
> > HeaderConverter interfaces? With KIP-769 [1], it looks like we'll have a
> > new Converter::config method, so if that makes it through, it should be a
> > matter of just adding the same methods to those interfaces as well
> > (although we may want to be tolerant of null ConfigDef objects being
> > returned from HeaderConverter::config since the Connect framework has not
> > been enforcing this requirement to date).
> >
>
> Yes, I think it's a good idea to expand the scope of the KIP to cover all
> these contracts. I have updated the KIP document accordingly.
>
> >
> > That aside, a few small nits:
> >
> > 1. The "This page is meant as a template" section can be removed :)
> > 2. The "Current Status" can be updated to "Under Discussion"
> > 3. Might want to add javadocs to the newly-proposed validate method (I'm
> > assuming they'll largely mirror the ones for the existing
> > Connector::validate method, but we may also want to add a {@since} tag or
> > some other information on which versions of Connect will leverage the
> > method).
> >
>
> Done.
>
> I will try and create a PR for this work in January next year.
>
> All the best,
>
> --Gunnar
>
> [1] -
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+plugins+and+retrieve+their+configuration+definitions#KIP769:ConnectAPIstolistallpluginsandretrievetheirconfigurationdefinitions-PublicInterfaces
> > (section labeled "Converter interface"
> >
> > Cheers,
> >
> > Chris
> >
> > On Wed, Nov 24, 2021 at 11:32 AM Gunnar Morling
> >  wrote:
> >
> > > Hey all,
> > >
> > > I would like to propose a KIP for Apache Kafka Connect which adds
> > > validation support for SMT-related configuration options:
> > >
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-802%3A+Validation+Support+for+Kafka+Connect+SMT+Options
> > >
> > > This feature allows users to make sure an SMT is configured correctly
> > > before actually putting a connector with that SMT in place.
> > >
> > > Any feedback, comments, and suggestions around this proposal will
> > > be greatly appreciated.
> > >
> > > Thanks,
> > >
> > > --Gunnar
> > >
> >
>


Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.1 #42

2021-12-21 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-802: Validation Support for Kafka Connect SMT Options

2021-12-21 Thread Gunnar Morling
Hey Chris,

Thanks a lot for reviewing this KIP and your comments! Some more answers
inline.

Am Di., 7. Dez. 2021 um 23:49 Uhr schrieb Chris Egerton
:

> Hi Gunnar,
>
> Thanks for the KIP! The section on backwards compatibility is especially
> impressive and was enjoyable to read.
>

Excellent, that's great to hear!

Overall I like the direction of the KIP (and in fact just ran into a
> situation yesterday where it would be valuable). I only have one major
> thought: could we add similar validate methods for the Converter and
> HeaderConverter interfaces? With KIP-769 [1], it looks like we'll have a
> new Converter::config method, so if that makes it through, it should be a
> matter of just adding the same methods to those interfaces as well
> (although we may want to be tolerant of null ConfigDef objects being
> returned from HeaderConverter::config since the Connect framework has not
> been enforcing this requirement to date).
>

Yes, I think it's a good idea to expand the scope of the KIP to cover all
these contracts. I have updated the KIP document accordingly.

>
> That aside, a few small nits:
>
> 1. The "This page is meant as a template" section can be removed :)
> 2. The "Current Status" can be updated to "Under Discussion"
> 3. Might want to add javadocs to the newly-proposed validate method (I'm
> assuming they'll largely mirror the ones for the existing
> Connector::validate method, but we may also want to add a {@since} tag or
> some other information on which versions of Connect will leverage the
> method).
>

Done.

I will try and create a PR for this work in January next year.

All the best,

--Gunnar

[1] -
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+plugins+and+retrieve+their+configuration+definitions#KIP769:ConnectAPIstolistallpluginsandretrievetheirconfigurationdefinitions-PublicInterfaces
> (section labeled "Converter interface"
>
> Cheers,
>
> Chris
>
> On Wed, Nov 24, 2021 at 11:32 AM Gunnar Morling
>  wrote:
>
> > Hey all,
> >
> > I would like to propose a KIP for Apache Kafka Connect which adds
> > validation support for SMT-related configuration options:
> >
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-802%3A+Validation+Support+for+Kafka+Connect+SMT+Options
> >
> > This feature allows users to make sure an SMT is configured correctly
> > before actually putting a connector with that SMT in place.
> >
> > Any feedback, comments, and suggestions around this proposal will
> > be greatly appreciated.
> >
> > Thanks,
> >
> > --Gunnar
> >
>


[jira] [Created] (KAFKA-13560) Load indexes and data in async manner in the critical path of replica fetcher threads.

2021-12-21 Thread Satish Duggana (Jira)
Satish Duggana created KAFKA-13560:
--

 Summary: Load indexes and data in async manner in the critical 
path of replica fetcher threads. 
 Key: KAFKA-13560
 URL: https://issues.apache.org/jira/browse/KAFKA-13560
 Project: Kafka
  Issue Type: Sub-task
  Components: core
Reporter: Satish Duggana
 Fix For: 3.2.0



https://github.com/apache/kafka/pull/11390#discussion_r762366976



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] KIP-808: Add support for unix epoch precision in TimestampConverter SMT

2021-12-21 Thread Julien Chanaud
Hi everyone,

Bumping this KIP discussion.
It's a small change, entirely backward compatible and I'd love your
feedback on it.
Thanks,
Julien


Le jeu. 9 déc. 2021 à 21:56, Julien Chanaud  a écrit :
>
> Hi everyone,
>
> I would like to start a discussion for KIP-808
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-808%3A+Add+support+for+unix+epoch+precision+in+TimestampConverter+SMT
>
> This seems like a simple change but I suspect there are several things to 
> consider, most notably regarding the java.util.Date object, which is at the 
> heart of the conversions.
>
> Let me know what you think.
>
> Julien
>


Re: [DISCUSS] KIP-811 Add separate delete.interval.ms to Kafka Streams

2021-12-21 Thread Luke Chen
Thanks, Bruno.

I agree my point 4 is more like PR discussion, not KIP discussion.
@Nick , please update my point 4 in PR directly.

Thank you.
Luke




On Tue, Dec 21, 2021 at 7:24 PM Bruno Cadonna  wrote:

> Hi Nick,
>
> Thank you for the KIP!
>
> I agree on points 1, 2, and 3. I am not sure about point 4. I agree that
> we should update the docs for commit.interval.ms but I am not sure if
> this needs to mentioned in the KIP. That seems to me more a PR
> discussion. Also on point 2, I agree that we need to add a doc string
> but the content should be exemplary not binding. What I want to say is
> that, we do not need a KIP to change docs.
>
> Here my points:
>
> 5. Could you specify in the motivation that the KIP is about deleting
> records from repartition topics? Maybe with a short description when why
> and when records are deleted from the repartition topics. For us it
> might be clear, but IMO we should try to write KIPs so that someone that
> is relatively new to Kafka Streams can understand the KIP without
> needing to know too much background.
>
> 6. Does the config need to be validated? For example, does
> delete.interval.ms need to be greater than or equal to commit.interval.ms?
>
> 7. Should the default value for non-EOS be 30s or the same value as
> commit.interval.ms? I am just thinking about the case where a user
> explicitly changes commit.interval.ms but not delete.interval.ms (or
> whatever name you come up for it). Once delete.interval.ms is set
> explicitly it is decoupled from commit.interval.ms. Similar could be
> done for the EOS case.
> Alternatively, we could also define delete.interval.ms to take a
> integral number without a unit that specifies after how many commit
> intervals the records in repartition topics should be deleted. This
> would make sense since delete.interval.ms is tightly bound to
> commit.interval.ms. Additionally, it would make the semantics of the
> config simpler. The name of the config should definitely change if we go
> down this way.
>
> Best,
> Bruno
>
>
>
> On 21.12.21 11:14, Luke Chen wrote:
> > Hi Nick,
> >
> > Thanks for the KIP!
> >
> > In addition to Sophie's comments, I have one more to this KIP:
> > 3. I think you should mention the behavior change *explicitly* in
> > "Compatibility" section. I know you already mentioned it in KIP, in the
> > benefit way. But I think in this section, we should clearly point out
> which
> > behavior will be change after this KIP. That is, you should put it clear
> > that the delete record interval will change from 100ms to 30s with EOS
> > enabled. And it should also be mentioned in doc/upgrade.html doc.
> > 4. Since this new config has some relationship with commit.interval.ms,
> I
> > think we should also update the doc description for `commit.interval.ms
> `,
> > to let user know there's another config to control delete interval and
> > should be greater than commit.interval.ms. Something like that. WDYT?
> (You
> > should put this change in the KIP as Sophie mentioned)
> >
> > Thank you.
> > Luke
> >
> > On Tue, Dec 21, 2021 at 9:27 AM Sophie Blee-Goldman
> >  wrote:
> >
> >> Hey Nick,
> >>
> >> I think you forgot to link to the KIP document, but I take it this is
> >> it: KIP-811:
> >> Add separate delete.interval.ms to Kafka Streams
> >> 
> >>
> >> The overall proposal sounds good to me, just a few minor things:
> >>
> >> 1. Please specify everything needed to define this config
> explicitly, ie
> >> all the arguments that will be passed in to the
> >> StreamsConfig's ConfigDef: in addition to the default value, we
> need the
> >> config type (presumably a Long), the doc
> >> string, and the importance (probably "low", similar to
> >> commit.interval.ms
> >> )
> >> 2. Might be worth considering a slightly more descriptive name for
> this
> >> config. Most users probably don't think about,
> >> or may not even be aware of, the deletion of consumed records by
> Kafka
> >> Streams, so calling it something along
> >> the lines of "repartition.records.delete.interval.ms" or "
> >> consumed.records.deletion.interval.ms" or so on will help
> >> make it clear what the config refers to and whether or not they
> need to
> >> care. Maybe you can come up with better
> >> and/or shorter names, just wanted to suggest some example names
> that I
> >> think sufficiently get the point across
> >>
> >> Other than that I'm +1 -- thanks for the KIP!
> >>
> >> Sophie
> >>
> >>
> >>
> >> On Mon, Dec 20, 2021 at 9:15 AM Nick Telford 
> >> wrote:
> >>
> >>> This is a KIP for a proposed solution to KAFKA-13549
> >>> . The solution is
> >> very
> >>> simple, so the KIP is pretty short.
> >>>
> >>> The suggested changes are implemented by this PR
> >>> .
> >>> --
> >>> Nick Telford
> >>>
> >>
> >
>


Re: [DISCUSS] KIP-811 Add separate delete.interval.ms to Kafka Streams

2021-12-21 Thread Bruno Cadonna

Hi Nick,

Thank you for the KIP!

I agree on points 1, 2, and 3. I am not sure about point 4. I agree that 
we should update the docs for commit.interval.ms but I am not sure if 
this needs to mentioned in the KIP. That seems to me more a PR 
discussion. Also on point 2, I agree that we need to add a doc string 
but the content should be exemplary not binding. What I want to say is 
that, we do not need a KIP to change docs.


Here my points:

5. Could you specify in the motivation that the KIP is about deleting 
records from repartition topics? Maybe with a short description when why 
and when records are deleted from the repartition topics. For us it 
might be clear, but IMO we should try to write KIPs so that someone that 
is relatively new to Kafka Streams can understand the KIP without 
needing to know too much background.


6. Does the config need to be validated? For example, does 
delete.interval.ms need to be greater than or equal to commit.interval.ms?


7. Should the default value for non-EOS be 30s or the same value as 
commit.interval.ms? I am just thinking about the case where a user 
explicitly changes commit.interval.ms but not delete.interval.ms (or 
whatever name you come up for it). Once delete.interval.ms is set 
explicitly it is decoupled from commit.interval.ms. Similar could be 
done for the EOS case.
Alternatively, we could also define delete.interval.ms to take a 
integral number without a unit that specifies after how many commit 
intervals the records in repartition topics should be deleted. This 
would make sense since delete.interval.ms is tightly bound to 
commit.interval.ms. Additionally, it would make the semantics of the 
config simpler. The name of the config should definitely change if we go 
down this way.


Best,
Bruno



On 21.12.21 11:14, Luke Chen wrote:

Hi Nick,

Thanks for the KIP!

In addition to Sophie's comments, I have one more to this KIP:
3. I think you should mention the behavior change *explicitly* in
"Compatibility" section. I know you already mentioned it in KIP, in the
benefit way. But I think in this section, we should clearly point out which
behavior will be change after this KIP. That is, you should put it clear
that the delete record interval will change from 100ms to 30s with EOS
enabled. And it should also be mentioned in doc/upgrade.html doc.
4. Since this new config has some relationship with commit.interval.ms, I
think we should also update the doc description for `commit.interval.ms`,
to let user know there's another config to control delete interval and
should be greater than commit.interval.ms. Something like that. WDYT? (You
should put this change in the KIP as Sophie mentioned)

Thank you.
Luke

On Tue, Dec 21, 2021 at 9:27 AM Sophie Blee-Goldman
 wrote:


Hey Nick,

I think you forgot to link to the KIP document, but I take it this is
it: KIP-811:
Add separate delete.interval.ms to Kafka Streams


The overall proposal sounds good to me, just a few minor things:

1. Please specify everything needed to define this config explicitly, ie
all the arguments that will be passed in to the
StreamsConfig's ConfigDef: in addition to the default value, we need the
config type (presumably a Long), the doc
string, and the importance (probably "low", similar to
commit.interval.ms
)
2. Might be worth considering a slightly more descriptive name for this
config. Most users probably don't think about,
or may not even be aware of, the deletion of consumed records by Kafka
Streams, so calling it something along
the lines of "repartition.records.delete.interval.ms" or "
consumed.records.deletion.interval.ms" or so on will help
make it clear what the config refers to and whether or not they need to
care. Maybe you can come up with better
and/or shorter names, just wanted to suggest some example names that I
think sufficiently get the point across

Other than that I'm +1 -- thanks for the KIP!

Sophie



On Mon, Dec 20, 2021 at 9:15 AM Nick Telford 
wrote:


This is a KIP for a proposed solution to KAFKA-13549
. The solution is

very

simple, so the KIP is pretty short.

The suggested changes are implemented by this PR
.
--
Nick Telford







MDC context for Kafka Streams

2021-12-21 Thread Maks Usanin
Hi team. My current task is updating MDC context for Kafka Streams. Is it
possible to update MDC context per thread and reset to original MDC context
after thread is finished execution ? Does Kafka Streams support this? any
ideas .. Thanks


Re: [DISCUSS] KIP-811 Add separate delete.interval.ms to Kafka Streams

2021-12-21 Thread Luke Chen
Hi Nick,

Thanks for the KIP!

In addition to Sophie's comments, I have one more to this KIP:
3. I think you should mention the behavior change *explicitly* in
"Compatibility" section. I know you already mentioned it in KIP, in the
benefit way. But I think in this section, we should clearly point out which
behavior will be change after this KIP. That is, you should put it clear
that the delete record interval will change from 100ms to 30s with EOS
enabled. And it should also be mentioned in doc/upgrade.html doc.
4. Since this new config has some relationship with commit.interval.ms, I
think we should also update the doc description for `commit.interval.ms`,
to let user know there's another config to control delete interval and
should be greater than commit.interval.ms. Something like that. WDYT? (You
should put this change in the KIP as Sophie mentioned)

Thank you.
Luke

On Tue, Dec 21, 2021 at 9:27 AM Sophie Blee-Goldman
 wrote:

> Hey Nick,
>
> I think you forgot to link to the KIP document, but I take it this is
> it: KIP-811:
> Add separate delete.interval.ms to Kafka Streams
> 
>
> The overall proposal sounds good to me, just a few minor things:
>
>1. Please specify everything needed to define this config explicitly, ie
>all the arguments that will be passed in to the
>StreamsConfig's ConfigDef: in addition to the default value, we need the
>config type (presumably a Long), the doc
>string, and the importance (probably "low", similar to
> commit.interval.ms
>)
>2. Might be worth considering a slightly more descriptive name for this
>config. Most users probably don't think about,
>or may not even be aware of, the deletion of consumed records by Kafka
>Streams, so calling it something along
>the lines of "repartition.records.delete.interval.ms" or "
>consumed.records.deletion.interval.ms" or so on will help
>make it clear what the config refers to and whether or not they need to
>care. Maybe you can come up with better
>and/or shorter names, just wanted to suggest some example names that I
>think sufficiently get the point across
>
> Other than that I'm +1 -- thanks for the KIP!
>
> Sophie
>
>
>
> On Mon, Dec 20, 2021 at 9:15 AM Nick Telford 
> wrote:
>
> > This is a KIP for a proposed solution to KAFKA-13549
> > . The solution is
> very
> > simple, so the KIP is pretty short.
> >
> > The suggested changes are implemented by this PR
> > .
> > --
> > Nick Telford
> >
>


Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.1 #41

2021-12-21 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 501411 lines...]
[2021-12-21T09:54:41.892Z] > Task :raft:testClasses UP-TO-DATE
[2021-12-21T09:54:41.892Z] > Task :connect:json:testJar
[2021-12-21T09:54:42.733Z] > Task :connect:json:testSrcJar
[2021-12-21T09:54:42.733Z] > Task :metadata:compileTestJava UP-TO-DATE
[2021-12-21T09:54:42.733Z] > Task :metadata:testClasses UP-TO-DATE
[2021-12-21T09:54:42.733Z] > Task 
:clients:generateMetadataFileForMavenJavaPublication
[2021-12-21T09:54:42.733Z] > Task 
:clients:generatePomFileForMavenJavaPublication
[2021-12-21T09:54:42.733Z] 
[2021-12-21T09:54:42.733Z] > Task :streams:processMessages
[2021-12-21T09:54:42.733Z] Execution optimizations have been disabled for task 
':streams:processMessages' to ensure correctness due to the following reasons:
[2021-12-21T09:54:42.733Z]   - Gradle detected a problem with the following 
location: 
'/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.1/streams/src/generated/java/org/apache/kafka/streams/internals/generated'.
 Reason: Task ':streams:srcJar' uses this output of task 
':streams:processMessages' without declaring an explicit or implicit 
dependency. This can lead to incorrect results being produced, depending on 
what order the tasks are executed. Please refer to 
https://docs.gradle.org/7.2/userguide/validation_problems.html#implicit_dependency
 for more details about this problem.
[2021-12-21T09:54:42.733Z] MessageGenerator: processed 1 Kafka message JSON 
files(s).
[2021-12-21T09:54:42.733Z] 
[2021-12-21T09:54:42.733Z] > Task :streams:compileJava UP-TO-DATE
[2021-12-21T09:54:42.733Z] > Task :streams:classes UP-TO-DATE
[2021-12-21T09:54:42.733Z] > Task :streams:test-utils:compileJava UP-TO-DATE
[2021-12-21T09:54:43.946Z] > Task :streams:copyDependantLibs
[2021-12-21T09:54:43.946Z] > Task :streams:jar UP-TO-DATE
[2021-12-21T09:54:43.946Z] > Task 
:streams:generateMetadataFileForMavenJavaPublication
[2021-12-21T09:54:46.022Z] > Task :connect:api:javadoc
[2021-12-21T09:54:46.022Z] > Task :connect:api:copyDependantLibs UP-TO-DATE
[2021-12-21T09:54:46.022Z] > Task :connect:api:jar UP-TO-DATE
[2021-12-21T09:54:46.022Z] > Task 
:connect:api:generateMetadataFileForMavenJavaPublication
[2021-12-21T09:54:46.022Z] > Task :connect:json:copyDependantLibs UP-TO-DATE
[2021-12-21T09:54:46.022Z] > Task :connect:json:jar UP-TO-DATE
[2021-12-21T09:54:46.022Z] > Task 
:connect:json:generateMetadataFileForMavenJavaPublication
[2021-12-21T09:54:46.022Z] > Task :connect:api:javadocJar
[2021-12-21T09:54:46.022Z] > Task 
:connect:json:publishMavenJavaPublicationToMavenLocal
[2021-12-21T09:54:46.022Z] > Task :connect:json:publishToMavenLocal
[2021-12-21T09:54:46.022Z] > Task :connect:api:compileTestJava UP-TO-DATE
[2021-12-21T09:54:46.022Z] > Task :connect:api:testClasses UP-TO-DATE
[2021-12-21T09:54:46.022Z] > Task :connect:api:testJar
[2021-12-21T09:54:46.022Z] > Task :connect:api:testSrcJar
[2021-12-21T09:54:46.022Z] > Task 
:connect:api:publishMavenJavaPublicationToMavenLocal
[2021-12-21T09:54:46.022Z] > Task :connect:api:publishToMavenLocal
[2021-12-21T09:54:49.568Z] > Task :streams:javadoc
[2021-12-21T09:54:50.490Z] > Task :streams:javadocJar
[2021-12-21T09:54:51.692Z] > Task :clients:javadoc
[2021-12-21T09:54:51.692Z] > Task :clients:javadocJar
[2021-12-21T09:54:52.614Z] 
[2021-12-21T09:54:52.614Z] > Task :clients:srcJar
[2021-12-21T09:54:52.614Z] Execution optimizations have been disabled for task 
':clients:srcJar' to ensure correctness due to the following reasons:
[2021-12-21T09:54:52.614Z]   - Gradle detected a problem with the following 
location: 
'/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.1/clients/src/generated/java'.
 Reason: Task ':clients:srcJar' uses this output of task 
':clients:processMessages' without declaring an explicit or implicit 
dependency. This can lead to incorrect results being produced, depending on 
what order the tasks are executed. Please refer to 
https://docs.gradle.org/7.2/userguide/validation_problems.html#implicit_dependency
 for more details about this problem.
[2021-12-21T09:54:53.536Z] 
[2021-12-21T09:54:53.536Z] > Task :clients:testJar
[2021-12-21T09:54:53.536Z] > Task :clients:testSrcJar
[2021-12-21T09:54:53.536Z] > Task 
:clients:publishMavenJavaPublicationToMavenLocal
[2021-12-21T09:54:53.536Z] > Task :clients:publishToMavenLocal
[2021-12-21T09:55:14.514Z] > Task :core:compileScala
[2021-12-21T09:58:38.268Z] > Task :core:classes
[2021-12-21T09:58:38.268Z] > Task :core:compileTestJava NO-SOURCE
[2021-12-21T09:59:08.824Z] > Task :core:compileTestScala
[2021-12-21T10:00:16.415Z] > Task :core:testClasses
[2021-12-21T10:00:30.676Z] > Task :streams:compileTestJava
[2021-12-21T10:00:30.676Z] > Task :streams:testClasses
[2021-12-21T10:00:30.676Z] > Task :streams:testJar
[2021-12-21T10:00:31.601Z] > Task :streams:testSrcJar
[2021-12-21T10:00:31.601Z] > Task 
:streams:publishMavenJavaPublic

Re: [DISCUSS] KIP-719: Add Log4J2 Appender

2021-12-21 Thread Dongjin Lee
Hi Mickael,

> In the meantime, you may want to bump the VOTE thread too.

Sure, I just reset the voting thread with a brief context.

Thanks,
Dongjin

On Tue, Dec 21, 2021 at 2:13 AM Mickael Maison 
wrote:

> Thanks Dongjin!
>
> I'll take a look soon.
> In the meantime, you may want to bump the VOTE thread too.
>
> Best,
> Mickael
>
>
> On Sat, Dec 18, 2021 at 10:00 AM Dongjin Lee  wrote:
> >
> > Hi Mickael,
> >
> > Finally, I did it! As you can see at the PR
> > , KIP-719 now uses log4j2's
> > Kafka appender, and log4j-appender is not used by the other modules
> > anymore. You can see how it will work with KIP-653 at this preview
> > ,
> based
> > on Apache Kafka 3.0.0. The proposal document
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-719%3A+Deprecate+Log4J+Appender
> >
> > is also updated accordingly, with its title.
> >
> > There is a minor issue on log4j2
> > , but it seems like
> it
> > will be resolved soon.
> >
> > Best,
> > Dongjin
> >
> > On Wed, Dec 15, 2021 at 9:28 PM Dongjin Lee  wrote:
> >
> > > Hi Mickael,
> > >
> > > > Can we do step 3 without breaking any compatibility? If so then that
> > > sounds like a good idea.
> > >
> > > As far as I know, the answer is yes; I am now updating my PR, so I will
> > > notify you as soon as I complete the work.
> > >
> > > Best,
> > > Dongjin
> > >
> > > On Wed, Dec 15, 2021 at 2:00 AM Mickael Maison <
> mickael.mai...@gmail.com>
> > > wrote:
> > >
> > >> Hi Dongjin,
> > >>
> > >> Sorry for the late reply. Can we do step 3 without breaking any
> > >> compatibility? If so then that sounds like a good idea.
> > >>
> > >> Thanks,
> > >> Mickael
> > >>
> > >>
> > >>
> > >> On Tue, Nov 23, 2021 at 2:08 PM Dongjin Lee 
> wrote:
> > >> >
> > >> > Hi Mickael,
> > >> >
> > >> > I also thought over the issue thoroughly and would like to propose a
> > >> minor
> > >> > change to your proposal:
> > >> >
> > >> > 1. Deprecate log4j-appender now
> > >> > 2. Document how to migrate into logging-log4j2
> > >> > 3. (Changed) Replace the log4j-appender (in turn log4j 1.x)
> > >> dependencies in
> > >> > tools, trogdor, and shell and upgrade to log4j2 in 3.x, removing
> log4j
> > >> 1.x
> > >> > dependencies.
> > >> > 4. (Changed) Remove log4j-appender in Kafka 4.0
> > >> >
> > >> > What we need to do for the log4j2 upgrade is just removing the log4j
> > >> > dependencies only, for they can cause a classpath error. And
> actually,
> > >> we
> > >> > can do it without discontinuing publishing the log4j-appender
> artifact.
> > >> So,
> > >> > I suggest separating the upgrade to log4j2 and removing the
> > >> log4j-appender
> > >> > module.
> > >> >
> > >> > How do you think? If you agree, I will update the KIP and the PR
> > >> > accordingly ASAP.
> > >> >
> > >> > Thanks,
> > >> > Dongjin
> > >> >
> > >> > On Mon, Nov 15, 2021 at 8:06 PM Mickael Maison <
> > >> mickael.mai...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Hi Dongjin,
> > >> > >
> > >> > > Thanks for the clarifications.
> > >> > >
> > >> > > I wonder if a simpler course of action could be:
> > >> > > - Deprecate log4j-appender now
> > >> > > - Document how to use logging-log4j2
> > >> > > - Remove log4j-appender and all the log4j dependencies in Kafka
> 4.0
> > >> > >
> > >> > > This delays KIP-653 till Kafka 4.0 but (so far) Kafka is not
> directly
> > >> > > affected by the log4j CVEs. At least this gives us a clear and
> simple
> > >> > > roadmap to follow.
> > >> > >
> > >> > > What do you think?
> > >> > >
> > >> > > On Tue, Nov 9, 2021 at 12:12 PM Dongjin Lee 
> > >> wrote:
> > >> > > >
> > >> > > > Hi Mickael,
> > >> > > >
> > >> > > > I greatly appreciate you for reading the proposal so carefully!
> I
> > >> wrote
> > >> > > it
> > >> > > > quite a while ago and rechecked it today.
> > >> > > >
> > >> > > > > Is the KIP proposing to replace the existing log4-appender or
> > >> simply
> > >> > > add
> > >> > > > a new one for log4j2? Reading the KIP and with its current
> title,
> > >> it's
> > >> > > not
> > >> > > > entirely explicit.
> > >> > > >
> > >> > > > Oh, After re-reading it, I realized that this is not clear. Let
> me
> > >> > > clarify;
> > >> > > >
> > >> > > > 1. Provide a lo4j2 equivalent of traditional log4j-appender,
> > >> > > > log4j2-appender.
> > >> > > > 2. Migrate the modules depending on log4j-appender (i.e., tools,
> > >> trogdor,
> > >> > > > shell) into log4j2-appender, removing log4j-appender from
> > >> dependencies.
> > >> > > > 3. Entirely remove log4j-appender from the project dependencies,
> > >> along
> > >> > > with
> > >> > > > log4j.
> > >> > > >
> > >> > > > I think log4j-appender may be published for every new release
> like
> > >> > > before,
> > >> > > > but the committee should make a decision on the policy.
> > >> > > >
> > >> > > > > Under Rejected Alternative, the KIP states: "t

[VOTE] KIP-719: Deprecate Log4J Appender

2021-12-21 Thread Dongjin Lee
Hi All,

I hope to reboot the voting for KIP-719: Deprecate Log4J Appender.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-719%3A+Deprecate+Log4J+Appender

Here is some context: the purpose of this proposal is to remove
log4j-appender from the dependency, and initially, the proposal focused on
deprecating log4j-appender and introducing a log4j2 equivalent. However, as
the discussion proceeded, it became clear that removing log4j-appender from
the dependency only is available without introducing a new subproject.

For the reasons above, the point of the proposal was slightly changed:

- Before: Deprecate the log4j-appender and add log4j2-appender
- After: Deprecate the log4j-appender and replace the current dependency
with log4j 2.x's Kafka appender.

This is why I reset the voting thread.

Best,
Dongjin

-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*



*github:  github.com/dongjinleekr
keybase: https://keybase.io/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
speakerdeck: speakerdeck.com/dongjin
*