[GitHub] [kafka-site] guozhangwang commented on pull request #402: MINOR: Unsubscribe mailing lists

2022-04-01 Thread GitBox


guozhangwang commented on pull request #402:
URL: https://github.com/apache/kafka-site/pull/402#issuecomment-1086398541


   This is a good improvement on the docs, thanks @tombentley !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka-site] guozhangwang merged pull request #402: MINOR: Unsubscribe mailing lists

2022-04-01 Thread GitBox


guozhangwang merged pull request #402:
URL: https://github.com/apache/kafka-site/pull/402


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #831

2022-04-01 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-813 Shared State Stores

2022-04-01 Thread Matthias J. Sax

+1 (binding)


On 4/1/22 6:47 AM, John Roesler wrote:

Thanks for the KIP, Daan!

I’m +1 (binding)

-John

On Tue, Mar 29, 2022, at 06:01, Daan Gertis wrote:

I would like to start a vote on this one:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-813%3A+Shareable+State+Stores

Cheers,
D.


Re: [DISCUSS] KIP-0422: Add Record Footers

2022-04-01 Thread Anna McDonald
Excellent addition, can we consider building in priority for all footer
elements such that each could be designated high or low?

Cheers,
Anna

On Fri, Apr 1, 2022, 10:56 AM Chris Egerton  wrote:

> Hi Matthias,
>
> Thanks for the KIP! For context, it looks like this was discussed earlier
> at Kafka Summit (recording here: https://youtu.be/dQw4w9WgXcQ); could we
> add that to the background section?
>
> Cheers,
>
> Chris
>
> On Fri, Apr 1, 2022, 10:51 Bill Bejeck  wrote:
>
> > Hi Matthias
> >
> > This is a great idea, and I have one suggestion, can we name them record
> > footnotes?
> >
> > Thanks,
> > Bill
> >
> > On Fri, Apr 1, 2022 at 10:45 AM Matthias J. Sax 
> wrote:
> >
> > > Hi,
> > >
> > > we added record header support to Kafka via KIP-82 many years ago. I
> > > think it's time to complement this feature with record footers.
> > >
> > > Looking forward to your feedback.
> > >
> > > https://tinyurl.com/43jubbaj
> > >
> > >
> > > -Matthias
> > >
> >
>


[GitHub] [kafka-site] tombentley opened a new pull request #402: MINOR: Unsubscribe mailing lists

2022-04-01 Thread GitBox


tombentley opened a new pull request #402:
URL: https://github.com/apache/kafka-site/pull/402


   We fairly frequently get people emailing `dev@` asking how to describe. 
Although this was described on /contact, it maybe wasn't very obvious for 
people not reading the whole page. So give explicit instructions to unsubscribe 
from each list.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: Kafka Streams Issue

2022-04-01 Thread John Roesler
Hi Daan,

First of all, it does sound like that is a correct
implementation of QueryableStoreProvider. Kudos for taking
that on; the complexity of that API was one of my top
motivations for replacing it with IQv2!
(https://cwiki.apache.org/confluence/display/KAFKA/KIP-796%3A+Interactive+Query+v2
)

To answer your question directly, no "activeHost" just means
the host that currently has the "activeTask" for the desired
store.

I suspect that this is either a subtle and rare edge case in
how the metadata gets updated, or it's just a simple race
condition between the query and a rebalance in the cluster,
which is a fact of life in any distributed database.

If you are able to reproduce it and send us the logs, we
should be able to tell which is which.

In particular, we'd need to see thee things in the logs:
1. The logs for the rebalances and assignments (which are on
by default)
2. The log of when you check the metadata and what the
result it
3. The log of when the query tries to run on the
"activeHost" and what it finds there (that the task is only
a standby)

One other possibility worth considering is whether the
queryMetadataForKey is producing the correct partition. What
it does is run the provided key through the provided
serializer and then run the serialized key though the
default partitioner. If your actual data isn't partitioned
the same way, then queryMetadataForKey might be effectively
selecting a random host, which sometimes happens to host the
active task and other times does not? Kind of a long shot,
but I just wanted to put it out there.

Thanks,
-John


On Mon, 2022-03-28 at 13:48 +, Daan Gertis wrote:
> Hi All,
> 
> We are experiencing some weird behaviour with our interactive query service 
> implementation.
> This is the flow we’ve implemented:
> 
> 
>   1.  kafkaStreams.queryMetadataForKey(store, key, serializer) returns for 
> activeHost HostInfo{host='localhost', port=8562}, and standbyHosts [] for the 
> store and partition where the key would reside. We are not interested in 
> standby hosts. Luckily, we have an active host which we can call.
>   2.  We make an HTTP call to host localhost:8562, asking for the key there.
>   3.  Inside the 8562 host, we retrieve the store by calling 
> kafkaStreams.store(parameters), using parameters with staleStores set to 
> false.
>   4.  We call kafkaStreams.state().equals(RUNNING) to make sure we’re in the 
> RUNNING state.
>   5.  Now we call store.get(key) in order to retrieve the key from the store, 
> if it has been stored there.
>   6.  The get method on our store implementation calls the 
> storeProvider.stores(storeName, storeType) method to iterate over all the 
> stores available on the host.
>   7.  The storeProvider is a WrappingStoreProvider, which calls 
> storeProvider.stores(storeQueryParameters) for each 
> StreamThreadStateStoreProvider it wraps (just one in our case).
>   8.  As the logic inside that stores method finds that the StreamThread is 
> in the RUNNING state, it retrieves the tasks based on 
> storeQueryParams.staleStoresEnabled() ? streamThread.allTasks().values() : 
> streamThread.activeTasks(), which evaluates to false since we set staleStores 
> to false in the params.
>   9.  To our surprise, the streamThread.activeTasks() method returns an empty 
> ArrayList, while the streamThread.allTasks().values() returns one StandbyTask 
> for the store we’re looking for.
>   10. As there appear to be no active tasks on this host for this store, we 
> return the fabled “The state store, " + storeName + ", may have migrated to 
> another instance.” InvalidStateStoreException.
> 
> This flow is quite tricky as the queryMetadataForKey returns an active host, 
> which turns out to only have a standby task once queried.
> I have executed the queryMetadataForKey method on the active host as well, 
> once before calling kafkaStreams.store in step 3, and another time between 
> step 4 and 5. Each time the metadata returns the same, the host we’re on at 
> that moment is the active host.
> 
> Could it be there is a difference between activeHost and activeTask?
> 
> For those also on the confluent community slack might recognize this message 
> as it has been posted there by our CTO as well.
> 
> Cheers,
> D.



Re: Kafka DEV Mail List ?

2022-04-01 Thread Mickael Maison
Hi Jeremy,

You can signup yourself, just follow the steps on
https://kafka.apache.org/contact

Thanks,
Mickael

On Fri, Apr 1, 2022 at 4:41 PM Jeremy Amer  wrote:
>
> Hi!
> I am currently reading (learning!) "Kafka The Defnitive Guide", where it
> says I can sign up to a Kafka Mail List ?
>
> Please could you add my name ?
>
> Thansk
> /Jeremy Amer


Re: [HEADS-UP] Modification to KIP Template

2022-04-01 Thread Mickael Maison
Hi,

Unearthing this old thread as today I stumbled on the issue that
Ismael reported. It looks like this was never fixed!

The "Test Plan" section was only added in the KIP-template page [0]
and not in the actual KIP-template template [1] that is used when
doing `Create -> KIP-Template` or by clicking on `Create KIP` on
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

I think this new section makes sense and it's very easy to add it to
the actual template. Before doing it, I just want to ping the dev list
to see if anybody has suggestions or concerns since this was discussed
many years ago now.

0: https://cwiki.apache.org/confluence/display/KAFKA/KIP-Template
1: 
https://cwiki.apache.org/confluence/pages/templates2/viewpagetemplate.action?entityId=54329345=KAFKA

Thanks,
Mickael

On Fri, May 27, 2016 at 10:55 AM Ismael Juma  wrote:
>
> Hi Gwen,
>
> Thanks for adding the "Test Plans" section. I think it may be worth adding
> a note about performance testing plans too (whenever relevant). By the way,
> even though the following page has the new section, if I use `Create ->
> KIP-Template`, the new section doesn't appear. Do you know why?
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-Template
>
> Ismael
>
> On Fri, May 27, 2016 at 3:24 AM, Gwen Shapira  wrote:
>
> > Hi Kafka Developers,
> >
> > Just a quick heads-up that I added a new section to the KIP template: "Test
> > Plans".
> > I think its a good habit to think about how a feature will be tested while
> > planning it. I'm talking about high-level notes on system tests, not gritty
> > details.
> >
> > This will apply to new KIPs, not ones in discussion/implementation phases
> > (although if your KIP is under discussion and you want to add test plans,
> > it will be very nice of you).
> >
> > I figured we all agree that thinking a bit about tests is a good idea, so I
> > added it first and started a discussion later. If you strongly object,
> > please respond with strong objections. Wikis are easy to edit :)
> >
> > Gwen
> >


Re: [DISCUSS] KIP-0422: Add Record Footers

2022-04-01 Thread Chris Egerton
Hi Matthias,

Thanks for the KIP! For context, it looks like this was discussed earlier
at Kafka Summit (recording here: https://youtu.be/dQw4w9WgXcQ); could we
add that to the background section?

Cheers,

Chris

On Fri, Apr 1, 2022, 10:51 Bill Bejeck  wrote:

> Hi Matthias
>
> This is a great idea, and I have one suggestion, can we name them record
> footnotes?
>
> Thanks,
> Bill
>
> On Fri, Apr 1, 2022 at 10:45 AM Matthias J. Sax  wrote:
>
> > Hi,
> >
> > we added record header support to Kafka via KIP-82 many years ago. I
> > think it's time to complement this feature with record footers.
> >
> > Looking forward to your feedback.
> >
> > https://tinyurl.com/43jubbaj
> >
> >
> > -Matthias
> >
>


Re: [DISCUSS] KIP-0422: Add Record Footers

2022-04-01 Thread Bill Bejeck
Hi Matthias

This is a great idea, and I have one suggestion, can we name them record
footnotes?

Thanks,
Bill

On Fri, Apr 1, 2022 at 10:45 AM Matthias J. Sax  wrote:

> Hi,
>
> we added record header support to Kafka via KIP-82 many years ago. I
> think it's time to complement this feature with record footers.
>
> Looking forward to your feedback.
>
> https://tinyurl.com/43jubbaj
>
>
> -Matthias
>


[DISCUSS] KIP-0422: Add Record Footers

2022-04-01 Thread Matthias J. Sax

Hi,

we added record header support to Kafka via KIP-82 many years ago. I 
think it's time to complement this feature with record footers.


Looking forward to your feedback.

https://tinyurl.com/43jubbaj


-Matthias


Fwd: Kafka DEV Mail List ?

2022-04-01 Thread Jeremy Amer
Hi!
I am currently reading (learning!) "Kafka The Defnitive Guide", where it
says I can sign up to a Kafka Mail List ?

Please could you add my name ?

Thansk
/Jeremy Amer


Re: [VOTE] KIP-813 Shared State Stores

2022-04-01 Thread John Roesler
Thanks for the KIP, Daan!

I’m +1 (binding)

-John

On Tue, Mar 29, 2022, at 06:01, Daan Gertis wrote:
> I would like to start a vote on this one:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-813%3A+Shareable+State+Stores
>
> Cheers,
> D.


Re: [DISCUSS] KIP-813 Shared State Stores

2022-04-01 Thread John Roesler
Thanks for the replies, Daan,

That all sounds good to me. I think standbys will probably come naturally, but 
we should make sure the implementation includes an integration test to make 
sure. Anyway, I just wanted to make sure we were on the same page. 

Thanks again,
John

On Fri, Apr 1, 2022, at 08:16, Daan Gertis wrote:
> Hey John,
>
>
>   *   1. Am I right I’m thinking that there’s no way to enforce the 
> stores are actually read-only, right? It seems like the StoreBuilder 
> interface is too generic for that. If that’s true, I think it’s fine, 
> but we should be sure the JavaDoc clearly states that other processors 
> must not write into these stores (except for the one that feeds it).
>
> Yeah I couldn’t really find a way to limit it easily. We might be able 
> to throw unsupported exceptions by wrapping the statestore, but that 
> seems kind of brittle to do and feels a bit like a hack.
>
> Also, the function name clearly states it should be considered readonly.
>
>
>   *2. Are you planning for these stores to get standbys as well? I 
> would think so, otherwise the desired purpose of standbys (eliminating 
> restoration latency during failover) would not be served.
>
> Yeah I think standbys should be applicable here as well. But we get 
> that by implementing these readonly statestores as regular ones right?
>
> Cheers,
> D.
>
>
> From: John Roesler 
> Date: Friday, 1 April 2022 at 04:01
> To: dev@kafka.apache.org 
> Subject: Re: [DISCUSS] KIP-813 Shared State Stores
> Hi Daan,
>
> Thanks for the KIP!
>
> I just got caught up on the discussion. I just have a some small 
> questions, and then I will be ready to vote.
>
> 1. Am I right I’m thinking that there’s no way to enforce the stores 
> are actually read-only, right? It seems like the StoreBuilder interface 
> is too generic for that. If that’s true, I think it’s fine, but we 
> should be sure the JavaDoc clearly states that other processors must 
> not write into these stores (except for the one that feeds it).
>
>  2. Are you planning for these stores to get standbys as well? I would 
> think so, otherwise the desired purpose of standbys (eliminating 
> restoration latency during failover) would not be served.
>
> Thanks,
> John
>
> On Mon, Mar 7, 2022, at 13:13, Matthias J. Sax wrote:
>> Thanks for updating the KIP. LGTM.
>>
>> I think we can start a vote.
>>
>>
>>>  I think this might provide issues if your processor is doing a projection 
>>> of the data.
>>
>> This is correct. It's a know issue:
>> https://issues.apache.org/jira/browse/KAFKA-7663
>>
>> Global-stores/KTables are designed to put the data into the store
>> _unmodified_.
>>
>>
>> -Matthias
>>
>> On 2/28/22 5:05 AM, Daan Gertis wrote:
>>> Updated the KIP to be more aligned with global state store function names.
>>>
>>> If I remember correctly during restore the processor will not be used 
>>> right? I think this might provide issues if your processor is doing a 
>>> projection of the data. Either way, I would not add that into this KIP 
>>> since it is a specific use-case pattern.
>>>
>>> Unless there is anything more to add or change, I would propose moving to a 
>>> vote?
>>>
>>> Cheers!
>>> D.
>>>
>>> From: Matthias J. Sax 
>>> Date: Friday, 18 February 2022 at 03:29
>>> To: dev@kafka.apache.org 
>>> Subject: Re: [DISCUSS] KIP-813 Shared State Stores
>>> Thanks for updating the KIP!
>>>
>>> I am wondering if we would need two overloads of `addReadOnlyStateStore`
>>> one w/ and one w/o `TimestampExtractor` argument to effectively make it
>>> an "optional" parameter?
>>>
>>> Also wondering if we need to pass in a `String sourceName` and `String
>>> processorName` parameters (similar to `addGlobalStore()`?) instead if
>>> re-using the store name as currently proposed? -- In general I don't
>>> have a strong opinion either way, but it seems to introduce some API
>>> inconsistency if we don't follow the `addGlobalStore()` pattern?
>>>
>>>
 Another thing we were confronted with was the restoring of state when the 
 actual local storage is gone. For example, we host on K8s with ephemeral 
 pods, so there is no persisted storage between pod restarts. However, the 
 consumer group will be already been at the latest offset, preventing from 
 previous data to be restored within the new pod’s statestore.
>>>
>>> We have already code in-place in the runtime to do the right thing for
>>> this case (ie, via DSL source-table changelog optimization). We can
>>> re-use this part. It's nothing we need to discuss on the KIP, but we can
>>> discuss on the PR later.
>>>
>>>
>>> -Matthias
>>>
>>>
>>> On 2/17/22 10:09 AM, Guozhang Wang wrote:
 Hi Daan,

 I think for the read-only state stores you'd need ot slightly augment the
 checkpointing logic so that it would still write the checkpointed offsets
 while restoring from the changelogs.


 Guozhang

 On Thu, Feb 17, 2022 at 7:02 AM Daan Gertis 
 wrote:

>> Could you 

Re: [DISCUSS] KIP-813 Shared State Stores

2022-04-01 Thread Daan Gertis
Hey John,


  *   1. Am I right I’m thinking that there’s no way to enforce the stores are 
actually read-only, right? It seems like the StoreBuilder interface is too 
generic for that. If that’s true, I think it’s fine, but we should be sure the 
JavaDoc clearly states that other processors must not write into these stores 
(except for the one that feeds it).

Yeah I couldn’t really find a way to limit it easily. We might be able to throw 
unsupported exceptions by wrapping the statestore, but that seems kind of 
brittle to do and feels a bit like a hack.

Also, the function name clearly states it should be considered readonly.


  *2. Are you planning for these stores to get standbys as well? I would 
think so, otherwise the desired purpose of standbys (eliminating restoration 
latency during failover) would not be served.

Yeah I think standbys should be applicable here as well. But we get that by 
implementing these readonly statestores as regular ones right?

Cheers,
D.


From: John Roesler 
Date: Friday, 1 April 2022 at 04:01
To: dev@kafka.apache.org 
Subject: Re: [DISCUSS] KIP-813 Shared State Stores
Hi Daan,

Thanks for the KIP!

I just got caught up on the discussion. I just have a some small questions, and 
then I will be ready to vote.

1. Am I right I’m thinking that there’s no way to enforce the stores are 
actually read-only, right? It seems like the StoreBuilder interface is too 
generic for that. If that’s true, I think it’s fine, but we should be sure the 
JavaDoc clearly states that other processors must not write into these stores 
(except for the one that feeds it).

 2. Are you planning for these stores to get standbys as well? I would think 
so, otherwise the desired purpose of standbys (eliminating restoration latency 
during failover) would not be served.

Thanks,
John

On Mon, Mar 7, 2022, at 13:13, Matthias J. Sax wrote:
> Thanks for updating the KIP. LGTM.
>
> I think we can start a vote.
>
>
>>  I think this might provide issues if your processor is doing a projection 
>> of the data.
>
> This is correct. It's a know issue:
> https://issues.apache.org/jira/browse/KAFKA-7663
>
> Global-stores/KTables are designed to put the data into the store
> _unmodified_.
>
>
> -Matthias
>
> On 2/28/22 5:05 AM, Daan Gertis wrote:
>> Updated the KIP to be more aligned with global state store function names.
>>
>> If I remember correctly during restore the processor will not be used right? 
>> I think this might provide issues if your processor is doing a projection of 
>> the data. Either way, I would not add that into this KIP since it is a 
>> specific use-case pattern.
>>
>> Unless there is anything more to add or change, I would propose moving to a 
>> vote?
>>
>> Cheers!
>> D.
>>
>> From: Matthias J. Sax 
>> Date: Friday, 18 February 2022 at 03:29
>> To: dev@kafka.apache.org 
>> Subject: Re: [DISCUSS] KIP-813 Shared State Stores
>> Thanks for updating the KIP!
>>
>> I am wondering if we would need two overloads of `addReadOnlyStateStore`
>> one w/ and one w/o `TimestampExtractor` argument to effectively make it
>> an "optional" parameter?
>>
>> Also wondering if we need to pass in a `String sourceName` and `String
>> processorName` parameters (similar to `addGlobalStore()`?) instead if
>> re-using the store name as currently proposed? -- In general I don't
>> have a strong opinion either way, but it seems to introduce some API
>> inconsistency if we don't follow the `addGlobalStore()` pattern?
>>
>>
>>> Another thing we were confronted with was the restoring of state when the 
>>> actual local storage is gone. For example, we host on K8s with ephemeral 
>>> pods, so there is no persisted storage between pod restarts. However, the 
>>> consumer group will be already been at the latest offset, preventing from 
>>> previous data to be restored within the new pod’s statestore.
>>
>> We have already code in-place in the runtime to do the right thing for
>> this case (ie, via DSL source-table changelog optimization). We can
>> re-use this part. It's nothing we need to discuss on the KIP, but we can
>> discuss on the PR later.
>>
>>
>> -Matthias
>>
>>
>> On 2/17/22 10:09 AM, Guozhang Wang wrote:
>>> Hi Daan,
>>>
>>> I think for the read-only state stores you'd need ot slightly augment the
>>> checkpointing logic so that it would still write the checkpointed offsets
>>> while restoring from the changelogs.
>>>
>>>
>>> Guozhang
>>>
>>> On Thu, Feb 17, 2022 at 7:02 AM Daan Gertis 
>>> wrote:
>>>
> Could you add more details about the signature of
> `addReadOnlyStateStore()` -- What parameters does it take? Are there any
> overloads taking different parameters? The KIP only contains some verbal
> description on the "Implementation Plan" section, that is hard to find
> and hard to read.
>
> The KIP mentions a `ProcessorProvider` -- do you mean
 `ProcessorSupplier`?
>
> About timestamp synchronization: why do you propose to disable timestamp
> 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #830

2022-04-01 Thread Apache Jenkins Server
See