Re: [VOTE] KIP-862: Self-join optimization for stream-stream joins

2022-09-13 Thread Jim Hughes
Hi Vicky,

I'm +1 (non-binding); thanks for the KIP (and PR)!

Cheers,

Jim

On Tue, Sep 13, 2022 at 12:05 PM Guozhang Wang  wrote:

> Thank Vicky! I'm +1.
>
> Guozhang
>
> On Mon, Sep 12, 2022 at 7:02 PM John Roesler  wrote:
>
> > Thanks for the updates, Vicky!
> >
> > I've reviewed the KIP and your POC PR,
> > and I'm +1 (binding).
> >
> > Thanks!
> > -John
> >
> > On Mon, Sep 12, 2022, at 09:13, Vasiliki Papavasileiou wrote:
> > > Hey Guozhang,
> > >
> > > Great suggestion, I made the change.
> > >
> > > Best,
> > > Vicky
> > >
> > > On Fri, Sep 9, 2022 at 10:43 PM Guozhang Wang 
> > wrote:
> > >
> > >> Thanks Vicky, that reads much clearer now.
> > >>
> > >> Just regarding the value string name itself: "self.join" may be
> > confusing
> > >> compared to other values that people would think before this config is
> > >> enabled, self-join are not allowed at all. Maybe we can rename it to
> > >> "single.store.self.join"?
> > >>
> > >> Guozhang
> > >>
> > >> On Fri, Sep 9, 2022 at 2:15 AM Vasiliki Papavasileiou
> > >>  wrote:
> > >>
> > >> > Hey Guozhang,
> > >> >
> > >> > Ah it seems my text was not very clear :)
> > >> > With "TOPOLOGY_OPTIMIZATION_CONFIG will be extended to accept a list
> > of
> > >> > optimization rule configs" I meant that it will accept the new value
> > >> > strings for each optimization rule. Let me rephrase that in the KIP
> to
> > >> make
> > >> > it clearer.
> > >> > Is it better now?
> > >> >
> > >> > Best,
> > >> > Vicky
> > >> >
> > >> > On Thu, Sep 8, 2022 at 9:07 PM Guozhang Wang 
> > wrote:
> > >> >
> > >> > > Thanks Vicky,
> > >> > >
> > >> > > I read through the KIP again and it looks good to me. Just a quick
> > >> > question
> > >> > > regarding the public config changes: you mentioned "No public
> > >> interfaces
> > >> > > will be impacted. The config TOPOLOGY_OPTIMIZATION_CONFIG will be
> > >> > extended
> > >> > > to accept a list of optimization rule configs in addition to the
> > global
> > >> > > values "all" and "none" . But there are no new value strings
> > mentioned
> > >> in
> > >> > > this KIP, so that means we will apply this optimization only when
> > `all`
> > >> > is
> > >> > > specified in the config right?
> > >> > >
> > >> > >
> > >> > > Guozhang
> > >> > >
> > >> > >
> > >> > > On Thu, Sep 8, 2022 at 12:02 PM Vasiliki Papavasileiou
> > >> > >  wrote:
> > >> > >
> > >> > > > Hello everyone,
> > >> > > >
> > >> > > > I'd like to open the vote for KIP-862, which proposes to
> optimize
> > >> > > > stream-stream self-joins by using a single state store for the
> > join.
> > >> > > >
> > >> > > > The proposal is here:
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-862%3A+Self-join+optimization+for+stream-stream+joins
> > >> > > >
> > >> > > > Thanks to all who reviewed the proposal, and thanks in advance
> for
> > >> > taking
> > >> > > > the time to vote!
> > >> > > >
> > >> > > > Thank you,
> > >> > > > Vicky
> > >> > > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > -- Guozhang
> > >> > >
> > >> >
> > >>
> > >>
> > >> --
> > >> -- Guozhang
> > >>
> >
>
>
> --
> -- Guozhang
>


[jira] [Created] (KAFKA-14076) Fix issues with KafkaStreams.CloseOptions

2022-07-14 Thread Jim Hughes (Jira)
Jim Hughes created KAFKA-14076:
--

 Summary: Fix issues with KafkaStreams.CloseOptions
 Key: KAFKA-14076
 URL: https://issues.apache.org/jira/browse/KAFKA-14076
 Project: Kafka
  Issue Type: Bug
Reporter: Jim Hughes


The new `close(CloseOptions)` function has a few bugs.  
([https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L1518-L1561)]

Notably, it needs to remove CGs per StreamThread.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-834: Pause / Resume KafkaStreams Topologies

2022-06-16 Thread Jim Hughes
Hi all,

As an update, I wanted to say that I updated my PR to include this
feedback.  Let me know if I should clarify anything on the KIP itself.

Cheers,

Jim

On Thu, Jun 2, 2022 at 12:36 AM Sophie Blee-Goldman
 wrote:

> Hey Jim, thanks for the update. I'm on the same side as Guozhang here, as
> I've expressed during
> the original discussion I think it would be confusing and possibly harmful
> to continue *any* kind of
> processing or action within Streams while it is "paused". In fact I sort of
> assumed we were including
> active task restoration under the umbrella of standby tasks when we decided
> to pause those as well,
> but since they are technically different I can see why we might want to
> consider them separately.
>
> I would say that for now we should just keep the semantics simple and
> obvious, and if users express
> a desire to pause applications from active processing but allow them to
> catch up as restoring actives,
> lagging standbys, warmup tasks, or so on then we can always add that
> functionality to the feature later on
>
> On Wed, Jun 1, 2022 at 11:41 AM Guozhang Wang  wrote:
>
> > Hello Jim,
> >
> > I think If our primary goal would be to reduce resource utilization and
> > potentially to stop the streaming pipeline for investigating possible
> bugs
> > etc, then we should also pause active tasks' restoration as well since
> that
> > 1) may still use resources, and 2) may load in bad data.
> >
> > Guozhang
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Jun 1, 2022 at 5:53 AM Jim Hughes 
> > wrote:
> >
> > > Hi all,
> > >
> > > While reviewing my PR for KIP-834, Bruno noticed a case that we may not
> > > have discussed enough.*
> > >
> > > During the discussion, we decided that standby tasks would be paused.
> In
> > > order to do this, there are changes to the StoreChangelogReader around
> > > where it does restorations.  Bruno noticed that the restoration of
> active
> > > tasks is not paused in my PR.
> > >
> > > From my point of view, I was hoping to let active tasks
> > restore/consume/etc
> > > in order that the Kafka Streams instance could transition to RUNNING
> > > (assuming that it was started paused).  I believe Bruno's position is
> > that
> > > if we are pausing restoration for standby tasks, then restoration
> should
> > be
> > > paused for active tasks as well.
> > >
> > > Since this point hasn't been discussed like this, the KIP is unclear
> > about
> > > this detail.
> > >
> > > What do folks think?
> > >
> > > Thanks in advance,
> > >
> > > Jim
> > >
> > > * https://github.com/apache/kafka/pull/12161#discussion_r886732983
> > >
> > > On Mon, May 16, 2022 at 11:07 AM Jim Hughes 
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > >
> > > > With 5 binding votes (John, Bruno, Sophie, Matthias, Bill) and 4
> > > > non-binding votes (Guozhang, Luke, Leah, Walker), the vote for
> KIP-834
> > > > passes!
> > > >
> > > >
> > > > Thanks all for the great discussion.
> > > >
> > > > I have a PR up here: https://github.com/apache/kafka/pull/12161
> > > >
> > > >
> > > > Thanks in advance for feedback on the PR!
> > > >
> > > >
> > > > Cheers,
> > > >
> > > >
> > > > JIm
> > > >
> > > > On Fri, May 13, 2022 at 12:04 PM Walker Carlson
> > > >  wrote:
> > > >
> > > >> +1 from me (non-binding)
> > > >>
> > > >> Walker
> > > >>
> > > >> On Wed, May 11, 2022 at 12:36 PM Leah Thomas
> > >  > > >> >
> > > >> wrote:
> > > >>
> > > >> > Thanks Jim, great discussion. +1 from me (non-binding)
> > > >> >
> > > >> > Cheers,
> > > >> > Leah
> > > >> >
> > > >> > On Wed, May 11, 2022 at 10:14 AM Bill Bejeck 
> > > wrote:
> > > >> >
> > > >> > > Thanks for the KIP!
> > > >> > >
> > > >> > > +1 (binding)
> > > >> > >
> > > >> > > -Bill
> > > >> > >
> > > >> > > On Wed, May 11, 2022 at 9:36 AM Luke Chen 
>

[jira] [Created] (KAFKA-13998) JoinGroupRequestData 'reason' can be too large

2022-06-15 Thread Jim Hughes (Jira)
Jim Hughes created KAFKA-13998:
--

 Summary: JoinGroupRequestData 'reason' can be too large
 Key: KAFKA-13998
 URL: https://issues.apache.org/jira/browse/KAFKA-13998
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.2.0
Reporter: Jim Hughes
Assignee: Jim Hughes


We saw an exception like this: 

```org.apache.kafka.streams.errors.StreamsException: 
java.lang.RuntimeException: 'reason' field is too long to be serialized 3 at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:627)
 4 at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:551)
 5Caused by: java.lang.RuntimeException: 'reason' field is too long to be 
serialized 6 at 
org.apache.kafka.common.message.JoinGroupRequestData.addSize(JoinGroupRequestData.java:465)
 7 at 
org.apache.kafka.common.protocol.SendBuilder.buildSend(SendBuilder.java:218) 8 
at 
org.apache.kafka.common.protocol.SendBuilder.buildRequestSend(SendBuilder.java:187)
 9 at 
org.apache.kafka.common.requests.AbstractRequest.toSend(AbstractRequest.java:101)
 10 at org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:524) 11 
at org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:500) 12 at 
org.apache.kafka.clients.NetworkClient.send(NetworkClient.java:460) 13 at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.trySend(ConsumerNetworkClient.java:499)
 14 at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:255)
 15 at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
 16 at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
 17 at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:437)
 18 at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:371)
 19 at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:542)
 20 at 
org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1271)
 21 at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1235) 
22 at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1215) 
23 at 
org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:969)
 24 at 
org.apache.kafka.streams.processor.internals.StreamThread.pollPhase(StreamThread.java:917)
 25 at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:736)
 26 at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:589)
 27 ... 1 more```

This appears to be caused by the code passing an entire stack trace in the 
`rejoinReason`.  See 
https://github.com/apache/kafka/blob/3.2.0/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java#L481



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] Apache Kafka 3.3.0 Release

2022-06-03 Thread Jim Hughes
Hi José,

Can we add KIP-834 as well?  The vote has passed and I have PR in progress.

Cheers,

Jim

On Fri, Jun 3, 2022 at 1:05 PM Mickael Maison 
wrote:

> Hi José,
>
> Can we also add KIP-827? The vote has passed and I opened a PR today.
>
> Thanks,
> Mickael
>
> On Fri, Jun 3, 2022 at 5:43 PM David Jacot 
> wrote:
> >
> > Hi José,
> >
> > KIP-841 has been accepted. Could we add it to the release plan?
> >
> > Thanks,
> > David
> >
> > On Wed, May 11, 2022 at 11:04 PM Chris Egerton 
> wrote:
> > >
> > > Hi José,
> > >
> > > Could we add KIP-618 (
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors
> )
> > > to the release plan? It's been blocked on review for almost a year
> now. If
> > > we can't make time to review it before the 3.3.0 release I'll probably
> > > close my PRs and abandon the feature.
> > >
> > > Best,
> > >
> > > Chris
> > >
> > > On Wed, May 11, 2022 at 4:44 PM José Armando García Sancio
> > >  wrote:
> > >
> > > > Great.
> > > >
> > > > I went ahead and created a release page for 3.3.0:
> > > > https://cwiki.apache.org/confluence/x/-xahD
> > > >
> > > > The planned KIP content is based on the list of KIPs targeting 3.3.0
> > > > in https://cwiki.apache.org/confluence/x/4QwIAw. Please take a look
> at
> > > > the list and let me know if I missed your KIP.
> > > >
> > > > The 3.3.0 release plan page also enumerates the potential release
> > > > dates. Here is a summary of them:
> > > > 1. KIP Freeze June 15th, 2022
> > > > 2. Feature Freeze July 6th, 2022
> > > > 3. Code Freeze July 20th, 2022
> > > >
> > > > Thanks and let me know what you think,
> > > > -José
> > > >
>


Re: [VOTE] KIP-834: Pause / Resume KafkaStreams Topologies

2022-06-01 Thread Jim Hughes
Hi all,

While reviewing my PR for KIP-834, Bruno noticed a case that we may not
have discussed enough.*

During the discussion, we decided that standby tasks would be paused.  In
order to do this, there are changes to the StoreChangelogReader around
where it does restorations.  Bruno noticed that the restoration of active
tasks is not paused in my PR.

>From my point of view, I was hoping to let active tasks restore/consume/etc
in order that the Kafka Streams instance could transition to RUNNING
(assuming that it was started paused).  I believe Bruno's position is that
if we are pausing restoration for standby tasks, then restoration should be
paused for active tasks as well.

Since this point hasn't been discussed like this, the KIP is unclear about
this detail.

What do folks think?

Thanks in advance,

Jim

* https://github.com/apache/kafka/pull/12161#discussion_r886732983

On Mon, May 16, 2022 at 11:07 AM Jim Hughes  wrote:

> Hi all,
>
>
> With 5 binding votes (John, Bruno, Sophie, Matthias, Bill) and 4
> non-binding votes (Guozhang, Luke, Leah, Walker), the vote for KIP-834
> passes!
>
>
> Thanks all for the great discussion.
>
> I have a PR up here: https://github.com/apache/kafka/pull/12161
>
>
> Thanks in advance for feedback on the PR!
>
>
> Cheers,
>
>
> JIm
>
> On Fri, May 13, 2022 at 12:04 PM Walker Carlson
>  wrote:
>
>> +1 from me (non-binding)
>>
>> Walker
>>
>> On Wed, May 11, 2022 at 12:36 PM Leah Thomas > >
>> wrote:
>>
>> > Thanks Jim, great discussion. +1 from me (non-binding)
>> >
>> > Cheers,
>> > Leah
>> >
>> > On Wed, May 11, 2022 at 10:14 AM Bill Bejeck  wrote:
>> >
>> > > Thanks for the KIP!
>> > >
>> > > +1 (binding)
>> > >
>> > > -Bill
>> > >
>> > > On Wed, May 11, 2022 at 9:36 AM Luke Chen  wrote:
>> > >
>> > > > Hi Jim,
>> > > >
>> > > > I'm +1. (please add some note in KIP about the stream resetting tool
>> > > can't
>> > > > be used in paused state)
>> > > > Thanks for the KIP!
>> > > >
>> > > > Luke
>> > > >
>> > > > On Wed, May 11, 2022 at 9:09 AM Guozhang Wang 
>> > > wrote:
>> > > >
>> > > > > Thanks Jim. +1 from me.
>> > > > >
>> > > > > On Tue, May 10, 2022 at 4:51 PM Matthias J. Sax > >
>> > > > wrote:
>> > > > >
>> > > > > > I had one minor question on the discuss thread. It's mainly
>> about
>> > > > > > clarifying and document the user contract. I am fine either way.
>> > > > > >
>> > > > > > +1 (binding)
>> > > > > >
>> > > > > >
>> > > > > > -Matthias
>> > > > > >
>> > > > > > On 5/10/22 12:32 PM, Sophie Blee-Goldman wrote:
>> > > > > > > Thanks for the KIP! +1 (binding)
>> > > > > > >
>> > > > > > > On Tue, May 10, 2022, 12:24 PM Bruno Cadonna <
>> cado...@apache.org
>> > >
>> > > > > wrote:
>> > > > > > >
>> > > > > > >> Thanks Jim,
>> > > > > > >>
>> > > > > > >> +1 (binding)
>> > > > > > >>
>> > > > > > >> Best,
>> > > > > > >> Bruno
>> > > > > > >>
>> > > > > > >> On 10.05.22 21:19, John Roesler wrote:
>> > > > > > >>> Thanks Jim,
>> > > > > > >>>
>> > > > > > >>> I’m +1 (binding)
>> > > > > > >>>
>> > > > > > >>> -John
>> > > > > > >>>
>> > > > > > >>> On Tue, May 10, 2022, at 14:05, Jim Hughes wrote:
>> > > > > > >>>> Hi all,
>> > > > > > >>>>
>> > > > > > >>>> I'm asking for a vote on KIP-834:
>> > > > > > >>>>
>> > > > > > >>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211882832
>> > > > > > >>>>
>> > > > > > >>>> Thanks in advance!
>> > > > > > >>>>
>> > > > > > >>>> Jim
>> > > > > > >>
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > -- Guozhang
>> > > > >
>> > > >
>> > >
>> >
>>
>


Re: [VOTE] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-16 Thread Jim Hughes
Hi all,


With 5 binding votes (John, Bruno, Sophie, Matthias, Bill) and 4
non-binding votes (Guozhang, Luke, Leah, Walker), the vote for KIP-834
passes!


Thanks all for the great discussion.

I have a PR up here: https://github.com/apache/kafka/pull/12161


Thanks in advance for feedback on the PR!


Cheers,


JIm

On Fri, May 13, 2022 at 12:04 PM Walker Carlson
 wrote:

> +1 from me (non-binding)
>
> Walker
>
> On Wed, May 11, 2022 at 12:36 PM Leah Thomas  >
> wrote:
>
> > Thanks Jim, great discussion. +1 from me (non-binding)
> >
> > Cheers,
> > Leah
> >
> > On Wed, May 11, 2022 at 10:14 AM Bill Bejeck  wrote:
> >
> > > Thanks for the KIP!
> > >
> > > +1 (binding)
> > >
> > > -Bill
> > >
> > > On Wed, May 11, 2022 at 9:36 AM Luke Chen  wrote:
> > >
> > > > Hi Jim,
> > > >
> > > > I'm +1. (please add some note in KIP about the stream resetting tool
> > > can't
> > > > be used in paused state)
> > > > Thanks for the KIP!
> > > >
> > > > Luke
> > > >
> > > > On Wed, May 11, 2022 at 9:09 AM Guozhang Wang 
> > > wrote:
> > > >
> > > > > Thanks Jim. +1 from me.
> > > > >
> > > > > On Tue, May 10, 2022 at 4:51 PM Matthias J. Sax 
> > > > wrote:
> > > > >
> > > > > > I had one minor question on the discuss thread. It's mainly about
> > > > > > clarifying and document the user contract. I am fine either way.
> > > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > >
> > > > > > -Matthias
> > > > > >
> > > > > > On 5/10/22 12:32 PM, Sophie Blee-Goldman wrote:
> > > > > > > Thanks for the KIP! +1 (binding)
> > > > > > >
> > > > > > > On Tue, May 10, 2022, 12:24 PM Bruno Cadonna <
> cado...@apache.org
> > >
> > > > > wrote:
> > > > > > >
> > > > > > >> Thanks Jim,
> > > > > > >>
> > > > > > >> +1 (binding)
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Bruno
> > > > > > >>
> > > > > > >> On 10.05.22 21:19, John Roesler wrote:
> > > > > > >>> Thanks Jim,
> > > > > > >>>
> > > > > > >>> I’m +1 (binding)
> > > > > > >>>
> > > > > > >>> -John
> > > > > > >>>
> > > > > > >>> On Tue, May 10, 2022, at 14:05, Jim Hughes wrote:
> > > > > > >>>> Hi all,
> > > > > > >>>>
> > > > > > >>>> I'm asking for a vote on KIP-834:
> > > > > > >>>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211882832
> > > > > > >>>>
> > > > > > >>>> Thanks in advance!
> > > > > > >>>>
> > > > > > >>>> Jim
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-11 Thread Jim Hughes
Hi Luke, John,

Thanks for bringing up this and also sorting it out!

I have added a note to the KIP.

Thanks,

Jim

On Wed, May 11, 2022 at 9:34 AM Luke Chen  wrote:

> Thanks John!
> It makes sense.
> I have no other questions as long as it is documented in the KIP.
>
> Thank you.
> Luke
>
> On Wed, May 11, 2022 at 9:15 PM John Roesler  wrote:
>
> > Hi Luke,
> >
> > It’s not my KIP, but my two cents is that users should not run the reset
> > tool while the application is paused.
> >
> > The reset tool should only be run while the whole app is shut down
> because
> > it messes with a lot of internal state bits without synchronization.
> > Leaving the app running (even while pausing processing) will result in
> the
> > app being in an undefined state, as the members and the tool will be
> > simultaneously trying to set the committed offsets to different values,
> etc.
> >
> > Jim, can you also make it a point to document this? As Luke points out,
> it
> > might be a natural thing to want to do.
> >
> > Thanks,
> > John
> >
> > On Wed, May 11, 2022, at 02:19, Luke Chen wrote:
> > > Hi Jim,
> > >
> > > Thanks for the KIP. Overall LGTM!
> > >
> > > One late question:
> > > Could we run the stream resetter tool (i.e.
> > > kafka-streams-application-reset.sh) during pause state?
> > > I can imagine there's a use case that after pausing for a while, user
> > just
> > > want to continue with the latest offset, and skipping the intermediate
> > > records.
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Wed, May 11, 2022 at 10:12 AM Jim Hughes
>  > >
> > > wrote:
> > >
> > >> Hi Matthias,
> > >>
> > >> I like it.  I've updated the KIP to reflect that detail; I put the
> > details
> > >> in the docs for pause.
> > >>
> > >> Cheers,
> > >>
> > >> Jim
> > >>
> > >> On Tue, May 10, 2022 at 7:51 PM Matthias J. Sax 
> > wrote:
> > >>
> > >> > Thanks for the KIP. Overall LGTM.
> > >> >
> > >> > Can we clarify one question: would it be allowed to call `pause()`
> > >> > before calling `start()`? I don't see any reason why we would need
> to
> > >> > disallow it?
> > >> >
> > >> > It could be helpful to start a KafkaStreams client in paused state
> --
> > >> > otherwise there is a race between calling `start()` and calling
> > >> `pause()`.
> > >> >
> > >> > If we allow it, we should clearly document it.
> > >> >
> > >> >
> > >> > -Matthias
> > >> >
> > >> > On 5/10/22 12:04 PM, Jim Hughes wrote:
> > >> > > Hi Bill, all,
> > >> > >
> > >> > > Thank you.  I've updated the KIP to reflect pausing standby tasks
> as
> > >> > well.
> > >> > > I think all the outstanding points have been addressed and I'm
> > going to
> > >> > > start the vote thread!
> > >> > >
> > >> > > Cheers,
> > >> > >
> > >> > > Jim
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Tue, May 10, 2022 at 2:43 PM Bill Bejeck 
> > wrote:
> > >> > >
> > >> > >> Hi Jim,
> > >> > >>
> > >> > >> After reading the comments on the KIP, I agree that it makes
> sense
> > to
> > >> > pause
> > >> > >> all activities and any changes can be made later on.
> > >> > >>
> > >> > >> Thanks,
> > >> > >> Bill
> > >> > >>
> > >> > >> On Tue, May 10, 2022 at 4:03 AM Bruno Cadonna <
> cado...@apache.org>
> > >> > wrote:
> > >> > >>
> > >> > >>> Hi Jim,
> > >> > >>>
> > >> > >>> Thanks for the KIP!
> > >> > >>>
> > >> > >>> I am fine with the KIP in general.
> > >> > >>>
> > >> > >>> However, I am with Sophie and John to also pause the standbys
> for
> > the
> > >> > >>> reasons they brought up. Is there a specific reason you want to
> > keep

Re: [DISCUSS] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-10 Thread Jim Hughes
Hi Matthias,

I like it.  I've updated the KIP to reflect that detail; I put the details
in the docs for pause.

Cheers,

Jim

On Tue, May 10, 2022 at 7:51 PM Matthias J. Sax  wrote:

> Thanks for the KIP. Overall LGTM.
>
> Can we clarify one question: would it be allowed to call `pause()`
> before calling `start()`? I don't see any reason why we would need to
> disallow it?
>
> It could be helpful to start a KafkaStreams client in paused state --
> otherwise there is a race between calling `start()` and calling `pause()`.
>
> If we allow it, we should clearly document it.
>
>
> -Matthias
>
> On 5/10/22 12:04 PM, Jim Hughes wrote:
> > Hi Bill, all,
> >
> > Thank you.  I've updated the KIP to reflect pausing standby tasks as
> well.
> > I think all the outstanding points have been addressed and I'm going to
> > start the vote thread!
> >
> > Cheers,
> >
> > Jim
> >
> >
> >
> > On Tue, May 10, 2022 at 2:43 PM Bill Bejeck  wrote:
> >
> >> Hi Jim,
> >>
> >> After reading the comments on the KIP, I agree that it makes sense to
> pause
> >> all activities and any changes can be made later on.
> >>
> >> Thanks,
> >> Bill
> >>
> >> On Tue, May 10, 2022 at 4:03 AM Bruno Cadonna 
> wrote:
> >>
> >>> Hi Jim,
> >>>
> >>> Thanks for the KIP!
> >>>
> >>> I am fine with the KIP in general.
> >>>
> >>> However, I am with Sophie and John to also pause the standbys for the
> >>> reasons they brought up. Is there a specific reason you want to keep
> >>> standbys going? It feels like premature optimization to me. We can
> still
> >>> add keeping standby running in a follow up if needed.
> >>>
> >>> Best,
> >>> Bruno
> >>>
> >>> On 10.05.22 05:15, Sophie Blee-Goldman wrote:
> >>>> Thanks Jim, just one note/question on the standby tasks:
> >>>>
> >>>> At the minute, my moderately held position is that standby tasks ought
> >> to
> >>>>> continue reading and remain caught up.  If standby tasks would run
> out
> >>> of
> >>>>> space, there are probably bigger problems.
> >>>>
> >>>>
> >>>> For a single node application, or when the #pause API is invoked on
> all
> >>>> instances,
> >>>> then there won't be any further active processing and thus nothing to
> >>> keep
> >>>> up with,
> >>>> right? So for that case, it's just a matter of whether any standbys
> >> that
> >>>> are lagging
> >>>> will have the chance to catch up to the (paused) active task state
> >> before
> >>>> they stop
> >>>> as well, in which case having them continue feels fine to me. However
> >>> this
> >>>> is a
> >>>> relatively trivial benefit and I would only consider it as a deciding
> >>>> factor when all
> >>>> things are equal otherwise.
> >>>>
> >>>> My concern is the more interesting case: when this feature is used to
> >>> pause
> >>>> only
> >>>> one nodes, or some subset of the overall application. In this case,
> >> yes,
> >>>> the standby
> >>>> tasks will indeed fall out of sync. But the only reason I can imagine
> >>>> someone using
> >>>> the pause feature in such a way is because there is something going
> >>> wrong,
> >>>> or about
> >>>> to go wrong, on that particular node. For example as mentioned above,
> >> if
> >>>> the user
> >>>> wants to cut down on costs without stopping everything, or if the node
> >> is
> >>>> about to
> >>>> run out of disk or needs to be debugged or so on. And in this case,
> >>>> continuing to
> >>>> process the standby tasks while other instances continue to run would
> >>>> pretty much
> >>>> defeat the purpose of pausing it entirely, and might have unpleasant
> >>>> consequences
> >>>> for the unsuspecting developer.
> >>>>
> >>>> All that said, I don't want to block this KIP so if you have strong
> >>>> feelings about the
> >>>> standby behavior I'm happy to back down. I'm only pushing back now
> >>> because
> >

[VOTE] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-10 Thread Jim Hughes
Hi all,

I'm asking for a vote on KIP-834:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211882832

Thanks in advance!

Jim


Re: [DISCUSS] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-10 Thread Jim Hughes
Hi Bill, all,

Thank you.  I've updated the KIP to reflect pausing standby tasks as well.
I think all the outstanding points have been addressed and I'm going to
start the vote thread!

Cheers,

Jim



On Tue, May 10, 2022 at 2:43 PM Bill Bejeck  wrote:

> Hi Jim,
>
> After reading the comments on the KIP, I agree that it makes sense to pause
> all activities and any changes can be made later on.
>
> Thanks,
> Bill
>
> On Tue, May 10, 2022 at 4:03 AM Bruno Cadonna  wrote:
>
> > Hi Jim,
> >
> > Thanks for the KIP!
> >
> > I am fine with the KIP in general.
> >
> > However, I am with Sophie and John to also pause the standbys for the
> > reasons they brought up. Is there a specific reason you want to keep
> > standbys going? It feels like premature optimization to me. We can still
> > add keeping standby running in a follow up if needed.
> >
> > Best,
> > Bruno
> >
> > On 10.05.22 05:15, Sophie Blee-Goldman wrote:
> > > Thanks Jim, just one note/question on the standby tasks:
> > >
> > > At the minute, my moderately held position is that standby tasks ought
> to
> > >> continue reading and remain caught up.  If standby tasks would run out
> > of
> > >> space, there are probably bigger problems.
> > >
> > >
> > > For a single node application, or when the #pause API is invoked on all
> > > instances,
> > > then there won't be any further active processing and thus nothing to
> > keep
> > > up with,
> > > right? So for that case, it's just a matter of whether any standbys
> that
> > > are lagging
> > > will have the chance to catch up to the (paused) active task state
> before
> > > they stop
> > > as well, in which case having them continue feels fine to me. However
> > this
> > > is a
> > > relatively trivial benefit and I would only consider it as a deciding
> > > factor when all
> > > things are equal otherwise.
> > >
> > > My concern is the more interesting case: when this feature is used to
> > pause
> > > only
> > > one nodes, or some subset of the overall application. In this case,
> yes,
> > > the standby
> > > tasks will indeed fall out of sync. But the only reason I can imagine
> > > someone using
> > > the pause feature in such a way is because there is something going
> > wrong,
> > > or about
> > > to go wrong, on that particular node. For example as mentioned above,
> if
> > > the user
> > > wants to cut down on costs without stopping everything, or if the node
> is
> > > about to
> > > run out of disk or needs to be debugged or so on. And in this case,
> > > continuing to
> > > process the standby tasks while other instances continue to run would
> > > pretty much
> > > defeat the purpose of pausing it entirely, and might have unpleasant
> > > consequences
> > > for the unsuspecting developer.
> > >
> > > All that said, I don't want to block this KIP so if you have strong
> > > feelings about the
> > > standby behavior I'm happy to back down. I'm only pushing back now
> > because
> > > it
> > > felt like there wasn't any particular motivation for the standbys to
> > > continue processing
> > > or not, and I figured I'd try to fill in this gap with my thoughts on
> the
> > > matter :)
> > > Either way we should just make sure that this behavior is documented
> > > clearly,
> > > since it may be surprising if we decide to only pause active processing
> > > (another option
> > > is to rename the method something like #pauseProcessing or
> > > #pauseActiveProcessing
> > > so that it's hard to miss).
> > >
> > > Thanks! Sorry for the lengthy response, but hopefully we won't need to
> > > debate this any
> > > further. Beyond this I'm satisfied with the latest proposal
> > >
> > > On Mon, May 9, 2022 at 5:16 PM John Roesler 
> wrote:
> > >
> > >> Thanks for the updates, Jim!
> > >>
> > >> After this discussion and your updates, this KIP looks good to me.
> > >>
> > >> Thanks,
> > >> John
> > >>
> > >> On Mon, May 9, 2022, at 17:52, Jim Hughes wrote:
> > >>> Hi Sophie, all,
> > >>>
> > >>> I've updated the KIP with feedback from the discussion so far:
> > >>>
> > >>
> >
> https:/

Re: [DISCUSS] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-09 Thread Jim Hughes
this
> > is what you have in mind. We can leave the question of finer controlled
> > pausing behavior for later when we have named topology being exposed via
> > another KIP.
> >
> >
> > Guozhang
> >
> > On Mon, May 9, 2022 at 7:50 AM John Roesler  wrote:
> >
> > > Hi Jim,
> > >
> > > Thanks for the replies. This all sounds good to me. Just two further
> > > comments:
> > >
> > > 3. It seems like you should aim for the simplest semantics. If the
> intent
> > > is to “pause” the instance, then you’d better pause the whole instance.
> > If
> > > you leave punctuations and standbys running, I expect we’d see bug
> > reports
> > > come in that the instance isn’t really paused.
> > >
> > > 5. Since you won the race to write a KIP, I don’t think it makes too
> much
> > > sense to worry too much about modular topologies. When they propose
> their
> > > KIP, they will have to specify a lot of state management behavior, and
> > > pause/resume will have to be part of it. If they have some concern
> about
> > > your KIP, they’ll chime in. It doesn’t make sense for you to try and
> > guess
> > > what that proposal will look like.
> > >
> > > To be honest, you’re proposing a KafkaStreams runtime-level
> pause/resume
> > > function, not a topology-level one anyway, so it seems pretty clear
> that
> > it
> > > would pause the whole runtime (of a single instance) regardless of any
> > > modular topologies. If the intent is to pause individual topologies in
> > the
> > > future, you’d need a different API anyway.
> > >
> > > Thanks!
> > > -John
> > >
> > > On Mon, May 9, 2022, at 08:10, Jim Hughes wrote:
> > > > Hi John,
> > > >
> > > > Long emails are great; responding inline!
> > > >
> > > > On Sat, May 7, 2022 at 4:54 PM John Roesler 
> > wrote:
> > > >
> > > >> Thanks for the KIP, Jim!
> > > >>
> > > >> This conversation seems to highlight that the KIP needs to specify
> > > >> some of its behavior as well as its APIs, where the behavior is
> > > >> observable and significant to users.
> > > >>
> > > >> For example:
> > > >>
> > > >> 1. Do you plan to have a guarantee that immediately after
> > > >> calling KafkaStreams.pause(), users should observe that the instance
> > > >> stops processing new records? Or should they expect that the threads
> > > >> will continue to process some records and pause asynchronously
> > > >> (you already answered this in the thread earlier)?
> > > >>
> > > >
> > > > I'm happy to build up to a guarantee of sorts.  My current idea is
> that
> > > > pause() does not do anything "exceptional" to get control back from a
> > > > running topology.  A currently running topology would get to complete
> > its
> > > > loop.
> > > >
> > > > Separately, I'm still piecing together how commits work.  By some
> > > > mechanism, after a pause, I do agree that the topology needs to
> commit
> > > its
> > > > work in some manner.
> > > >
> > > >
> > > >> 2. Will the threads continue to poll new records until they
> naturally
> > > fill
> > > >> up the task buffers, or will they immediately pause their Consumers
> > > >> as well?
> > > >>
> > > >
> > > > Presently, I'm suggesting that consumers would fill up their buffers.
> > > >
> > > >
> > > >> 3. Will threads continue to call (system time) punctuators, or would
> > > >> punctuations also be paused?
> > > >>
> > > >
> > > > In my first pass at thinking through this, I left the punctuators
> > > running.
> > > > To be honest, I'm not sure what they do, so my approach is either
> lucky
> > > and
> > > > correct or it could be Very Clearly Wrong.;)
> > > >
> > > >
> > > >> I realize that some of those questions simply may not have occurred
> to
> > > >> you, so this is not a criticism for leaving them off; I'm just
> > pointing
> > > out
> > > >> that although we don't tend to mention implementation details in
> KIPs,
> > > >> we als

Re: [DISCUSS] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-09 Thread Jim Hughes
s a need to run commits before transitioning all the way to PAUSED.)



> And that's all I have to say about that. I hope you don't find my
> long message offputting. I'm fundamentally in favor of your KIP,
> and I think with a little more explanation in the KIP, and a few
> small tweaks to the proposal, we'll be able to provide good
> ergonomics to our users.
>

Thanks!

Jim


> Thanks,
> -John
>
> On Sat, May 7, 2022, at 00:06, Guozhang Wang wrote:
> > I'm in favor of the "just pausing the instance itself“ option as well. As
> > for EOS, the point is that when the processing is paused, we would not
> > trigger any `producer.send` during the time, and the transaction timeout
> is
> > sort of relying on that behavior, so my point was that it's probably
> better
> > to also commit the processing before we pause it.
> >
> >
> > Guozhang
> >
> > On Fri, May 6, 2022 at 6:12 PM Jim Hughes 
> > wrote:
> >
> >> Hi Matthias,
> >>
> >> Since the only thing which will be paused is processing the topology, I
> >> think we can let commits happen naturally.
> >>
> >> Good point about getting the paused state to new members; it is seeming
> >> like the "building block" approach is a good one to keep things simple
> at
> >> first.
> >>
> >> Cheers,
> >>
> >> Jim
> >>
> >> On Fri, May 6, 2022 at 8:31 PM Matthias J. Sax 
> wrote:
> >>
> >> > I think it's tricky to propagate a pauseAll() via the rebalance
> >> > protocol. New members joining the group would need to get paused, too?
> >> > Could there be weird race conditions with overlapping pauseAll() and
> >> > resumeAll() calls on different instanced while there could be a
> errors /
> >> > network partitions or similar?
> >> >
> >> > I would argue that similar to IQ, we provide the basic building
> blocks,
> >> > and leave it the user users to implement cross instance management
> for a
> >> > pauseAll() scenario. -- Also, if there is really demand, we can always
> >> > add pauseAll()/resumeAll() as follow up work.
> >> >
> >> > About named typologies: I agree to Jim to not include them in this KIP
> >> > as they are not a public feature yet. If we make named typologies
> >> > public, the corresponding KIP should extend the pause/resume feature
> >> > (ie, APIs) accordingly. Of course, the code can (and should) already
> be
> >> > setup to support it to be future proof.
> >> >
> >> > Good call out about commit and EOS -- to simplify it, I think it might
> >> > be good to commit also for the at-least-once case?
> >> >
> >> >
> >> > -Matthias
> >> >
> >> >
> >> > On 5/6/22 1:05 PM, Jim Hughes wrote:
> >> > > Hi Bill,
> >> > >
> >> > > Great questions; I'll do my best to reply inline:
> >> > >
> >> > > On Fri, May 6, 2022 at 3:21 PM Bill Bejeck 
> wrote:
> >> > >
> >> > >> Hi Jim,
> >> > >>
> >> > >> Thanks for the KIP.  I have a couple of meta-questions as well:
> >> > >>
> >> > >> 1) Regarding pausing only a subset of running instances, I'm
> thinking
> >> > there
> >> > >> may be a use case for pausing all of them.
> >> > >> Would it make sense to also allow for pausing all instances by
> >> > adding a
> >> > >> method `pauseAll()` or something similar?
> >> > >>
> >> > >
> >> > > Honestly, I'm indifferent on this point.  Presently, I think what I
> >> have
> >> > > proposed is the minimal change to get the ability to pause and
> resume
> >> > > processing.  If adding a 'pauseAll()' is required, I'd be happy to
> do
> >> > that!
> >> > >
> >> > >  From Guozhang's email, it sounds like this would require using the
> >> > > rebalance protocol to trigger the coordination.  Would there be
> enough
> >> > room
> >> > > in that approach to indicate that a named topology is to be paused
> >> across
> >> > > all nodes?
> >> > >
> >> > >
> >> > >> 2) Would pausing affect standby tasks?  For example, imagine there
> >> are 3
> >> > >> instances A, B, and C.
> >&g

Re: [DISCUSS] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-06 Thread Jim Hughes
Hi Matthias,

Since the only thing which will be paused is processing the topology, I
think we can let commits happen naturally.

Good point about getting the paused state to new members; it is seeming
like the "building block" approach is a good one to keep things simple at
first.

Cheers,

Jim

On Fri, May 6, 2022 at 8:31 PM Matthias J. Sax  wrote:

> I think it's tricky to propagate a pauseAll() via the rebalance
> protocol. New members joining the group would need to get paused, too?
> Could there be weird race conditions with overlapping pauseAll() and
> resumeAll() calls on different instanced while there could be a errors /
> network partitions or similar?
>
> I would argue that similar to IQ, we provide the basic building blocks,
> and leave it the user users to implement cross instance management for a
> pauseAll() scenario. -- Also, if there is really demand, we can always
> add pauseAll()/resumeAll() as follow up work.
>
> About named typologies: I agree to Jim to not include them in this KIP
> as they are not a public feature yet. If we make named typologies
> public, the corresponding KIP should extend the pause/resume feature
> (ie, APIs) accordingly. Of course, the code can (and should) already be
> setup to support it to be future proof.
>
> Good call out about commit and EOS -- to simplify it, I think it might
> be good to commit also for the at-least-once case?
>
>
> -Matthias
>
>
> On 5/6/22 1:05 PM, Jim Hughes wrote:
> > Hi Bill,
> >
> > Great questions; I'll do my best to reply inline:
> >
> > On Fri, May 6, 2022 at 3:21 PM Bill Bejeck  wrote:
> >
> >> Hi Jim,
> >>
> >> Thanks for the KIP.  I have a couple of meta-questions as well:
> >>
> >> 1) Regarding pausing only a subset of running instances, I'm thinking
> there
> >> may be a use case for pausing all of them.
> >> Would it make sense to also allow for pausing all instances by
> adding a
> >> method `pauseAll()` or something similar?
> >>
> >
> > Honestly, I'm indifferent on this point.  Presently, I think what I have
> > proposed is the minimal change to get the ability to pause and resume
> > processing.  If adding a 'pauseAll()' is required, I'd be happy to do
> that!
> >
> >  From Guozhang's email, it sounds like this would require using the
> > rebalance protocol to trigger the coordination.  Would there be enough
> room
> > in that approach to indicate that a named topology is to be paused across
> > all nodes?
> >
> >
> >> 2) Would pausing affect standby tasks?  For example, imagine there are 3
> >> instances A, B, and C.
> >> A user elects to pause instance C only but it hosts the standby
> tasks
> >> for A.
> >> Would the standby tasks on the paused application continue to read
> from
> >> the changelog topic?
> >>
> >
> > Yes, standby tasks would continue reading from the changelog topic.  All
> > consumers would continue reading to avoid getting dropped from their
> > consumer groups.
> >
> > Cheers,
> >
> > Jim
> >
> >
> >
> >
> >> Thanks!
> >> Bill
> >>
> >>
> >> On Fri, May 6, 2022 at 2:44 PM Jim Hughes  >
> >> wrote:
> >>
> >>> Hi Guozhang,
> >>>
> >>> Thanks for the feedback; responses inline below:
> >>>
> >>> On Fri, May 6, 2022 at 1:09 PM Guozhang Wang 
> wrote:
> >>>
> >>>> Hello Jim,
> >>>>
> >>>> Thanks for the proposed KIP. I have some meta questions about it:
> >>>>
> >>>> 1) Would an instance always pause/resume all of its current owned
> >>>> topologies (i.e. the named topologies), or are there any scenarios
> >> where
> >>> we
> >>>> only want to pause/resume a subset of them?
> >>>>
> >>>
> >>> An instance may wish to pause some of its named topologies.  I was
> unsure
> >>> what to say about named topologies in the KIP since they seem to be an
> >>> internal detail at the moment.
> >>>
> >>> I intend to add to KafkaStreamsNamedTopologyWrapper methods like:
> >>>  public void pauseNamedTopology(final String topologyToPause)
> >>>  public boolean isNamedTopologyPaused(final String topology)
> >>>  public void resumeNamedTopology(final String topologyToResume)
> >>>
> >>>
> >>>
> >>>> 2) From a user's perspective, do we want to al

Re: [DISCUSS] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-06 Thread Jim Hughes
Hi Bill,

Great questions; I'll do my best to reply inline:

On Fri, May 6, 2022 at 3:21 PM Bill Bejeck  wrote:

> Hi Jim,
>
> Thanks for the KIP.  I have a couple of meta-questions as well:
>
> 1) Regarding pausing only a subset of running instances, I'm thinking there
> may be a use case for pausing all of them.
>Would it make sense to also allow for pausing all instances by adding a
> method `pauseAll()` or something similar?
>

Honestly, I'm indifferent on this point.  Presently, I think what I have
proposed is the minimal change to get the ability to pause and resume
processing.  If adding a 'pauseAll()' is required, I'd be happy to do that!

>From Guozhang's email, it sounds like this would require using the
rebalance protocol to trigger the coordination.  Would there be enough room
in that approach to indicate that a named topology is to be paused across
all nodes?


> 2) Would pausing affect standby tasks?  For example, imagine there are 3
> instances A, B, and C.
>A user elects to pause instance C only but it hosts the standby tasks
> for A.
>Would the standby tasks on the paused application continue to read from
> the changelog topic?
>

Yes, standby tasks would continue reading from the changelog topic.  All
consumers would continue reading to avoid getting dropped from their
consumer groups.

Cheers,

Jim




> Thanks!
> Bill
>
>
> On Fri, May 6, 2022 at 2:44 PM Jim Hughes 
> wrote:
>
> > Hi Guozhang,
> >
> > Thanks for the feedback; responses inline below:
> >
> > On Fri, May 6, 2022 at 1:09 PM Guozhang Wang  wrote:
> >
> > > Hello Jim,
> > >
> > > Thanks for the proposed KIP. I have some meta questions about it:
> > >
> > > 1) Would an instance always pause/resume all of its current owned
> > > topologies (i.e. the named topologies), or are there any scenarios
> where
> > we
> > > only want to pause/resume a subset of them?
> > >
> >
> > An instance may wish to pause some of its named topologies.  I was unsure
> > what to say about named topologies in the KIP since they seem to be an
> > internal detail at the moment.
> >
> > I intend to add to KafkaStreamsNamedTopologyWrapper methods like:
> > public void pauseNamedTopology(final String topologyToPause)
> > public boolean isNamedTopologyPaused(final String topology)
> > public void resumeNamedTopology(final String topologyToResume)
> >
> >
> >
> > > 2) From a user's perspective, do we want to always issue a
> `pause/resume`
> > > to all the instances or not? For example, we can define the semantics
> of
> > > the function as "you only need to call this function on any of the
> > > application's instances, and all instances would then pause (via the
> > > rebalance error codes)", or as "you would call this function for all
> the
> > > instances of an application". Which one are you referring to?
> > >
> >
> > My initial intent is that one would call this function on any instances
> of
> > the application that one wishes to pause.  This should allow more control
> > (in case one wanted to pause a portion of the instances).  On the other
> > hand, this approach would put more work on the implementer to coordinate
> > calling pause or resume across instances.
> >
> > If the other option is more suitable, happy to do that instead.
> >
> >
> > > 3) With EOS, there's a transaction timeout which would determine how
> > long a
> > > transaction can stay idle before it's force-aborted on the broker
> side. I
> > > think when a pause is issued, that means we'd need to immediately
> commit
> > > the current transaction for EOS since we do not know how long we could
> > > pause for. Is that right? If yes could you please clarify that in the
> doc
> > > as well.
> > >
> >
> > Good point.  My intent is for pause() to wait for the next iteration
> > through `runOnce()` and then only skip over the processing for paused
> tasks
> > in `taskManager.process(numIterations, time)`.
> >
> > Do commits live inside that call or do they live across/outside of it?
> In
> > the former case, I think there shouldn't be any issues with EOS.
> > Otherwise, we may need to work through some details to get EOS right.
> >
> > Once we figure that out, I can update the KIP.
> >
> > Thanks,
> >
> > Jim
> >
> >
> >
> > >
> > >
> > > Guozhang
> > >
> > >
> > >
> > > On Wed, May 4, 2022 at 10:51 AM Jim Hughes
>  > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I have written up a KIP for adding the ability to pause and resume
> the
> > > > processing of a topology in AK Streams.  The KIP is here:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211882832
> > > >
> > > > Thanks in advance for your feedback!
> > > >
> > > > Cheers,
> > > >
> > > > Jim
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>


Re: [DISCUSS] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-06 Thread Jim Hughes
Hi Guozhang,

Thanks for the feedback; responses inline below:

On Fri, May 6, 2022 at 1:09 PM Guozhang Wang  wrote:

> Hello Jim,
>
> Thanks for the proposed KIP. I have some meta questions about it:
>
> 1) Would an instance always pause/resume all of its current owned
> topologies (i.e. the named topologies), or are there any scenarios where we
> only want to pause/resume a subset of them?
>

An instance may wish to pause some of its named topologies.  I was unsure
what to say about named topologies in the KIP since they seem to be an
internal detail at the moment.

I intend to add to KafkaStreamsNamedTopologyWrapper methods like:
public void pauseNamedTopology(final String topologyToPause)
public boolean isNamedTopologyPaused(final String topology)
public void resumeNamedTopology(final String topologyToResume)



> 2) From a user's perspective, do we want to always issue a `pause/resume`
> to all the instances or not? For example, we can define the semantics of
> the function as "you only need to call this function on any of the
> application's instances, and all instances would then pause (via the
> rebalance error codes)", or as "you would call this function for all the
> instances of an application". Which one are you referring to?
>

My initial intent is that one would call this function on any instances of
the application that one wishes to pause.  This should allow more control
(in case one wanted to pause a portion of the instances).  On the other
hand, this approach would put more work on the implementer to coordinate
calling pause or resume across instances.

If the other option is more suitable, happy to do that instead.


> 3) With EOS, there's a transaction timeout which would determine how long a
> transaction can stay idle before it's force-aborted on the broker side. I
> think when a pause is issued, that means we'd need to immediately commit
> the current transaction for EOS since we do not know how long we could
> pause for. Is that right? If yes could you please clarify that in the doc
> as well.
>

Good point.  My intent is for pause() to wait for the next iteration
through `runOnce()` and then only skip over the processing for paused tasks
in `taskManager.process(numIterations, time)`.

Do commits live inside that call or do they live across/outside of it?  In
the former case, I think there shouldn't be any issues with EOS.
Otherwise, we may need to work through some details to get EOS right.

Once we figure that out, I can update the KIP.

Thanks,

Jim



>
>
> Guozhang
>
>
>
> On Wed, May 4, 2022 at 10:51 AM Jim Hughes 
> wrote:
>
> > Hi all,
> >
> > I have written up a KIP for adding the ability to pause and resume the
> > processing of a topology in AK Streams.  The KIP is here:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211882832
> >
> > Thanks in advance for your feedback!
> >
> > Cheers,
> >
> > Jim
> >
>
>
> --
> -- Guozhang
>


[DISCUSS] KIP-834: Pause / Resume KafkaStreams Topologies

2022-05-04 Thread Jim Hughes
Hi all,

I have written up a KIP for adding the ability to pause and resume the
processing of a topology in AK Streams.  The KIP is here:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211882832

Thanks in advance for your feedback!

Cheers,

Jim


[jira] [Created] (KAFKA-13873) Add ability to Pause / Resume KafkaStreams Topologies

2022-05-04 Thread Jim Hughes (Jira)
Jim Hughes created KAFKA-13873:
--

 Summary: Add ability to Pause / Resume KafkaStreams Topologies
 Key: KAFKA-13873
 URL: https://issues.apache.org/jira/browse/KAFKA-13873
 Project: Kafka
  Issue Type: New Feature
Reporter: Jim Hughes


In order to reduce resources used or modify data pipelines, users may want to 
pause processing temporarily.  Presently, this would require stopping the 
entire KafkaStreams instance (or instances).  

This work would add the ability to pause and resume topologies.  When the need 
to pause processing has passed, then users should be able to resume processing.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: Developer access

2022-04-04 Thread Jim Hughes
Hi Bill,

Thank you very much!

Cheers,

Jim

On Mon, Apr 4, 2022 at 4:58 PM Bill Bejeck  wrote:

> Hey Jim,
>
> You're all set now on Jira and the Wiki.
>
> -Bill
>
>
>
> On Mon, Apr 4, 2022 at 4:52 PM Jim Hughes 
> wrote:
>
> > Hi all,
> >
> > Could I get access to the Kafka project in JIRA and on the Wiki?  My
> > username for each is jhughes.
> >
> > Thanks in advance!
> >
> > Cheers,
> >
> > Jim
> >
>


Developer access

2022-04-04 Thread Jim Hughes
Hi all,

Could I get access to the Kafka project in JIRA and on the Wiki?  My
username for each is jhughes.

Thanks in advance!

Cheers,

Jim