Given that the conversation has lingered for a bit, I've gone ahead and
opened up a PR with the initial implementation. Let me know your thoughts!

https://github.com/apache/kafka/pull/5002

Also, voting is open - so if you like this idea please send me some binding
+1's before May 22nd so we can get this in Kafka 2.0 :)

On Tue, Apr 17, 2018 at 7:11 PM, Matt Farmer <m...@frmr.me> wrote:

> Hello all, I've updated a KIP again to add a few sentences about the
> general problem we were facing in the motivation section. Please let me
> know if there is any further feedback.
>
> On Tue, Apr 3, 2018 at 1:46 PM, Matt Farmer <m...@frmr.me> wrote:
>
>> Hey Randall,
>>
>> Devil's advocate sparring is always a fun game so I'm down. =)
>>
>> Rebalance caused by connectivity failure is the case that caused us to
>> notice the issue. But the issue
>> is really more about giving connectors the tools to facilitate rebalances
>> or a Kafka connect shutdown
>> cleanly. Perhaps that wasn't clear in the KIP.
>>
>> In our case timeouts were *not* uniformly affecting tasks. But every
>> time a timeout occurred in one task,
>> all tasks lost whatever forward progress they had made. So, yes, in the
>> specific case of timeouts a
>> backoff jitter in the connector is a solution for that particular issue.
>> However, this KIP *also* gives connectors
>> enough information to behave intelligently during any kind of rebalance
>> or shutdown because they're able
>> to discover that preCommit is being invoked for that specific reason. (As
>> opposed to being invoked
>> during normal operation.)
>>
>> On Tue, Apr 3, 2018 at 12:36 PM, Randall Hauch <rha...@gmail.com> wrote:
>>
>>> Matt,
>>>
>>> Let me play devil's advocate. Do we need this additional complexity? The
>>> motivation section talked about needing to deal with task failures due to
>>> connectivity problems. Generally speaking, isn't it possible that if one
>>> task has connectivity problems with either the broker or the external
>>> system that other tasks would as well? And in the particular case of S3,
>>> is
>>> it possible to try and prevent the task shutdown in the first place,
>>> perhaps by improving how the S3 connector retries? (We did this in the
>>> Elasticsearch connector using backoff with jitter; see
>>> https://github.com/confluentinc/kafka-connect-elasticsearch/pull/116 for
>>> details.)
>>>
>>> Best regards,
>>>
>>> Randall
>>>
>>> On Tue, Apr 3, 2018 at 8:39 AM, Matt Farmer <m...@frmr.me> wrote:
>>>
>>> > I have made the requested updates to the KIP! :)
>>> >
>>> > On Mon, Apr 2, 2018 at 11:02 AM, Matt Farmer <m...@frmr.me> wrote:
>>> >
>>> > > Ugh
>>> > >
>>> > > * I can update
>>> > >
>>> > > I need more caffeine...
>>> > >
>>> > > On Mon, Apr 2, 2018 at 11:01 AM, Matt Farmer <m...@frmr.me> wrote:
>>> > >
>>> > >> I'm can update the rejected alternatives section as you describe.
>>> > >>
>>> > >> Also, adding a paragraph to the preCommit javadoc also seems like a
>>> > >> Very Very Good Idea™ so I'll make that update to the KIP as well.
>>> > >>
>>> > >> On Mon, Apr 2, 2018 at 10:48 AM, Randall Hauch <rha...@gmail.com>
>>> > wrote:
>>> > >>
>>> > >>> Thanks for the KIP proposal, Matt.
>>> > >>>
>>> > >>> You mention in the "Rejected Alternatives" section that you
>>> considered
>>> > >>> changing the signature of the `preCommit` method but rejected it
>>> > because
>>> > >>> it
>>> > >>> would break backward compatibility. Strictly speaking, it is
>>> possible
>>> > to
>>> > >>> do
>>> > >>> this without breaking compatibility by introducing a new
>>> `preCommit`
>>> > >>> method, deprecating the old one, and having the new implementation
>>> call
>>> > >>> the
>>> > >>> old one. Such an approach would be complicated, and I'm not sure it
>>> > adds
>>> > >>> any value. In fact, one of the benefits of having a context object
>>> is
>>> > >>> that
>>> > >>> we can add methods like the one you're proposing without causing
>>> any
>>> > >>> compatibility issues. Anyway, it probably is worth updating this
>>> > rejected
>>> > >>> alternative to be a bit more precise.
>>> > >>>
>>> > >>> Otherwise, I think this is a good approach, though I'd request
>>> that we
>>> > >>> update the `preCommit` JavaDoc to add a paragraph that explains
>>> this
>>> > >>> scenario. Thoughts?
>>> > >>>
>>> > >>> Randall
>>> > >>>
>>> > >>> On Wed, Mar 28, 2018 at 9:29 PM, Ted Yu <yuzhih...@gmail.com>
>>> wrote:
>>> > >>>
>>> > >>> > I looked at WorkerSinkTask and it seems using a boolean for
>>> KIP-275
>>> > >>> should
>>> > >>> > suffice for now.
>>> > >>> >
>>> > >>> > Thanks
>>> > >>> >
>>> > >>> > On Wed, Mar 28, 2018 at 7:20 PM, Matt Farmer <m...@frmr.me>
>>> wrote:
>>> > >>> >
>>> > >>> > > Hey Ted,
>>> > >>> > >
>>> > >>> > > I have not, actually!
>>> > >>> > >
>>> > >>> > > Do you think that we're likely to add multiple states here
>>> soon?
>>> > >>> > >
>>> > >>> > > My instinct is to keep it simple until there are multiple
>>> states
>>> > >>> that we
>>> > >>> > > would want
>>> > >>> > > to consider. I really like the simplicity of just getting a
>>> boolean
>>> > >>> and
>>> > >>> > the
>>> > >>> > > implementation of WorkerSinkTask already passes around a
>>> boolean to
>>> > >>> > > indicate this is happening internally. We're really just
>>> shuttling
>>> > >>> that
>>> > >>> > > value into
>>> > >>> > > the context at the correct moments.
>>> > >>> > >
>>> > >>> > > Once we have multiple states, we could choose to provide a more
>>> > >>> > > appropriately
>>> > >>> > > named method (e.g. getState?) and reimplement isClosing by
>>> checking
>>> > >>> that
>>> > >>> > > enum
>>> > >>> > > without breaking compatibility.
>>> > >>> > >
>>> > >>> > > However, if we think multiple states here are imminent for some
>>> > >>> reason, I
>>> > >>> > > would
>>> > >>> > > be pretty easy to convince adding that would be worth the extra
>>> > >>> > complexity!
>>> > >>> > > :)
>>> > >>> > >
>>> > >>> > > Matt
>>> > >>> > >
>>> > >>> > > —
>>> > >>> > > Matt Farmer | Blog <http://farmdawgnation.com/> | Twitter
>>> > >>> > > <http://twitter.com/farmdawgnation>
>>> > >>> > > GPG: CD57 2E26 F60C 0A61 E6D8  FC72 4493 8917 D667 4D07
>>> > >>> > >
>>> > >>> > > On Wed, Mar 28, 2018 at 10:02 PM, Ted Yu <yuzhih...@gmail.com>
>>> > >>> wrote:
>>> > >>> > >
>>> > >>> > > > The enhancement gives SinkTaskContext state information.
>>> > >>> > > >
>>> > >>> > > > Have you thought of exposing the state retrieval as an enum
>>> > >>> (initially
>>> > >>> > > with
>>> > >>> > > > two values) ?
>>> > >>> > > >
>>> > >>> > > > Thanks
>>> > >>> > > >
>>> > >>> > > > On Wed, Mar 28, 2018 at 6:55 PM, Matt Farmer <m...@frmr.me>
>>> > wrote:
>>> > >>> > > >
>>> > >>> > > > > Hello all,
>>> > >>> > > > >
>>> > >>> > > > > I am proposing KIP-275 to improve Connect's
>>> SinkTaskContext so
>>> > >>> that
>>> > >>> > > Sinks
>>> > >>> > > > > can be informed
>>> > >>> > > > > in their preCommit hook if the hook is being invoked as a
>>> part
>>> > >>> of a
>>> > >>> > > > > rebalance or Connect
>>> > >>> > > > > shutdown.
>>> > >>> > > > >
>>> > >>> > > > > The KIP is here:
>>> > >>> > > > > https://cwiki.apache.org/confluence/pages/viewpage.
>>> > >>> > > > action?pageId=75977607
>>> > >>> > > > >
>>> > >>> > > > > Please let me know what feedback y'all have. Thanks!
>>> > >>> > > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> > >>
>>> > >>
>>> > >
>>> >
>>>
>>
>>
>

Reply via email to