Hi there, sorry for the delay in responding. Last week had a holiday and several busy work days in it so I'm just now getting around to responding.
> We would only exclude > exception Streams can handle itself (like ProducerFencedException) -- > thus, if the handler has code to react to this, it would not be bad, as > this code is just never called. [...] > Thus, I am still in favor of not calling the ProductionExceptionHandler > for fatal exception. Acknowledged, so is ProducerFencedException the only kind of exception I need to change my behavior on? Or are there other types I need to check? Is there a comprehensive list somewhere? > About the "always continue" case. Sounds good to me to remove it (not > sure why we need it in test package?) I include it in the test package because I have tests that assert that if the record collector impl encounters an Exception and receives a CONTINUE that it actually does CONTINUE. > What is there reasoning for invoking the handler only for the first > exception? I didn't want to invoke the handler in places where the CONTINUE or FAIL result would just be ignored. Presumably, after a FAIL has been returned, subsequent exceptions are likely to be repeats or noise from my understanding of the code paths. If this is incorrect we can revisit. Once I get the answers to these questions I can make another pass on the pull request! Matt On Mon, Nov 20, 2017 at 4:07 PM Matthias J. Sax <matth...@confluent.io> wrote: > Thanks for following up! > > One thought about an older reply from you: > > >>>> I strongly disagree here. The purpose of this handler isn't *just* to > >>>> make a decision for streams. There may also be desirable side effects > that > >>>> users wish to cause when production exceptions occur. There may be > >>>> side-effects that they wish to cause when AuthenticationExceptions > occur, > >>>> as well. We should still give them the hooks to preform those side > effects. > > And your follow up: > > >> - I think I would rather invoke it for all exceptions that could > occur > >> from attempting to produce - even those exceptions were returning > CONTINUE > >> may not be a good idea (e.g. Authorization exception). Until there > is a > >> different type for exceptions that are totally fatal (for example a > >> FatalStreamsException or some sort), maintaining a list of > exceptions that > >> you can intercept with this handler and exceptions you cannot would > be > >> really error-prone and hard to keep correct. > > I understand what you are saying, however, consider that Streams needs > to die for a fatal exception. Thus, if you call the handler for a fatal > exception, we would need to ignore the return value and fail -- this > seems to be rather intuitive. Furthermore, users can register an > uncaught-exception-handler and side effects for fatal errors can be > triggered there. > > Btw: an AuthorizationException is fatal -- not sure what you mean by an > "totally fatal" exception -- there is no superlative to fatal from my > understanding. > > About maintaining a list of exceptions: I don't think this is too hard, > and users also don't need to worry about it IMHO. We would only exclude > exception Streams can handle itself (like ProducerFencedException) -- > thus, if the handler has code to react to this, it would not be bad, as > this code is just never called. In case that there is an exception > Streams could actually handle but we still call the handler (ie, bug), > there should be no harm either -- the handler needs to return either > CONTINUE or FAIL and we would obey. It could only happen, that Streams > dies---as request by the user(!)---even if Streams could actually handle > the error and move on. But this seems to be not a issue from my point of > view. > > Thus, I am still in favor of not calling the ProductionExceptionHandler > for fatal exception. > > > > About the "always continue" case. Sounds good to me to remove it (not > sure why we need it in test package?) and to rename the "failing" > handler to "Default" (even if "default" is less descriptive and I would > still prefer "Fail" in the name). > > > Last question: > > >> - Continue to *only* invoke it on the first exception that we > >> encounter (before sendException is set) > > > What is there reasoning for invoking the handler only for the first > exception? > > > > > -Matthias > > On 11/20/17 10:43 AM, Matt Farmer wrote: > > Alright, here are some updates I'm planning to make after thinking on > this > > for awhile: > > > > - Given that the "always continue" handler isn't something I'd > recommend > > for production use as is, I'm going to move it into the test > namespace and > > remove it from mention in the public API. > > - I'm going to rename the "AlwaysFailProductionExceptionHandler" to > > "DefaultProductionExceptionHandler" > > - Given that the API for the exception handler involves being invoked > by > > streams to make a decision about whether or not to continue — I think > that > > we should: > > - Continue to *only* invoke it on the first exception that we > > encounter (before sendException is set) > > - Stop invoking it for the self-healing fenced exceptions. > > - I think I would rather invoke it for all exceptions that could occur > > from attempting to produce - even those exceptions were returning > CONTINUE > > may not be a good idea (e.g. Authorization exception). Until there is > a > > different type for exceptions that are totally fatal (for example a > > FatalStreamsException or some sort), maintaining a list of exceptions > that > > you can intercept with this handler and exceptions you cannot would be > > really error-prone and hard to keep correct. > > - I'm happy to file a KIP for the creation of this new Exception > type > > if there is interest. > > > > @ Matthias — What do you think about the above? > > > > On Tue, Nov 14, 2017 at 9:44 AM Matt Farmer <m...@frmr.me> wrote: > > > >> I responded before reading your code review and didn't see the bit about > >> how ProducerFencedException is self-healing. This error handling logic > is > >> *quite* confusing to reason about... I think I'm going to sit down with > >> the code a bit more today, but I'm inclined to think that if the fenced > >> exceptions are, indeed, self healing that we still invoke the handler > but > >> ignore its result for those exceptions. > >> > >> On Tue, Nov 14, 2017 at 9:37 AM Matt Farmer <m...@frmr.me> wrote: > >> > >>> Hi there, > >>> > >>> Following up here... > >>> > >>>> One tiny comment: I would prefer to remove the "Always" from the > >>> handler implementation names -- it sounds "cleaner" to me without it. > >>> > >>> Let me think on this. I generally prefer expressiveness to clean-ness, > >>> and I think that comes out in the names I chose for things. The straw > man > >>> in my head is the person that tried to substitute in the > "AlwaysContinue" > >>> variant and the "Always" is a trigger to really consider whether or not > >>> they *always* want to try to continue. > >>> > >>> To be truthful, after some thought, using the "AlwaysContinue" variant > >>> isn't something I'd recommend generally in a production system. Ideally > >>> these handlers are targeted at handling a specific series of exceptions > >>> that a user wants to ignore, and not ignoring all exceptions. More on > this > >>> in a minute. > >>> > >>>> For the first category, it seems to not make sense to call the handle > but > >>> Streams should always fail. If we follow this design, the KIP should > list > >>> all exceptions for which the handler is not called. > >>> > >>> I strongly disagree here. The purpose of this handler isn't *just* to > >>> make a decision for streams. There may also be desirable side effects > that > >>> users wish to cause when production exceptions occur. There may be > >>> side-effects that they wish to cause when AuthenticationExceptions > occur, > >>> as well. We should still give them the hooks to preform those side > effects. > >>> > >>> In light of the above, I'm thinking that the > >>> "AlwaysContinueProductionExceptionHandler" variant should probably be > >>> removed from the public API and moved into tests since that's where > it's > >>> most useful. The more I think on it, the more I feel that having that > in > >>> the public API is a landmine. If you agree, then, we can rename the > >>> "AlwaysFailProductionExceptionHandler" to > >>> "DefaultProductionExceptionHandler". > >>> > >>> Thoughts? > >>> > >>> On Fri, Nov 10, 2017 at 6:13 PM Matthias J. Sax <matth...@confluent.io > > > >>> wrote: > >>> > >>>> I just review the PR, and there is one thing we should discuss. > >>>> > >>>> There are different types of exceptions that could occur. Some apply > to > >>>> all records (like Authorization exception) while others are for single > >>>> records only (like record too large). > >>>> > >>>> For the first category, it seems to not make sense to call the handle > >>>> but Streams should always fail. If we follow this design, the KIP > should > >>>> list all exceptions for which the handler is not called. > >>>> > >>>> WDYT? > >>>> > >>>> > >>>> -Matthias > >>>> > >>>> > >>>> On 11/10/17 2:56 PM, Matthias J. Sax wrote: > >>>>> Just catching up on this KIP. > >>>>> > >>>>> One tiny comment: I would prefer to remove the "Always" from the > >>>> handler > >>>>> implementation names -- it sounds "cleaner" to me without it. > >>>>> > >>>>> > >>>>> -Matthias > >>>>> > >>>>> On 11/5/17 12:57 PM, Matt Farmer wrote: > >>>>>> It is agreed, then. I've updated the pull request. I'm trying to > also > >>>>>> update the KIP accordingly, but cwiki is being slow and dropping > >>>>>> connections..... I'll try again a bit later but please consider the > >>>> KIP > >>>>>> updated for all intents and purposes. heh. > >>>>>> > >>>>>> On Sun, Nov 5, 2017 at 3:45 PM Guozhang Wang <wangg...@gmail.com> > >>>> wrote: > >>>>>> > >>>>>>> That makes sense. > >>>>>>> > >>>>>>> > >>>>>>> Guozhang > >>>>>>> > >>>>>>> On Sun, Nov 5, 2017 at 12:33 PM, Matt Farmer <m...@frmr.me> wrote: > >>>>>>> > >>>>>>>> Interesting. I'm not sure I agree. I've been bitten many times by > >>>>>>>> unintentionally shipping code that fails to properly implement > >>>> logging. I > >>>>>>>> always discover this at the exact *worst* moment, too. (Normally > at > >>>> 3 AM > >>>>>>>> during an on-call shift. Hah.) However, if others feel the same > way > >>>> I > >>>>>>> could > >>>>>>>> probably be convinced to remove it. > >>>>>>>> > >>>>>>>> We could also meet halfway and say that when a customized > >>>>>>>> ProductionExceptionHandler instructs Streams to CONTINUE, we log > at > >>>> DEBUG > >>>>>>>> level instead of WARN level. Would that alternative be appealing > to > >>>> you? > >>>>>>>> > >>>>>>>> On Sun, Nov 5, 2017 at 12:32 PM Guozhang Wang <wangg...@gmail.com > > > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Thanks for the updates. I made a pass over the wiki again and it > >>>> looks > >>>>>>>>> good. > >>>>>>>>> > >>>>>>>>> About whether record collector should still internally log the > >>>> error in > >>>>>>>>> addition to what the customized ProductionExceptionHandler does. > I > >>>>>>>>> personally would prefer only to log if the returned value is FAIL > >>>> to > >>>>>>>>> indicate that this thread is going to shutdown and trigger the > >>>>>>> exception > >>>>>>>>> handler. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Guozhang > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Sun, Nov 5, 2017 at 6:09 AM, Matt Farmer <m...@frmr.me> > wrote: > >>>>>>>>> > >>>>>>>>>> Hello, a bit later than I'd anticipated, but I've updated this > >>>> KIP as > >>>>>>>>>> outlined above. The updated KIP is now ready for review again! > >>>>>>>>>> > >>>>>>>>>> On Sat, Nov 4, 2017 at 1:03 PM Matt Farmer <m...@frmr.me> > wrote: > >>>>>>>>>> > >>>>>>>>>>> Ah. I actually created both of those in the PR and forgot to > >>>>>>> mention > >>>>>>>>> them > >>>>>>>>>>> by name in the KIP! Thanks for pointing out the oversight. > >>>>>>>>>>> > >>>>>>>>>>> I’ll revise the KIP this afternoon accordingly. > >>>>>>>>>>> > >>>>>>>>>>> The logging is actually provided for in the record collector. > >>>>>>>> Whenever > >>>>>>>>> a > >>>>>>>>>>> handler continues it’ll log a warning to ensure that it’s > >>>>>>>> *impossible* > >>>>>>>>> to > >>>>>>>>>>> write a handler that totally suppresses production exceptions > >>>> from > >>>>>>>> the > >>>>>>>>>> log. > >>>>>>>>>>> As such, the default continue handler just returns the continue > >>>>>>>> value. > >>>>>>>>> I > >>>>>>>>>>> can add details about those semantics to the KIP as well. > >>>>>>>>>>> On Sat, Nov 4, 2017 at 12:46 PM Matthias J. Sax < > >>>>>>>> matth...@confluent.io > >>>>>>>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> One more comment. > >>>>>>>>>>>> > >>>>>>>>>>>> You mention a default implementation for the handler that > >>>> fails. I > >>>>>>>>>>>> think, this should be part of the public API and thus should > >>>> have > >>>>>>> a > >>>>>>>>>>>> proper defined named that is mentioned in the KIP. > >>>>>>>>>>>> > >>>>>>>>>>>> We could also add a second implementation for the > >>>> log-and-move-on > >>>>>>>>>>>> strategy, as both are the two most common cases. This class > >>>> should > >>>>>>>>> also > >>>>>>>>>>>> be part of public API (so users can just set in the config) > >>>> with a > >>>>>>>>>>>> proper name. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Otherwise, I like the KIP a lot! Thanks. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -Matthias > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 11/1/17 12:23 AM, Matt Farmer wrote: > >>>>>>>>>>>>> Thanks for the heads up. Yes, I think my changes are > compatible > >>>>>>>> with > >>>>>>>>>>>> that > >>>>>>>>>>>>> PR, but there will be a merge conflict that happens whenever > >>>> one > >>>>>>>> of > >>>>>>>>>> the > >>>>>>>>>>>> PRs > >>>>>>>>>>>>> is merged. Happy to reconcile the changes in my PR if 4148 > goes > >>>>>>> in > >>>>>>>>>>>> first. :) > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, Oct 31, 2017 at 6:44 PM Guozhang Wang < > >>>>>>> wangg...@gmail.com > >>>>>>>>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> That sounds reasonable, thanks Matt. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> As for the implementation, please note that there is another > >>>>>>>>> ongoing > >>>>>>>>>> PR > >>>>>>>>>>>>>> that may touch the same classes that you are working on: > >>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4148 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> So it may help if you can also take a look at that PR and > see > >>>>>>> if > >>>>>>>> it > >>>>>>>>>> is > >>>>>>>>>>>>>> compatible with your changes. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Guozhang > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Tue, Oct 31, 2017 at 10:59 AM, Matt Farmer <m...@frmr.me > > > >>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I've opened this pull request to implement the KIP as > >>>>>>> currently > >>>>>>>>>>>> written: > >>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4165. It still needs > >>>>>>> some > >>>>>>>>>> tests > >>>>>>>>>>>>>>> added, > >>>>>>>>>>>>>>> but largely represents the shape I was going for. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> If there are more points that folks would like to discuss, > >>>>>>>> please > >>>>>>>>>> let > >>>>>>>>>>>> me > >>>>>>>>>>>>>>> know. If I don't hear anything by tomorrow afternoon I'll > >>>>>>>> probably > >>>>>>>>>>>> start > >>>>>>>>>>>>>> a > >>>>>>>>>>>>>>> [VOTE] thread. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>> Matt > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Fri, Oct 27, 2017 at 7:33 PM Matt Farmer <m...@frmr.me> > >>>>>>>> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I can’t think of a reason that would be problematic. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Most of the time I would write a handler like this, I > either > >>>>>>>> want > >>>>>>>>>> to > >>>>>>>>>>>>>>>> ignore the error or fail and bring everything down so > that I > >>>>>>>> can > >>>>>>>>>> spin > >>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>> back up later and resume from earlier offsets. When we > start > >>>>>>> up > >>>>>>>>>> after > >>>>>>>>>>>>>>>> crashing we’ll eventually try to process the message we > >>>>>>> failed > >>>>>>>> to > >>>>>>>>>>>>>> produce > >>>>>>>>>>>>>>>> again. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I’m concerned that “putting in a queue for later” opens > you > >>>>>>> up > >>>>>>>> to > >>>>>>>>>>>>>> putting > >>>>>>>>>>>>>>>> messages into the destination topic in an unexpected > order. > >>>>>>>>> However > >>>>>>>>>>>> if > >>>>>>>>>>>>>>>> others feel differently, I’m happy to talk about it. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Fri, Oct 27, 2017 at 7:10 PM Guozhang Wang < > >>>>>>>>> wangg...@gmail.com> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Please correct me if I'm wrong, but my understanding is > >>>>>>> that > >>>>>>>>> the > >>>>>>>>>>>>>>> record > >>>>>>>>>>>>>>>>>> metadata is always null if an exception occurred while > >>>>>>> trying > >>>>>>>>> to > >>>>>>>>>>>>>>>>> produce. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> That is right. Thanks. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I looked at the example code, and one thing I realized > that > >>>>>>>>> since > >>>>>>>>>> we > >>>>>>>>>>>>>> are > >>>>>>>>>>>>>>>>> not passing the context in the handle function, we may > not > >>>>>>> be > >>>>>>>>>>>>>> implement > >>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>> logic to send the fail records into another queue for > >>>> future > >>>>>>>>>>>>>> processing. > >>>>>>>>>>>>>>>>> Would people think that would be a big issue? > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Guozhang > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Thu, Oct 26, 2017 at 12:14 PM, Matt Farmer < > >>>> m...@frmr.me > >>>>>>>> > >>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hello all, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I've updated the KIP based on this conversation, and > made > >>>>>>> it > >>>>>>>> so > >>>>>>>>>>>> that > >>>>>>>>>>>>>>> its > >>>>>>>>>>>>>>>>>> interface, config setting, and parameters line up more > >>>>>>>> closely > >>>>>>>>>> with > >>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> interface in KIP-161 (deserialization handler). > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I believe there are a few specific questions I need to > >>>>>>> reply > >>>>>>>>>>>> to..... > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> The question I had about then handle parameters are > >>>> around > >>>>>>>> the > >>>>>>>>>>>>>>> record, > >>>>>>>>>>>>>>>>>>> should it be `ProducerRecord<byte[], byte[]>`, or be > >>>>>>>> generics > >>>>>>>>> of > >>>>>>>>>>>>>>>>>>> `ProducerRecord<? extends K, ? extends V>` or > >>>>>>>>> `ProducerRecord<? > >>>>>>>>>>>>>>>>> extends > >>>>>>>>>>>>>>>>>>> Object, ? extends Object>`? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> At this point in the code we're guaranteed that this is > a > >>>>>>>>>>>>>>>>>> ProducerRecord<byte[], byte[]>, so the generics would > just > >>>>>>>> make > >>>>>>>>>> it > >>>>>>>>>>>>>>>>> harder > >>>>>>>>>>>>>>>>>> to work with the key and value. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Also, should the handle function include the > >>>>>>>> `RecordMetadata` > >>>>>>>>> as > >>>>>>>>>>>>>>> well > >>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>> case it is not null? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Please correct me if I'm wrong, but my understanding is > >>>>>>> that > >>>>>>>>> the > >>>>>>>>>>>>>>> record > >>>>>>>>>>>>>>>>>> metadata is always null if an exception occurred while > >>>>>>> trying > >>>>>>>>> to > >>>>>>>>>>>>>>>>> produce. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> We may probably try to write down at least the > following > >>>>>>>>>> handling > >>>>>>>>>>>>>>>>> logic > >>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>> see if the given API is sufficient for it > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> I've added some examples to the KIP. Let me know what > you > >>>>>>>>> think. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>> Matt > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Mon, Oct 23, 2017 at 9:00 PM Matt Farmer < > m...@frmr.me > >>>>> > >>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Thanks for this feedback. I’m at a conference right now > >>>>>>> and > >>>>>>>> am > >>>>>>>>>>>>>>>>> planning > >>>>>>>>>>>>>>>>>> on > >>>>>>>>>>>>>>>>>>> updating the KIP again with details from this > >>>> conversation > >>>>>>>>> later > >>>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>>>> week. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I’ll shoot you a more detailed response then! :) > >>>>>>>>>>>>>>>>>>> On Mon, Oct 23, 2017 at 8:16 PM Guozhang Wang < > >>>>>>>>>> wangg...@gmail.com > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Thanks for the KIP Matt. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Regarding the handle interface of > >>>>>>>>>> ProductionExceptionHandlerResp > >>>>>>>>>>>>>>> onse, > >>>>>>>>>>>>>>>>>>>> could > >>>>>>>>>>>>>>>>>>>> you write it on the wiki also, along with the actual > >>>>>>> added > >>>>>>>>>> config > >>>>>>>>>>>>>>>>> names > >>>>>>>>>>>>>>>>>>>> (e.g. what > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>> 161%3A+streams+ > >>>>>>>>>>>>>>>>>> deserialization+exception+handlers > >>>>>>>>>>>>>>>>>>>> described). > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> The question I had about then handle parameters are > >>>>>>> around > >>>>>>>>> the > >>>>>>>>>>>>>>>>> record, > >>>>>>>>>>>>>>>>>>>> should it be `ProducerRecord<byte[], byte[]>`, or be > >>>>>>>> generics > >>>>>>>>>> of > >>>>>>>>>>>>>>>>>>>> `ProducerRecord<? extends K, ? extends V>` or > >>>>>>>>> `ProducerRecord<? > >>>>>>>>>>>>>>>>> extends > >>>>>>>>>>>>>>>>>>>> Object, ? extends Object>`? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Also, should the handle function include the > >>>>>>>> `RecordMetadata` > >>>>>>>>>> as > >>>>>>>>>>>>>>>>> well in > >>>>>>>>>>>>>>>>>>>> case it is not null? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> We may probably try to write down at least the > following > >>>>>>>>>> handling > >>>>>>>>>>>>>>>>> logic > >>>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>> see if the given API is sufficient for it: 1) throw > >>>>>>>> exception > >>>>>>>>>>>>>>>>>> immediately > >>>>>>>>>>>>>>>>>>>> to fail fast and stop the world, 2) log the error and > >>>>>>> drop > >>>>>>>>>> record > >>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>> proceed silently, 3) send such errors to a specific > >>>>>>> "error" > >>>>>>>>>> Kafka > >>>>>>>>>>>>>>>>> topic, > >>>>>>>>>>>>>>>>>>>> or > >>>>>>>>>>>>>>>>>>>> record it as an app-level metrics ( > >>>>>>>>>>>>>>>>>>>> > https://kafka.apache.org/documentation/#kafka_streams_ > >>>>>>>>>> monitoring > >>>>>>>>>>>>>> ) > >>>>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>> monitoring. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Guozhang > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On Fri, Oct 20, 2017 at 5:47 PM, Matt Farmer < > >>>>>>> m...@frmr.me > >>>>>>>>> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I did some more digging tonight. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> @Ted: It looks like the deserialization handler uses > >>>>>>>>>>>>>>>>>>>>> "default.deserialization.exception.handler" for the > >>>>>>>> config > >>>>>>>>>>>>>>> name. No > >>>>>>>>>>>>>>>>>>>>> ".class" on the end. I'm inclined to think this > should > >>>>>>> use > >>>>>>>>>>>>>>>>>>>>> "default.production.exception.handler". > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On Fri, Oct 20, 2017 at 8:22 PM Matt Farmer < > >>>>>>> m...@frmr.me > >>>>>>>>> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Okay, I've dug into this a little bit. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> I think getting access to the serialized record is > >>>>>>>>> possible, > >>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>> changing > >>>>>>>>>>>>>>>>>>>>>> the naming and return type is certainly doable. > >>>>>>> However, > >>>>>>>>>>>>>>> because > >>>>>>>>>>>>>>>>>> we're > >>>>>>>>>>>>>>>>>>>>>> hooking into the onCompletion callback we have no > >>>>>>>> guarantee > >>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> ProcessorContext state hasn't changed by the time > this > >>>>>>>>>>>>>>> particular > >>>>>>>>>>>>>>>>>>>> handler > >>>>>>>>>>>>>>>>>>>>>> runs. So I think the signature would change to > >>>>>>> something > >>>>>>>>>>>>>> like: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> ProductionExceptionHandlerResponse handle(final > >>>>>>>>>>>>>>>>> ProducerRecord<..> > >>>>>>>>>>>>>>>>>>>>> record, > >>>>>>>>>>>>>>>>>>>>>> final Exception exception) > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Would this be acceptable? > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 19, 2017 at 7:33 PM Matt Farmer < > >>>>>>>> m...@frmr.me> > >>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Ah good idea. Hmmm. I can line up the naming and > >>>>>>> return > >>>>>>>>> type > >>>>>>>>>>>>>>> but > >>>>>>>>>>>>>>>>>> I’m > >>>>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>>>> sure if I can get my hands on the context and the > >>>>>>> record > >>>>>>>>>>>>>>> itself > >>>>>>>>>>>>>>>>>>>> without > >>>>>>>>>>>>>>>>>>>>>>> other changes. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Let me dig in and follow up here tomorrow. > >>>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 19, 2017 at 7:14 PM Matthias J. Sax < > >>>>>>>>>>>>>>>>>>>> matth...@confluent.io> > >>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Are you familiar with KIP-161? > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>> 161%3A+streams+ > >>>>>>>>>>>>>>>>>>>>> deserialization+exception+handlers > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> I thinks, we should align the design (parameter > >>>>>>> naming, > >>>>>>>>>>>>>>> return > >>>>>>>>>>>>>>>>>>>> types, > >>>>>>>>>>>>>>>>>>>>>>>> class names etc) of KIP-210 to KIP-161 to get a > >>>>>>> unified > >>>>>>>>>>>>>> user > >>>>>>>>>>>>>>>>>>>>> experience. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On 10/18/17 4:20 PM, Matt Farmer wrote: > >>>>>>>>>>>>>>>>>>>>>>>>> I’ll create the JIRA ticket. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> I think that config name will work. I’ll update > the > >>>>>>>> KIP > >>>>>>>>>>>>>>>>>>>> accordingly. > >>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 18, 2017 at 6:09 PM Ted Yu < > >>>>>>>>>>>>>>> yuzhih...@gmail.com> > >>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Can you create JIRA that corresponds to the KIP > ? > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> For the new config, how about naming it > >>>>>>>>>>>>>>>>>>>>>>>>>> production.exception.processor.class > >>>>>>>>>>>>>>>>>>>>>>>>>> ? This way it is clear that class name should be > >>>>>>>>>>>>>>> specified. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Cheers > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 18, 2017 at 2:40 PM, Matt Farmer < > >>>>>>>>>>>>>>> m...@frmr.me> > >>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Hello everyone, > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> This is the discussion thread for the KIP that > I > >>>>>>>> just > >>>>>>>>>>>>>>> filed > >>>>>>>>>>>>>>>>>>>> here: > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>>>>>>>>>>>>>>>>>>>>> 210+-+Provide+for+custom+ > >>>>>>>> error+handling++when+Kafka+ > >>>>>>>>>>>>>>>>>>>>>>>>>>> Streams+fails+to+produce > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to getting some feedback from > >>>>>>> folks > >>>>>>>>>>>>>> about > >>>>>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>>>>>> idea > >>>>>>>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>>> working toward a solution we can contribute > back. > >>>>>>> :) > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>>>>>>> Matt Farmer > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>> -- Guozhang > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>> -- Guozhang > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> -- Guozhang > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> -- Guozhang > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> -- Guozhang > >>>>>>> > >>>>>> > >>>>> > >>>> > >>>> > > > >