Hi Chris,
Thanks for the explanation. This clears my thoughts.
I can now agree that the concerns are totally different for SMTs and
predicates.
I also do agree with you that this might encourage SMTs to be poorly
designed.

Do you see this worth considering just for the filter use case?
['errors.tolerance' would make it drop from the pipeline instead of
skipping from the transformation chain]. Maybe by further limiting its
functionality to not support nested fields. I can now discard the other
use-cases I mentioned in the KIP in favor of "Improving the corresponding
SMTs".

*---*
*Thanks and Regards,*
*Kumud Kumar Srivatsava Tirupati*
*Ph : +91-8686073938*

On Fri, 3 Jun 2022 at 18:04, Chris Egerton <fearthecel...@gmail.com> wrote:

> Hi Kumud,
>
> Responses inline.
>
> > But, I still believe this logic of predicate checking shouldn't be made a
> part of each individual SMT. After all, that is what the predicates are for
> right?
>
> I don't quite agree. I think the benefit of predicates is that they can
> allow you to selectively apply a transformation based on conditions that
> don't directly relate to the behavior of the transformation. For example,
> "mask field 'ID' if the 'SSN' header is present". The use cases identified
> in the KIP don't quite align with this; instead, they're more about working
> around undesirable behavior in transformations that makes them difficult to
> use in some pipelines. If an SMT is difficult to use, it's better to just
> improve the SMT directly instead of adding a layer of indirection on top of
> it.
>
> > `HasField` predicate is something very similar but is more powerful
> because
> it allows users to skip transformations at SMT level or drop them from the
> transformation chain altogether using the current `
> org.apache.kafka.connect.transforms.Filter`. So, the use cases are not just
> limited to skipping some SMTs.
>
> This use case should be included in the KIP if it's a significant
> motivating factor for adding this predicate. But it's also possible to drop
> entire records from a pipeline today by setting the 'errors.tolerance'
> property to 'all' and allowing all records that cause exceptions to be
> thrown in the transformation chain to be skipped/written to a DLQ/logged.
>
> Ultimately, I'm worried that we're encouraging bad habits with SMT
> developers by establishing a precedent where core logic that most users
> would expect to be present is left out in favor of a predicate, which makes
> things harder to configure and can confound new users by adding a layer of
> complexity in the form of another pluggable interface.
>
> Cheers,
>
> Chris
>
> On Thu, Jun 2, 2022 at 4:24 AM Kumud Kumar Srivatsava Tirupati <
> kumudkumartirup...@gmail.com> wrote:
>
>> Hi Chris,
>> Thanks for your comments.
>>
>> I am totally aligned with your comment on nested field names which include
>> dots. I will incorporate the same based on how the KIP-821 discussion goes
>> (maybe this parser could be a utility that can be reused in other areas as
>> well).
>>
>> But, I still believe this logic of predicate checking shouldn't be made a
>> part of each individual SMT. After all, that is what the predicates are
>> for
>> right?
>> If you see, the current predicates that we have are:
>> org.apache.kafka.connect.transforms.predicates.TopicNameMatches
>> org.apache.kafka.connect.transforms.predicates.HasHeaderKey
>> org.apache.kafka.connect.transforms.predicates.RecordIsTombstone
>> They only allow you filter/transform based on the metadata of the record.
>>
>> `HasField` predicate is something very similar but is more powerful
>> because
>> it allows users to skip transformations at SMT level or drop them from the
>> transformation chain altogether using the current `
>> org.apache.kafka.connect.transforms.Filter`. So, the use cases are not
>> just
>> limited to skipping some SMTs.
>> If this makes sense, I should probably add this and give an example in the
>> KIP as well.
>>
>> *---*
>> *Thanks and Regards,*
>> *Kumud Kumar Srivatsava Tirupati*
>>
>>
>> On Wed, 1 Jun 2022 at 07:09, Chris Egerton <fearthecel...@gmail.com>
>> wrote:
>>
>> > Hi Kumud,
>> >
>> > Thanks for the KIP. I'm a little bit skeptical about the necessity for
>> this
>> > predicate but I think we may be able to satisfy your requirements with a
>> > slightly different approach. The motivation section deals largely with
>> > skipping the invocation of SMTs that expect a certain field to be
>> present
>> > in a record, and will fail if it is not present. This seems like a
>> > reasonable use case; even when your data has a fixed schema, optional
>> > fields are still possible, and preventing SMTs from being used on
>> optional
>> > fields seems unnecessarily restrictive. In fact, it seems like such a
>> > reasonable use case that I wonder if it'd be worth investing in new
>> > SMT-level properties to handle this case, instead of requiring users to
>> > configure a predicate separately alongside them? Something like an
>> > "on.field.missing" property with options of "skip", "fail", and/or
>> "warn"
>> > could do the trick.
>> >
>> > It's also worth noting that the proposed syntax for the "field.path"
>> > property in the HasField predicate would make it impossible to refer to
>> > field names that have dots in them. It hasn't been finalized yet but if
>> we
>> > do end up going this route, we should probably use the same syntax
>> that's
>> > agreed upon for KIP-821 [1].
>> >
>> > [1] -
>> >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-821%3A+Connect+Transforms+support+for+nested+structures
>> >
>> > Cheers,
>> >
>> > Chris
>> >
>> > On Thu, May 26, 2022 at 5:28 AM Sagar <sagarmeansoc...@gmail.com>
>> wrote:
>> >
>> > > Hi Kumud,
>> > >
>> > > Thanks for that. I don't have any other comments at this point on the
>> > KIP.
>> > > LGTM overall.
>> > >
>> > > Thanks!
>> > > Sagar.
>> > >
>> > > On Wed, May 25, 2022 at 5:14 PM Sagar <sagarmeansoc...@gmail.com>
>> wrote:
>> > >
>> > > > Thanks for the KIP Kumud.
>> > > >
>> > > > Can you please add a couple of examples on how this would behave
>> with
>> > > > different combinations. I think that way it would be easier to
>> > > understand.
>> > > >
>> > > > Thanks!
>> > > > Sagar.
>> > > >
>> > > > On Wed, May 25, 2022 at 4:59 PM Kumud Kumar Srivatsava Tirupati <
>> > > > kumudkumartirup...@gmail.com> wrote:
>> > > >
>> > > >> Hi all,
>> > > >> I have written a KIP for having a new HasField predicate for kafka
>> > > connect
>> > > >> transforms and would like to start a discussion. Please share your
>> > > >> thoughts
>> > > >> on the same.
>> > > >>
>> > > >>
>> > > >>
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-845%3A+%27HasField%27+predicate+for+kafka+connect
>> > > >>
>> > > >> *---*
>> > > >> *Thanks and Regards,*
>> > > >> *Kumud Kumar Srivatsava Tirupati*
>> > > >>
>> > > >
>> > >
>> >
>>
>

Reply via email to