Hi Chris,
Thanks for your comment. I might have misunderstood the filter SMT. Makes
sense. Dropping this KIP for now. I will look at improving the
existing SMTs separately.

*---*
*Thanks and Regards,*
*Kumud Kumar Srivatsava Tirupati*


On Sat, 4 Jun 2022 at 03:57, Chris Egerton <fearthecel...@gmail.com> wrote:

> Hi Kumud,
>
> I believe using the Filter SMT with a predicate would also cause the
> record to be dropped from the pipeline. AFAIK there is no way to skip the
> entire transformation chain besides applying a predicate to every SMT in
> the chain. If there's a reasonable use case for wanting to do that based on
> the presence of a field in a record that wouldn't be addressed by updating
> the SMTs that come with Connect to be friendlier when the field they're
> acting on does/does not exist, then we definitely could continue to pursue
> the HasField predicate.
>
> Cheers,
>
> Chris
>
> On Fri, Jun 3, 2022 at 10:52 AM Kumud Kumar Srivatsava Tirupati <
> kumudkumartirup...@gmail.com> wrote:
>
>> Hi Chris,
>> Thanks for the explanation. This clears my thoughts.
>> I can now agree that the concerns are totally different for SMTs and
>> predicates.
>> I also do agree with you that this might encourage SMTs to be poorly
>> designed.
>>
>> Do you see this worth considering just for the filter use case?
>> ['errors.tolerance' would make it drop from the pipeline instead of
>> skipping from the transformation chain]. Maybe by further limiting its
>> functionality to not support nested fields. I can now discard the other
>> use-cases I mentioned in the KIP in favor of "Improving the corresponding
>> SMTs".
>>
>> *---*
>> *Thanks and Regards,*
>> *Kumud Kumar Srivatsava Tirupati*
>> *Ph : +91-8686073938*
>>
>> On Fri, 3 Jun 2022 at 18:04, Chris Egerton <fearthecel...@gmail.com>
>> wrote:
>>
>>> Hi Kumud,
>>>
>>> Responses inline.
>>>
>>> > But, I still believe this logic of predicate checking shouldn't be
>>> made a
>>> part of each individual SMT. After all, that is what the predicates are
>>> for
>>> right?
>>>
>>> I don't quite agree. I think the benefit of predicates is that they can
>>> allow you to selectively apply a transformation based on conditions that
>>> don't directly relate to the behavior of the transformation. For example,
>>> "mask field 'ID' if the 'SSN' header is present". The use cases identified
>>> in the KIP don't quite align with this; instead, they're more about working
>>> around undesirable behavior in transformations that makes them difficult to
>>> use in some pipelines. If an SMT is difficult to use, it's better to just
>>> improve the SMT directly instead of adding a layer of indirection on top of
>>> it.
>>>
>>> > `HasField` predicate is something very similar but is more powerful
>>> because
>>> it allows users to skip transformations at SMT level or drop them from
>>> the
>>> transformation chain altogether using the current `
>>> org.apache.kafka.connect.transforms.Filter`. So, the use cases are not
>>> just
>>> limited to skipping some SMTs.
>>>
>>> This use case should be included in the KIP if it's a significant
>>> motivating factor for adding this predicate. But it's also possible to drop
>>> entire records from a pipeline today by setting the 'errors.tolerance'
>>> property to 'all' and allowing all records that cause exceptions to be
>>> thrown in the transformation chain to be skipped/written to a DLQ/logged.
>>>
>>> Ultimately, I'm worried that we're encouraging bad habits with SMT
>>> developers by establishing a precedent where core logic that most users
>>> would expect to be present is left out in favor of a predicate, which makes
>>> things harder to configure and can confound new users by adding a layer of
>>> complexity in the form of another pluggable interface.
>>>
>>> Cheers,
>>>
>>> Chris
>>>
>>> On Thu, Jun 2, 2022 at 4:24 AM Kumud Kumar Srivatsava Tirupati <
>>> kumudkumartirup...@gmail.com> wrote:
>>>
>>>> Hi Chris,
>>>> Thanks for your comments.
>>>>
>>>> I am totally aligned with your comment on nested field names which
>>>> include
>>>> dots. I will incorporate the same based on how the KIP-821 discussion
>>>> goes
>>>> (maybe this parser could be a utility that can be reused in other areas
>>>> as
>>>> well).
>>>>
>>>> But, I still believe this logic of predicate checking shouldn't be made
>>>> a
>>>> part of each individual SMT. After all, that is what the predicates are
>>>> for
>>>> right?
>>>> If you see, the current predicates that we have are:
>>>> org.apache.kafka.connect.transforms.predicates.TopicNameMatches
>>>> org.apache.kafka.connect.transforms.predicates.HasHeaderKey
>>>> org.apache.kafka.connect.transforms.predicates.RecordIsTombstone
>>>> They only allow you filter/transform based on the metadata of the
>>>> record.
>>>>
>>>> `HasField` predicate is something very similar but is more powerful
>>>> because
>>>> it allows users to skip transformations at SMT level or drop them from
>>>> the
>>>> transformation chain altogether using the current `
>>>> org.apache.kafka.connect.transforms.Filter`. So, the use cases are not
>>>> just
>>>> limited to skipping some SMTs.
>>>> If this makes sense, I should probably add this and give an example in
>>>> the
>>>> KIP as well.
>>>>
>>>> *---*
>>>> *Thanks and Regards,*
>>>> *Kumud Kumar Srivatsava Tirupati*
>>>>
>>>>
>>>> On Wed, 1 Jun 2022 at 07:09, Chris Egerton <fearthecel...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Kumud,
>>>> >
>>>> > Thanks for the KIP. I'm a little bit skeptical about the necessity
>>>> for this
>>>> > predicate but I think we may be able to satisfy your requirements
>>>> with a
>>>> > slightly different approach. The motivation section deals largely with
>>>> > skipping the invocation of SMTs that expect a certain field to be
>>>> present
>>>> > in a record, and will fail if it is not present. This seems like a
>>>> > reasonable use case; even when your data has a fixed schema, optional
>>>> > fields are still possible, and preventing SMTs from being used on
>>>> optional
>>>> > fields seems unnecessarily restrictive. In fact, it seems like such a
>>>> > reasonable use case that I wonder if it'd be worth investing in new
>>>> > SMT-level properties to handle this case, instead of requiring users
>>>> to
>>>> > configure a predicate separately alongside them? Something like an
>>>> > "on.field.missing" property with options of "skip", "fail", and/or
>>>> "warn"
>>>> > could do the trick.
>>>> >
>>>> > It's also worth noting that the proposed syntax for the "field.path"
>>>> > property in the HasField predicate would make it impossible to refer
>>>> to
>>>> > field names that have dots in them. It hasn't been finalized yet but
>>>> if we
>>>> > do end up going this route, we should probably use the same syntax
>>>> that's
>>>> > agreed upon for KIP-821 [1].
>>>> >
>>>> > [1] -
>>>> >
>>>> >
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-821%3A+Connect+Transforms+support+for+nested+structures
>>>> >
>>>> > Cheers,
>>>> >
>>>> > Chris
>>>> >
>>>> > On Thu, May 26, 2022 at 5:28 AM Sagar <sagarmeansoc...@gmail.com>
>>>> wrote:
>>>> >
>>>> > > Hi Kumud,
>>>> > >
>>>> > > Thanks for that. I don't have any other comments at this point on
>>>> the
>>>> > KIP.
>>>> > > LGTM overall.
>>>> > >
>>>> > > Thanks!
>>>> > > Sagar.
>>>> > >
>>>> > > On Wed, May 25, 2022 at 5:14 PM Sagar <sagarmeansoc...@gmail.com>
>>>> wrote:
>>>> > >
>>>> > > > Thanks for the KIP Kumud.
>>>> > > >
>>>> > > > Can you please add a couple of examples on how this would behave
>>>> with
>>>> > > > different combinations. I think that way it would be easier to
>>>> > > understand.
>>>> > > >
>>>> > > > Thanks!
>>>> > > > Sagar.
>>>> > > >
>>>> > > > On Wed, May 25, 2022 at 4:59 PM Kumud Kumar Srivatsava Tirupati <
>>>> > > > kumudkumartirup...@gmail.com> wrote:
>>>> > > >
>>>> > > >> Hi all,
>>>> > > >> I have written a KIP for having a new HasField predicate for
>>>> kafka
>>>> > > connect
>>>> > > >> transforms and would like to start a discussion. Please share
>>>> your
>>>> > > >> thoughts
>>>> > > >> on the same.
>>>> > > >>
>>>> > > >>
>>>> > > >>
>>>> > >
>>>> >
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-845%3A+%27HasField%27+predicate+for+kafka+connect
>>>> > > >>
>>>> > > >> *---*
>>>> > > >> *Thanks and Regards,*
>>>> > > >> *Kumud Kumar Srivatsava Tirupati*
>>>> > > >>
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>

Reply via email to