Hi Chris, Thanks for your comment. I might have misunderstood the filter SMT. Makes sense. Dropping this KIP for now. I will look at improving the existing SMTs separately.
*---* *Thanks and Regards,* *Kumud Kumar Srivatsava Tirupati* On Sat, 4 Jun 2022 at 03:57, Chris Egerton <fearthecel...@gmail.com> wrote: > Hi Kumud, > > I believe using the Filter SMT with a predicate would also cause the > record to be dropped from the pipeline. AFAIK there is no way to skip the > entire transformation chain besides applying a predicate to every SMT in > the chain. If there's a reasonable use case for wanting to do that based on > the presence of a field in a record that wouldn't be addressed by updating > the SMTs that come with Connect to be friendlier when the field they're > acting on does/does not exist, then we definitely could continue to pursue > the HasField predicate. > > Cheers, > > Chris > > On Fri, Jun 3, 2022 at 10:52 AM Kumud Kumar Srivatsava Tirupati < > kumudkumartirup...@gmail.com> wrote: > >> Hi Chris, >> Thanks for the explanation. This clears my thoughts. >> I can now agree that the concerns are totally different for SMTs and >> predicates. >> I also do agree with you that this might encourage SMTs to be poorly >> designed. >> >> Do you see this worth considering just for the filter use case? >> ['errors.tolerance' would make it drop from the pipeline instead of >> skipping from the transformation chain]. Maybe by further limiting its >> functionality to not support nested fields. I can now discard the other >> use-cases I mentioned in the KIP in favor of "Improving the corresponding >> SMTs". >> >> *---* >> *Thanks and Regards,* >> *Kumud Kumar Srivatsava Tirupati* >> *Ph : +91-8686073938* >> >> On Fri, 3 Jun 2022 at 18:04, Chris Egerton <fearthecel...@gmail.com> >> wrote: >> >>> Hi Kumud, >>> >>> Responses inline. >>> >>> > But, I still believe this logic of predicate checking shouldn't be >>> made a >>> part of each individual SMT. After all, that is what the predicates are >>> for >>> right? >>> >>> I don't quite agree. I think the benefit of predicates is that they can >>> allow you to selectively apply a transformation based on conditions that >>> don't directly relate to the behavior of the transformation. For example, >>> "mask field 'ID' if the 'SSN' header is present". The use cases identified >>> in the KIP don't quite align with this; instead, they're more about working >>> around undesirable behavior in transformations that makes them difficult to >>> use in some pipelines. If an SMT is difficult to use, it's better to just >>> improve the SMT directly instead of adding a layer of indirection on top of >>> it. >>> >>> > `HasField` predicate is something very similar but is more powerful >>> because >>> it allows users to skip transformations at SMT level or drop them from >>> the >>> transformation chain altogether using the current ` >>> org.apache.kafka.connect.transforms.Filter`. So, the use cases are not >>> just >>> limited to skipping some SMTs. >>> >>> This use case should be included in the KIP if it's a significant >>> motivating factor for adding this predicate. But it's also possible to drop >>> entire records from a pipeline today by setting the 'errors.tolerance' >>> property to 'all' and allowing all records that cause exceptions to be >>> thrown in the transformation chain to be skipped/written to a DLQ/logged. >>> >>> Ultimately, I'm worried that we're encouraging bad habits with SMT >>> developers by establishing a precedent where core logic that most users >>> would expect to be present is left out in favor of a predicate, which makes >>> things harder to configure and can confound new users by adding a layer of >>> complexity in the form of another pluggable interface. >>> >>> Cheers, >>> >>> Chris >>> >>> On Thu, Jun 2, 2022 at 4:24 AM Kumud Kumar Srivatsava Tirupati < >>> kumudkumartirup...@gmail.com> wrote: >>> >>>> Hi Chris, >>>> Thanks for your comments. >>>> >>>> I am totally aligned with your comment on nested field names which >>>> include >>>> dots. I will incorporate the same based on how the KIP-821 discussion >>>> goes >>>> (maybe this parser could be a utility that can be reused in other areas >>>> as >>>> well). >>>> >>>> But, I still believe this logic of predicate checking shouldn't be made >>>> a >>>> part of each individual SMT. After all, that is what the predicates are >>>> for >>>> right? >>>> If you see, the current predicates that we have are: >>>> org.apache.kafka.connect.transforms.predicates.TopicNameMatches >>>> org.apache.kafka.connect.transforms.predicates.HasHeaderKey >>>> org.apache.kafka.connect.transforms.predicates.RecordIsTombstone >>>> They only allow you filter/transform based on the metadata of the >>>> record. >>>> >>>> `HasField` predicate is something very similar but is more powerful >>>> because >>>> it allows users to skip transformations at SMT level or drop them from >>>> the >>>> transformation chain altogether using the current ` >>>> org.apache.kafka.connect.transforms.Filter`. So, the use cases are not >>>> just >>>> limited to skipping some SMTs. >>>> If this makes sense, I should probably add this and give an example in >>>> the >>>> KIP as well. >>>> >>>> *---* >>>> *Thanks and Regards,* >>>> *Kumud Kumar Srivatsava Tirupati* >>>> >>>> >>>> On Wed, 1 Jun 2022 at 07:09, Chris Egerton <fearthecel...@gmail.com> >>>> wrote: >>>> >>>> > Hi Kumud, >>>> > >>>> > Thanks for the KIP. I'm a little bit skeptical about the necessity >>>> for this >>>> > predicate but I think we may be able to satisfy your requirements >>>> with a >>>> > slightly different approach. The motivation section deals largely with >>>> > skipping the invocation of SMTs that expect a certain field to be >>>> present >>>> > in a record, and will fail if it is not present. This seems like a >>>> > reasonable use case; even when your data has a fixed schema, optional >>>> > fields are still possible, and preventing SMTs from being used on >>>> optional >>>> > fields seems unnecessarily restrictive. In fact, it seems like such a >>>> > reasonable use case that I wonder if it'd be worth investing in new >>>> > SMT-level properties to handle this case, instead of requiring users >>>> to >>>> > configure a predicate separately alongside them? Something like an >>>> > "on.field.missing" property with options of "skip", "fail", and/or >>>> "warn" >>>> > could do the trick. >>>> > >>>> > It's also worth noting that the proposed syntax for the "field.path" >>>> > property in the HasField predicate would make it impossible to refer >>>> to >>>> > field names that have dots in them. It hasn't been finalized yet but >>>> if we >>>> > do end up going this route, we should probably use the same syntax >>>> that's >>>> > agreed upon for KIP-821 [1]. >>>> > >>>> > [1] - >>>> > >>>> > >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-821%3A+Connect+Transforms+support+for+nested+structures >>>> > >>>> > Cheers, >>>> > >>>> > Chris >>>> > >>>> > On Thu, May 26, 2022 at 5:28 AM Sagar <sagarmeansoc...@gmail.com> >>>> wrote: >>>> > >>>> > > Hi Kumud, >>>> > > >>>> > > Thanks for that. I don't have any other comments at this point on >>>> the >>>> > KIP. >>>> > > LGTM overall. >>>> > > >>>> > > Thanks! >>>> > > Sagar. >>>> > > >>>> > > On Wed, May 25, 2022 at 5:14 PM Sagar <sagarmeansoc...@gmail.com> >>>> wrote: >>>> > > >>>> > > > Thanks for the KIP Kumud. >>>> > > > >>>> > > > Can you please add a couple of examples on how this would behave >>>> with >>>> > > > different combinations. I think that way it would be easier to >>>> > > understand. >>>> > > > >>>> > > > Thanks! >>>> > > > Sagar. >>>> > > > >>>> > > > On Wed, May 25, 2022 at 4:59 PM Kumud Kumar Srivatsava Tirupati < >>>> > > > kumudkumartirup...@gmail.com> wrote: >>>> > > > >>>> > > >> Hi all, >>>> > > >> I have written a KIP for having a new HasField predicate for >>>> kafka >>>> > > connect >>>> > > >> transforms and would like to start a discussion. Please share >>>> your >>>> > > >> thoughts >>>> > > >> on the same. >>>> > > >> >>>> > > >> >>>> > > >> >>>> > > >>>> > >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-845%3A+%27HasField%27+predicate+for+kafka+connect >>>> > > >> >>>> > > >> *---* >>>> > > >> *Thanks and Regards,* >>>> > > >> *Kumud Kumar Srivatsava Tirupati* >>>> > > >> >>>> > > > >>>> > > >>>> > >>>> >>>