exceptionfactory commented on PR #10053: URL: https://github.com/apache/nifi/pull/10053#issuecomment-3079218340
Thanks for the substantive review of options and current discussion @dariuszseweryn, that is helpful framing. Reviewing the options, and considering similar issues in the `ConsumeKafka` Processor, the use case related to an embedded schema is a good candidate for consideration. Solving that scenario could also benefit the schema inference scenario. Here is an additional approach to consider: The `ConsumeKafka` Processor does different output FlowFiles when records have different associated Schemas. However, instead of comparing whether a Record Schema is compatible with Schema from the first Record, it groups output based on Record Schema equality. It still sends invalid records to a `parse.failure` Relationship, but it provides a logical grouping based on common Record Schema references. This fits the scenario of embedded schema references, and can also work for scenarios where schema inference produces different Record Schema results for different Records. This approach would also avoid some of the performance concerns related to evaluate a `KinesisRecord` multiple times, and avoids the need for changes to Record Writers. If that sounds viable, it is probably worth creating a new pull request, since it presents a more substantive direction than the initial focus here on improving error handling. However, it seems like such an approach gets closer to resolving the core issues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
