[
https://issues.apache.org/jira/browse/NIFI-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063201#comment-16063201
]
ASF GitHub Bot commented on NIFI-4004:
--------------------------------------
Github user markap14 commented on a diff in the pull request:
https://github.com/apache/nifi/pull/1877#discussion_r124031794
--- Diff:
nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-standard-record-utils/src/main/java/org/apache/nifi/schema/access/SchemaAccessStrategy.java
---
@@ -38,4 +40,11 @@
* @return the set of all Schema Fields that are supplied by the
RecordSchema that is returned from {@link #getSchema(FlowFile, InputStream)}.
*/
Set<SchemaField> getSuppliedSchemaFields();
+
+ /**
+ * @return Whether this factory needs an incoming FlowFile to resolve
Record Schema.
+ */
+ default boolean isFlowFileRequired() {
--- End diff --
@ijokarumawak I very much agree with the idea behind this JIRA. However, I
think I would approach the solution in a somewhat different manner. Rather than
having a boolean indicate whether or not a FlowFile is required, I would
recommend we get rid of the FlowFile being passed in to get the schema and
instead just pass in a Map<String, String>. This way, if you have no FlowFile
you can just pass in an empty Map. It also allows for more flexibility so that
you can pass in a Map<String, String> where the schema.name may be set by the
processor, for instance, instead of an empty Map.
> Refactor RecordReaderFactory and SchemaAccessStrategy to be used without
> incoming FlowFile
> ------------------------------------------------------------------------------------------
>
> Key: NIFI-4004
> URL: https://issues.apache.org/jira/browse/NIFI-4004
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Affects Versions: 1.2.0
> Reporter: Koji Kawamura
> Assignee: Koji Kawamura
>
> Current RecordReaderFactory and SchemaAccessStrategy implementation assumes
> there's always an incoming FlowFile available, and use it to resolve Record
> Schema.
> That is fine for components those convert or update incoming FlowFiles,
> however there are other components those does not have any incoming
> FlowFiles, for example, ConsumeKafkaRecord_0_10. Typically, ones fetches data
> from external system do not have incoming FlowFile. And current API doesn't
> fit well with these as it requires a FlowFile.
> In fact, [ConsumeKafkaRecord creates a temporal
> FlowFile|https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-kafka-bundle/nifi-kafka-0-10-processors/src/main/java/org/apache/nifi/processors/kafka/pubsub/ConsumerLease.java#L426]
> only to get RecordSchema. This should be avoided as we expect more
> components start using Record reader mechanism.
> This JIRA proposes refactoring current API to allow accessing RecordReaders
> without needing an incoming FlowFile.
> Additionally, since there's Schema Access Strategy that requires incoming
> FlowFile containing attribute values to access schema registry, it'd be
> useful if we could tell user when such RecordReader is specified that it
> can't be used.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)