[ 
https://issues.apache.org/jira/browse/NIFI-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131676#comment-16131676
 ] 

ASF GitHub Bot commented on NIFI-4004:
--------------------------------------

Github user ijokarumawak commented on the issue:

    https://github.com/apache/nifi/pull/1877
  
    @markap14 Thanks for the suggestion, I agree with that. I've made following 
changes:
    
    - Added default method at RecordReaderFactory, so that existing processors 
needn't change
    - Reverted changes to classes those can utilize the newly added default 
method, PutElasticsearchHttpRecord, AbstractPutHDFSRecord, PutParquetTes, 
PutDatabaseRecord, FlowFileEnumerator and FlowFileTable. This reduced the 
volume of this PR, a little bit.
    - Rebased with the latest master, and updated few new classes to meet the 
new RecordSetWriterFactory method signatures.
    
    Local contrib check passed without issue. Also tested a live flow with 
various record readers and writers. 
https://gist.github.com/ijokarumawak/a6c33eef30d0cd9786eab7eeacccb7ff
    
    I hope it's now ready to be merged, thanks!


> Refactor RecordReaderFactory and SchemaAccessStrategy to be used without 
> incoming FlowFile
> ------------------------------------------------------------------------------------------
>
>                 Key: NIFI-4004
>                 URL: https://issues.apache.org/jira/browse/NIFI-4004
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: 1.2.0
>            Reporter: Koji Kawamura
>            Assignee: Koji Kawamura
>
> Current RecordReaderFactory and SchemaAccessStrategy implementation assumes 
> there's always an incoming FlowFile available, and use it to resolve Record 
> Schema.
> That is fine for components those convert or update incoming FlowFiles, 
> however there are other components those does not have any incoming 
> FlowFiles, for example, ConsumeKafkaRecord_0_10. Typically, ones fetches data 
> from external system do not have incoming FlowFile. And current API doesn't 
> fit well with these as it requires a FlowFile.
> In fact, [ConsumeKafkaRecord creates a temporal 
> FlowFile|https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-kafka-bundle/nifi-kafka-0-10-processors/src/main/java/org/apache/nifi/processors/kafka/pubsub/ConsumerLease.java#L426]
>  only to get RecordSchema. This should be avoided as we expect more 
> components start using Record reader mechanism.
> This JIRA proposes refactoring current API to allow accessing RecordReaders 
> without needing an incoming FlowFile.
> Additionally, since there's Schema Access Strategy that requires incoming 
> FlowFile containing attribute values to access schema registry, it'd be 
> useful if we could tell user when such RecordReader is specified that it 
> can't be used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to