[ 
https://issues.apache.org/jira/browse/NIFI-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653556#comment-16653556
 ] 

ASF GitHub Bot commented on NIFI-5706:
--------------------------------------

Github user bbende commented on the issue:

    https://github.com/apache/nifi/pull/3079
  
    @MikeThomsen the advantage is not having to write the parquet out to local 
disk somewhere, and then have a disconnected flow where another part reads it 
back in. With this approach the data stays in NiFi's repositories and the 
ConvertAvroToParquet can be directly connected to the next processor.
    
    I will try to give this a review in the coming days... Since this processor 
is already developed, the best approach here maybe to commit this as is 
(assuming the processor works as expected) and then afterwards refactor the 
utility classes into a nifi-parquet-utils under 
nifi-nar-bundles/nifi-extension-utils and this way the code can be shared by 
this processor and the eventual record writer.


> Processor ConvertAvroToParquet 
> -------------------------------
>
>                 Key: NIFI-5706
>                 URL: https://issues.apache.org/jira/browse/NIFI-5706
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>    Affects Versions: 1.7.1
>            Reporter: Mohit
>            Priority: Major
>              Labels: pull-request-available
>
> *Why*?
> PutParquet support is limited to HDFS. 
> PutParquet bypasses the _flowfile_ implementation and writes the file 
> directly to sink. 
> We need a processor for parquet that works like _ConvertAvroToOrc_.
> *What*?
> _ConvertAvroToParquet_ will convert the incoming avro flowfile to a parquet 
> flowfile. Unlike PutParquet, which writes to the hdfs file system, processor 
> ConvertAvroToParquet would write into the flowfile, which can be pipelined to 
> put into other sinks, like _local_, _S3, Azure data lake_ etc.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to