[jira] [Commented] (NIFI-3724) Add Put/Fetch Parquet Processors

ASF GitHub Bot (JIRA) Thu, 02 Nov 2017 12:47:19 -0700

    [ 
https://issues.apache.org/jira/browse/NIFI-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236481#comment-16236481
 ]


ASF GitHub Bot commented on NIFI-3724:
--------------------------------------

Github user nellashapiro123 commented on the issue:

    https://github.com/apache/nifi/pull/1712
  
    Has anybody been able to use fetchParquet processor successfully? I am 
getting SchemaNotFound exception. I have created the file with PutParquet and 
Spark can read this parquet file.


> Add Put/Fetch Parquet Processors
> --------------------------------
>
>                 Key: NIFI-3724
>                 URL: https://issues.apache.org/jira/browse/NIFI-3724
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Bryan Bende
>            Assignee: Bryan Bende
>            Priority: Minor
>             Fix For: 1.2.0
>
>
> Now that we have the record reader/writer services currently in master, it 
> would be nice to have reader and writers for Parquet. Since Parquet's API is 
> based on the Hadoop Path object, and not InputStreams/OutputStreams, we can't 
> really implement direct conversions to and from Parquet in the middle of a 
> flow, but we can we can perform the conversion by taking any record format 
> and writing to a Path as Parquet, or reading Parquet from a Path and writing 
> it out as another record format.
> We should add a PutParquet that uses a record reader and writes records to a 
> Path as Parquet, and a FetchParquet that reads Parquet from a path and writes 
> out records to a flow file using a record writer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (NIFI-3724) Add Put/Fetch Parquet Processors

Reply via email to