[
https://issues.apache.org/jira/browse/NIFI-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991449#comment-15991449
]
ASF subversion and git services commented on NIFI-3724:
-------------------------------------------------------
Commit 60d88b5a64286776ffc2c9088905ac78a1a56786 in nifi's branch
refs/heads/master from [~bbende]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=60d88b5 ]
NIFI-3724 - Initial commit of Parquet bundle with PutParquet and FetchParquet
- Creating nifi-records-utils to share utility code from record services
- Refactoring Parquet tests to use MockRecorderParser and MockRecordWriter
- Refactoring AbstractPutHDFSRecord to use schema access strategy
- Adding custom validate to AbstractPutHDFSRecord and adding handling of UNION
types when writing Records as Avro
- Refactoring project structure to get CS API references out of nifi-commons,
introducing nifi-extension-utils under nifi-nar-bundles
- Updating abstract put/fetch processors to obtain the WriteResult and update
flow file attributes
This closes #1712.
Signed-off-by: Andy LoPresto <[email protected]>
> Add Put/Fetch Parquet Processors
> --------------------------------
>
> Key: NIFI-3724
> URL: https://issues.apache.org/jira/browse/NIFI-3724
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Bryan Bende
> Assignee: Bryan Bende
> Priority: Minor
> Fix For: 1.2.0
>
>
> Now that we have the record reader/writer services currently in master, it
> would be nice to have reader and writers for Parquet. Since Parquet's API is
> based on the Hadoop Path object, and not InputStreams/OutputStreams, we can't
> really implement direct conversions to and from Parquet in the middle of a
> flow, but we can we can perform the conversion by taking any record format
> and writing to a Path as Parquet, or reading Parquet from a Path and writing
> it out as another record format.
> We should add a PutParquet that uses a record reader and writes records to a
> Path as Parquet, and a FetchParquet that reads Parquet from a path and writes
> out records to a flow file using a record writer.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)