[
https://issues.apache.org/jira/browse/NIFI-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730985#comment-14730985
]
Bryan Bende commented on NIFI-919:
----------------------------------
I've been playing around with adding a new method to DataFileWriter which
essentially works exactly the same as appendAllFrom(...), but it lets you
specify the number of blocks:
https://issues.apache.org/jira/browse/AVRO-1726
I think if we got that in there it would work nicely for splitting up the files
based on number of blocks.
Instead of making the "Split Size" property be the number of records (or
approx. number), we would likely make it the number of blocks when Split
Strategy = Block, and the number of records when Split Strategy = Record.
I'm thinking maybe first pass of this processor only supports the record
strategy, unless we figure out something that doesn't require adding
functionality to Avro.
> Support Splitting Avro Files
> ----------------------------
>
> Key: NIFI-919
> URL: https://issues.apache.org/jira/browse/NIFI-919
> Project: Apache NiFi
> Issue Type: New Feature
> Reporter: Bryan Bende
> Assignee: Bryan Bende
> Priority: Minor
> Fix For: 0.4.0
>
>
> Provide a processor that splits an Avro file into multiple smaller files.
> Would be nice to have a configurable batch size so a user could produce
> single record files and also multi-record files of smaller size than the
> original. Also consider making the output format configurable, data file vs
> bare record.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)