[
https://issues.apache.org/jira/browse/CRUNCH-219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tom White updated CRUNCH-219:
-----------------------------
Attachment: CRUNCH-219.patch
Here's a new patch that adds multi-path support to all file-based inputs.
I haven't changed MaterializableIterable, but then I'm not sure it's needed,
since only Sources can have multiple paths. Targets and SourceTargets are still
single paths, and for each of MapsideJoinStrategy, BloomFilterJoinStrategy, and
Sort the PCollection being materialized is not an input collection, so it's a
SourceTarget (I think), and hence a single path. (I'm not sure it's even
possible to change MaterializableIterable to have a getPaths() method since
FilterKeysWithBloomFilterFn calls PType.getPath() with a single path to get a
SourceTarget.) Does this sound right to you Josh, or am I missing something?
> Support multiple paths in Avro source
> -------------------------------------
>
> Key: CRUNCH-219
> URL: https://issues.apache.org/jira/browse/CRUNCH-219
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Reporter: Tom White
> Assignee: Josh Wills
> Attachments: CRUNCH-219.patch, CRUNCH-219.patch
>
>
> It would be useful to be able to specify multiple paths (which may be files,
> or directories, or a combination of both) to read from in a source.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira