[
https://issues.apache.org/jira/browse/PIG-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236730#comment-13236730
]
Scott Carey commented on PIG-2492:
----------------------------------
Something seems way off here.
I have a custom LoadFunc for Avro (very different feature set, built before
AvroStorage).
It has worked with globs since the beginning, with only this:
{code}
@Override
public void setLocation(String location, Job job) throws IOException {
FileInputFormat.setInputPaths(job, location);
}
{code}
This is much, much simpler.
This also solves the "only works with *.avro" file issue. But it changes the
syntax you would need in the LOAD statement.
In my scripts, If I might do something like
A = LOAD '/events/2012/03/23/{views,clicks}/*.avro' using MyCustomStorageFunc();
In other words, use the glob and the well tested FileInputFormat to find files,
don't write it in your LoadFunc.
> AvroStorage should recognize globs and commas
> ---------------------------------------------
>
> Key: PIG-2492
> URL: https://issues.apache.org/jira/browse/PIG-2492
> Project: Pig
> Issue Type: Improvement
> Components: piggybank
> Affects Versions: 0.9.1
> Reporter: Stan Rosenberg
> Attachments: AvroStorage.patch, AvroStorageUtils.patch
>
>
> I've patched AvroStorage and AvroStorageUtils to support the same file input
> syntax as currently supported
> by hadoop's FileInputFormat. Specifically, globs and commas are supported.
> Somebody should write some unit tests for theses changes; I am currently
> pressed for time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira