[ https://issues.apache.org/jira/browse/PIG-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603652#comment-13603652 ]
Michael Kramer commented on PIG-3223: ------------------------------------- [~cheolsoo], thanks for getting back to me so quickly! We're using variable substitution and input path generation via Oozie Coordinator. We include the hdfs://namenode:8020 at the beginning of our path templates, which I think is pretty standard (e.g. something like <uri-template>$\{nameNode\}/data/</uri-template> ) When Oozie constructs input paths to be passed to the pig script or map reduce job, it enumerates the paths via a comma separated list, something like hdfs://namenode:8020/data/1,hdfs://namenode:8020/data/2. This is how we figured out AvroStorage was breaking in the first place. A good coordinator/workflow example that is indicative of the types of workflows we're running can be found in the Oozie source examples: https://github.com/apache/oozie/blob/trunk/examples/src/main/apps/aggregator/coordinator.xml > AvroStorage does not handle comma separated input paths > ------------------------------------------------------- > > Key: PIG-3223 > URL: https://issues.apache.org/jira/browse/PIG-3223 > Project: Pig > Issue Type: Bug > Components: piggybank > Affects Versions: 0.10.0, 0.11 > Reporter: Michael Kramer > Assignee: Johnny Zhang > Attachments: AvroStorage.patch, AvroStorage.patch-2, > AvroStorageUtils.patch, AvroStorageUtils.patch-2, PIG-3223.patch.txt > > > In pig 0.11, a patch was issued to AvroStorage to support globs and comma > separated input paths (PIG-2492). While this function works fine for > glob-formatted input paths, it fails when issued a standard comma separated > list of paths. fs.globStatus does not seem to be able to parse out such a > list, and a java.net.URISyntaxException is thrown when toURI is called on the > path. > I have a working fix for this, but it's extremely ugly (basically checking if > the string of input paths is globbed, otherwise splitting on ","). I'm sure > there's a more elegant solution. I'd be happy to post the relevant methods > and "fixes" if necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira