Olga Natkovich
Wed, 04 Jun 2008 14:21:28 -0700
Tom, You are correct that currently we only allow a single glob in the load statement. It would not be hard to extend it to multiple globs. I have created a JIRA for it: https://issues.apache.org/jira/browse/PIG-252; maybe somebody will be interested to contribute a patch. Olga > -----Original Message----- > From: Tom White [EMAIL PROTECTED] > Sent: Wednesday, June 04, 2008 7:56 AM > To: pig-user@incubator.apache.org > Subject: Specifying multiple input paths > > I;m having a problem loading data from multiple paths in Pig. > What I'm trying to do is to load data from a range of dates, > so I would like to specify an input of two globbed paths: > > x = LOAD '2008/05/{26,27,28,29,30,31},2008/06/{1,2}' > > Pig doesn't seem to like this though as it's trying to > interpret it as a single path. The best I can do it to use UNION: > > x1 = LOAD '2008/05/{26,27,28,29,30,31}' > x2 = LOAD '2008/06/{1,2}' > x = UNION x1, x2 > > The downside to this is that I want to parameterize my paths, > and having separate script for each number of paths in the > input is cumbersome. > > Is there a better way of doing this? Are there any plans to > support multiple paths, and/or PathFilters? > > Thanks, > > Tom >