humm so

 LOAD 'folder/**KeyWord**

so pig is able to use wild cards??
Ill have to try that!



On Fri, Oct 3, 2014 at 5:02 PM, Andrew Oliver <[email protected]> wrote:

> If you find you need something smarter than pig's handling, you can use a
> shell script and pass in at the commandline: -param myfile=somefile and put
> $myfile in your load statement.
> On Oct 3, 2014 1:58 PM, "hanif mahboobi" <[email protected]>
> wrote:
>
> >
> >
> > Sure,
> >
> > table = LOAD 'folder/*KeyWord*' USING
> > org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'YES_MULTILINE',
> > 'NOCHANGE', 'SKIP_INPUT_HEADER') AS (rec: chararray);
> >
> >
> >
> > On Friday, October 3, 2014 4:51 PM, Bob Metelsky <[email protected]
> >
> > wrote:
> >
> >
> >
> > can you post what you did/used?
> >
> >
> > On Fri, Oct 3, 2014 at 4:41 PM, hanif mahboobi <
> > [email protected]> wrote:
> >
> > > Hi Praveen,
> > >
> > > Thanks for the reply.
> > > In fact after a minor debugging, I could make it work and it could read
> > in
> > > using the glob pattern.
> > >
> > > Best,
> > > Hanif
> > >
> > >
> > >
> > > On Friday, October 3, 2014 3:08 AM, Praveen R <
> > > [email protected]> wrote:
> > >
> > >
> > >
> > > Looks like Pig load doesn't support glob patterns.
> > >
> > > I guess you would need to write a custom loader to achieve this.
> > >
> > >
> > > On Fri, Oct 3, 2014 at 3:25 AM, hanif mahboobi <
> > > [email protected]> wrote:
> > >
> > > > Hi There,
> > > >
> > > > Here is my problem.
> > > > I have a folder with thousands of files in it. I just want to load
> > > certain
> > > > subset of them which have a specific string in their names (in this
> > > example
> > > > "site").
> > > >
> > > > Knowing that something like this does not work:
> > > > table = LOAD 'folder/*site*' USING ...
> > > >
> > > > Can anybody help with that?
> > > >
> > > > Thanks,
> > > > Hanif
>

Reply via email to