Re: Loading multiple files, each file as a record

2015-03-10 Thread Ronald Green
Thanks for your suggestion. I wouldn't want to run a map reduce job just to just get the file in a single tuple. But also, I can't be sure I get the lines sorted within the group, in the same order they are in the file. Thanks On 10 March 2015 at 06:39, Arvind S wrote: > while loading file you

Re: Loading multiple files, each file as a record

2015-03-09 Thread Arvind S
while loading file you can attempt to use PigStorage(',','-tagFile') then regex on each line of the file .. then group by file name https://pig.apache.org/docs/r0.14.0/api/org/apache/pig/builtin/PigStorage.html *Cheers !!* Arvind On Fri, Mar 6, 2015 at 2:26 AM, Daniel Dai wrote: > Didn¹t reali

Re: Loading multiple files, each file as a record

2015-03-05 Thread Daniel Dai
Didn¹t realize any, but it should be pretty easy to write a customized Loader/InputFormat for that. Daniel On 3/5/15, 6:18 AM, "Ronald Green" wrote: >Hi, > >I'm looking for a loader function that will let me read each file as a >record on its own so I'll be able to treat each as a single record

Loading multiple files, each file as a record

2015-03-05 Thread Ronald Green
Hi, I'm looking for a loader function that will let me read each file as a record on its own so I'll be able to treat each as a single record/field. For example: a = load '/files' USING TheLoader() as (file:chararray); b = foreach a GENERATE REGEX_EXTRACT(file,'...'); PigStorage and TextLoader r