Thanks for your suggestion.
I wouldn't want to run a map reduce job just to just get the file in a
single tuple. But also, I can't be sure I get the lines sorted within the
group, in the same order they are in the file.
Thanks
On 10 March 2015 at 06:39, Arvind S wrote:
> while loading file you
while loading file you can attempt to use
PigStorage(',','-tagFile')
then regex on each line of the file .. then group by file name
https://pig.apache.org/docs/r0.14.0/api/org/apache/pig/builtin/PigStorage.html
*Cheers !!*
Arvind
On Fri, Mar 6, 2015 at 2:26 AM, Daniel Dai wrote:
> Didn¹t reali
Didn¹t realize any, but it should be pretty easy to write a customized
Loader/InputFormat for that.
Daniel
On 3/5/15, 6:18 AM, "Ronald Green" wrote:
>Hi,
>
>I'm looking for a loader function that will let me read each file as a
>record on its own so I'll be able to treat each as a single record
Hi,
I'm looking for a loader function that will let me read each file as a
record on its own so I'll be able to treat each as a single record/field.
For example:
a = load '/files' USING TheLoader() as (file:chararray);
b = foreach a GENERATE REGEX_EXTRACT(file,'...');
PigStorage and TextLoader r