Sorry to hijack your question, Jonathan, but while we are at it... :) Is there a way to tell Pig NOT to add "base_alias::"? Almost half my code consists of FOREACH... GENERATE that just remove these prefixes.
Thanks, Anze On Monday 06 December 2010, Daniel Dai wrote: > After join, cross, foreach flatten, Pig will automatically add > "base_alias::" prefix. All other cases use "." > > Daniel > > Jonathan Coveney wrote: > > It's very hard to search for this among the docs because it's so generic, > > so I thought I'd ask... I'm sure the answer is painfully easy. > > > > Taking a look at this code that I found online, for example > > > > -- > > -- Read in a bag of tuples (timeseries for this example) and divide the > > -- numeric column by its maximum. > > -- > > %default DATABAG 'data/timeseries.tsv' > > > > data = LOAD '$DATABAG' AS (month:chararray, count:int); > > accumulate = GROUP data ALL; > > calc_max = FOREACH accumulate GENERATE FLATTEN(data), > > MAX(data.count) AS max_count; > > normalize = FOREACH calc_max GENERATE data::month AS month, > > data::count AS count, (float)data::count / (float)max_count AS > > normed_count; > > DUMP normalize; > > > > What purpose does data::month serve versus data.count? > > > > Thanks
