Hi,

I have data already processed in following form:


( id ,{ bag of words})
So for example:

(foobar, {(foo), (foo),(foobar),(bar)})
(foo,{(bar),(bar)})

and so on..
describe processed gives me:
processed: {id: chararray,tokens: {tuple_of_tokens: (token: chararray)}}


Now what I want is.. also count the number of times a word appears in this
data and output it as
foobar, foo, 2
foobar,foobar,1
foobar,bar,1
foo,bar,2

and so on...

How do I do this in pig?

Reply via email to