I needed some feedback about best way of implementing the following -
In my document table I have documentid as row-id and content:author,
content:text stored in each row. I want to process all documents
pertaining to each author in a map reduce job. ie. my map will take
key=author and values="all documentids sent by that sender". But for
this first I would have to find all distinct authors and store them in
another table. Then run map-reduce job on the second table. Am I
thinking in the right direction or is there a better way to achieve
this?
- Rohit Kelkar

Reply via email to