Hello all, Ted: I'm not quite sure I understand your suggestion. Co-occurrence modeling would be limited to finding the most interesting pairs. If you have a follow up link to elaborate on item sets that extend beyond pairs (cardinality > 2), that would be helpful.
All: A related question: I also don't see a clear way to translate FP-growth to MR. Passing around nodes seems painful. Any and all insights/thoughts are welcome Thanks. -Sej Ted Dunning wrote: > > I would think that you would do better with a simpler approach based > simply > on cooccurrence modeling. > > Cooccurrence counting and testing is something that is very nice in > Map-reduce. At Veoh, we use Hadoop to analyze very large numbers of view > events. For the actual counting of cooccurrence, it is nice to user a > higher level language like Pig in order to not have to write vats of very > repetitive code. > > For finding interesting pairs, one simple technique is the one that I > proposed ages ago for using in language processing: > > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.2186 > > On Tue, Aug 19, 2008 at 4:35 PM, sej <[EMAIL PROTECTED]> wrote: > >> >> Hello all, >> >> Just a general question: To what extent can the aprior algorithm be >> implemented in MR? The naive implementation would to be just use MR to >> accumulate itemsets. Is there a more efficient algorithm available? (not >> necessarily implemented, but pointers to papers would be helpful) >> >> Thanks. >> -Sej >> -- >> View this message in context: >> http://www.nabble.com/aprior-algorithm-in-MR-tp19060674p19060674.html >> Sent from the Mahout Developer List mailing list archive at Nabble.com. >> >> > > > -- > ted > > -- View this message in context: http://www.nabble.com/aprior-algorithm-in-MR-tp19060674p19064685.html Sent from the Mahout Developer List mailing list archive at Nabble.com.
