On Fri, May 28, 2010 at 7:39 PM, Grant Ingersoll <[email protected]> wrote: > Robin, > > What I'll do here is make the code reusable so that we can use it in FPG > directly as well. > Cool.
Btw there is one more thing missing. Make sure each item in an itemset to the algorithm is formed of unique tokens. I dont things its well handled in the sequential run, mapreduce run converts the info into a transaction tree so the error goes away. If that is removing information then need to split into multiple transactions as follows for example if a log record has A B A A A C D create following transactions from it to keep the correct co-occurrence counts A B, 3 A C, 3 A D, 3 B C D, 1 Robin
