On Fri, May 28, 2010 at 7:39 PM, Grant Ingersoll <[email protected]> wrote:
> Robin,
>
> What I'll do here is make the code reusable so that we can use it in FPG 
> directly as well.
>
Cool.

Btw there is one more thing missing. Make sure each item in an
itemset to the algorithm is formed of unique tokens. I dont things its
well handled in the sequential run, mapreduce run converts the info
into a transaction tree so the error goes away.
If that is removing information then need to split into multiple
transactions as follows


for example if a log record has
A B A A A C D
create following transactions from it to keep the correct co-occurrence counts
A B,  3
A C,  3
A D, 3
B C D, 1

Robin

Reply via email to