To understand this, I would create the smallest possible data set that shows different outputs from the two implementations.
On Sun, Dec 18, 2011 at 10:37 PM, gaurav singh <[email protected]> wrote: > Hi All, > > I am using mahout on Ubuntu 10.04 from the repository and running it on a > data set of 1472 row, I am running it in sequential mode with k=200,000 and > s= 400. I have implemented fp-growth in php but when I compare the output > of my implementation of fp-growth and mahout fpg, I find that in mahout the > output consists of just 17,500 patterns whereas from my implementation I > get around 65,000 unique patterns(I have verified there uniqueness!), for > the same value of support threshold. I have also verified my outputs from > the actual data set and have found out that all my patterns are correct and > do exist in the data set with correct value of their support. > > > Can anyone please explain me the reason?? > > Thanks!! > > -- > regards > Gaurav Singh > > > > > > -- > regards > Gaurav Singh > > > > > > -- > regards > Gaurav Singh -- Lance Norskog [email protected]
