Hello,

Could you try re-running FP-Growth with the '-2' flag, and let us know if you have more success?

This uses an alternate implementation of the FPGrowth algorithm; I have had problems similar to what you are seeing when using the default implementation.

I am skeptical of the change you suggest. That line seems to be related to maintaining a "least" pointer, and the logic seems right to me on the surface.

It is interesting to hear that this change improves the diversity of patterns you receive. The default implementation of FPGrowth will often "mine" the same pattern several times. There is also a limit on how many patterns will be returned (k). Together, these can limit the number of unique patterns found.

The '-2' flag should eliminate duplicate patterns. You can also try increasing k if you want to find more patterns.

-tom

On 08/21/2012 11:22 PM, 林泽桢 wrote:
hello, when i use fpgrowth to get association rules, but it always come to
wrong, so confused.

Then i read the source code, i think i found a bug in line #102
of FrequentPatternMaxHeap.java, which " least.compareTo(frequentPattern) <
0 " should change to " least.compareTo(frequentPattern) > 0 ", the former
will filter a lot frequent patterns come after.

After modification, it comes to better, but when running on a file with
size of 400m and the maxHeapSize =1000, minsupport=2, fpgrowth costs above
10 hours, sometimes it spents 2 hours to compute one feature, is anything
wrong again?

thanks for help


Reply via email to