Hi Gaurav,
The patterns are accumulated in a heap (see FrequentPatternMaxHeap),
which uses isSubPatternOf.
That said, I do think the default implementation of PFPGrowth will get
you many redundant patterns under certain circumstances, but the "-2"
implementation will reduce (perhaps eliminate?) redundant patterns.
-tom
On 02/26/2012 09:39 AM, gaurav singh wrote:
Hi Guys,
There is a function in mahout sequential fp-growth algorithm named
isSubPatternof() which returns whether one pattern is subpattern of another
pattern and if both have equal support only the one larger of the two is
output. I can't find any such function being used in parallel fp-growth.
Does that mean that in parallel fp-growth we display all the possible
patterns without eliminating such subpatterns?
Thanks for help!