Hi Tom, I don't understand, why do you say I will get a lot of redundant patterns? In each group dependent shard generates patterns with respect to the elements of that shard. The fpg-2 as far as I know and if I am correct is only a new sequential implementation of fp-growth and not map/reduce implementation.
My question was specifically if we eliminate subpatterns from output in mahout parallel fp-growth(map/reduce version)? I know that the function exists in FrequentPatternMaxHeap, but that's the sequential algorithm, I am asking only about the map/reduce version? On Sun, Feb 26, 2012 at 9:39 PM, tom <[email protected]> wrote: > Hi Gaurav, > > The patterns are accumulated in a heap (see FrequentPatternMaxHeap), which > uses isSubPatternOf. > > That said, I do think the default implementation of PFPGrowth will get you > many redundant patterns under certain circumstances, but the "-2" > implementation will reduce (perhaps eliminate?) redundant patterns. > > -tom > > > On 02/26/2012 09:39 AM, gaurav singh wrote: > >> Hi Guys, >> >> >> There is a function in mahout sequential fp-growth algorithm named >> isSubPatternof() which returns whether one pattern is subpattern of >> another >> pattern and if both have equal support only the one larger of the two is >> output. I can't find any such function being used in parallel fp-growth. >> Does that mean that in parallel fp-growth we display all the possible >> patterns without eliminating such subpatterns? >> >> Thanks for help! >> >> > -- regards Gaurav Singh
