Hi Tom,

I don't understand, why do you say I will get a lot of redundant patterns?
In each group dependent shard generates patterns with respect to the
elements of that shard. The fpg-2 as far as I know and if I am correct is
only a new sequential implementation of fp-growth and not map/reduce
implementation.

My question was specifically if we eliminate subpatterns from output in
mahout parallel fp-growth(map/reduce version)? I know that the function
exists in FrequentPatternMaxHeap, but that's the sequential algorithm, I am
asking only about the map/reduce version?

On Sun, Feb 26, 2012 at 9:39 PM, tom <[email protected]> wrote:

> Hi Gaurav,
>
> The patterns are accumulated in a heap (see FrequentPatternMaxHeap), which
> uses isSubPatternOf.
>
> That said, I do think the default implementation of PFPGrowth will get you
> many redundant patterns under certain circumstances, but the "-2"
> implementation will reduce (perhaps eliminate?) redundant patterns.
>
> -tom
>
>
> On 02/26/2012 09:39 AM, gaurav singh wrote:
>
>> Hi Guys,
>>
>>
>> There is a function in mahout sequential fp-growth algorithm named
>> isSubPatternof() which returns whether one pattern is subpattern of
>> another
>> pattern and if both have equal support only the one larger of the two is
>> output. I can't find any such function being used in parallel fp-growth.
>> Does that mean that in parallel fp-growth we display all the possible
>> patterns without eliminating such subpatterns?
>>
>> Thanks for help!
>>
>>
>


-- 
regards
Gaurav Singh

Reply via email to