[ 
https://issues.apache.org/jira/browse/MAHOUT-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078749#comment-13078749
 ] 

Yarco Hayduk commented on MAHOUT-709:
-------------------------------------

can't give you the source yet....I'm still in the process of perf. testing. 
I found an important bug recently. In the mapReduce version of the program, we 
don't need to encode the transactions the second time, as it distorts the 
results. The tree gets restructured and our conditionals get moved almost to 
the root, instead of being in the bottom of the tree. Anyone care to try the 
MapReduce version with the num of groups == 1 ?) You will likely see this 
problem too

> FP-Growth Redundant patterns
> ----------------------------
>
>                 Key: MAHOUT-709
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-709
>             Project: Mahout
>          Issue Type: Bug
>          Components: Frequent Itemset/Association Rule Mining
>    Affects Versions: 0.4, 0.5
>            Reporter: Yarco Hayduk
>            Assignee: Robin Anil
>              Labels: fp-growth, frequent, parallel, pattern
>             Fix For: 0.6
>
>         Attachments: SixTransactions.dat, bresult-new.txt, dumpedPatterns, 
> patterns-converted.txt
>
>
> The algorithm outputs more patterns that it is needed. 
> I have tested Mahout's PFP-Growth algorithm with the 
> http://www.borgelt.net/fpgrowth.html FP-Growth implementation. This 
> implementation has an option to generate closed patterns too. 
> When I filtered out the sub patterns from the output of Parallel FP-Growth I 
> arrived to the same result, as in http://www.borgelt.net/fpgrowth.html
> Succinctly, you are not outputting closed items
> I am attaching the dummy DB along with the output of both algorithms

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to