[ 
https://issues.apache.org/jira/browse/MAHOUT-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762137#action_12762137
 ] 

Robin Anil commented on MAHOUT-157:
-----------------------------------

Reply to Ted's Queries

1. The timings are run on a Machine(Single Threaded on a Core2Duo 3.0Ghz)  
     It takes approximately 3 mins to mine top 50 patterns for each 320 
features in a 340K transaction list
2. Database Reading Time is approximately 5 seconds for 340K transactions.
3. Initial FPTree building time is approximately 4 seconds for 340K transactions
4. FPGrowth is run 320 times and total time is 3 mins. So average time to fetch 
top 50 pattern containing a given feature is approximately 0.5 sec

Parallel FPGrowth will split the database transactions such that only 
transactions containing a few features will be treated on one node, reducing 
fp-tree building time and also parallely executing the fpgrowth step.



> Frequent Pattern Mining using Parallel FP-Growth
> ------------------------------------------------
>
>                 Key: MAHOUT-157
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-157
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Frequent Itemset/Association Rule Mining
>    Affects Versions: 0.2
>            Reporter: Robin Anil
>            Assignee: Robin Anil
>             Fix For: 0.2
>
>         Attachments: MAHOUT-157-August-17.patch, MAHOUT-157-August-24.patch, 
> MAHOUT-157-August-31.patch, MAHOUT-157-August-6.patch, 
> MAHOUT-157-Combinations-BSD-License.patch, 
> MAHOUT-157-Combinations-BSD-License.patch, 
> MAHOUT-157-inProgress-August-5.patch, MAHOUT-157-Oct-1.patch, 
> MAHOUT-157-September-10.patch, MAHOUT-157-September-18.patch, 
> MAHOUT-157-September-5.patch
>
>
> Implement: http://infolab.stanford.edu/~echang/recsys08-69.pdf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to