[
https://issues.apache.org/jira/browse/MAHOUT-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762137#action_12762137
]
Robin Anil commented on MAHOUT-157:
-----------------------------------
Reply to Ted's Queries
1. The timings are run on a Machine(Single Threaded on a Core2Duo 3.0Ghz)
It takes approximately 3 mins to mine top 50 patterns for each 320
features in a 340K transaction list
2. Database Reading Time is approximately 5 seconds for 340K transactions.
3. Initial FPTree building time is approximately 4 seconds for 340K transactions
4. FPGrowth is run 320 times and total time is 3 mins. So average time to fetch
top 50 pattern containing a given feature is approximately 0.5 sec
Parallel FPGrowth will split the database transactions such that only
transactions containing a few features will be treated on one node, reducing
fp-tree building time and also parallely executing the fpgrowth step.
> Frequent Pattern Mining using Parallel FP-Growth
> ------------------------------------------------
>
> Key: MAHOUT-157
> URL: https://issues.apache.org/jira/browse/MAHOUT-157
> Project: Mahout
> Issue Type: New Feature
> Components: Frequent Itemset/Association Rule Mining
> Affects Versions: 0.2
> Reporter: Robin Anil
> Assignee: Robin Anil
> Fix For: 0.2
>
> Attachments: MAHOUT-157-August-17.patch, MAHOUT-157-August-24.patch,
> MAHOUT-157-August-31.patch, MAHOUT-157-August-6.patch,
> MAHOUT-157-Combinations-BSD-License.patch,
> MAHOUT-157-Combinations-BSD-License.patch,
> MAHOUT-157-inProgress-August-5.patch, MAHOUT-157-Oct-1.patch,
> MAHOUT-157-September-10.patch, MAHOUT-157-September-18.patch,
> MAHOUT-157-September-5.patch
>
>
> Implement: http://infolab.stanford.edu/~echang/recsys08-69.pdf
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.