[
https://issues.apache.org/jira/browse/MAHOUT-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821621#comment-13821621
]
Suneel Marthi commented on MAHOUT-1355:
---------------------------------------
[~smoens] Thanks for this patch. Some comments based on a very cursory firs
pass through the code and not considering the actual algorithm and its
implementation.
a) Use Guava APIs where appropriate.
For eg:- Map<Integer,MutableLong> counts = new
HashMap<Integer,MutableLong>();
could be replaced by
Map<Integer,MutableLong> counts = Maps.newHashMap();
b) Classes that actually launch MR jobs should extend Mahout's AbstractJob and
leverage appropriate methods.
For eg:-
public class BigFIMDriver extends Configured implements Tool
can be replaced by
public class BigFIMDriver extends AbstractJob
Replace
Job job = new Job(conf, "Apriori Phase" + i);
/// and all of the code that follows this
by
Job bigFmJob = prepareJob(.....);
Could u post this patch on Reviewboard, it would be much easier to comment and
review then.
https://reviews.apache.org
> Frequent Pattern Mining algorithms for Mahout
> ---------------------------------------------
>
> Key: MAHOUT-1355
> URL: https://issues.apache.org/jira/browse/MAHOUT-1355
> Project: Mahout
> Issue Type: New Feature
> Components: Frequent Itemset/Association Rule Mining
> Affects Versions: 0.9
> Reporter: Sandy Moens
> Priority: Minor
> Attachments: MAHOUT-1355.patch
>
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> We implemented frequent pattern mining algorithms for Hadoop and adapted them
> to Mahout. We used "PFP" (now deprecated) as a benchmark and these
> implementations perform better in terms of speed and memory footprint. The
> details of the implementations can be found in the paper Frequent Pattern
> Mining for BigData ( http://adrem.ua.ac.be/bigfim )
> We have been maintaining the project for a while in GitLab (
> https://gitlab.com/adrem/bigfim ). Documentation for adaptation (
> Readme-Mahout.md ) and usage in mahout ( Mahout-wiki.md ) can be found there.
> We are open to any modification and/or improvement requests to make it more
> worthwhile for the Mahout project. We, as the research group, volunteer to
> maintain FPM algorithms as well.
--
This message was sent by Atlassian JIRA
(v6.1#6144)