Add more tunable parameters to PFPGrowth implementation
-------------------------------------------------------

                 Key: MAHOUT-293
                 URL: https://issues.apache.org/jira/browse/MAHOUT-293
             Project: Mahout
          Issue Type: Improvement
          Components: Frequent Itemset/Association Rule Mining
    Affects Versions: 0.4
            Reporter: Robin Anil
            Assignee: Robin Anil
             Fix For: 0.4


Objective is to add more tunable parameters to the PFPGrowth algorithm.

>From Neal on Mahout User list:

I often use Christian Borgelt's itemset implementations for playing
with data.  He's implemented a nice set of switches, see below.
Setting a minimum support threshold and mimimum itemset size are both
convenient and tend to make the algorithm run a bit faster.

http://www.borgelt.net/software.html

ne...@nrichter-laptop:~$ fpgrowth_fim
usage: fpgrowth_fim [options] infile outfile
find frequent item sets with the fpgrowth algorithm
version 1.13 (2008.05.02)        (c) 2004-2008   Christian Borgelt
-m#      minimal number of items per item set (default: 1)
-n#      maximal number of items per item set (default: no limit)
-s#      minimal support of an item set (default: 10%)
        (positive: percentage, negative: absolute number)
-d#      minimal binary logarithm of support quotient (default: none)
-p#      output format for the item set support (default: "%.1f")
-a       print absolute support (number of transactions)
-g       write output in scanable form (quote certain characters)
-q#      sort items w.r.t. their frequency (default: -2)
        (1: ascending, -1: descending, 0: do not sort,
         2: ascending, -2: descending w.r.t. transaction size sum)
-u       use alternative tree projection method
-z       do not prune tree projections to bonsai
-j       use quicksort to sort the transactions (default: heapsort)
-i#      ignore records starting with a character in the given string
-b/f/r#  blank characters, field and record separators
        (default: " \t\r", " \t", "\n")
infile   file to read transactions from
outfile  file to write frequent item se

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to