Could you try comparing your dataset when using the bin/mahout process and 
report back here?

On Nov 21, 2011, at 4:49 AM, Sébastien Noir wrote:

> Hi!
> 
> I'm currently trying to understand how to use the implementation of the 
> FPGrowth algoritm (see : 
> https://cwiki.apache.org/MAHOUT/parallel-frequent-pattern-mining.html).
> 
> Currently, I'm just trying it with stupid data, and scala code. The problem 
> is that it output only single item itemset.
> I probably missed something. Could you give me a hint?
> 
> By the way, the code below is scala (calling java implementation directly!). 
> It that is a problem, I can translate it to java...
> 
> sample outuput :
> 
> freqList :Buffer((bier,15), (bread,12), (milk,11), (butter,6))
> 10:47:44,688 INFO  ~ Number of unique items 4
> 10:47:44,688 INFO  ~ Number of unique pruned items 4
> 10:47:44,688 INFO  ~ Number of Nodes in the FP Tree: 0
> 10:47:44,688 INFO  ~ Mining FTree Tree for all patterns with 3
> updater : FPGrowth Algorithm for a given feature: 3
> butter:[butter] : 6
> 10:47:44,690 INFO  ~ Found 1 Patterns with Least Support 6
> 10:47:44,690 INFO  ~ Mining FTree Tree for all patterns with 2
> updater : FPGrowth Algorithm for a given feature: 2
> updater : FPGrowth Algorithm for a given feature: 3
> milk:[milk] : 11
> 10:47:44,690 INFO  ~ Found 1 Patterns with Least Support 11
> 10:47:44,690 INFO  ~ Mining FTree Tree for all patterns with 1
> updater : FPGrowth Algorithm for a given feature: 1
> updater : FPGrowth Algorithm for a given feature: 2
> updater : FPGrowth Algorithm for a given feature: 3
> bread:[bread] : 12
> 10:47:44,690 INFO  ~ Found 1 Patterns with Least Support 12
> 10:47:44,690 INFO  ~ Mining FTree Tree for all patterns with 0
> updater : FPGrowth Algorithm for a given feature: 0
> updater : FPGrowth Algorithm for a given feature: 1
> updater : FPGrowth Algorithm for a given feature: 2
> updater : FPGrowth Algorithm for a given feature: 3
> bier:[bier] : 15
> 10:47:44,691 INFO  ~ Found 1 Patterns with Least Support 15
> 
> code :
> 
> 
>    import org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth
>    import java.util.HashSet
>    import org.apache.mahout.common.iterator.StringRecordIterator
>    import org.apache.mahout.common.iterator.FileLineIterable
>    import org.apache.mahout.fpm.pfpgrowth.convertors._
>    import org.apache.mahout.fpm.pfpgrowth.convertors.integer._
>    import org.apache.mahout.fpm.pfpgrowth.convertors.string._
>    import org.apache.hadoop.io.SequenceFile.Writer
>    import org.apache.mahout.fpm.pfpgrowth.convertors.StatusUpdater
>    import org.apache.hadoop.mapred.OutputCollector
>    import scala.collection.JavaConversions._
>    import java.util.{ List => JList }
>    import org.apache.mahout.common.{ Pair => JPair }
>    import java.lang.{ Long => JLong }
>    import org.apache.hadoop.io.{ Text => JText }
> 
>    val minSupport = 1L
>    val k: Int = 50
>    val fps: FPGrowth[String] = new FPGrowth[String]()
> 
>    val milk = "milk"
>    val bread = "bread"
>    val butter = "butter"
>    val bier = "bier"
> 
>    val transactionStream: Iterator[JPair[JList[String], JLong]] = Iterator(
>      new JPair(List(milk, bread), 1L),
>      new JPair(List(butter), 1L),
>      new JPair(List(bier), 10L),
>      new JPair(List(milk, bread, butter), 5L),
>      new JPair(List(milk, bread, bier), 5L),
>      new JPair(List(bread), 1L)
>    )
> 
>    val frequencies: Collection[JPair[String, JLong]] = fps.generateFList(
>      transactionStream, minSupport.toInt)
> 
>    println("freqList :" + frequencies)
> 
>    var returnableFeatures: Collection[String] = List(
>      milk, bread, butter, bier)
> 
>    var output: OutputCollector[String, JList[JPair[JList[String], JLong]]] = (
>      new OutputCollector[String, JList[JPair[JList[String], JLong]]] {
>        def collect(x1: String,
>                    x2: JList[JPair[JList[String], JLong]]) = {
>          println(x1 + ":" +
>            x2.map(pair => "[" + pair.getFirst.mkString(",") + "] : " +
>              pair.getSecond).mkString("; "))
>        }
>      }
>    )
> 
>    val updater: StatusUpdater = new StatusUpdater {
>      def update(status: String) = println("updater : " + status)
>    }
> 
>    fps.generateTopKFrequentPatterns(
>      transactionStream,
>      frequencies,
>      minSupport,
>      k,
>      null, //returnableFeatures
>      output,
>      updater)
> 
> 
>       
> 
> 
> 
> 

--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com



Reply via email to