Now it's morning in China. Morning! I have waken up. You may try this way: There is another sequential implementation of fp-growth by Borgelt. Link is here: http://www.borgelt.net/fpgrowth.html You may down load it. After compiling, you can try to run this on the same dataset with the same arguments. If you get a same result, then mahout may had been working well. If not, try to checkout more of the log.
Regards, Q 2011/11/22 Grant Ingersoll <[email protected]> > Could you try comparing your dataset when using the bin/mahout process and > report back here? > > On Nov 21, 2011, at 4:49 AM, Sébastien Noir wrote: > > > Hi! > > > > I'm currently trying to understand how to use the implementation of the > FPGrowth algoritm (see : > https://cwiki.apache.org/MAHOUT/parallel-frequent-pattern-mining.html). > > > > Currently, I'm just trying it with stupid data, and scala code. The > problem is that it output only single item itemset. > > I probably missed something. Could you give me a hint? > > > > By the way, the code below is scala (calling java implementation > directly!). It that is a problem, I can translate it to java... > > > > sample outuput : > > > > freqList :Buffer((bier,15), (bread,12), (milk,11), (butter,6)) > > 10:47:44,688 INFO ~ Number of unique items 4 > > 10:47:44,688 INFO ~ Number of unique pruned items 4 > > 10:47:44,688 INFO ~ Number of Nodes in the FP Tree: 0 > > 10:47:44,688 INFO ~ Mining FTree Tree for all patterns with 3 > > updater : FPGrowth Algorithm for a given feature: 3 > > butter:[butter] : 6 > > 10:47:44,690 INFO ~ Found 1 Patterns with Least Support 6 > > 10:47:44,690 INFO ~ Mining FTree Tree for all patterns with 2 > > updater : FPGrowth Algorithm for a given feature: 2 > > updater : FPGrowth Algorithm for a given feature: 3 > > milk:[milk] : 11 > > 10:47:44,690 INFO ~ Found 1 Patterns with Least Support 11 > > 10:47:44,690 INFO ~ Mining FTree Tree for all patterns with 1 > > updater : FPGrowth Algorithm for a given feature: 1 > > updater : FPGrowth Algorithm for a given feature: 2 > > updater : FPGrowth Algorithm for a given feature: 3 > > bread:[bread] : 12 > > 10:47:44,690 INFO ~ Found 1 Patterns with Least Support 12 > > 10:47:44,690 INFO ~ Mining FTree Tree for all patterns with 0 > > updater : FPGrowth Algorithm for a given feature: 0 > > updater : FPGrowth Algorithm for a given feature: 1 > > updater : FPGrowth Algorithm for a given feature: 2 > > updater : FPGrowth Algorithm for a given feature: 3 > > bier:[bier] : 15 > > 10:47:44,691 INFO ~ Found 1 Patterns with Least Support 15 > > > > code : > > > > > > import org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth > > import java.util.HashSet > > import org.apache.mahout.common.iterator.StringRecordIterator > > import org.apache.mahout.common.iterator.FileLineIterable > > import org.apache.mahout.fpm.pfpgrowth.convertors._ > > import org.apache.mahout.fpm.pfpgrowth.convertors.integer._ > > import org.apache.mahout.fpm.pfpgrowth.convertors.string._ > > import org.apache.hadoop.io.SequenceFile.Writer > > import org.apache.mahout.fpm.pfpgrowth.convertors.StatusUpdater > > import org.apache.hadoop.mapred.OutputCollector > > import scala.collection.JavaConversions._ > > import java.util.{ List => JList } > > import org.apache.mahout.common.{ Pair => JPair } > > import java.lang.{ Long => JLong } > > import org.apache.hadoop.io.{ Text => JText } > > > > val minSupport = 1L > > val k: Int = 50 > > val fps: FPGrowth[String] = new FPGrowth[String]() > > > > val milk = "milk" > > val bread = "bread" > > val butter = "butter" > > val bier = "bier" > > > > val transactionStream: Iterator[JPair[JList[String], JLong]] = > Iterator( > > new JPair(List(milk, bread), 1L), > > new JPair(List(butter), 1L), > > new JPair(List(bier), 10L), > > new JPair(List(milk, bread, butter), 5L), > > new JPair(List(milk, bread, bier), 5L), > > new JPair(List(bread), 1L) > > ) > > > > val frequencies: Collection[JPair[String, JLong]] = fps.generateFList( > > transactionStream, minSupport.toInt) > > > > println("freqList :" + frequencies) > > > > var returnableFeatures: Collection[String] = List( > > milk, bread, butter, bier) > > > > var output: OutputCollector[String, JList[JPair[JList[String], > JLong]]] = ( > > new OutputCollector[String, JList[JPair[JList[String], JLong]]] { > > def collect(x1: String, > > x2: JList[JPair[JList[String], JLong]]) = { > > println(x1 + ":" + > > x2.map(pair => "[" + pair.getFirst.mkString(",") + "] : " + > > pair.getSecond).mkString("; ")) > > } > > } > > ) > > > > val updater: StatusUpdater = new StatusUpdater { > > def update(status: String) = println("updater : " + status) > > } > > > > fps.generateTopKFrequentPatterns( > > transactionStream, > > frequencies, > > minSupport, > > k, > > null, //returnableFeatures > > output, > > updater) > > > > > > > > > > > > > > > > -------------------------------------------- > Grant Ingersoll > http://www.lucidimagination.com > > > >
