The PFP in mahout accept a text input format, you could specify the splitter
to split different columns. For other data source, the easiest way is to
transfer it to a text format and separate the columns by a tab('\t') and put
it into the hdfs as the PFP input.On Fri, May 6, 2011 at 9:35 AM, hustnn <[email protected]> wrote: > I see a topic of you about "the convert data in databases (Flatfiles, > XMLdumps, MySQL,Cassandra, Different formats on HDFS, Hbase) into > intermediate form(say vector)". > > I Know the parallel FPGrowth can use the hadoop to distribute compution in > different tasktrackers easily in map-reduce ways, but I want to know how > parallel FPGrowth works using other database such as mysql, cassandra and > hbase. How does it gain input and how does it distribute computions making > it works parallelly. > > Thanks. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/how-to-get-input-in-parallel-FPGrowth-tp2906536p2906536.html > Sent from the Mahout Developer List mailing list archive at Nabble.com. >
