The PFP in mahout accept a text input format, you could specify the splitter
to split different columns. For other data source, the easiest way is to
transfer it to a text format and separate the columns by a tab('\t') and put
it into the hdfs as the PFP input.


On Fri, May 6, 2011 at 9:35 AM, hustnn <[email protected]> wrote:

> I see a topic of you about "the convert data in databases (Flatfiles,
> XMLdumps, MySQL,Cassandra, Different formats on  HDFS, Hbase) into
> intermediate form(say vector)".
>
> I Know the parallel FPGrowth can use the hadoop to distribute compution in
> different tasktrackers easily in map-reduce ways, but I want to know how
> parallel FPGrowth works using other database such as mysql, cassandra and
> hbase. How does it gain input and how does it distribute computions making
> it works parallelly.
>
> Thanks.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/how-to-get-input-in-parallel-FPGrowth-tp2906536p2906536.html
> Sent from the Mahout Developer List mailing list archive at Nabble.com.
>

Reply via email to