I haven't fooled around with Riffle, but I have done some extraction in the
past of Mahout components to use with Cascading.
And I'm interested in using Cascading 2.0 (APL vs. GPLv3 license) with Mahout,
so if you can share more details I'd be happy to take a look.
Regards,
-- Ken
On Dec 19, 2011, at 12:01pm, Neil Chaudhuri wrote:
> Does anyone have any code to share about how to use Riffle (and Cascading)
> with Mahout? I have a class wrapping a Mahout operation, but I am getting a
> NullPointerException when I add this class to my Cascade. I think the key
> line is this:
>
> 11/12/19 14:50:14 INFO flow.Flow: [mahoutVectorizer] atleast one sink does
> not exist
>
> This is despite having a method annotated as follows:
>
> @DependencyOutgoing
> public Path getOutgoing() {
> return outputFilePath;
> }
>
> Any insight is appreciated.
>
> Thanks.
>
--------------------------
Ken Krugler
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr