Hi,

I apologize if I've misunderstood the purpose of the Taste component of Mahout. 
Our goal was to take a recommendation framework and use our own recommendation 
algorithm within it. We need to process a massive amount of data, and wanted it 
to be done on our Hadoop grid. I thought that Taste was the right fit for the 
job. I'm not interested in the HTTP service. I'm interested in the 
recommendation framework, particularly from a back-end batch perspective. Does 
that help clarify? Thanks for helping me sort through this.

-Aurora


On 7/21/09 3:02 PM, "Sean Owen" <[email protected]> wrote:

Hmm, lots going on here, it's confusing.

Are you trying to run this on Hadoop intentionally? because the web
app example is not intended to run on Hadoop. It's a component
intended to serve recommendations over HTTP in real time. It also
appears you are running an evaluation rather than a web app serving
requests. I realize you're trying to run this without Jetty, but
that's kind of like trying to run a web app without a web server.

I think you'd have to clarify what you are trying to do, and then what
you are doing right now, to begin to assist.

On Tue, Jul 21, 2009 at 9:20 PM, Aurora
Skarra-Gallagher<[email protected]> wrote:
> Hi,
>
> I'm trying to run the taste web example without using jetty. Our gateways 
> aren't meant to be used as webservers. By poking around, I found that the 
> following command worked:
> hadoop --config ~/hod-clusters/test jar 
> /x/mahout-current/examples/target/mahout-examples-0.2-SNAPSHOT.job 
> org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommenderEvaluatorRunner
>
> The output is:
> 09/07/21 19:59:21 INFO file.FileDataModel: Creating FileDataModel for file 
> /tmp/ratings.txt
> 09/07/21 19:59:21 INFO eval.AbstractDifferenceRecommenderEvaluator: Beginning 
> evaluation using 0.9 of GroupLensDataModel
> 09/07/21 19:59:22 INFO file.FileDataModel: Reading file info...
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 100000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 200000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 300000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 400000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 500000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 600000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 700000 lines
> 09/07/21 19:59:22 INFO file.FileDataModel: Processed 800000 lines
> 09/07/21 19:59:23 INFO file.FileDataModel: Processed 900000 lines
> 09/07/21 19:59:23 INFO file.FileDataModel: Processed 1000000 lines
> 09/07/21 19:59:23 INFO file.FileDataModel: Read lines: 1000209
> 09/07/21 19:59:30 INFO slopeone.MemoryDiffStorage: Building average diffs...
> 09/07/21 19:59:42 INFO eval.AbstractDifferenceRecommenderEvaluator: 
> Evaluation result: 0.7035965559003973
> 09/07/21 19:59:42 INFO grouplens.GroupLensRecommenderEvaluatorRunner: 
> 0.7035965559003973
>
> The job appears to write data to /tmp/ratings.txt and /tmp/movies.txt. I'm 
> not sure if this is the correct way to run this example. I have a few 
> questions:
>
>  1.  Is the output file /tmp/ratings.txt? If so, how do I interpret it?
>  2.  What does the Evaluation result mean?
>  3.  Is it even running on HDFS?
>  4.  Is it a map-reduce job?
>
> Any pointers on how to run this as a standalone job would be helpful.
>
> Thanks,
> Aurora
>

Reply via email to