Mahout Samsara executes on Spark and doesn’t include a full recommender because 
it uses a search engine to perform the last calculation and serve results.

The newer code does not require the mapping. You can use the provided 
text-delimited format for input or write your own. 
SimilarityAnalysis,rowSimilarity compares all rows to all other rows (with some 
downsampling). Is this what you are looking for? To create a content-based 
recommender you index the output of the job, with you can take as Spark rdds or 
as text files. Then for any row, fetch similar rows, or for a user’s history of 
rows preferred use this as the query and get a ranked list of similar rows to 
the user’s preferred rows.

If you have a file or set of part-0000x files which are formatted as

row-id1<tab>cat-id1,cat-id2,cat-idxyz
row-id2<tab>cat-id3,…

Notice two delimiters

If this is anything like what you are looking for check the spark-rowsimilarity 
description for more details: 

As Nick says, the older Mahout code does require id mapping but is also being 
deprecated as we move completely to more modern computation engines (Hadoop 
mapreduce is pretty slow in comparison).

On Nov 18, 2014, at 10:44 PM, Lee S <[email protected]> wrote:

I need to create a mapping myself before passing the data to mahout.That
already helps.
When the data is big enough , I need a mapreduce job to do the convertion
,right?

2014-11-19 11:16 GMT+08:00 Martin, Nick <[email protected]>:

> Hi there,
> 
> Which algorithm are you using? For instance, for recommendations you could
> create a mapping of your categorical data to integers before you pass the
> data into Mahout.
> 
> Let us know a bit more about what you're trying to accomplish/algos you're
> looking to use.
> 
> Best,
> Nick
> 
> -----Original Message-----
> From: Lee S [mailto:[email protected]]
> Sent: Tuesday, November 18, 2014 10:13 PM
> To: user
> Subject: How to deal with catogrical and date data in mahout ?
> 
> Hi all:
> Do you hava any good practice when you deal with catogrical data?
> Does mahout have provided a tool class which can do the convertion?
> 

Reply via email to