Hi Jordi,

That's because you compute recommendations on *boolean* data (-b true).
There is no weight involved in the preferences then, you either know
that a user likes something or you don't know it. The result of that is
that you can also not assign a weight to a computed recommendation
either. That's where the 1.0s are coming from.

Things might be clearer if we take a look at the math:

u = a user
i = an item not yet rated by u
N = all items similar to i

Prediction(u,i) = sum(all n from N: similarity(i,n) * rating(u,n)) /
sum(all n from N: abs(similarity(i,n)))

If all ratings have value 1 (cause we use boolean data) the result of
the Predicition can also only be 1.

--sebastian



Am 26.11.2010 19:26, schrieb Jordi Abad:
> Hi,
> 
> I'm running a RecommenderJob (mahout-0.4 version) over hadoop like this:
> 
> hadoop-0.20 jar /mahout-distribution-0.4/mahout-core-0.4-job.jar
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJob
> -Dmapred.input.dir=input -Dmapred.output.dir=output -s
> SIMILARITY_TANIMOTO_COEFFICIENT -b true
> 
> The job works fine but when I examine the result I get things like:
> 
> 12    [1:1.0,2:1.0,3:1.0,5:1.0,6:1.0,11:1.0,168:1.0,173:1.0,180:1.0,199:1.0]
> 14    [1:1.0,2:1.0,3:1.0,5:1.0,6:1.0,11:1.0,14:1.0,21:1.0,22:1.0,23:1.0]
> ...
> 
> I can't understand why each recommendation gets 1.0 of score. It doesn't
> matter which SimilarityClass I set. I always get a score of 1.0.
> 
> My input file is a "boolean file" (1391374 rows) with values like:
> 
> 1,6496241
> 1,4368916
> 1,4922226
> 1,4958662
> ...
> 
> If I run
> "org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob" job
> over the same file I get good results for items.
> 
> Any ideas?
> 
> Thanks in advance.
> 

Reply via email to