Hi,

I would like to build a system able to say how similar two items are from a
set of attributes including: title, genre, ratings, year, description and
more.
So i guess i could build a feature vector for each item and then come up
with some similarity measures.

However i have no clue on which method i could use to:
- determine a weight to put on each feature (other than intuitive)
- how to deal with the 'description' attribute (i.e. a more or less long
free text) and to transform it into a relevant set of features.
- what algorithms in mahout could be adapted to build such things

Thanks a lot in advance for any insights, links or anything related to that.

Reply via email to