Re: Feedback using Mahout Taste in Master Thesis:

Sean Owen Tue, 10 May 2011 04:35:34 -0700

On Tue, May 10, 2011 at 12:24 PM, Manuel Blechschmidt
<[email protected]> wrote:
> Hello guys,
> I used a lot of Mahout especially Taste in my Master Thesis: "An architecture 
> for evaluating recommender systems in real world scenarios". I wanted to give 
> some feedback about it. If somebody is interested in the whole work (97 
> pages) drop me an email.


Great, thanks for the kudos. It would be good to post a link to your
work on the user@ list if you like.


> I was especially difficult to get the IDMigrator working. Would be quite cool 
> if there would be a DataModel which automatically includes String migration.

This is how it worked originally -- it just doesn't scale nearly as
well. It's really a much better idea to use numeric IDs, so the
framework pushes you that way.


> I had some problems that some interfaces did not implement the Serializable 
> interface. I already opened a ticket MAHOUT-650.

Yes interesting issue, though I don't believe a change is called for
in the framework. The issue notes have what I consider the "right" way
to approach this.


> Is there a benchmark engine telling RMSE of the different algorithms? Would 
> be cool if a maven command would be available. So when I implement a new 
> recommender I can directly benchmark it against the other implementations.

RMSE is not a property of an algorithm, but an algorithm and a
particular data set at least. I don't think this is possible as a
result.


>  * getNumUsersWithPreferenceFor for the MySQL DataModel only works for at 
> most two things and there is no warning if more are supplied

Maybe this is fixed since you looked, but it does throw an error:
    Preconditions.checkArgument(length != 0 && length <= 2, "Illegal
number of item IDs: " + length);


>  * DataModel expects that there is always only one rating from a user to an 
> item (what about reratings?)

Yes, that's true. The most recent rating always counts. It might be
interesting to find a way to factor in re-ratings, but to actually
build that in the framework would cause scale problems and I don't
know algorithms that use it. So maybe it's better to collapse multiple
ratings into one (weighted average favoring recent one?)


> I also attached some images which should explain how Taste is doing it's job 
> in my system.

(Images aren't included in mail to @apache.org mailing lists, you'd
have to post it elsewhere.

Re: Feedback using Mahout Taste in Master Thesis:

Reply via email to