+1 from me, a happy Taste-at-large-scale user. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
----- Original Message ---- > From: Sean Owen <[email protected]> > To: Mahout User List <[email protected]> > Sent: Wednesday, July 22, 2009 10:07:22 AM > Subject: Backwards-incompatible change(s) to recommender engine > > I wanted to announce a change that could break some people who extend > the recommender engine library. Actually, I am threatening to make one > small change now, in anticipation of a couple larger changes. > > The change is: > > 1. "ID" values become Comparable instead of just Object. Example: > User.getID() return value. > This just makes sense, I hope. The only real requirement I have for > keys is that they have an ordering, preferably a natural one. So this > expresses it. Better than the way it is currently expressed: > > 2. Remove the generic type from classes like GenericUser, GenericItem. > This type was bound as ">" so you could only > use a comparable key with these classes anyway. But the associated > interfaces, User and Item, did not carry this generic type since the > type then spread virally throughout the code until it was a mess. Any > component that touched a User or Item had to also express their > generic type. Pretty soon you have DataModel<(user key type), (item > key type)> and more, which doesn't make sense > But it really didn't make sense to have this generic type and not put > it in the interface declaration. > > This was causing messes in cloning User or Item objects meaningfully: > > something(Object itemID) { > if (itemID instanceof Long) { > return new GenericItem((Long) itemID), ...); > } else if (itemID instanceof Integer) { > ... > } ... > } > > > I think this only affects people that have extended the components or > integrated more deeply than just calling the components as-is. That > is, if you're already making calls like recommend("3595", 20) then > nothing changes. > > > The deeper change I've been threatening for a while is to remove the > User and Item abstractions entirely. It's a big change, but removing > these objects and dealing only in IDs simplifies things, and improves > memory and speed notably. Forcing instantiation of these objects is a > problem in several places. That change I'm still contemplating since > it's big... but, this change will actually make it somewhat easier. > > Soliciting thoughts before I pull the trigger. > > Sean
