Is it reasonable to use 1.5GB of heap for recs? sure -- assuming you can
allow the JVM to use, say, 2GB or more of heap total.

There are more choices in Mahout for non-distributed recs. The primary
distributed version is an item-similarity-based approach but you can choose
from several similarity metrics. There is also a matrix-factorization-based
approach in there.

You can make the distributed version do anything you want, depends on how
much code you want to write. You'd have to replace the similarity
computation bits with your own logic, yes.

Sean

On Fri, Aug 3, 2012 at 5:16 PM, Matt Mitchell <[email protected]> wrote:

> Hi,
>
> I have a pretty neat, non-distributed recommender running here. In
> doing some math on new user growth rate, I thought it might be wise
> for me to turn to a distributed approach now, rather than later. Just
> to make sure I'm on the right track, I calculated 1.5 GB for in-memory
> prefs. Is that pushing it for a single machine recommender?
>
> Am I right in thinking that the distributed approach is a single
> algorithm, unlike the many possible choices for non-distributed? Is it
> possible to inject content-aware logic within a distributed
> recommender? Would I inject that into the map/reduce phase, when the
> recommendations are generated? Curious to know if there are any
> examples out there?
>
> Thanks!
>

Reply via email to