Thanks guys for your quick response.

We have a couple millions of items and 40 millions users (including
anonymous users). Up to 50 items were generated per item.

I will try minimum similarity. Is there any document or a parameter defined
in itemsimilarity job?

What about user-based recommendation? Any ideas how we can make that happen
without loading everything in memory?

Thanks.


On Thu, Jun 21, 2012 at 3:29 PM, Sean Owen <[email protected]> wrote:

> I would suggest pruning similarities near 0, and then treating missing
> similarities as 0 later at runtime. It may take a bit of coding. But
> you should be able to throw away a lot without compromising much of
> the result.
>
> On Thu, Jun 21, 2012 at 10:16 PM, Way Cool <[email protected]> wrote:
> > Hi, guys,
> >
> > For item-based recommendation, I pre-calculated the item similarities on
> > Hadoop per algorithm, which generated 20m rows each. The problem now is I
> > can't just load them into memory via MySQLJDBCInMemoryItemSimilarity with
> > 4GB memory. I tried MySQLJDBCItemSimilarity, however it's way too slow.
> > What are the alternatives?
> >
> > For user-based recommendation, I can't load 100m lines of data model from
> > FileDataModel into memory. It ran out of memory after 20m lines. The same
> > issue with JDBCDataModel is way too slow. Does anyone precalculate the
> user
> > similarities before and recommend items to a user?
> >
> > Anyone had the similar issues before?
> >
> > Thanks,
> >
> > YG
>

Reply via email to