I would suggest pruning similarities near 0, and then treating missing
similarities as 0 later at runtime. It may take a bit of coding. But
you should be able to throw away a lot without compromising much of
the result.

On Thu, Jun 21, 2012 at 10:16 PM, Way Cool <[email protected]> wrote:
> Hi, guys,
>
> For item-based recommendation, I pre-calculated the item similarities on
> Hadoop per algorithm, which generated 20m rows each. The problem now is I
> can't just load them into memory via MySQLJDBCInMemoryItemSimilarity with
> 4GB memory. I tried MySQLJDBCItemSimilarity, however it's way too slow.
> What are the alternatives?
>
> For user-based recommendation, I can't load 100m lines of data model from
> FileDataModel into memory. It ran out of memory after 20m lines. The same
> issue with JDBCDataModel is way too slow. Does anyone precalculate the user
> similarities before and recommend items to a user?
>
> Anyone had the similar issues before?
>
> Thanks,
>
> YG

Reply via email to