Yes all that is true. Precomputing is reasonable -- it's the storing
it in memory that is difficult given the size. You could consider
keeping the similarities in the database instead, and not loading into
memory, if you are worried about memory. There is not an
implementation that reads a database table but we could construct one.
:-) Sounds good. Thanks, but I will first test your other suggestions and hints.
I don't see how UserSimilarity objects come into this. You would not
use one in an item-based recommender. There is a CachingItemSimilarity
for ItemSimilarity classes.
Because I precompute the Item-Similarity-Matrix with a UserBasedSimilarity and that DB-Table:
|| aItem || aItemCharacteristic || aItemValue ||  ... so

   * User = aItem
   * Item = aItemCharacteristic
   * Preference = aItemValue

I uses that method to get the correlation:

       aCorrelation = aUserSimilarity.userSimilarity(user1, user2);

This is in my example the similarity between aItem1 and aItem2

If I use a CachingItemSimilarity I must use a ItemSimilarity:

       aCorrelation = aItemSimilarity.itemSimilarity(item1, item2);

This is in my example and my opinion the similarity between aItemCharacteristic1 and aItemCharacteristic2 and this isn't interesting for me. So I must use the UserSimilarity objects and the UserBasedRecommender although I would prefer the ItemBasedRecommender.

... hopefully my line of reasoning is not to confused ;-).
What you are doing now is effectively pre-computing all similarities
and caching them in memory, all of them, ahead of time. Using
CachingItemSimilarity would simply do that for you, and would probably
use a lot less memory since only pairs that are needed, and accessed
frequently, will be put into memory. It won't be quite as fast, since
it will still be re-computing similarities from time to time. But
overall you will probably use far less memory for a small decrease in
performance.
Ok I will try this at first.
Beyond that I could suggest more extreme modifications to the code.
For example, if you are willing to dig into the code to experiment,
you can try something like this: instead of considering every single
item for recommendation every time, pre-compute some subset of items
that are reasonably popular, and then in the code, only consider
recommending these. It is not a great approach since you want to
recommend obscure items sometimes, but could help.
I will think about this, but in the moment I will try to recommend all Items. If it isn't fast enough and there is no other idea I will try this.
You should also try using the very latest code from subversion. Just
this week I have made some pretty good improvements to the JDBC code.

Also, it sounds like you are trying to do real-time recommendations,
like synchronously with a user request. This can be hard since it
imposes such a tight time limit. Consider doing recommendations
asynchronously if you can. For example, start computing
recommendations when the user logs in, and maybe on the 2nd page view
5 seconds later, you are ready to recommend something.
Yes real-time is the dream :-) but I know this will be hard to reach. I first will follow your hints and if the worst-case recommendation will no longer be 80s I'm happy :-).

best regards
Thomas

--
___________________________________________________________
Thomas Rewig
___________________________________________________________

Reply via email to