Re: Recommending when working with binary data sets

Grant Ingersoll Tue, 30 Sep 2008 10:26:20 -0700

I adhere to Doug's philosophy on patents, and refuse to look at themor do searches for them. I am not a lawyer, nor a judge, nor a patentofficer, and am thus completely unqualified to even begin to venturean opinion.


Sorry,
Grant



On Sep 30, 2008, at 1:09 PM, Otis Gospodnetic wrote:

Hello,

Thanks for the pointers, Grant. Regarding that Amazon item-itemrecommendation. It looks like that's patented:


http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=2&p=1&f=G&l=50&d=PTXT&S1=amazon.ASNM.&OS=AN/(amazon)&RS=AN/amazon

Does that mean one cannot implement this in Taste (or any otherpiece of software)? Even if used in non-shopping purposes?



Thanks,
Otis


----- Original Message ----

From: Grant Ingersoll <[EMAIL PROTECTED]>
To: [email protected]
Sent: Monday, September 29, 2008 9:43:58 AM
Subject: Re: Recommending when working with binary data sets

Not sure I know the answer in terms of Taste, but did a little bit of
digging (mind you, I'm no CF expert, but I'm learning thanks to Taste
and Sean).

At any rate, came across:
Started at Wikipedia's page:
http://en.wikipedia.org/wiki/Collaborative_filtering

which lead to http://en.wikipedia.org/wiki/Slope_One, which then has
an interesting comment about Amazon's item-item approach, which, via
Google Scholar leads to:

http://dsonline.computer.org/portal/site/dsonline/menuitem.9ed3d9924aeb0dcd82ccc6716bbe36ec/index.jsp?&pName=dso_level1&path=dsonline/2003_Archives/0301/d&file=wp1lind.xml&xsl=article.xsl&;jsessionid=LghY1grHgYJpBTLpWjX5NtvQwhH1Bkv9rpfXT4VnpVtDNVpfZ8n0!-1404507079

In particular, see the "How it Works" section.  Essentially, it
describes how they build the item to item similarity matrix, which I
believe is also what you need to do.

HTH,
Grant

On Sep 26, 2008, at 1:52 PM, Otis Gospodnetic wrote:

Hi,

I've been reading the chapter on recommendations in Programming
Collective Intelligence and looking at Taste.  The examples in PCI
all assume scenarios where items to recommend have been rated by
users on some scale.  I understand how items can be recommended to
users using item-based filtering and user-item ratings and why this
is preferred over user-based filtering when the number of users is
larger than the number of items.
But what if all I've got is item-item similarity (content-based) and
there are no user-item ratings?  Say I have a situation where people
simply either consume content (e.g. read an article, watch a
video...) or not consume it (don't read an article, don't watch the

video...). In other words, I really have only yes/no or 1/0 orseen/

not seen type "rating".

I can't really use Euclidean distance or Pearson correlation
coefficient, can I?

What do people use in such scenarios?  Would it make sense to use

http://en.wikipedia.org/wiki/Jaccard_index

for such cases?
... Ah, I do see javadoc in TanimotoCoefficientSimilarity saying
exactly that, good.

But then my question is:
Doesn't the use of Jaccard/Tanimoto mean going back to the expensive
user-user similarity computation?

That is, if I need to recommend items for user U1 don't I need to:
1) have user-user similarity pre-computed (and recomputed
periodically)
2) find top N users U{2,3,4,...N} who are the most similar to U1
3) then for these top N most similar users find their "seen" items

that U1 has not seen (possibly limit this to only recently seenitems)

4) select top N items from 3) and recommend those to U1.

If so, then 1) is again expensive.
And what how would one go about selecting top N items from the list
in this case other than ordering them by user-user similarity?

Of course, something is telling me I'm demonstrating that I don't
yet have the full grasp of item-based filtering.  I hope that's the
case! :)

Thanks,
Otis

Re: Recommending when working with binary data sets

Reply via email to