On Wed, Jan 27, 2010 at 2:27 PM, Sean Owen <sro...@gmail.com> wrote:

> On Wed, Jan 27, 2010 at 2:15 AM, Jake Mannix <jake.man...@gmail.com>
> wrote:
> > There is no need (although there may be much *utility*) in ever thinking
> > about
> > interactions between items (item-item similarity) or users.
>  Content-based
> > recommendations can act purely as a generalized search engine, where the
> > trick is just coming up with the search terms / query features to use for
> > each user.
>
> Yes, you're right, if I understand your meaning correctly.
>
> I think that content-based recommendation of this form is not really a
> conventional recommender system. It smells much more like a search
> problem. I like attributes X Y and Z, so recommend me me items with
> attributes X Y and Z: call 'attributes' as 'search terms' and 'items'
> as 'search results' and yup, it's search. No real ratings here.
>

But how is "presence of term X in both item1 and user1" as a boolean
preference value any different than "user1 has a preference for
attribute(X)"?  Similarly, tf-idf weightings provide a floating point
"rating" for correlations between different item types.

The reason why I think this kind of recommender is not so strange is
that you can group together attributes into fields / column-families,
and while presence/absence (or tf-idf, or whatever) can act as
raw ratings, you can then add in arbitrary model weights *between*
fields which are *learned* by feedback (use logistic regression,
for example) from the user-item ratings table.  Does that make sense?


> So I suppose I am resisting implementing this as a recommender system
> since it's well in hand from search frameworks, but I'm not sure how
> valid that is.
>

It *exists* as a search setup, but at least in e.g Lucene, it's not designed
to do this, really, and there are lots of hacks you have to do (the
normalization
is wrong, the dot product isn't really cosine, you have to work to make it
into tanimoto/etc).  And search setups aren't really designed to do batch
recommendations of this kind either.  Trust me, you can do this with search,
and sometimes its a good idea, but it's kindof a kludge, and it's not at all
straightforward (but the goal is a totally valid one!).


>
>
> >  * on webpage (type W), you have certain set of features, and users come
> to
> > that
> > webpage, sometimes with no prior history, so if you want to recommend
> > (serve)
> > ads (type A) to the user, recommending based purely on some kind of
> > content-based
> > correlation between items of type W and A can work.
>
> Alrighty so users are webpages (W) and items are ads (A) and you're
> recommending ads to webpages. And you intend to use the text of W and
> A to recommend? Yup, that's valid, but smells like search, and
> something a search framework would do well on. I would say: figure out
> which Ws 'prefer' which As based on clicks, and maybe base ad
> recommendations on textual similarity between As. That's a(n
> item-based) recommender.
>

But what you're suggesting here is one particular choice of solution - it's
presupposing that that one is the best.  Why not say: similarity(W,A) =
alpha_0 * (W_title * A_title) + alpha_1 * (W_header * A_title) + alpha_2 *
(W_subHeader * A_body) + alpha_3 * (W_tags * A_landingURL) + ...
and then train your alpha_i to optimize clickthrough?



> Well there's no reason that a recommender framework shouldn't support
> search-like approaches. I have convinced myself that what I have on my
> hands is really a collaborative filtering framework. I think it's
> somewhere on the roadmap, therefore, to expand into these other
> techniques.
>

Why should we limit ourselves to just a CF framework?  Why not a
recommendation framework which can easily do both?

  -jake

Reply via email to