Also, don't make algorithm choices based on small data samples. Bigger data will change the ordering of which algorithms work well.
On Mon, Dec 3, 2012 at 10:04 PM, Sean Owen <[email protected]> wrote: > You may do better with a latent feature approach -- working in lower > dimensional space won't have the problem of sheer sparsity preventing > you from finding any associations. > > I would not use these precision and recall scores as they will be > mostly noise. If your similarity metric is sound you should be able to > rely on the score. Just use the usual log-likelihood. > > It's a slightly complex question, but yes you should be able to > compare scores across users and yes should be able to determine a > cutoff empirically which means the result is good enough for your > purpose. > > Sean > > On Mon, Dec 3, 2012 at 9:22 PM, Pat Ferrel <[email protected]> wrote: > > Great, thanks. Not sure if it's worth changing because as I said my data > is very very incomplete. This is an experiment and we're mining a site > "politely" so it will take months to accumulate a good share. > > > > In the meantime to temporarily get around the low rate of cooccurrence > we look at the strength of the recommendation. We're using a small > neighborhood (3). Looking through all of the recommendations we get a few > pretty high strengths--say 2.8-1.5. While it's hard to tell by just looking > these seem to be reasonably good recommendations. > > > > The intuition for all of this being, we have a very weak recommender for > the average user but a good one for a lucky few. I suppose adding the > user's individual P and R to the eval criteria would help validate this > judgement? The P&R come from a slightly different recommender than the > actual recommendations due to using the eval subset. It seems like a high > strength would correspond to a higher P&R since the strength is the sum of > user similarities. > > > > We are then using these few highly ranked recommendations to get an > early somewhat subjective look at value. Earlier I asked if strengths could > be used to compare one user's recommendation to another's and concluded > that they could (all caveats about the actual meaning of strengths kept in > mind). Any obvious flaw in this reasoning? >
