Thanks for your help
Yes, I think a time-based division of test v. training probably would
make sense since that will correspond to our actual intended practice.
But before I worry about that I seem to have some more fundamental
problem that is giving me 0 precision and 0 recall all the time...
-Mike
On 06/07/2013 02:58 PM, Sean Owen wrote:
In point 1, I don't think I'd say it that way. It's not true that
test/training is divided by user, because every user would either be
100% in the training or 100% in the test data. Instead you hold out
part of the data for each user, or at least, for some subset of users.
Then you can see whether recs for those users match the held out data.
Yes then you see how the held-out set matches the predictions by
computing ratios that give you precision/recall.
The key question is really how you choose the test data. It's implicit
data; one is as good as the next. In the framework I think it just
randomly picks a subset of the data. You could also split by time;
that's a defensible way to do it. Training data up to time t and test
data after time t.
On Fri, Jun 7, 2013 at 7:51 PM, Michael Sokolov
<[email protected]> wrote:
I'm trying to evaluate a few different recommenders based on boolean
preferences. The in action book suggests using an precision/recall metric,
but I'm not sure I understand what that does, and in particular how it is
dividing my data into test/train sets.
What I think I'd like to do is:
1. Divide the test data by user: identify a set of training data with data
from 80% of the users, and test using the remaining 20% (say).
2. Build a similarity model from the training data
3. For the test users, divide their data in half; a "training" set and an
evaluation set. Then for each test user, use their training data as input
to the recommender, and see if it recommends the data in the evaluation set
or not.
Is this what the precision/recall test is actually doing?
--
Michael Sokolov
Senior Architect
Safari Books Online
--
Michael Sokolov
Senior Architect
Safari Books Online