evaluating recommender with boolean prefs

2013-06-07 Thread Michael Sokolov
I'm trying to evaluate a few different recommenders based on boolean preferences. The in action book suggests using an precision/recall metric, but I'm not sure I understand what that does, and in particular how it is dividing my data into test/train sets. What I think I'd like to do is: 1.

Re: evaluating recommender with boolean prefs

2013-06-07 Thread Sean Owen
In point 1, I don't think I'd say it that way. It's not true that test/training is divided by user, because every user would either be 100% in the training or 100% in the test data. Instead you hold out part of the data for each user, or at least, for some subset of users. Then you can see whether

Re: evaluating recommender with boolean prefs

2013-06-07 Thread Koobas
Since I am primarily an HPC person, probably a naive question from the ML perspective. What if, when computing recommendations, we don't exclude what the user already has, and then see if the items he has end up being recommended to him (compute some appropriate metric / ratio)? Wouldn't that be

Re: evaluating recommender with boolean prefs

2013-06-07 Thread Michael Sokolov
Thanks for your help Yes, I think a time-based division of test v. training probably would make sense since that will correspond to our actual intended practice. But before I worry about that I seem to have some more fundamental problem that is giving me 0 precision and 0 recall all the

Re: evaluating recommender with boolean prefs

2013-06-07 Thread Sean Owen
It depends on the algorithm I suppose. In some cases, the already-known items would always be top recommendations and the test would tell you nothing. Just like in an RMSE test -- if you already know the right answers your score is always a perfect 0. But in some cases I agree you could get some

Re: evaluating recommender with boolean prefs

2013-06-07 Thread Koobas
On Fri, Jun 7, 2013 at 4:50 PM, Sean Owen sro...@gmail.com wrote: It depends on the algorithm I suppose. In some cases, the already-known items would always be top recommendations and the test would tell you nothing. Just like in an RMSE test -- if you already know the right answers your

Re: evaluating recommender with boolean prefs

2013-06-07 Thread simon.2.thompson
But why would she want the things she has? - Original Message - From: Koobas [mailto:koo...@gmail.com] Sent: Friday, June 07, 2013 08:06 PM To: user@mahout.apache.org user@mahout.apache.org Subject: Re: evaluating recommender with boolean prefs Since I am primarily an HPC person

Re: evaluating recommender with boolean prefs

2013-06-07 Thread Sean Owen
Yes it makes sense in the case of for example ALS. With or without this idea, the more general point is that this result is still problematic. It is somewhat useful in comparing in a relative sense; I'd rather have a recommender that stacks my input values somewhere near the top than bottom. But

Re: evaluating recommender with boolean prefs

2013-06-07 Thread Sean Owen
I believe the suggestion is just for purposes of evaluation. You would not return these items in practice, yes. Although there are cases where you do want to return known items. For example, maybe you are modeling user interaction with restaurant categories. This could be useful, because as soon

Re: error when evaluating recommender w/boolean prefs

2012-07-15 Thread Sean Owen
This sounds like a target leak, like your test data is actually getting copied into the training data. On Sun, Jul 15, 2012 at 1:08 AM, Matt Mitchell goodie...@gmail.com wrote: One strange thing, and I'm going to dig through the MIA book tonight, is that my user based recommendation evaluator

Re: error when evaluating recommender w/boolean prefs

2012-07-15 Thread Matt Mitchell
OK hmm, is it possible this could happen from duplicate user/pref/score values in my data? How does Mahout handle duplicate entries in data, whether in a load-once file or coming from a refresh? On Sun, Jul 15, 2012 at 4:01 AM, Sean Owen sro...@gmail.com wrote: This sounds like a target leak,

Re: error when evaluating recommender w/boolean prefs

2012-07-15 Thread Sean Owen
Duplicates are handled by over-writing. There's not a way to represent two states of a user-item association simultaneously. It could be an issue only if you made your own data splitter that didn't properly put stuff in one bucket or the other, but I don't know that this is the issue here. You

Re: error when evaluating recommender w/boolean prefs

2012-07-15 Thread Matt Mitchell
I got it! The problem was my clojure code. I was not using the model argument passed into in my builder method, instead I was referencing a local, model var outside the method -- state problem. All working great now. Very interesting results too... sure to keep me busy for a while! On Sun, Jul

Re: error when evaluating recommender w/boolean prefs

2012-07-14 Thread Sean Owen
It still means the same thing. 220K lines may still be too sparse to get results. Also try removing your threshold and let it pick. On Sat, Jul 14, 2012 at 3:24 AM, Matt Mitchell goodie...@gmail.com wrote: Hmm, still happening. I have a 220k line file with user, item and pref value. I am still

Re: error when evaluating recommender w/boolean prefs

2012-07-14 Thread Matt Mitchell
Hey Sean, are you talking about using GenericRecommenderIRStatsEvaluator/CHOOSE_THRESHOLD? I am already using that. I have however bumped up the training size, maybe this will help. - Matt On Sat, Jul 14, 2012 at 3:13 AM, Sean Owen sro...@gmail.com wrote: It still means the same thing. 220K

Re: error when evaluating recommender w/boolean prefs

2012-07-14 Thread Sean Owen
Ah yes I see that now. Try increasing evaluation percentage to 1.0. At the moment you're only using 10% of the data. That's a quick way to make a bigger test! Also, what happens if you set the threshold to 0.5? On Sat, Jul 14, 2012 at 4:56 PM, Matt Mitchell goodie...@gmail.com wrote: Hey Sean,

Re: error when evaluating recommender w/boolean prefs

2012-07-14 Thread Matt Mitchell
Thanks Sean, your suggestions worked! I'm now getting results back from the evaluator. I also tweaked the JVM settings and things are running quicker. One strange thing, and I'm going to dig through the MIA book tonight, is that my user based recommendation evaluator returns 0.0 no matter what I

Re: error when evaluating recommender w/boolean prefs

2012-07-13 Thread Matt Mitchell
Hmm, still happening. I have a 220k line file with user, item and pref value. I am still getting the NaN error when evaluating. I'm not sure what to do. It also takes a long time for this error to popup, around 1 hour. I hate to throw code out like this, but maybe it'll be help someone... help me.

Re: error when evaluating recommender w/boolean prefs

2012-07-07 Thread Sean Owen
What it really means is that there is not enough data to make a meaningful test here. On Sat, Jul 7, 2012 at 1:28 AM, Matt Mitchell goodie...@gmail.com wrote: Hi, I have a recommender, with a boolean prefs model. I am following the instructions in the MIA book, but only get this exception:

Re: error when evaluating recommender w/boolean prefs

2012-07-07 Thread Matt Mitchell
Thanks Sean, you're absolutely right. Things are working nicely now. - Matt On Sat, Jul 7, 2012 at 3:48 AM, Sean Owen sro...@gmail.com wrote: What it really means is that there is not enough data to make a meaningful test here. On Sat, Jul 7, 2012 at 1:28 AM, Matt Mitchell

error when evaluating recommender w/boolean prefs

2012-07-06 Thread Matt Mitchell
Hi, I have a recommender, with a boolean prefs model. I am following the instructions in the MIA book, but only get this exception: Illegal precision: NaN [Thrown class java.lang.IllegalArgumentException] Restarts: 0: [QUIT] Quit to the SLIME top level Backtrace: 0: