I'd like to compare the accuracy, precision and recall of various vector similarity measures with regards to our data sets. Ideally, I'd like to do that for RecommenderJob, including CooccurrenceCount. However, I don't think RecommenderJob supports calculation of the performance metrics.
Alternatively, I could use the evaluator logic in the non-Hadoop-based Item-based recommenders, but they do not seem to support the option of using CooccurrenceCount as a measure, or am I wrong? Reading archived conversations from here, I can see others have asked a similar question in 2011 (http://comments.gmane.org/gmane.comp.apache.mahout.user/9758) but there seems no clear guidance. Also, I am unsure if it is valid to split the data set into training/testing that way, as testing users' key characteristic is the items they have preferred—and there is no "model" to fit them to, so to speak, or they would become anonymous users if we stripped their preferences. Am I right in thinking that I could test RecommenderJob by feeding X random preferences of a user, having hidden the remainder of their preferences, and see if the hidden items/preferences would become their recommendations? However, that approach would change what a user "likes" (by hiding their preferences for testing purposes) and I'd be concerned about the value of the recommendation. Am I in a loop? Is there a way to somehow tap into the recommendation to get an accuracy metric out? Did anyone, perhaps, share a method or a script (R, Python, Java) for evaluating RecommenderJob results? Many thanks, Rafal Lukawiecki
