There are 2 types of MLlib ALS recommenders last I checked, implicit and explicit. Implicit ones you give any arbitrary score, like a 1 for purchase. The explicit one you can input ratings and it is expected to predict ratings for an individual. But both iirc also have a regularization parameter that affects the scoring and is a param so you have to experiment with it using cross-validation to see where you get the best results.
There is an old metric used for this type of thing called RMSE (root-mean-square error) which, when minimized will give you scores that most closely match actual scores in the hold-out set (see wikipedia on cross-validation and RMSE). You may have to use explicit ALS and tweak the regularization param, to get the lowest RMSE. I doubt anything will guarantee them to be in exactly the range of ratings so you’ll then need to pick the closest rating. On Dec 18, 2017, at 10:42 AM, GMAIL <[email protected]> wrote: That is, the predicted scores that the Recommender returns can not just be multiplied by two, but may be completely wrong? I can not, say, just divide the predictions by 2 and pretend that everything is fine? 2017-12-18 21:35 GMT+03:00 Pat Ferrel <[email protected] <mailto:[email protected]>>: The UR and the Recommendations Template use very different technology underneath. In general the scores you get from recommenders are meaningless on their own. When using ratings as numerical values with a ”Matrix Factorization” recommender like the ones in MLlib, upon which the Recommendations Template is based need to have a regularization parameter. I don’t know for sure but maybe this is why the results don’t come in the range of input ratings. I haven’t looked at the code in a long while. If you are asking about the UR it would not take numeric ratings and the scores cannot be compared to them. For many reasons that I have written about before I always warn people about using ratings, which have been discontinued as a source of input for Netflix (who have removed them from their UX) and many other top recommender users. There are many reasons for this, not the least of which is that they are ambiguous and don’t directly relate to whether a user might like an item. For instance most video sources now use something like the length of time a user watches a video, and review sites prefer “like” and “dislike”. The first is implicit and the second is quite unambiguous. On Dec 18, 2017, at 12:32 AM, GMAIL <[email protected] <mailto:[email protected]>> wrote: Does it seem to me or UR strongly differs from Recommender? At least I can't find method getRatings in class DataSource, which contains all events, in particular, "rate", that I needed. 2017-12-18 11:14 GMT+03:00 Noelia Osés Fernández <[email protected] <mailto:[email protected]>>: I didn't solve the problem :( Now I use the universal recommender On 18 December 2017 at 09:12, GMAIL <[email protected] <mailto:[email protected]>> wrote: And how did you solve this problem? Did you divide prediction score by 2? 2017-12-18 10:40 GMT+03:00 Noelia Osés Fernández <[email protected] <mailto:[email protected]>>: I got the same problem. I still don't know the answer to your question :( On 17 December 2017 at 14:07, GMAIL <[email protected] <mailto:[email protected]>> wrote: I thought that there was a 5 point scale, but if so, why do I get predictions of 7, 8, etc.? P.S. Sorry for my English. 2017-12-17 16:05 GMT+03:00 GMAIL <[email protected] <mailto:[email protected]>>: Hi. I train with Recommendation Engine Template. I use data from sample_movielens_data.txt and there all score less than 5, but I get prediction with score more than 5. What it meaning? -- <http://www.vicomtech.org/> Noelia Osés Fernández, PhD Senior Researcher | Investigadora Senior [email protected] <mailto:[email protected]> +[34] 943 30 92 30 Data Intelligence for Energy and Industrial Processes | Inteligencia de Datos para Energía y Procesos Industriales <https://www.linkedin.com/company/vicomtech> <https://www.youtube.com/user/VICOMTech> <https://twitter.com/@Vicomtech_IK4> member of: <http://www.graphicsmedia.net/> <http://www.ik4.es/> Legal Notice - Privacy policy <http://www.vicomtech.org/en/proteccion-datos> -- <http://www.vicomtech.org/> Noelia Osés Fernández, PhD Senior Researcher | Investigadora Senior [email protected] <mailto:[email protected]> +[34] 943 30 92 30 Data Intelligence for Energy and Industrial Processes | Inteligencia de Datos para Energía y Procesos Industriales <https://www.linkedin.com/company/vicomtech> <https://www.youtube.com/user/VICOMTech> <https://twitter.com/@Vicomtech_IK4> member of: <http://www.graphicsmedia.net/> <http://www.ik4.es/> Legal Notice - Privacy policy <http://www.vicomtech.org/en/proteccion-datos>
