Github user coderh commented on the pull request:

    https://github.com/apache/spark/pull/597#issuecomment-45847264
  
    Here is the values I have tried: seed is set to 42
    
    in & out means in sample (training set) out-of-sample (test set)
    
    # #factor = 12, lamda = 1, alpha = 1 
         iter 20  => 
                     MAP_in = 0.035399855240788425
                     MAP_out = 0.007907455900941737
                     EPR_in  = 0.4902389595686534
                     EPR_out = 0.4931204751436468
    
          iter 40  => 
                      MAP_in = 0.033210624652830374
                      MAP_out = 0.007158070987320343
                      EPR_in  = 0.4907502816419743
                      EPR_out = 0.49214166351173705
    
    # #factor = 50, alpha = 1, iter = 30
          lambda = 1, => 
                      MAP_in = 0.029096938174350682
                      MAP_out = 0.006634856811818636
                      EPR_in  = 0.4928298931862564
                      EPR_out = 0.49328834081999423
    
         lambda = 0.001 => 
                      MAP_in = 0.02903970778838223
                      MAP_out = 0.006569378517284138
                      EPR_in  = 0.4929466287464198
                      EPR_out = 0.49337539845412665
    
    I have not tried other metrics, as said before, RMSE is not that good. I 
will give AUC and ROC a try.
    
    I listed some code snippets here. There are 2 evaluation methods and the 
main
    https://gist.github.com/coderh/05a83be081c1f713e15b


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to