[ 
https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826078#comment-15826078
 ] 

Nick Pentreath commented on SPARK-14409:
----------------------------------------

[~danilo.ascione] [~roberto.mirizzi] thanks for the code examples. Both seem 
reasonable and I like the DataFrame-based solutions here. The ideal solution 
would likely take a few elements from each design.

One aspect that concerns me is how are you generating recommendations from ALS? 
It appears that you will be using the current output of {{ALS.transform}}. So 
you're computing a ranking metric in a scenario where you only recommend the 
subset of user-item combinations that occur in the evaluation data set. So it 
is sort of like a "re-ranking" evaluation metric in a sense. I'd expect the 
ranking metric here to quite dramatically overestimate true performance, since 
in the real word you would generate recommendations from the complete set of 
available items.

cc [~srowen] thoughts?

> Investigate adding a RankingEvaluator to ML
> -------------------------------------------
>
>                 Key: SPARK-14409
>                 URL: https://issues.apache.org/jira/browse/SPARK-14409
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: Nick Pentreath
>            Priority: Minor
>
> {{mllib.evaluation}} contains a {{RankingMetrics}} class, while there is no 
> {{RankingEvaluator}} in {{ml.evaluation}}. Such an evaluator can be useful 
> for recommendation evaluation (and can be useful in other settings 
> potentially).
> Should be thought about in conjunction with adding the "recommendAll" methods 
> in SPARK-13857, so that top-k ranking metrics can be used in cross-validators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to