[ 
https://issues.apache.org/jira/browse/HIVEMALL-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143306#comment-16143306
 ] 

Takuya Kitazawa commented on HIVEMALL-124:
------------------------------------------

[~uhyonc] Hi, how is your progress on this issue?

> NDCG - BinaryResponseMeasure "fix"
> ----------------------------------
>
>                 Key: HIVEMALL-124
>                 URL: https://issues.apache.org/jira/browse/HIVEMALL-124
>             Project: Hivemall
>          Issue Type: Improvement
>            Reporter: Uhyon Chung
>            Assignee: Takuya Kitazawa
>
> There's a small issue which makes it a bit hard to use the NDCG@x
> from BinaryResponseMeasure.java
> {code:java}
>     public static double nDCG(@Nonnull final List<?> rankedList,
>             @Nonnull final List<?> groundTruth, @Nonnull final int 
> recommendSize) {
>         double dcg = 0.d;
>         double idcg = IDCG(Math.min(recommendSize, groundTruth.size()));
> ...
>     public static double IDCG(final int n) {
>         double idcg = 0.d;
>         for (int i = 0; i < n; i++) {
>             idcg += Math.log(2) / Math.log(i + 2);
>         }
>         return idcg;
>     }
> {code}
> You'll notice that the way it calculates the idcg for binary NDCG calculation 
> is that it uses the count in groundTruth. The problem is that when we use 
> "recommendSize" (e.g. NDCG@10) we may pass all the ground Truth and not just 
> the ones in the first 10. This is a bit unexpected. Of course, we could just 
> limit the truths using array intersection and what not, but the users 
> shouldn't really have to do that. You can simply just count the # of matched 
> ground truths so it's easier to use this function.
> e.g.
> {code:java}
>     public static double nDCG(@Nonnull final List<?> rankedList,
>             @Nonnull final List<?> groundTruth, @Nonnull final int 
> recommendSize) {
>         double dcg = 0.d;
>         int matchedGroundTruths = 0;
>         for (int i = 0, n = recommendSize; i < n; i++) {
>             Object item_id = rankedList.get(i);
>             if (!groundTruth.contains(item_id)) {
>                 continue;
>             }
>             int rank = i + 1;
>             dcg += Math.log(2) / Math.log(rank + 1);
>             matchedGroundTruths++;
>         }
>         double idcg = IDCG(matchedGroundTruths);
>         return dcg / idcg;
>     }
> {code}
> Thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to