Uhyon Chung created HIVEMALL-124:
------------------------------------

             Summary: NDCG - BinaryResponseMeasure "fix"
                 Key: HIVEMALL-124
                 URL: https://issues.apache.org/jira/browse/HIVEMALL-124
             Project: Hivemall
          Issue Type: Improvement
            Reporter: Uhyon Chung


There's a small issue which makes it a bit hard to use the NDCG@x

from BinaryResponseMeasure.java

{code:java}

    public static double nDCG(@Nonnull final List<?> rankedList,
            @Nonnull final List<?> groundTruth, @Nonnull final int 
recommendSize) {
        double dcg = 0.d;
        double idcg = IDCG(Math.min(recommendSize, groundTruth.size()));
...
    public static double IDCG(final int n) {
        double idcg = 0.d;
        for (int i = 0; i < n; i++) {
            idcg += Math.log(2) / Math.log(i + 2);
        }
        return idcg;
    }
{code}

You'll notice that the way it calculates the idcg for binary NDCG calculation 
is that it uses the count in groundTruth. The problem is that when we use 
"recommendSize" (e.g. NDCG@10) we may pass all the ground Truth and not just 
the ones in the first 10. This is a bit unexpected. Of course, we could just 
limit the truths using array intersection and what not, but the users shouldn't 
really have to do that. You can simply just count the # of matched ground 
truths so it's easier to use this function.

e.g.
{code:java}
    public static double nDCG(@Nonnull final List<?> rankedList,
            @Nonnull final List<?> groundTruth, @Nonnull final int 
recommendSize) {
        double dcg = 0.d;
        int matchedGroundTruths = 0;
        for (int i = 0, n = recommendSize; i < n; i++) {
            Object item_id = rankedList.get(i);
            if (!groundTruth.contains(item_id)) {
                continue;
            }
            int rank = i + 1;
            dcg += Math.log(2) / Math.log(rank + 1);
            matchedGroundTruths++;
        }
        double idcg = IDCG(matchedGroundTruths);
        return dcg / idcg;
    }
{code}

Thanks




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to