[GitHub] eric-haibin-lin commented on a change in pull request #7390: adding ranking metrics (precision/recall) at position K.

2017-09-11 Thread git
eric-haibin-lin commented on a change in pull request #7390: adding ranking 
metrics (precision/recall) at position K. 
URL: https://github.com/apache/incubator-mxnet/pull/7390#discussion_r138146891
 
 

 ##
 File path: python/mxnet/metric.py
 ##
 @@ -569,6 +569,161 @@ def update(self, labels, preds):
 self.num_inst += 1
 
 
+def true_positives(label):
+"""given a vector of labels, returns the set of indices 
+corresponding to positives
+Parameters:
+--
+label   : vector of binary ground truth
+Returns:
+
+set of indices corresponding to positive examples
+"""
+return set(numpy.ravel(numpy.argwhere(label == 1)))
+
+
+@register
+@alias('top_k_precision')
+class TopKPrecision(EvalMetric):
+"""Computes top k precision metric.
+top k differs from regular precision in that the score is only
+computed for the top k predictions. "correct" or "wrong" entries
+outside the top k are ignored
+Parameters
+--
+top_k : int
+Whether targets are in top k predictions.
+name : str
+Name of this metric instance for display.
+output_names : list of str, or None
+Name of predictions that should be used when updating with update_dict.
+By default include all predictions.
+label_names : list of str, or None
+Name of labels that should be used when updating with update_dict.
+By default include all labels.
+
+Examples
+
+>>>ytrue = [[1.,0.,1.,0.],[0.,1.,1.,0.]]
+>>>ytrue = mx.nd.array(ytrue)
+>>>yhat = [[0.4,0.8,0.1,0.1],[0.4,0.8,0.8,0.4]]
+>>>yhat = mx.nd.array(yhat)
+>>>pre = mx.metric.create('top_k_precision',top_k=2)
+>>>pre.update(preds = [yhat], labels = [ytrue])
+>>>print pre.get()[1]
+>>> 0.75
+
+"""
+
+def __init__(self, top_k=1, name='top_k_precision',
+ output_names=None, label_names=None):
+super(TopKPrecision, self).__init__(
+name, top_k=top_k,
+output_names=output_names, label_names=label_names)
+self.top_k = top_k
+
+
+def update(self, labels, preds):
+"""Updates the internal evaluation result.
+Parameters
+--
+labels : list of `NDArray`
+The labels of the data. (binary)
+preds : list of `NDArray`
+Predicted values. (float)
+
+Returns:
+
+The precision at K (float)
+"""
+check_label_shapes(labels, preds)
+
+for label, pred_label in zip(labels, preds):
+assert(len(pred_label.shape) <= 2), 'Predictions should be no more 
than 2 dims'
+pred_label = 
numpy.argsort(-pred_label.asnumpy().astype('float32'), axis=1)
+label = label.asnumpy().astype('int32')
+check_label_shapes(label, pred_label)
+num_samples = pred_label.shape[0]
+local_precision = 0.0
+for s in range(num_samples):
+truepos = true_positives(label[s,:])
+predpos = set(numpy.ravel(pred_label[s, :self.top_k]))
+local_precision += 
len(truepos.intersection(predpos))/self.top_k
+self.sum_metric += local_precision
+self.num_inst += num_samples
+
+
+@register
+@alias('top_k_recall')
+class TopKRecall(EvalMetric):
+"""Computes top k recall metric.
+top k differs from regular recall in that the score is only
+computed for the top k predictions. "correct" or "wrong" entries
+outside the top k are ignored
+Parameters
+--
+top_k : int
+Whether targets are in top k predictions.
+name : str
+Name of this metric instance for display.
+output_names : list of str, or None
+Name of predictions that should be used when updating with update_dict.
+By default include all predictions.
+label_names : list of str, or None
+Name of labels that should be used when updating with update_dict.
+By default include all labels.
+
+Examples
+
+>>>ytrue = [[1.,0.,1.,0.],[0.,1.,1.,0.]]
+>>>ytrue = mx.nd.array(ytrue)
+>>>yhat = [[0.4,0.8,0.1,0.1],[0.4,0.8,0.8,0.4]]
+>>>yhat = mx.nd.array(yhat)
+>>>pre = mx.metric.create('top_k_precision',top_k=2)
+>>>rec.update(preds = [yhat], labels = [ytrue])
+>>>print rec.get()[1]
+>>> 0.75
+
+"""
+
+def __init__(self, top_k=1, name='top_k_recall',
+ output_names=None, label_names=None):
+super(TopKRecall, self).__init__(
+name, top_k=top_k,
+output_names=output_names, label_names=label_names)
+self.top_k = top_k
+
+def update(self, labels, preds):
+"""Updates the internal evaluation result.
+Parameters
+--
+labels : list of `NDArray`
+The labels of the data. (binary)
+preds : list of `NDArray`
+Predicted 

[GitHub] eric-haibin-lin commented on a change in pull request #7390: adding ranking metrics (precision/recall) at position K.

2017-09-08 Thread git
eric-haibin-lin commented on a change in pull request #7390: adding ranking 
metrics (precision/recall) at position K. 
URL: https://github.com/apache/incubator-mxnet/pull/7390#discussion_r137904015
 
 

 ##
 File path: python/mxnet/metric.py
 ##
 @@ -569,6 +569,161 @@ def update(self, labels, preds):
 self.num_inst += 1
 
 
+def truePositives(label):
 
 Review comment:
   please rename to `true_positives`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #7390: adding ranking metrics (precision/recall) at position K.

2017-09-05 Thread git
eric-haibin-lin commented on a change in pull request #7390: adding ranking 
metrics (precision/recall) at position K. 
URL: https://github.com/apache/incubator-mxnet/pull/7390#discussion_r137061206
 
 

 ##
 File path: python/mxnet/metric.py
 ##
 @@ -568,6 +568,147 @@ def update(self, labels, preds):
 self.sum_metric += f1_score
 self.num_inst += 1
 
+@register
+@alias('top_k_precision')
+class TopKPrecision(EvalMetric):
+"""Computes top k precision metric.
+top k differs from regular precision in that the score is only
+computed for the top k predictions. "correct" or "wrong" entries
+outside the top k are ignored
+Parameters
+--
+top_k : int
+Whether targets are in top k predictions.
+name : str
+Name of this metric instance for display.
+output_names : list of str, or None
+Name of predictions that should be used when updating with update_dict.
+By default include all predictions.
+label_names : list of str, or None
+Name of labels that should be used when updating with update_dict.
+By default include all labels.
+
+Examples
+
+>>>ytrue = [[1.,0.,1.,0.],[0.,1.,1.,0.]]
+>>>ytrue = mx.nd.array(ytrue)
+>>>yhat = [[0.4,0.8,0.1,0.1],[0.4,0.8,0.8,0.4]]
+>>>yhat = mx.nd.array(yhat)
+>>>pre = mx.metric.create('top_k_precision',top_k=2)
+>>>pre.update(preds = [yhat], labels = [ytrue])
+>>>print pre.get()[1]
+>>> 0.75
+
+"""
+
+def __init__(self, top_k=1, name='top_k_precision',
+ output_names=None, label_names=None):
+super(TopKPrecision, self).__init__(
+name, top_k=top_k,
+output_names=output_names, label_names=label_names)
+self.top_k = top_k
+
+
+def update(self, labels, preds):
+"""Updates the internal evaluation result.
+Parameters
+--
+labels : list of `NDArray`
+The labels of the data. (binary)
+preds : list of `NDArray`
+Predicted values. (float)
+
+Returns:
+
+The precision at K (float)
+"""
+check_label_shapes(labels, preds)
+
+for label, pred_label in zip(labels, preds):
+assert(len(pred_label.shape) <= 2), 'Predictions should be no more 
than 2 dims'
+pred_label = 
numpy.argsort(-pred_label.asnumpy().astype('float32'), axis=1)
+label = label.asnumpy().astype('int32')
+check_label_shapes(label, pred_label)
+num_samples = pred_label.shape[0]
+local_precision = 0.0
+for s in range(num_samples):
+truepos = set(numpy.ravel(numpy.argwhere(label[s, :] == 1)))
 
 Review comment:
   The logics to compute `truepos` and `predpos` are redundant in the two 
classes. Could you refactor it as a free function so that the two classes can 
call, or let the two classes inherit from an abstract class that has such a 
method? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #7390: adding ranking metrics (precision/recall) at position K.

2017-08-30 Thread git
eric-haibin-lin commented on a change in pull request #7390: adding ranking 
metrics (precision/recall) at position K. 
URL: https://github.com/apache/incubator-mxnet/pull/7390#discussion_r136199686
 
 

 ##
 File path: python/mxnet/metric.py
 ##
 @@ -568,6 +568,147 @@ def update(self, labels, preds):
 self.sum_metric += f1_score
 self.num_inst += 1
 
+@register
+@alias('top_k_precision')
+class TopKPrecision(EvalMetric):
+"""Computes top k precision metric.
+top k differs from regular precision in that the score is only
+computed for the top k predictions. "correct" or "wrong" entries
+outside the top k are ignored
+Parameters
+--
+top_k : int
+Whether targets are in top k predictions.
+name : str
+Name of this metric instance for display.
+output_names : list of str, or None
+Name of predictions that should be used when updating with update_dict.
+By default include all predictions.
+label_names : list of str, or None
+Name of labels that should be used when updating with update_dict.
+By default include all labels.
+
+Examples
+
+>>>ytrue = [[1.,0.,1.,0.],[0.,1.,1.,0.]]
+>>>ytrue = mx.nd.array(ytrue)
+>>>yhat = [[0.4,0.8,0.1,0.1],[0.4,0.8,0.8,0.4]]
+>>>yhat = mx.nd.array(yhat)
+>>>pre = mx.metric.create('top_k_precision',top_k=2)
+>>>pre.update(preds = [yhat], labels = [ytrue])
+>>>print pre.get()[1]
+>>> 0.75
+
+"""
+
+def __init__(self, top_k=1, name='top_k_precision',
+ output_names=None, label_names=None):
+super(TopKPrecision, self).__init__(
+name, top_k=top_k,
+output_names=output_names, label_names=label_names)
+self.top_k = top_k
+
+
+def update(self, labels, preds):
+"""Updates the internal evaluation result.
+Parameters
+--
+labels : list of `NDArray`
+The labels of the data. (binary)
+preds : list of `NDArray`
+Predicted values. (float)
+
+Returns:
+
+The precision at K (float)
+"""
+check_label_shapes(labels, preds)
+
+for label, pred_label in zip(labels, preds):
+assert(len(pred_label.shape) <= 2), 'Predictions should be no more 
than 2 dims'
+pred_label = 
numpy.argsort(-pred_label.asnumpy().astype('float32'), axis=1)
 
 Review comment:
   does this work when `len(pred_label.shape) == 1`  ??
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #7390: adding ranking metrics (precision/recall) at position K.

2017-08-30 Thread git
eric-haibin-lin commented on a change in pull request #7390: adding ranking 
metrics (precision/recall) at position K. 
URL: https://github.com/apache/incubator-mxnet/pull/7390#discussion_r136159509
 
 

 ##
 File path: python/mxnet/ranking_metrics.py
 ##
 @@ -0,0 +1,46 @@
+"""
 
 Review comment:
   Could you remove this file from your PR since it's all moved to metrics.py? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #7390: adding ranking metrics (precision/recall) at position K.

2017-08-11 Thread git
eric-haibin-lin commented on a change in pull request #7390: adding ranking 
metrics (precision/recall) at position K. 
URL: https://github.com/apache/incubator-mxnet/pull/7390#discussion_r132745728
 
 

 ##
 File path: python/mxnet/ranking_metrics.py
 ##
 @@ -0,0 +1,46 @@
+"""
+A function that implements ranking metrics
+(precision, recall, coverage and converted coverage)
+for a given position K
+"""
+
+def metrics_at_k(Y, Yhat, K):
+"""
+Parameters
+--
+Y  : dictionary with key = sample index and value = list of positive 
indices of features
+Yhat: dict with key = sample index
+  and value = ORDERED list of indices of features, according to 
some score
+K  : position at which to compute score
+
+Returns
+---
+pre: precision at K
+rec: recall at K
+convcoverage: converted coverage at K
+coverage   : coverage at K
+
+Examples
+
+>>>Ytrue = {1: [1,2,3,4]}
+>>>Yhat = {1:[1,2,3,4,5,6,7,8,9]}
+>>>k = 2
+>>>print(metrics_at_k(Ytrue,Yhat,k))
 
 Review comment:
   Could you add the result of the print here? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #7390: adding ranking metrics (precision/recall) at position K.

2017-08-11 Thread git
eric-haibin-lin commented on a change in pull request #7390: adding ranking 
metrics (precision/recall) at position K. 
URL: https://github.com/apache/incubator-mxnet/pull/7390#discussion_r132746129
 
 

 ##
 File path: python/mxnet/ranking_metrics.py
 ##
 @@ -0,0 +1,46 @@
+"""
+A function that implements ranking metrics
+(precision, recall, coverage and converted coverage)
+for a given position K
+"""
+
+def metrics_at_k(Y, Yhat, K):
+"""
+Parameters
+--
+Y  : dictionary with key = sample index and value = list of positive 
indices of features
+Yhat: dict with key = sample index
+  and value = ORDERED list of indices of features, according to 
some score
 
 Review comment:
   Does the model always produce ordered result? is that why you're adding the 
`ordered` restriction here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #7390: adding ranking metrics (precision/recall) at position K.

2017-08-11 Thread git
eric-haibin-lin commented on a change in pull request #7390: adding ranking 
metrics (precision/recall) at position K. 
URL: https://github.com/apache/incubator-mxnet/pull/7390#discussion_r132746805
 
 

 ##
 File path: python/mxnet/ranking_metrics.py
 ##
 @@ -0,0 +1,46 @@
+"""
 
 Review comment:
   I don't think it's a good idea adding this as a standalone file under 
`python/mxnet`. Is it more reasonable to put integrate this metric in 
`python/mxnet/metric.py` file? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #7390: adding ranking metrics (precision/recall) at position K.

2017-08-08 Thread git
eric-haibin-lin commented on a change in pull request #7390: adding ranking 
metrics (precision/recall) at position K. 
URL: https://github.com/apache/incubator-mxnet/pull/7390#discussion_r132081575
 
 

 ##
 File path: python/mxnet/ranking_metrics.py
 ##
 @@ -0,0 +1,45 @@
+"""
+A function that implements ranking metrics 
+(precision, recall, coverage and converted coverage) 
+for a given position K
+"""
+
+def metrics_at_k(Y,Yhat, K):
+   """
+   Parameters
+   --
+   Y   : dictionary with key = sample index and value = list of 
positive indices of features
+   Yhat: dict wit key = sample index and value = ORDERED list of indices 
of features, according to some score
+   K   : position at which to compute score
+   
+   Returns
+   ---
+   pre : precision at K
+   rec : recall at K
+   convcoverage: converted coverage at K
+   coverage: coverage at K
+
+   Examples
+
 
 Review comment:
   Thanks for contributing back! Could you fix pylint warnings? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services