yu jiang created SPARK-4736:
-------------------------------
Summary: functions returning the category with weights
Key: SPARK-4736
URL: https://issues.apache.org/jira/browse/SPARK-4736
Project: Spark
Issue Type: Improvement
Components: MLlib
Reporter: yu jiang
In this version, we add two functions: 1) predictByVotingWithWeight(features:
Vector) and 2) predictWithWeight(features: Vector). And we also modify the
function: predictByVoting(features: Vector). There are at least two reasons why
we make such improvement: 1) In our practice, we want to find the top N samples
from one category. However in 1.3.0 version, the function of predict can only
give the predicted category but without weights. 2) What's more, in our
practice, the numbers of positive and negative samples are very unbalance.
There are much less positive samples than negative samples. According to the
results of votes, there are very few samples predicted as positive sample. If
the weights are also given, users can make a proper threshold to modify the
results so that the performance can be improved.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]