[
https://issues.apache.org/jira/browse/SPARK-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233906#comment-14233906
]
Apache Spark commented on SPARK-4736:
-------------------------------------
User 'dikejiang' has created a pull request for this issue:
https://github.com/apache/spark/pull/3583
> functions returning the category with weights
> ---------------------------------------------
>
> Key: SPARK-4736
> URL: https://issues.apache.org/jira/browse/SPARK-4736
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: yu jiang
>
> In this version, we add two functions: 1) predictByVotingWithWeight(features:
> Vector) and 2) predictWithWeight(features: Vector). And we also modify the
> function: predictByVoting(features: Vector). There are at least two reasons
> why we make such improvement: 1) In our practice, we want to find the top N
> samples from one category. However in 1.3.0 version, the function of predict
> can only give the predicted category but without weights. 2) What's more, in
> our practice, the numbers of positive and negative samples are very
> unbalance. There are much less positive samples than negative samples.
> According to the results of votes, there are very few samples predicted as
> positive sample. If the weights are also given, users can make a proper
> threshold to modify the results so that the performance can be improved.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]