[
https://issues.apache.org/jira/browse/SPARK-19216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826632#comment-15826632
]
Bryan Cutler commented on SPARK-19216:
--------------------------------------
This is a valid issue, but is sort of a duplicate of SPARK-10931, which is to
generally add all parameters to PySpark models. There is also SPARK-18739,
which is trying to change the PySpark model class hierarchy to add some missing
methods, but looks like it is not taking all params into account.
> LogisticRegressionModel is missing getThreshold()
> -------------------------------------------------
>
> Key: SPARK-19216
> URL: https://issues.apache.org/jira/browse/SPARK-19216
> Project: Spark
> Issue Type: Improvement
> Components: ML, PySpark
> Affects Versions: 2.1.0
> Reporter: Nicholas Chammas
> Priority: Minor
>
> Say I just loaded a logistic regression model from storage. How do I check
> that model's threshold in PySpark? From what I can see, the only way to do
> that is to dip into the Java object:
> {code}
> model._java_obj.getThreshold())
> {code}
> It seems like PySpark's version of {{LogisticRegressionModel}} should include
> this method.
> Another issue is that it's not clear whether the threshold is for the raw
> prediction or the probability. Maybe it's obvious to machine learning
> practitioners, but I couldn't tell from reading the docs or skimming the code
> what the threshold was for exactly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]