[GitHub] [spark] huaxingao commented on a change in pull request #26142: [SPARK-29381][FOLLOWUP][PYTHON][ML] Add 'private' _XXXParams classes for classification & regression

GitBox Wed, 16 Oct 2019 09:52:33 -0700

huaxingao commented on a change in pull request #26142: 
[SPARK-29381][FOLLOWUP][PYTHON][ML] Add 'private' _XXXParams classes for 
classification & regression
URL: https://github.com/apache/spark/pull/26142#discussion_r335592011


 ##########
 File path: python/pyspark/ml/classification.py
 ##########
 @@ -271,10 +282,167 @@ def intercept(self):
         return self._call_java("intercept")
 
 
+class _LogisticRegressionParams(_JavaProbabilisticClassifierParams, 
HasRegParam,
+                                HasElasticNetParam, HasMaxIter, 
HasFitIntercept, HasTol,
+                                HasStandardization, HasWeightCol, 
HasAggregationDepth,
+                                HasThreshold):
+    """
+    Params for :py:class:`LogisticRegression` and 
:py:class:`LogisticRegressionModel`.
+
+    .. versionadded:: 3.0.0
+    """
+
+    threshold = Param(Params._dummy(), "threshold",
+                      "Threshold in binary classification prediction, in range 
[0, 1]." +
+                      " If threshold and thresholds are both set, they must 
match." +
+                      "e.g. if threshold is p, then thresholds must be equal 
to [1-p, p].",
+                      typeConverter=TypeConverters.toFloat)
+
+    family = Param(Params._dummy(), "family",
+                   "The name of family which is a description of the label 
distribution to " +
+                   "be used in the model. Supported options: auto, binomial, 
multinomial",
+                   typeConverter=TypeConverters.toString)
+
+    lowerBoundsOnCoefficients = Param(Params._dummy(), 
"lowerBoundsOnCoefficients",
+                                      "The lower bounds on coefficients if 
fitting under bound "
+                                      "constrained optimization. The bound 
matrix must be "
+                                      "compatible with the shape "
+                                      "(1, number of features) for binomial 
regression, or "
+                                      "(number of classes, number of features) 
"
+                                      "for multinomial regression.",
+                                      typeConverter=TypeConverters.toMatrix)
+
+    upperBoundsOnCoefficients = Param(Params._dummy(), 
"upperBoundsOnCoefficients",
+                                      "The upper bounds on coefficients if 
fitting under bound "
+                                      "constrained optimization. The bound 
matrix must be "
+                                      "compatible with the shape "
+                                      "(1, number of features) for binomial 
regression, or "
+                                      "(number of classes, number of features) 
"
+                                      "for multinomial regression.",
+                                      typeConverter=TypeConverters.toMatrix)
+
+    lowerBoundsOnIntercepts = Param(Params._dummy(), "lowerBoundsOnIntercepts",
+                                    "The lower bounds on intercepts if fitting 
under bound "
+                                    "constrained optimization. The bounds 
vector size must be"
+                                    "equal with 1 for binomial regression, or 
the number of"
+                                    "lasses for multinomial regression.",
+                                    typeConverter=TypeConverters.toVector)
+
+    upperBoundsOnIntercepts = Param(Params._dummy(), "upperBoundsOnIntercepts",
+                                    "The upper bounds on intercepts if fitting 
under bound "
+                                    "constrained optimization. The bound 
vector size must be "
+                                    "equal with 1 for binomial regression, or 
the number of "
+                                    "classes for multinomial regression.",
+                                    typeConverter=TypeConverters.toVector)
+
+    @since("1.4.0")
+    def setThreshold(self, value):
+        """
+        Sets the value of :py:attr:`threshold`.
+        Clears value of :py:attr:`thresholds` if it has been set.
+        """
+        self._set(threshold=value)
+        self._clear(self.thresholds)
+        return self
+
+    @since("1.4.0")
+    def getThreshold(self):
+        """
+        Get threshold for binary classification.
+
+        If :py:attr:`thresholds` is set with length 2 (i.e., binary 
classification),
+        this returns the equivalent threshold:
+        :math:`\\frac{1}{1 + \\frac{thresholds(0)}{thresholds(1)}}`.
+        Otherwise, returns :py:attr:`threshold` if set or its default value if 
unset.
+        """
+        self._checkThresholdConsistency()
+        if self.isSet(self.thresholds):
+            ts = self.getOrDefault(self.thresholds)
+            if len(ts) != 2:
+                raise ValueError("Logistic Regression getThreshold only 
applies to" +
+                                 " binary classification, but thresholds has 
length != 2." +
+                                 "  thresholds: " + ",".join(ts))
+            return 1.0/(1.0 + ts[0]/ts[1])
+        else:
+            return self.getOrDefault(self.threshold)
+
+    @since("1.5.0")
+    def setThresholds(self, value):
+        """
+        Sets the value of :py:attr:`thresholds`.
+        Clears value of :py:attr:`threshold` if it has been set.
+        """
+        self._set(thresholds=value)
+        self._clear(self.threshold)
+        return self
 
 Review comment:
   it's a little strange to have ```setThreshold/Thresholds``` in the XXXParams 
class, but scala ```LogisticRegressionParams``` does this way, so I just do the 
same to be consistent with scala side. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] huaxingao commented on a change in pull request #26142: [SPARK-29381][FOLLOWUP][PYTHON][ML] Add 'private' _XXXParams classes for classification & regression

Reply via email to