[GitHub] spark pull request #19994: [SPARK-22810][ML][PySpark] Expose Python API for ...

2017-12-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19994


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19994: [SPARK-22810][ML][PySpark] Expose Python API for ...

2017-12-17 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19994#discussion_r157391859
  
--- Diff: python/pyspark/ml/regression.py ---
@@ -155,6 +183,14 @@ def intercept(self):
 """
 return self._call_java("intercept")
 
+@property
+@since("2.3.0")
+def scale(self):
+"""
+The value by which \|y - X'w\| is scaled down when loss is "huber".
--- End diff --

add doc "When square loss the value is 1.0"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19994: [SPARK-22810][ML][PySpark] Expose Python API for ...

2017-12-17 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19994#discussion_r157391801
  
--- Diff: python/pyspark/ml/tests.py ---
@@ -1725,6 +1725,27 @@ def test_offset(self):
 self.assertTrue(np.isclose(model.intercept, -1.561613, atol=1E-4))
 
 
+class LinearRegressionTest(SparkSessionTestCase):
+
+def test_linear_regression_with_huber_loss(self):
+
+data_path = "data/mllib/sample_linear_regression_data.txt"
+df = self.spark.read.format("libsvm").load(data_path)
+
+lir = LinearRegression(loss="huber")
--- End diff --

The testcase should include `setEpsilon`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19994: [SPARK-22810][ML][PySpark] Expose Python API for ...

2017-12-15 Thread yanboliang
GitHub user yanboliang opened a pull request:

https://github.com/apache/spark/pull/19994

[SPARK-22810][ML][PySpark] Expose Python API for LinearRegression with 
huber loss.

## What changes were proposed in this pull request?
Expose Python API for _LinearRegression_ with _huber_ loss.

## How was this patch tested?
Unit test.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanboliang/spark spark-22810

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19994.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19994


commit 1ed46a2ea0fe28e173df4bc9bfec301beafc1acd
Author: Yanbo Liang 
Date:   2017-12-15T19:58:55Z

Expose Python API for LinearRegression with huber loss.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org