[GitHub] spark pull request #23236: [SPARK-26275][PYTHON][ML] Increases timeout for S...

2018-12-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23236


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23236: [SPARK-26275][PYTHON][ML] Increases timeout for S...

2018-12-05 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/23236

[SPARK-26275][PYTHON][ML] Increases timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test

## What changes were proposed in this pull request?

Looks this test is flaky


https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99704/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99569/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99644/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99548/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99454/console

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99609/console

```
==
FAIL: test_training_and_prediction 
(pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests)
Test that the model improves on toy data with no. of batches
--
Traceback (most recent call last):
  File 
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py",
 line 367, in test_training_and_prediction
self._eventually(condition)
  File 
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py",
 line 78, in _eventually
% (timeout, lastValue))
AssertionError: Test failed due to timeout after 30 sec, with last 
condition returning: Latest errors: 0.67, 0.71, 0.78, 0.7, 0.75, 0.74, 0.73, 
0.69, 0.62, 0.71, 0.69, 0.75, 0.72, 0.77, 0.71, 0.74

--
Ran 13 tests in 185.051s

FAILED (failures=1, skipped=1)
```

This looks happening after increasing the parallelism in Jenkins to speed 
up at https://github.com/apache/spark/pull/23111. I am able to reproduce this 
manually when the resource usage is heavy (with manual decrease of timeout).

## How was this patch tested?

Manually tested by 

```
cd python
./run-tests --testnames 'pyspark.mllib.tests.test_streaming_algorithms 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction' 
--python-executables=python
```


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-26275

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23236.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23236


commit 3c4ee75c4d0585702cd87cc4df9af74e235bb431
Author: Hyukjin Kwon 
Date:   2018-12-05T12:17:21Z

Increases timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org