GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/23236
[SPARK-26275][PYTHON][ML] Increases timeout for
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test
## What changes were proposed in this pull request?
Looks this test is flaky
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99704/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99569/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99644/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99548/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99454/console
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99609/console
```
==
FAIL: test_training_and_prediction
(pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests)
Test that the model improves on toy data with no. of batches
--
Traceback (most recent call last):
File
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py",
line 367, in test_training_and_prediction
self._eventually(condition)
File
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/mllib/tests/test_streaming_algorithms.py",
line 78, in _eventually
% (timeout, lastValue))
AssertionError: Test failed due to timeout after 30 sec, with last
condition returning: Latest errors: 0.67, 0.71, 0.78, 0.7, 0.75, 0.74, 0.73,
0.69, 0.62, 0.71, 0.69, 0.75, 0.72, 0.77, 0.71, 0.74
--
Ran 13 tests in 185.051s
FAILED (failures=1, skipped=1)
```
This looks happening after increasing the parallelism in Jenkins to speed
up at https://github.com/apache/spark/pull/23111. I am able to reproduce this
manually when the resource usage is heavy (with manual decrease of timeout).
## How was this patch tested?
Manually tested by
```
cd python
./run-tests --testnames 'pyspark.mllib.tests.test_streaming_algorithms
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction'
--python-executables=python
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HyukjinKwon/spark SPARK-26275
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23236.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23236
commit 3c4ee75c4d0585702cd87cc4df9af74e235bb431
Author: Hyukjin Kwon
Date: 2018-12-05T12:17:21Z
Increases timeout for
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org