GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/4633
[SPARK-1600] Refactor FileInputStream tests to remove Thread.sleep() calls
and SystemClock usage (branch-1.2 backport)
(This PR backports #3801 into `branch-1.2` (1.2.2))
This patch refactors Spark Streaming's FileInputStream tests to remove uses
of Thread.sleep() and SystemClock, which should hopefully resolve some
longstanding flakiness in these tests (see SPARK-1600).
Key changes:
- Modify FileInputDStream to use the scheduler's Clock instead of
System.currentTimeMillis(); this allows it to be tested using ManualClock.
- Fix a synchronization issue in ManualClock's `currentTime` method.
- Add a StreamingTestWaiter class which allows callers to block until a
certain number of batches have finished.
- Change the FileInputStream tests so that files' modification times are
manually set based off of ManualClock; this eliminates many Thread.sleep calls.
- Update these tests to use the withStreamingContext fixture.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JoshRosen/spark spark-1600-b12-backport
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4633.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4633
----
commit e5d3dc4cfb9c890c2685f2c5ce624cb23f336f9b
Author: Josh Rosen <[email protected]>
Date: 2015-01-06T08:31:19Z
[SPARK-1600] Refactor FileInputStream tests to remove Thread.sleep() calls
and SystemClock usage
This patch refactors Spark Streaming's FileInputStream tests to remove uses
of Thread.sleep() and SystemClock, which should hopefully resolve some
longstanding flakiness in these tests (see SPARK-1600).
Key changes:
- Modify FileInputDStream to use the scheduler's Clock instead of
System.currentTimeMillis(); this allows it to be tested using ManualClock.
- Fix a synchronization issue in ManualClock's `currentTime` method.
- Add a StreamingTestWaiter class which allows callers to block until a
certain number of batches have finished.
- Change the FileInputStream tests so that files' modification times are
manually set based off of ManualClock; this eliminates many Thread.sleep calls.
- Update these tests to use the withStreamingContext fixture.
Author: Josh Rosen <[email protected]>
Closes #3801 from JoshRosen/SPARK-1600 and squashes the following commits:
e4494f4 [Josh Rosen] Address a potential race when setting file
modification times
8340bd0 [Josh Rosen] Use set comparisons for output.
0b9c252 [Josh Rosen] Fix some ManualClock usage problems.
1cc689f [Josh Rosen] ConcurrentHashMap -> SynchronizedMap
db26c3a [Josh Rosen] Use standard timeout in ScalaTest `eventually` blocks.
3939432 [Josh Rosen] Rename StreamingTestWaiter to BatchCounter
0b9c3a1 [Josh Rosen] Wait for checkpoint to complete
863d71a [Josh Rosen] Remove Thread.sleep that was used to make task run
slowly
b4442c3 [Josh Rosen] batchTimeToSelectedFiles should be thread-safe
15b48ee [Josh Rosen] Replace several TestWaiter methods w/ ScalaTest
eventually.
fffc51c [Josh Rosen] Revert "Remove last remaining sleep() call"
dbb8247 [Josh Rosen] Remove last remaining sleep() call
566a63f [Josh Rosen] Fix log message and comment typos
da32f3f [Josh Rosen] Fix log message and comment typos
3689214 [Josh Rosen] Merge remote-tracking branch 'origin/master' into
SPARK-1600
c8f06b1 [Josh Rosen] Remove Thread.sleep calls in FileInputStream
CheckpointSuite test.
d4f2d87 [Josh Rosen] Refactor file input stream tests to not rely on
SystemClock.
dda1403 [Josh Rosen] Add StreamingTestWaiter class.
3c3efc3 [Josh Rosen] Synchronize `currentTime` in ManualClock
a95ddc4 [Josh Rosen] Modify FileInputDStream to use Clock class.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]