GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/4633

    [SPARK-1600] Refactor FileInputStream tests to remove Thread.sleep() calls 
and SystemClock usage (branch-1.2 backport)

    (This PR backports #3801 into `branch-1.2` (1.2.2))
    
    This patch refactors Spark Streaming's FileInputStream tests to remove uses 
of Thread.sleep() and SystemClock, which should hopefully resolve some 
longstanding flakiness in these tests (see SPARK-1600).
    
    Key changes:
    
    - Modify FileInputDStream to use the scheduler's Clock instead of 
System.currentTimeMillis(); this allows it to be tested using ManualClock.
    - Fix a synchronization issue in ManualClock's `currentTime` method.
    - Add a StreamingTestWaiter class which allows callers to block until a 
certain number of batches have finished.
    - Change the FileInputStream tests so that files' modification times are 
manually set based off of ManualClock; this eliminates many Thread.sleep calls.
    - Update these tests to use the withStreamingContext fixture.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark spark-1600-b12-backport

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4633.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4633
    
----
commit e5d3dc4cfb9c890c2685f2c5ce624cb23f336f9b
Author: Josh Rosen <[email protected]>
Date:   2015-01-06T08:31:19Z

    [SPARK-1600] Refactor FileInputStream tests to remove Thread.sleep() calls 
and SystemClock usage
    
    This patch refactors Spark Streaming's FileInputStream tests to remove uses 
of Thread.sleep() and SystemClock, which should hopefully resolve some 
longstanding flakiness in these tests (see SPARK-1600).
    
    Key changes:
    
    - Modify FileInputDStream to use the scheduler's Clock instead of 
System.currentTimeMillis(); this allows it to be tested using ManualClock.
    - Fix a synchronization issue in ManualClock's `currentTime` method.
    - Add a StreamingTestWaiter class which allows callers to block until a 
certain number of batches have finished.
    - Change the FileInputStream tests so that files' modification times are 
manually set based off of ManualClock; this eliminates many Thread.sleep calls.
    - Update these tests to use the withStreamingContext fixture.
    
    Author: Josh Rosen <[email protected]>
    
    Closes #3801 from JoshRosen/SPARK-1600 and squashes the following commits:
    
    e4494f4 [Josh Rosen] Address a potential race when setting file 
modification times
    8340bd0 [Josh Rosen] Use set comparisons for output.
    0b9c252 [Josh Rosen] Fix some ManualClock usage problems.
    1cc689f [Josh Rosen] ConcurrentHashMap -> SynchronizedMap
    db26c3a [Josh Rosen] Use standard timeout in ScalaTest `eventually` blocks.
    3939432 [Josh Rosen] Rename StreamingTestWaiter to BatchCounter
    0b9c3a1 [Josh Rosen] Wait for checkpoint to complete
    863d71a [Josh Rosen] Remove Thread.sleep that was used to make task run 
slowly
    b4442c3 [Josh Rosen] batchTimeToSelectedFiles should be thread-safe
    15b48ee [Josh Rosen] Replace several TestWaiter methods w/ ScalaTest 
eventually.
    fffc51c [Josh Rosen] Revert "Remove last remaining sleep() call"
    dbb8247 [Josh Rosen] Remove last remaining sleep() call
    566a63f [Josh Rosen] Fix log message and comment typos
    da32f3f [Josh Rosen] Fix log message and comment typos
    3689214 [Josh Rosen] Merge remote-tracking branch 'origin/master' into 
SPARK-1600
    c8f06b1 [Josh Rosen] Remove Thread.sleep calls in FileInputStream 
CheckpointSuite test.
    d4f2d87 [Josh Rosen] Refactor file input stream tests to not rely on 
SystemClock.
    dda1403 [Josh Rosen] Add StreamingTestWaiter class.
    3c3efc3 [Josh Rosen] Synchronize `currentTime` in ManualClock
    a95ddc4 [Josh Rosen] Modify FileInputDStream to use Clock class.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to