GitHub user keypointt opened a pull request:

    https://github.com/apache/spark/pull/12845

    [SPARK-14936][BUILD][TESTS] FlumePollingStreamSuite is slow

    ## What changes were proposed in this pull request?
    
    FlumePollingStreamSuite contains two tests which run for a minute each. 
This seems excessively slow and we should speed it up if possible.
    
    ## How was this patch tested?
    
    Tested on my local machine, reducing the batch size and events size of each 
batch could **save couple seconds** in my observation.
    
    ## Needs verification
    I dig into this testing, and the slowness is mainly caused by the 
multi-thread tasks. Basically as I understand, when putting data to channels,  
`executorCompletion.submit(new TxnSubmitter(channel))` each executor is started 
to submit in parallel, and `CountDownLatch` is waiting for all data being 
process, since this process is synchronized and main thread has to wait till 
all threads finish.
    
    I'm not sure whether this is fixing the root cause or not, maybe some 
senior developers could have a look at it?

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/keypointt/spark SPARK-14936

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12845.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12845
    
----
commit 506f6fabd04b43ce415beda9ed68a9c5c79f7f01
Author: Xin Ren <[email protected]>
Date:   2016-05-02T17:02:41Z

    [SPARK-14936] fix typo

commit 83dcc61aa2b998fa913c6275eecf41aedb2dcb96
Author: Xin Ren <[email protected]>
Date:   2016-05-02T18:49:24Z

    [SPARK-14936] change to smaller and less batch, to save some time

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to