GitHub user aarondav opened a pull request:

    https://github.com/apache/spark/pull/2753

    [SPARK-3453] Netty-based BlockTransferService, extracted from Spark core

    This PR encapsulates #2330, which is itself a continuation of #2240. The 
first goal of this PR is to provide an alternate, simpler implementation of the 
ConnectionManager which is based on Netty.
    
    In addition to this goal, however, we want to resolve 
[SPARK-3796](https://issues.apache.org/jira/browse/SPARK-3796), which calls for 
a standalone shuffle service which can be integrated into the YARN NodeManager, 
Standalone Worker, or on its own. This PR makes the first step in this 
direction by ensuring that the actual Netty service is as small as possible and 
extracted from Spark core. Given this, we should be able to construct this 
standalone jar which can be included in other JVMs without incurring 
significant dependency or runtime issues. The actual work to ensure that such a 
standalone shuffle service would work in Spark will be left for a future PR, 
however.
    
    In order to minimize dependencies and allow for the service to be 
long-running (possibly much longer-running than Spark, and possibly having to 
support multiple version of Spark simultaneously), the entire service has been 
ported to Java, where we have full control over the binary compatibility of the 
components and do not depend on the Scala runtime or version.
    
    These issues: have been addressed by folding in #2330:
    
    SPARK-3453: Refactor Netty module to use BlockTransferService interface
    SPARK-3018: Release all buffers upon task completion/failure
    SPARK-3002: Create a connection pool and reuse clients across different 
threads
    SPARK-3017: Integration tests and unit tests for connection failures
    SPARK-3049: Make sure client doesn't block when server/connection has 
error(s)
    SPARK-3502: SO_RCVBUF and SO_SNDBUF should be bootstrap childOption, not 
option
    SPARK-3503: Disable thread local cache in PooledByteBufAllocator
    
    TODO before mergeable:
    [ ] Implement uploadBlock()
    [ ] Unit tests for RPC side of code
    [ ] Performance testing
    [ ] Turn OFF by default (currently on for unit testing)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aarondav/spark netty

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2753.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2753
    
----
commit 165eab1518f5184ef9609f26d374c5ccefd05472
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-09T07:29:33Z

    [SPARK-3453] Refactor Netty module to use BlockTransferService.
    
    Also includes some partial support for uploading blocks.

commit 1760d3292ecf262e4c77c9e3b28bfd2900d25840
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-09T07:42:37Z

    Use Epoll.isAvailable in BlockServer as well.

commit 2b44cf1b7547919bbe7386e954fe2f56be046790
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-09T21:36:31Z

    Added more documentation.

commit 064747b50a591acb132b2c750957e79f54dfa88f
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-10T06:38:38Z

    Reference count buffers and clean them up properly.

commit b5c8d1fca6d3cf5c2b95395310200c8149a7eb16
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-10T08:09:44Z

    Fixed ShuffleBlockFetcherIteratorSuite.

commit 108c9edaed06c5e046a21c9a8e54c50390da9a0b
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-10T08:10:04Z

    Forgot to add TestSerializer to the commit list.

commit 1be4e8ee7d932821c789cb974310e5d59df4ff84
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-10T08:11:40Z

    Shorten NioManagedBuffer and NettyManagedBuffer class names.

commit cb589ec7b6d3758498249b63b395634efb83d8ba
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-11T02:01:23Z

    Added more test cases covering cleanup when fault happens in 
ShuffleBlockFetcherIteratorSuite

commit 5cd33d7798ae742e76107bb976d8478ab9476ae7
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-11T02:55:54Z

    Fixed style violation.

commit 9e0cb8736be6d38e3f30766271d28875ceca1ae8
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-11T04:04:56Z

    Fixed BlockClientHandlerSuite

commit d23ed7bfd912770ace7eed7cd0dff2db6ac826e3
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-12T01:28:45Z

    Incorporated feedback from Norman:
    - use same pool for boss and worker
    - remove ioratio
    - disable caching of byte buf allocator
    - childoption sendbuf/receivebuf
    - fire exception through pipeline
    
    In addition:
    - fire failure handler BlockFetchingListener at least once per block.
    - enabled a bunch of ignored tests

commit b2f3281d0de540d38ea5b4c7bf576b775405d56d
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-12T05:12:08Z

    Added connection pooling.

commit 14323a55ebfa7ccc684c2ae78eac299a4426b353
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-12T05:13:02Z

    Removed BlockManager.getLocalShuffleFromDisk.

commit f0a16e9ec7d5c811dff3cd5219548e05077099c8
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-12T07:40:53Z

    Fixed test hanging.

commit 519d64dcb7768b3657438a4cfc85ee8065f56c2a
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-12T21:18:58Z

    Mark private package visibility and MimaExcludes.

commit c066309afbb0e248a8b2b808d997e6b37a2bff1e
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-13T05:42:32Z

    Implement java.io.Closeable interface.

commit 6afc435037a0448d6eb243bd18411ef25e3a2cf7
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-17T05:51:11Z

    Added logging.

commit f63fb4c1976e503238b7d7151f8f45f40ced36e9
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-29T18:13:44Z

    Add more debug message.

commit d68f3286a4a9795dfb61a8a63b8a20b3eafb4821
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-29T18:30:13Z

    Logging close() in case close() fails.

commit 1bdd7eec5d9ddb5a9eb33c9733878aea3ca26ba6
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-29T19:07:53Z

    Fixed tests.

commit bec4ea2b54659cfed6f54e527aa878dfbff829c7
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-29T19:22:01Z

    Removed OIO and added num threads settings.

commit 4b18db29edcdb87577fd033835275fd1c2957dcd
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-29T22:45:05Z

    Copy the buffer in fetchBlockSync.

commit a0518c766f0f4eba24459ffac61dce789fc14092
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-30T02:22:34Z

    Implemented block uploads.

commit 407e59afd3cb7385af9f63dc2263a40c7c21d783
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-30T02:37:28Z

    Fix style violation.

commit f6c220df8406be14fbdb7270682727e1085518a4
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-30T06:30:17Z

    Merge with latest master.

commit 5d98ce3de1deeeb7fbdc26b9303a591c46f1892b
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-30T07:56:32Z

    Flip buffer.

commit f7e7568414692989215d97abce9dda2fe172abb4
Author: Reynold Xin <r...@apache.org>
Date:   2014-09-30T19:28:21Z

    Fixed spark.shuffle.io.receiveBuffer setting.

commit c0cd242f375e939e1422e30d4b230a8a78b13b88
Author: Aaron Davidson <aa...@databricks.com>
Date:   2014-10-06T00:58:43Z

    [SPARK-3453] Netty-based BlockTransferService, extracted from Spark core
    
    This PR encapsulates #2330, which is itself a continuation of #2240. The 
first goal of this
    PR is to provide an alternate, simpler implementation of the 
ConnectionManager which is based on Netty.
    
    In addition to this goal, however, we want to resolve 
[SPARK-3796](https://issues.apache.org/jira/browse/SPARK-3796), which calls for 
a
    standalone shuffle service which can be integrated into the YARN 
NodeManager, Standalone Worker, or
    on its own. This PR makes the first step in this direction by ensuring that 
the actual Netty service
    is as small as possible and extracted from Spark core. Given this, we 
should be able to construct
    this standalone jar which can be included in other JVMs without incurring 
significant dependency or
    runtime issues. The actual work to ensure that such a standalone shuffle 
service would work in Spark
    will be left for a future PR, however.
    
    In order to minimize dependencies and allow for the service to be 
long-running (possibly
    much longer-running than Spark, and possibly having to support multiple 
version of Spark
    simultaneously), the entire service has been ported to Java, where we have 
full control
    over the binary compatibility of the components and do not depend on the 
Scala runtime or
    version.
    
    These PRs have been addressed by folding in #2330:
    
    SPARK-3453: Refactor Netty module to use BlockTransferService interface
    SPARK-3018: Release all buffers upon task completion/failure
    SPARK-3002: Create a connection pool and reuse clients across different 
threads
    SPARK-3017: Integration tests and unit tests for connection failures
    SPARK-3049: Make sure client doesn't block when server/connection has 
error(s)
    SPARK-3502: SO_RCVBUF and SO_SNDBUF should be bootstrap childOption, not 
option
    SPARK-3503: Disable thread local cache in PooledByteBufAllocator
    
    TODO before mergeable:
    [ ] Implement uploadBlock()
    [ ] Unit tests for RPC side of code
    [ ] Performance testing
    [ ] Turn OFF by default (currently on for unit testing)

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to