GitHub user squito opened a pull request:
https://github.com/apache/spark/pull/4857
[wip][SPARK-1391][SPARK-3151] 2g partition limit
https://issues.apache.org/jira/browse/SPARK-1391
This is still really rough, I'm looking for some feedback on overall
design, its not ready to merge. I put a design doc on jira; I think the major
issues that I'd like feedback on are:
1. How to test this? I added tests, but I disabled some of them just so
that I don't destroy jenkins. Some of these tests need ~16GB of memory to run.
I have smaller test cases for some things, but I really think we need some
tests that actually transfer a block that is > 2GB.
[SPARK-4767](https://github.com/apache/spark/pull/4048) would help with this.
Also looking for suggestions on more tests.
2. I could really use some advice on how to make `NettyBlockRpcServer`
robust in the way it handles `UploadPartialBlock`. (a) is the use of timeouts
sensible? (b) how do I come up with reasonable timeouts? (c) other cases I'm
not thinking about for how it might fail?
3. How to test performance? I haven't tested performance at all so far.
Again, my goals are only maintaining performance on < 2GB blocks, we can figure
out how to improve the performance of 2GB blocks later. (though of course easy
fixes now are welcome.) I'll try to do some performance testing myself but
could use advice.
thanks to @sryza , @ryanlecompte and @harishreedharan for helping bounce
ideas around, and @mridulm for work on an earlier implementation that served as
inspiration. mistakes are all mine though :)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/squito/spark SPARK-1391_2g_partition_limit
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4857.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4857
----
commit 5cdcd4246e586346a8e1ac2242dd795fdb1ae068
Author: Imran Rashid <[email protected]>
Date: 2015-02-20T22:35:12Z
add some failing tests, though these probably shouldnt actually get merged
commit 03db862833f3c4feef2d72620bc5c9a893dab2f5
Author: Imran Rashid <[email protected]>
Date: 2015-02-23T20:28:22Z
steal some code from earlier work of @mridulm
commit d6337f03a4ac2971a004ef821281723e857f9008
Author: Imran Rashid <[email protected]>
Date: 2015-02-24T18:18:38Z
wip -- changed a bunch of types to LargeByteBuffer; discovered problem on
replicate()
commit a139e97fe1aeac279b9c47119745c0f45eb7d8c5
Author: Imran Rashid <[email protected]>
Date: 2015-02-25T19:27:13Z
compiling but all sorts of bad casting etc.
commit 4965bad00574a05e133e7caeba56cd6115fe35b6
Author: Imran Rashid <[email protected]>
Date: 2015-02-25T20:28:16Z
move LargeByteBuffer to network-common, since we need it there for the
shuffles
commit 149d4fa3fa55403df90109b440a3523d3f4ab92b
Author: Imran Rashid <[email protected]>
Date: 2015-02-25T21:50:33Z
move large byte buffer to network/common ... still lots of crud
commit 01cafbf15026fdcbfd58566335802082493a491c
Author: Imran Rashid <[email protected]>
Date: 2015-02-25T21:53:22Z
tests compile too
commit ce391a0dbbba3d169d4013d2e387b7808065b3f8
Author: Imran Rashid <[email protected]>
Date: 2015-02-25T22:00:00Z
failing test case (though its crappy)
commit 29f0a8a10c685ea2742d239a748bc6c5d7798380
Author: Imran Rashid <[email protected]>
Date: 2015-02-27T19:19:12Z
fix use of LargeByteBuffer in some tests, create UploadPartialBlock
commit dcb46697d59fa77ac643e438b346eb28972d9e8f
Author: Imran Rashid <[email protected]>
Date: 2015-02-27T20:59:04Z
add real test case for uploading large blocks (failing now)
commit 660f5e362439d79d8dfd000a805be0ad5181106c
Author: Imran Rashid <[email protected]>
Date: 2015-02-28T02:57:32Z
flesh out NettyBlockTransfer#uploadBlock
commit 4c228a07173e06f8da449db17b878d220e14dea0
Author: Imran Rashid <[email protected]>
Date: 2015-02-28T03:14:13Z
minor cleanup
commit cf7c3a7067aaa61732782995984f17fa94a6cff7
Author: Imran Rashid <[email protected]>
Date: 2015-02-28T04:45:50Z
cleanup abandonded block uploads
commit fe90fd682d71ce4c156dde9e6a016e7923e65aad
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T15:46:19Z
crank up memory for tests
commit 857f3dfae649d56dbd37a022ddbe594c2a9bd0ac
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T15:47:16Z
fix LargeByteBuffer dispose()
commit b700723033d49094a4c0ee6f43a23592b29ae01f
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T16:55:19Z
maven needs you to request lots of extra memory for some reason
commit 6b102a028df0fae8d60b5cb2174fc4f380a81627
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T19:17:16Z
passing tests! assorted little changes
commit 84df2dd20664fa9333a2c01e06ed1ec1159e1b51
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T20:07:59Z
fix tests
commit 7a5d1acb4a512372800d7829aa676594c1f0eed3
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T20:12:30Z
dont kill jenkins with huge tests
commit 38d6993f7f687f6d857970229444644ddfe30db1
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T20:12:38Z
test cleanup
commit fc0d118b48d16a41f1e690b073c156fbf1a47437
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T20:25:58Z
fixup test & fix bug in WrappedLargeByteBuffer
commit 0e3700f3b69031bb91e6756d2abbbc993a3e8dca
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T21:12:55Z
cleanup
commit 6f6a8d7c512ab66ee8f03fa725d97533d0672c8e
Author: Imran Rashid <[email protected]>
Date: 2015-03-02T21:17:24Z
cleanup
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]