GitHub user davies opened a pull request:

    https://github.com/apache/spark/pull/2681

    embed small object in broadcast to avoid RPC

    For most of tasks, the serialized data will small, such as less than 4k or 
8k, we can avoid the RPC at all if the data was embedded in the Broadcast 
object it self.
    
    With this patch, The size of task will be similar to that before we use 
broadcast for them, no RPC (but still cached, only one deserialization per 
executor)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davies/spark embed

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2681.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2681
    
----
commit 3fd051d4bc9a305375f6d5abeb64258beacaa3d9
Author: Davies Liu <[email protected]>
Date:   2014-10-03T00:07:27Z

    embed small object in broadcast to avoid RPC

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to