[GitHub] spark pull request: SPARK-1057 (alternative) Remove fastutil

srowen Fri, 28 Mar 2014 14:58:11 -0700

GitHub user srowen opened a pull request:

    https://github.com/apache/spark/pull/266


    SPARK-1057 (alternative) Remove fastutil

    (This is for discussion at this point -- I'm not suggesting this should be 
committed.)
    
    This is what removing fastutil looks like. Much of it is straightforward, 
like using `java.io` buffered stream classes, and Guava for murmurhash3.
    
    Uses of the `FastByteArrayOutputStream` were a little trickier. In only one 
case though do I think the change to use `java.io` actually entails an extra 
array copy.
    
    The rest is using `OpenHashMap` and `OpenHashSet`.  These are now written 
in terms of more scala-like operations.
    
    `OpenHashMap` is where I made three non-trivial changes to make it work, 
and they need review:
    
    - It is no longer private
    - The key must be a `ClassTag`
    - Unless a lot of other code changes, the key type can't enforce being a 
supertype of `Null`
    
    It all works and tests pass, and I think there is reason to believe it's OK 
from a speed perspective.
    
    But what about those last changes? 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srowen/spark SPARK-1057-alternate

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/266.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #266
    
----
commit e4c8adcfb4152141ca7046fdfe08778ecbcf58c5
Author: Sean Owen <[email protected]>
Date:   2014-03-28T21:50:20Z

    Remove use of fastutil and replace with use of java.io, spark.util and 
Guava classes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1057 (alternative) Remove fastutil

Reply via email to