[GitHub] spark pull request: [SQL] support arbitrary object in UnsafeRow

davies Tue, 23 Jun 2015 12:20:12 -0700

GitHub user davies opened a pull request:

    https://github.com/apache/spark/pull/6959


    [SQL] support arbitrary object in UnsafeRow

    This PR brings arbitrary object support in UnsafeRow (both in grouping key 
and aggregation buffer).
    
    Two object pools will be created to hold those non-primitive objects, and 
put the index of them into UnsafeRow. In order to compare the grouping key as 
bytes, the objects in key will be stored in a unique object pool, to make sure 
same objects will have same index (used as hashCode).
    
    For StringType and BinaryType, we still put them as var-length in UnsafeRow 
when initializing for better performance. But for update, they will be an 
object inside object pools (there will be some garbages left in the buffer).
    
    BTW: Will create a JIRA once issue.apache.org is available.
    
    cc @JoshRosen @rxin 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davies/spark unsafe_obj

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/6959.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #6959
    
----
commit 236d6de2860bc64e3ba3ab352a5abd123685d8c0
Author: Davies Liu <[email protected]>
Date:   2015-06-23T19:12:03Z

    support arbitrary object in UnsafeRow

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SQL] support arbitrary object in UnsafeRow

Reply via email to