GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/12260

    [SPARK-14491][SQL] refactor object operator framework to make it easy to 
eliminate serializations

    ## What changes were proposed in this pull request?
    
    This PR tries to separate the serialization and deserialization logic from 
object operators, so that it's easier to eliminate unnecessary serializations 
in optimizer.
    
    Typed aggregate related operators are special, they will deserialize the 
input row to multiple objects and it's difficult to simply use a deserializer 
operator to abstract it, so we still mix the deserialization logic there.
    
    ## How was this patch tested?
    
    existing tests and new test in `EliminateSerializationSuite`


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark encoder

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12260.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12260
    
----
commit 7ecebdbf4fb46d8f46b4b53d497ca88cc2fbc227
Author: Wenchen Fan <[email protected]>
Date:   2016-04-08T08:18:36Z

    use serializer and deserializer operators to simplify the object operator 
framework

commit f072fdc6a031734eea35bb33fd416b3f1252e688
Author: Wenchen Fan <[email protected]>
Date:   2016-04-08T09:42:05Z

    refactor

commit 3b134ca55e098b978ebf5b5b41ce87ed4ad91a86
Author: Wenchen Fan <[email protected]>
Date:   2016-04-08T13:30:52Z

    add more tests

commit 8ec01b77723c3c727f9e3f8933bb5f1b0847ecd2
Author: Wenchen Fan <[email protected]>
Date:   2016-04-08T14:21:29Z

    optimize AppendColumns

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to