[GitHub] spark pull request: Revert "[SPARK-8579] [SQL] support arbitrary o...

rxin Wed, 22 Jul 2015 01:25:35 -0700

GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/7591


    Revert "[SPARK-8579] [SQL] support arbitrary object in UnsafeRow"

    Reverts ObjectPool. As it stands, it has a few problems:
    
    1. ObjectPool doesn't work with spilling at all.
    2. I don't think in the long run the idea of an object pool is what we want 
to support, since it essentially goes back to unmanaged memory, and creates 
pressure on GC, and is hard to account for the total in memory size.
    3. ObjectPool removed the specialized getters for strings and binary, and 
as a result, actually introduced branches when reading data types.
    
    If we do want to support arbitrary user defined types in the future, I 
think we need to pick execution strategies that are optimized for those, rather 
than keeping a lot of unserialized JVM objects in memory during aggregation.
    
    This is probably the hardest thing I had to revert in Spark, due to recent 
patches that also change the same part of the code. Would be great to get a 
careful look.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark revert-object-pool

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/7591.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #7591
    
----
commit fe37079fa3e40939a070db0fdc059a0b2076540d
Author: Reynold Xin <[email protected]>
Date:   2015-07-22T08:20:07Z

    Revert "[SPARK-8579] [SQL] support arbitrary object in UnsafeRow"
    
    This reverts commit ed359de595d5dd67b666660eddf092eaf89041c8.
    
    Conflicts:
        
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeFixedWidthAggregationMap.java
        
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
        
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverter.scala
        
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeFixedWidthAggregationMapSuite.scala
        
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
        
sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: Revert "[SPARK-8579] [SQL] support arbitrary o...

Reply via email to