GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/7591
Revert "[SPARK-8579] [SQL] support arbitrary object in UnsafeRow"
Reverts ObjectPool. As it stands, it has a few problems:
1. ObjectPool doesn't work with spilling at all.
2. I don't think in the long run the idea of an object pool is what we want
to support, since it essentially goes back to unmanaged memory, and creates
pressure on GC, and is hard to account for the total in memory size.
3. ObjectPool removed the specialized getters for strings and binary, and
as a result, actually introduced branches when reading data types.
If we do want to support arbitrary user defined types in the future, I
think we need to pick execution strategies that are optimized for those, rather
than keeping a lot of unserialized JVM objects in memory during aggregation.
This is probably the hardest thing I had to revert in Spark, due to recent
patches that also change the same part of the code. Would be great to get a
careful look.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rxin/spark revert-object-pool
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/7591.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #7591
----
commit fe37079fa3e40939a070db0fdc059a0b2076540d
Author: Reynold Xin <[email protected]>
Date: 2015-07-22T08:20:07Z
Revert "[SPARK-8579] [SQL] support arbitrary object in UnsafeRow"
This reverts commit ed359de595d5dd67b666660eddf092eaf89041c8.
Conflicts:
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeFixedWidthAggregationMap.java
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverter.scala
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeFixedWidthAggregationMapSuite.scala
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]