GitHub user yzhou2001 opened a pull request:

    https://github.com/apache/spark/pull/12598

    [SPARK-14521][SQL]StackOverflowError in Kryo when executing TPC-DS

    ## What changes were proposed in this pull request?
    
    Observed stackOverflowError in Kryo when executing TPC-DS Query27. Spark 
thrift server disables kryo reference tracking (if not specified in conf). When 
"spark.kryo.referenceTracking" is set to true explicitly in 
spark-defaults.conf, query executes successfully. The root cause is that the 
TaskMemoryManager inside MemoryConsumer and LongToUnsafeRowMap were not 
transient and thus were serialized and broadcast around from within 
LongHashedRelation, which could potentially cause circular reference inside 
Kryo. But the TaskMemoryManager is per task and should not be passed around at 
the first place. This fix makes it transient. 
    
    ## How was this patch tested?
    hive/test (all 9 tests of HiveSparkSubmitSuite fail on my personal box due 
to some env issue and they also failed before this patch), sql/test, 
catalyst/test, lint-scala, 
org.apache.spark.sql.hive.execution.HiveCompatibilitySuite,
    manual test of TBC-DS Query 27 with 1GB data but without the "limit 100" 
which would cause a NPE due to SPARK-14752.
    
    CC: @rajeshbalamohan 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yzhou2001/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12598.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12598
    
----
commit 6ba1a74aa4a35fa6d2d855d889afd4a3b688a27c
Author: yzhou2001 <[email protected]>
Date:   2016-04-22T01:53:21Z

    fix for SPARK-14521

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to