GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/9030
[SPARK-10914] UnsafeRow serialization breaks when two machines have
different Oops size.
UnsafeRow contains 3 pieces of information when pointing to some data in
memory (an object, a base offset, and length). When the row is serialized with
Java/Kryo serialization, the object layout in memory can change if two machines
have different pointer width (Oops in JVM).
To reproduce, launch Spark using
MASTER=local-cluster[2,1,1024] bin/spark-shell --conf
"spark.executor.extraJavaOptions=-XX:-UseCompressedOops"
And then run the following
scala> sql("select 1 xx").collect()
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rxin/spark SPARK-10914
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9030.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9030
----
commit 465fc8e18147b9e8cf34e0f5bcbc338d03ad4f95
Author: Reynold Xin <[email protected]>
Date: 2015-10-08T18:34:14Z
[SPARK-10914] UnsafeRow serialization breaks when two machines have
different Oops size.
The problem is that UnsafeRow contains 3 pieces of information when
pointing to some data in memory (an object, a base offset, and length). When
the row is serialized with Java/Kryo serialization, the object layout in memory
can change if two machines have different pointer width (Oops in JVM).
To reproduce, launch Spark using
MASTER=local-cluster[2,1,1024] bin/spark-shell --conf
"spark.executor.extraJavaOptions=-XX:-UseCompressedOops"
And then run the following
scala> sql("select 1 xx").collect()
(cherry picked from commit 157b2a818d3993b1321cc41fb7b30407bd13490b)
Signed-off-by: Reynold Xin <[email protected]>
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]