Max Seiden created SPARK-5277:
---------------------------------
Summary: SparkSqlSerializer does not register user specified
KryoRegistrators
Key: SPARK-5277
URL: https://issues.apache.org/jira/browse/SPARK-5277
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.2.0
Reporter: Max Seiden
Although the SparkSqlSerializer class extends the KryoSerializer in core, it's
overridden newKryo() does not call super.newKryo(). This results in
inconsistent serializer behaviors depending on whether a KryoSerializer
instance or a SparkSqlSerializer instance is used. This may also be related to
the TODO in KryoResourcePool, which uses KryoSerializer instead of
SparkSqlSerializer due to yet-to-be-investigated test failures.
An example of the divergence in behavior: The Exchange operator creates a new
SparkSqlSerializer instance (with an empty conf; another issue) when it is
constructed, whereas the GENERIC ColumnType pulls a KryoSerializer out of the
resource pool (see above). The result is that the serialized in-memory columns
are created using the user provided serializers / registrators, while
serialization during exchange does not.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]