GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/4093
[SPARK-5307] SerializationDebugger to help debug NotSerializableException
This patch adds a SerializationDebugger that is used to add more
information to a NotSerializableException. When a NotSerializableException is
encountered, the debugger tries to serialize the object one more time through a
DebugStream that hooks into the internals of ObjectOutputStream to get the
serialization stack. This ensures that there is no performance loss to run with
SerializationDebugger, unlike setting the
sun.io.serialization.extendedDebugInfo flag.
An example output looks like this:
```
org.apache.spark.serializer.NotSerializableClass
Serialization stack (3):
- org.apache.spark.serializer.NotSerializableClass@5e20dc10 (class
org.apache.spark.serializer.NotSerializableClass)
- org.apache.spark.serializer.SerializableClass2@521fb14e (class
org.apache.spark.serializer.SerializableClass2)
- org.apache.spark.serializer.SerializableClass1@5f54e92c (class
org.apache.spark.serializer.SerializableClass1)
Run the JVM with sun.io.serialization.extendedDebugInfo for more
information.
```
When sun.io.serialization.extendedDebugInfo is turned on, this debugger no
longer adds more information. Note that sun.io.serialization.extendedDebugInfo
can show also the field name information, which is harder to get by the
SerializationDebugger (technically possible with reflection but fairly
convoluted).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rxin/spark serialization
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4093.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4093
----
commit f7e6320e23c12fb24a2d9b26c772d0f3231f0ec7
Author: Reynold Xin <[email protected]>
Date: 2015-01-18T07:14:56Z
[SPARK-5307] SerializationDebugger to help debug NotSerializableException.
This patch adds a SerializationDebugger that is used to add more
information to
a NotSerializableException. When a NotSerializableException is encountered,
the
debugger tries to serialize the object one more time through a DebugStream
that
hooks into the internals of ObjectOutputStream to get the serialization
stack.
An example output looks like this:
org.apache.spark.serializer.NotSerializableClass
Serialization stack (3):
- org.apache.spark.serializer.NotSerializableClass@5e20dc10 (class
org.apache.spark.serializer.NotSerializableClass)
- org.apache.spark.serializer.SerializableClass2@521fb14e (class
org.apache.spark.serializer.SerializableClass2)
- org.apache.spark.serializer.SerializableClass1@5f54e92c (class
org.apache.spark.serializer.SerializableClass1)
Run the JVM with sun.io.serialization.extendedDebugInfo for more
information.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]