GitHub user ilganeli reopened a pull request:
https://github.com/apache/spark/pull/3518
[SPARK-3694] RDD and Task serialization debugging output
Hi all - in addition to what was explicitly requested in the original JIRA,
I also added the ability to have a trace of the serialization for RDDs so that
you can see which specific dependency is unserializable. For debugging task
serialization, I added a debug log output that shows the file and jar
dependencies. However, I am unsure whether I can add more functionality there.
For the RDD, it is possible to attempt to serialize each dependency in turn,
which is why I can identify which component fails. For task debugging, I did
not see a straightforward way to do the same thing. If anyone can suggest an
approach here, I would be happily to implement it.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ilganeli/spark SPARK-3694B
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/3518.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3518
----
commit 6c997629e4d3bf9bccfbe9c3fa65aa1afa4bfca0
Author: Ilya Ganelin <[email protected]>
Date: 2014-10-30T15:02:04Z
Created class to traverse dependency graph of RDD
commit 47ccc227e5bdf14a1db20edfcf1b8f9c77b3b64a
Author: Ilya Ganelin <[email protected]>
Date: 2014-10-30T22:06:04Z
Started walker code
commit a8d5332a71fbad4cca0aa1a7ca73db8e1386e15f
Author: Ilya Ganelin <[email protected]>
Date: 2014-11-06T18:40:38Z
RDD WAlker updates
commit a63652f8240e0c370100ab05a11c95beaf47faa5
Author: Ilya Ganelin <[email protected]>
Date: 2014-11-06T18:42:48Z
Added debug output to task serialization. Added debug output to RDD
serialization.
commit 05f2cc0665af3ca297936c8c4c5f6128be5a1ddc
Author: Ilya Ganelin <[email protected]>
Date: 2014-11-06T18:51:50Z
Rebase
commit cbb1d771f4576c6ba981252cd8b7490722317ddf
Author: Ilya Ganelin <[email protected]>
Date: 2014-11-14T19:03:25Z
Style errors
commit 183100019a0866e515edd0164db9c4c7fdf3ee5f
Author: Ilya Ganelin <[email protected]>
Date: 2014-11-29T16:21:43Z
Merge remote-tracking branch 'upstream/master'
commit 916a31c57d89bc6fb83b33fdf70dfc1b94192cc5
Author: Ilya Ganelin <[email protected]>
Date: 2014-11-29T23:52:00Z
Manual merge of updates
commit bfb723de65e60aabb9cccc3b45ccc4638f12583d
Author: Ilya Ganelin <[email protected]>
Date: 2014-11-29T23:55:40Z
Added helper files
commit e0a81537d5962f8bc79b8b9193a30b46827246ed
Author: Ilya Ganelin <[email protected]>
Date: 2014-11-30T00:45:52Z
Fixed whitespace errors
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]