Github user willb commented on the pull request:
https://github.com/apache/spark/pull/143#issuecomment-37686459
Yes, my understanding of SPARK-897 is that the issue is ensuring
serializability errors are reported to the user as soon as possible. And
essentially what these commits do is replicate the closure-serializability
check (which, as you note, occurs now in the scheduler as part of job
submission) in `ClosureCleaner.clean`, which is called for every closure
argument to RDD transformation methods in the driver. (The test cases I added
in f2ef54e check to see that unserializable-closure failures happen immediately
on transformation invocation, not merely after actions occur.)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---