GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/14675
[SPARK-17096][SQL][STREAMING] Improve exception string reported through the
StreamingQueryListener
## What changes were proposed in this pull request?
Currently, the stackTrace (as `Array[StackTraceElements]`) reported through
StreamingQueryListener.onQueryTerminated is useless as it has the stack trace
of where StreamingQueryException is defined, not the stack trace of underlying
exception.
Here is the right way to reason about what should be posted as through
StreamingQueryListener.onQueryTerminated
- The actual exception could either be a SparkException, or an arbitrary
exception.
- SparkException reports the relevant executor stack trace of a failed
task as a string in the the exception message. The `Array[StackTraceElements]`
returned by `SparkException.stackTrace()` is mostly not relevant.
- For arbitrary exception, the `Array[StackTraceElements]` returned by
`exception.stackTrace()` may be relevant.
- It's hard to reason when the `Array[StackTraceElements]` is relevant. In
fact, it is not clear whether it is even useful to report the stack trace as
this array of Java objects. It may be sufficient to report the strack trace as
a string, along with the message. This is how Spark reported executor stra
- Hence, this PR simplify the API by removing the array `stackTrace` from
`QueryTerminated`. Instead it returns the string containing both the exception
message and the stack trace. If any one is interested in the actual stack trace
as an array, can always access the exception objects through
`streamingQuery.exception`.
Note that this change in the public `QueryTerminated` class is okay as the
APIs are still experimental.
## How was this patch tested?
Unit tests that test whether the right information is present in the
exception message reported through QueryTerminated object.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tdas/spark SPARK-17096
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14675.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14675
----
commit dafdbb6c06dd2be49ede3fd6a4c745e0001bf272
Author: Tathagata Das <[email protected]>
Date: 2016-08-16T23:40:28Z
Removed stacktrace from QueryTerminated
commit 60eabcc4716f26e0c6d688d14b518f7321313f88
Author: Tathagata Das <[email protected]>
Date: 2016-08-17T00:05:30Z
Convert both exception message and stack trace to string
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]