GitHub user tdas opened a pull request:

    https://github.com/apache/spark/pull/14675

    [SPARK-17096][SQL][STREAMING] Improve exception string reported through the 
StreamingQueryListener

    ## What changes were proposed in this pull request?
    
    Currently, the stackTrace (as `Array[StackTraceElements]`) reported through 
StreamingQueryListener.onQueryTerminated is useless as it has the stack trace 
of where StreamingQueryException is defined, not the stack trace of underlying 
exception. 
    
    Here is the right way to reason about what should be posted as through 
StreamingQueryListener.onQueryTerminated
    - The actual exception could either be a SparkException, or an arbitrary 
exception.
      - SparkException reports the relevant executor stack trace of a failed 
task as a string in the the exception message. The `Array[StackTraceElements]` 
returned by `SparkException.stackTrace()` is mostly not relevant.
      - For arbitrary exception, the `Array[StackTraceElements]` returned by 
`exception.stackTrace()` may be relevant.
    - It's hard to reason when the `Array[StackTraceElements]` is relevant. In 
fact, it is not clear whether it is even useful to report the stack trace as 
this array of Java objects. It may be sufficient to report the strack trace as 
a string, along with the message. This is how Spark reported executor stra
    - Hence, this PR simplify the API by removing the array `stackTrace` from 
`QueryTerminated`. Instead it returns the string containing both the exception 
message and the stack trace. If any one is interested in the actual stack trace 
as an array, can always access the exception objects through 
`streamingQuery.exception`.
    
    Note that this change in the public `QueryTerminated` class is okay as the 
APIs are still experimental.
    
    ## How was this patch tested?
    Unit tests that test whether the right information is present in the 
exception message reported through QueryTerminated object.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tdas/spark SPARK-17096

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14675.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14675
    
----
commit dafdbb6c06dd2be49ede3fd6a4c745e0001bf272
Author: Tathagata Das <tathagata.das1...@gmail.com>
Date:   2016-08-16T23:40:28Z

    Removed stacktrace from QueryTerminated

commit 60eabcc4716f26e0c6d688d14b518f7321313f88
Author: Tathagata Das <tathagata.das1...@gmail.com>
Date:   2016-08-17T00:05:30Z

    Convert both exception message and stack trace to string

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to