GitHub user zsxwing opened a pull request:
https://github.com/apache/spark/pull/16085
[SPARK-18655][SS]Ignore Structured Streaming 2.0.2 logs in history server
## What changes were proposed in this pull request?
As `queryStatus` in StreamingQueryListener events was removed in #15954,
parsing 2.0.2 structured streaming logs will throw the following errror:
```
[info] com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException:
Unrecognized field "queryStatus" (class
org.apache.spark.sql.streaming.StreamingQueryListener$QueryTerminatedEvent),
not marked as ignorable (2 known properties: "id", "exception"])
[info] at [Source:
{"Event":"org.apache.spark.sql.streaming.StreamingQueryListener$QueryTerminatedEvent","queryStatus":{"name":"query-1","id":1,"timestamp":1480491532753,"inputRate":0.0,"processingRate":0.0,"latency":null,"sourceStatuses":[{"description":"FileStreamSource[file:/Users/zsx/stream]","offsetDesc":"#0","inputRate":0.0,"processingRate":0.0,"triggerDetails":{"latency.getOffset.source":"1","triggerId":"1"}}],"sinkStatus":{"description":"FileSink[/Users/zsx/stream2]","offsetDesc":"[#0]"},"triggerDetails":{}},"exception":null};
line: 1, column: 521] (through reference chain:
org.apache.spark.sql.streaming.QueryTerminatedEvent["queryStatus"])
[info] at
com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:51)
[info] at
com.fasterxml.jackson.databind.DeserializationContext.reportUnknownProperty(DeserializationContext.java:839)
[info] at
com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:1045)
[info] at
com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1352)
[info] at
com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperties(BeanDeserializerBase.java:1306)
[info] at
com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:453)
[info] at
com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1099)
...
```
This PR just ignores such errors and adds a test to make sure we can read
2.0.2 logs.
## How was this patch tested?
`query-event-logs-version-2.0.2.txt` has all types of events generated by
Structured Streaming in Spark 2.0.2. `testQuietly("ReplayListenerBus should
ignore broken event jsons generated in 2.0.2")` verified we can load them
without any error.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zsxwing/spark SPARK-18655
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/16085.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16085
----
commit 33c253f69dca337e27d2c741d3ea293f1598f53e
Author: Shixiong Zhu <[email protected]>
Date: 2016-11-30T19:06:18Z
Ignore Structured Streaming 2.0.2 logs in history server
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]