Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/19953#discussion_r156534448
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/ReplayListenerBus.scala ---
@@ -84,16 +84,9 @@ private[spark] class ReplayListenerBus extends
SparkListenerBus with Logging {
postToAll(JsonProtocol.sparkEventFromJson(parse(currentLine)))
} catch {
- case e: ClassNotFoundException if
KNOWN_REMOVED_CLASSES.contains(e.getMessage) =>
- // Ignore events generated by Structured Streaming in Spark
2.0.0 and 2.0.1.
- // It's safe since no place uses them.
- logWarning(s"Dropped incompatible Structured Streaming log:
$currentLine")
- case e: UnrecognizedPropertyException if e.getMessage != null &&
e.getMessage.startsWith(
- "Unrecognized field \"queryStatus\" " +
- "(class
org.apache.spark.sql.streaming.StreamingQueryListener$") =>
- // Ignore events generated by Structured Streaming in Spark
2.0.2
- // It's safe since no place uses them.
- logWarning(s"Dropped incompatible Structured Streaming log:
$currentLine")
+ case _: ClassNotFoundException | _:
UnrecognizedPropertyException =>
+ // Ignore unknown events or unrecognized properties, parse
through the event log file.
+ logWarning(s"Drop incompatible event log: $currentLine")
--- End diff --
Yeah, we can extend the existing whitelist mechanism and make it
configurable too. At the same time, we can make the tolerance level
configurable. Just want to make it more configurable. Also keep it strict
during the tests.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]