Kay Ousterhout created SPARK-6066:
-------------------------------------
Summary: Metadata in event log makes it very difficult for
external libraries to parse event log
Key: SPARK-6066
URL: https://issues.apache.org/jira/browse/SPARK-6066
Project: Spark
Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Kay Ousterhout
Assignee: Andrew Or
Priority: Blocker
The fix for SPARK-2261 added a line at the beginning of the event log that
encodes metadata. This line makes it much more difficult to parse the event
logs from external libraries (like
https://github.com/kayousterhout/trace-analysis, which is used by folks at
Berkeley) because:
(1) The metadata is not written as JSON, unlike the rest of the file
(2) More annoyingly, if the file is compressed, the metadata is not compressed.
This has a few side-effects: first, someone can't just use the command line to
uncompress the file and then look at the logs, because the file is in this
weird half-compressed format; and second, now external tools that parse these
logs also need to deal with this weird format.
We should fix this before the 1.3 release, because otherwise we'll have to add
a bunch more backward-compatibility code to handle this weird format!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]