James Xu created STORM-56:
-----------------------------
Summary: Provide support for handling "bad data"
Key: STORM-56
URL: https://issues.apache.org/jira/browse/STORM-56
Project: Apache Storm (Incubating)
Issue Type: New Feature
Reporter: James Xu
https://github.com/nathanmarz/storm/issues/13
Examples:
1. Scheme can't deserialize the tuple
2. An object that serializes but can't be deserialized. From Sam Stokes: "I've
seen JSON libraries that incorrectly serialised strings containing multi-byte
characters, and then unsurprisingly weren't able to parse the resulting byte
soup. "
This could be as simple as providing an exception type for deserialization
problems (InvalidTupleException) and a Storm config for skipping bad data.
Perhaps there can also be an implicit stream where those bad tuples are sent as
binary data. With the implicit stream, applications can do something with the
bad data like record it somewhere.
-------------------
malur: This would be very useful. Does it make sense to have an error handler
bolt at different levels like spout and topology?
------------------
nathanmarz: Yes, it does. There's already a planned feature called "failure
streams" for spouts: an implicit stream where all failed spout tuples are sent
to. Bad data could be sent to another kind of failure stream.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)