James Xu created STORM-56:
-----------------------------

             Summary: Provide support for handling "bad data"
                 Key: STORM-56
                 URL: https://issues.apache.org/jira/browse/STORM-56
             Project: Apache Storm (Incubating)
          Issue Type: New Feature
            Reporter: James Xu


https://github.com/nathanmarz/storm/issues/13

Examples:

1. Scheme can't deserialize the tuple
2. An object that serializes but can't be deserialized. From Sam Stokes: "I've 
seen JSON libraries that incorrectly serialised strings containing multi-byte 
characters, and then unsurprisingly weren't able to parse the resulting byte 
soup. "

This could be as simple as providing an exception type for deserialization 
problems (InvalidTupleException) and a Storm config for skipping bad data. 
Perhaps there can also be an implicit stream where those bad tuples are sent as 
binary data. With the implicit stream, applications can do something with the 
bad data like record it somewhere.

-------------------
malur: This would be very useful. Does it make sense to have an error handler 
bolt at different levels like spout and topology?

------------------
nathanmarz: Yes, it does. There's already a planned feature called "failure 
streams" for spouts: an implicit stream where all failed spout tuples are sent 
to. Bad data could be sent to another kind of failure stream.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to