My team uses Storm to ingest data into a few of our enterprise systems and sometimes we run into issues where a tuple will fail at one and succeed at the others. Some of us feel that the problem is that we are trying to parallelize a process that should be serialized instead according to priority so that we can ensure that if a tuple fails in one place, the other systems don't attempt to ingest at all. None of the systems were are inserting data into are SQL databases where we can just write the bolt to do a rollback in the event of a failure, they're search systems, HBase, stuff like that.
I'd like to know what others here think about this use case. I've seen statements that Trident adds transactional support to Storm which would be a big help, but I've not dug deep enough yet (holidays and all) to get a good feel for whether it's what we need. What we really need is some sort of transactional behavior that on failure with one bolt would trigger a rollback across all of our data ingestion bolts with the message id or something like that so we could manually remove the data. Thanks, Mike
