I have a question regarding serialization of member variables in spouts. We are using storm to parse files in HDFS with storm spouts. Because files usually emit many tuples we store which file we are parsing as a member variable in the spout.
We have noticed that when rebalancing happens we tend to lose track of the file that is being parsed if the file isn't completed. Can member variables be deserialized during rebalancing? Is there some other good tricks to use where state must be saved between nextTuple calls to protect yourself from rebalancing. I'm using storm version 0.9.0.1. Thanks, Vincent
