I just asked this question at the streaming webinar that just ended, but the speakers didn't answered so throwing here:
AFAIK checkpoints are the only recommended method for running Spark streaming without data loss. But it involves serializing the entire dstream graph, which prohibits any logic changes. How should I update / fix logic of a running streaming app without any data loss? Jong Wook