Hey, It seems pretty clear that one of the strength of Spark is to be able to share your code between your batch and streaming layer. Though, given that Spark streaming uses DStream being a set of RDDs and Spark uses a single RDD there might some complexity associated with it.
Of course since DStream is a superset of RDDs, one can just run the same code at the RDD granularity using DStream::forEachRDD. While this should work for map, I am not sure how that can work when it comes to reduce phase given that a group of keys spans across multiple RDDs. One of the option is to change the dataset object on which a job works on. For example of passing an RDD to a class method, one passes a higher level object (MetaRDD) that wraps around RDD or DStream depending the context. At this point the job calls its regular maps, reduces and so on and the MetaRDD wrapper would delegate accordingly. Just would like to know the official best practice from the spark community though. Thanks,