You can do this. // global variable to keep track of latest stuff var latestTime = _ var latestRDD = _
dstream.foreachRDD((rdd: RDD[..], time: Time) => { latestTime = time latestRDD = rdd }) Now you can asynchronously access the latest RDD. However if you are going to run jobs on the latest RDD, you must tell the streaming subsystem to keep the necessary data around for longer, otherwise it will get deleted even before asynchronous query has completed. Use this. streamingContext.remember(<expected max duration of your async query on latest RDD>) On Tue, Jul 14, 2015 at 6:57 PM, Chen Song <chen.song...@gmail.com> wrote: > I have been POC adding a rest service in a Spark Streaming job. Say I > create a stateful DStream X by using updateStateByKey, and each time there > is a HTTP request, I want to apply some transformations/actions on the > latest RDD of X and collect the results immediately but not scheduled by > streaming batch interval. > > * Is that even possible? > * The reason I think of this is because user can get a list of RDDs by > DStream.window.slice but I cannot find a way to get the most recent RDD in > the DSteam. > > > -- > Chen Song > >