Hi everyone,

I'm working on a spark streaming program where I need to asynchronously
apply a complex function across the partitions of an RDD. I'm currently
using foreachPartitionAsync to achieve this. What is the idiomatic way of
handling the FutureAction that returns from the foreachPartitionAsync call?
Currently I am simply doing:

try {
  Await.ready(future, timeout)
} catch {
  case error: TimeoutException =>
  future.cancel()
  //log the error
}

Is there a better way to handle the possibility of a future timeout? I would
prefer some method of retrying but am not sure how that would work in the
Spark Streaming execution model. Processing order isn't particularly
important to me, so the ability to "come back at a later time" and retry the
batch interval contents would be helpful.

Thanks for any advice!




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Handling-futures-from-foreachPartitionAsync-in-Spark-Streaming-tp25883.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to