Hi, We have a requirement wherein we need to keep RDDs in memory between Spark batch processing that happens every one hour. The idea here is to have RDDs that have active user sessions in memory between two jobs so that once a job processing is done and another job is run after an hour the RDDs with active sessions are still available for joining with those in the current job. So, what do we need to keep the data in memory in between two batch jobs? Can we use Tachyon?
Thanks, Swetha -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-keep-RDDs-in-memory-between-two-different-batch-jobs-tp23957.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org