Hi,

I'm building a system for near real-time data analytics. My plan is to have
an ETL batch job which calculates aggregations running periodically. User
queries are then parsed for on-demand calculations, also in Spark. Where are
the pre-calculated results supposed to be saved? I mean, after finishing
aggregations, the ETL job will terminate, so caches are wiped out of memory.
How can I use these results to calculate on-demand queries? Or more
generally, could you please give me a good way to organize the data flow and
jobs in order to achieve this?

I'm new to Spark so sorry if this might sound like a dumb question.

Thank you.
Huy



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Where-to-save-intermediate-results-tp13062.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to