I did take a look at SJC earlier. It does look like fits oure use case. It
seems to integrated in Datastax too. Apache Livy looks promising as well. I
will look into these further.
I think for real-time app that needs subsecond latency, spark dynamic
allocation won't work.
Thanks!
On Wed, Feb 7,
The other way might be to launch a single SparkContext and then run jobs
inside of it.
You can take a look at these projects:
-
https://github.com/spark-jobserver/spark-jobserver#persistent-context-mode---faster--required-for-related-jobs
- http://livy.incubator.apache.org
Problems with this way:
Currently sparkContext and it's executor pool is not shareable. Each
spakContext gets its own executor pool for entire life of an application.
So what is the best ways to share cluster resources across multiple long
running spark applications?
Only one I see is spark dynamic allocation but it has