Hi Santosh, Generally speaking, there are two ways of making a process faster:
1. Do more intelligent work by creating indexes, cubes etc thus reducing the processing time 2. Throw hardware and memory at it using something like Spark multi-cluster with fully managed cloud service like Google Dataproc So the framework is a computational engine (Spark) and the physical realisation is achieved by creating a Spark cluster (multi nodes/VM hosts) that work in tandem and provide parallel processing. I suggest that you look at Spark docs <https://spark.apache.org/> HTH LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Sat, 10 Oct 2020 at 15:24, Santosh74 <sardesaisant...@gmail.com> wrote: > Is spark compute engine only or it's also cluster which comes with set of > hardware /nodes ? What exactly is spark clusterr? > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >