Thanks Alex, but 482MB was just example size, and I am looking for generic approach doing this without broadcasting,
any idea? best, /Shahab On Thu, Apr 30, 2015 at 4:21 PM, Alex <lxv...@gmail.com> wrote: > 482 MB should be small enough to be distributed as a set of broadcast > variables. Then you can use local features of spark to process. > ------------------------------ > From: shahab <shahab.mok...@gmail.com> > Sent: 4/30/2015 9:42 AM > To: user@spark.apache.org > Subject: is there anyway to enforce Spark to cache data in all worker > nodes(almost equally) ? > > Hi, > > I load data from Cassandra into spark The entire data is almost around 482 > MB. and it is cached as TempTable in 7 tables. How can I enforce spark to > cache data in both worker nodes not only in ONE worker (as in my case)? > > I am using spark "2.1.1" with spark-connector "1.2.0-rc3". I have small > stand-alone cluster with two nodes A, B. Where node A accommodates > Cassandra, Spark Master and Worker and node B contains the second spark > worker. > > best, > /Shahab >