482 MB should be small enough to be distributed as a set of broadcast
variables. Then you can use local features of spark to process.
-Original Message-
From: shahab shahab.mok...@gmail.com
Sent: 4/30/2015 9:42 AM
To: user@spark.apache.org user@spark.apache.org
Subject: is there
Hi,
I load data from Cassandra into spark The entire data is almost around 482
MB. and it is cached as TempTable in 7 tables. How can I enforce spark to
cache data in both worker nodes not only in ONE worker (as in my case)?
I am using spark 2.1.1 with spark-connector 1.2.0-rc3. I have small
Thanks Alex, but 482MB was just example size, and I am looking for
generic approach doing this without broadcasting,
any idea?
best,
/Shahab
On Thu, Apr 30, 2015 at 4:21 PM, Alex lxv...@gmail.com wrote:
482 MB should be small enough to be distributed as a set of broadcast
variables. Then