Hi Spark Users,
What are the restrictions for using more then one spark contexts in a
single scala application? I did not see any documented limitations, but we
did observe some bad behavior when trying to do this. The one I'm hitting
now is that if I create a local context, stop it and then
of different serialization requirements between the two. Perhaps it is
different when using Kryo
On 05/14/2014 04:34 AM, Andras Nemeth wrote:
E.g. if I accidentally use a closure which has something
non-serializable in it, then my test will happily succeed in local mode but
go down in flames
Hi,
Spark's local mode is great to create simple unit tests for our spark
logic. The disadvantage however is that certain types of problems are never
exposed in local mode because things never need to be put on the wire.
E.g. if I accidentally use a closure which has something non-serializable
Thanks Koert, very useful!
On Tue, Apr 29, 2014 at 6:41 PM, Koert Kuipers ko...@tresata.com wrote:
SparkContext.getRDDStorageInfo
On Tue, Apr 29, 2014 at 12:34 PM, Andras Nemeth
andras.nem...@lynxanalytics.com wrote:
Hi,
Is it possible to know from code about an RDD if it is cached
On 30 Apr 2014 10:35, Akhil Das ak...@sigmoidanalytics.com wrote:
Hi
The reason you saw that warning is the native Hadoop library
$HADOOP_HOME/lib/native/libhadoop.so.1.0.0 was actually compiled on 32 bit.
Anyway, it's just a warning, and won't impact Hadoop's functionalities.
Here is the
for an equivalent of spark-ec2 on gce. Did you roll your own?
On Thu, Apr 17, 2014 at 10:24 AM, Andras Nemeth
andras.nem...@lynxanalytics.com wrote:
Hello!
On Wed, Apr 16, 2014 at 7:59 PM, Aureliano Buendia buendia...@gmail.com
wrote:
Hi,
Google has publisheed a new connector
Hello Spark Users,
With the recent graduation of Spark to a top level project (grats, btw!),
maybe a well timed question. :)
We are at the very beginning of a large scale big data project and after
two months of exploration work we'd like to settle on the technologies to
use, roll up our sleeves