Context switch in spark

2014-05-26 Thread Andras Nemeth
Hi Spark Users, What are the restrictions for using more then one spark contexts in a single scala application? I did not see any documented limitations, but we did observe some bad behavior when trying to do this. The one I'm hitting now is that if I create a local context, stop it and then

Re: Spark unit testing best practices

2014-05-16 Thread Andras Nemeth
of different serialization requirements between the two. Perhaps it is different when using Kryo On 05/14/2014 04:34 AM, Andras Nemeth wrote: E.g. if I accidentally use a closure which has something non-serializable in it, then my test will happily succeed in local mode but go down in flames

Spark unit testing best practices

2014-05-15 Thread Andras Nemeth
Hi, Spark's local mode is great to create simple unit tests for our spark logic. The disadvantage however is that certain types of problems are never exposed in local mode because things never need to be put on the wire. E.g. if I accidentally use a closure which has something non-serializable

Re: Storage information about an RDD from the API

2014-05-06 Thread Andras Nemeth
Thanks Koert, very useful! On Tue, Apr 29, 2014 at 6:41 PM, Koert Kuipers ko...@tresata.com wrote: SparkContext.getRDDStorageInfo On Tue, Apr 29, 2014 at 12:34 PM, Andras Nemeth andras.nem...@lynxanalytics.com wrote: Hi, Is it possible to know from code about an RDD if it is cached

Re: the spark configuage

2014-04-30 Thread Andras Nemeth
On 30 Apr 2014 10:35, Akhil Das ak...@sigmoidanalytics.com wrote: Hi The reason you saw that warning is the native Hadoop library $HADOOP_HOME/lib/native/libhadoop.so.1.0.0 was actually compiled on 32 bit. Anyway, it's just a warning, and won't impact Hadoop's functionalities. Here is the

Re: Using google cloud storage for spark big data

2014-04-22 Thread Andras Nemeth
for an equivalent of spark-ec2 on gce. Did you roll your own? On Thu, Apr 17, 2014 at 10:24 AM, Andras Nemeth andras.nem...@lynxanalytics.com wrote: Hello! On Wed, Apr 16, 2014 at 7:59 PM, Aureliano Buendia buendia...@gmail.com wrote: Hi, Google has publisheed a new connector

Fwd: Spark - ready for prime time?

2014-04-10 Thread Andras Nemeth
Hello Spark Users, With the recent graduation of Spark to a top level project (grats, btw!), maybe a well timed question. :) We are at the very beginning of a large scale big data project and after two months of exploration work we'd like to settle on the technologies to use, roll up our sleeves