RE: Model parallelism with RDD

2015-07-16 Thread Ulanov, Alexander
Dear Spark developers, What happens if RDD does not fit into memory and cache would not work in the code below? Will all previous iterations repeated each new iteration within iterative RDD update (as described below)? Also, could you clarify regarding DataFrame and GC overhead: does setting

S3 Read / Write makes executors deadlocked

2015-07-16 Thread Hao Ren
Given the following code which just reads from s3, then saves files to s3 val inputFileName: String = s3n://input/file/path val outputFileName: String = s3n://output/file/path val conf = new SparkConf().setAppName(this.getClass.getName).setMaster(local[4]) val

Re: problems with build of latest the master

2015-07-16 Thread Steve Loughran
Patching hadoop's build will fix this long term, but not until Hadoop-2.7.2 I think just adding the openstack JAR to the spark classpath should be enough to pick this up, which the --jars command can do with ease On that topic, one thing I would like to see (knowing what it takes to get azure

Re: Foundation policy on releases and Spark nightly builds

2015-07-16 Thread Sean Owen
To move this forward, I think one of two things needs to happen: 1. Move this guidance to the wiki. Seems that people gathered here believe that resolves the issue. Done. 2. Put disclaimers on the current downloads page. This may resolve the issue, but then we bring it up on the right mailing

Re: RestSubmissionClient Basic Auth

2015-07-16 Thread Akhil Das
You can possibly raise a JIRA ticket for feature and start working on it, once done you can send a pull request with the code changes. Thanks Best Regards On Wed, Jul 15, 2015 at 7:30 PM, Joel Zambrano jo...@microsoft.com wrote: Thanks Akhil! For the one where I change the rest client, how

KryoSerializer gives class cast exception

2015-07-16 Thread Eugene Morozov
Hi, some time ago we’ve found that it’s better use Kryo serializer instead of Java one. So, we turned it on and use it everywhere. I have pretty complex objects, which I can’t change. Previously my algo was building such an objects and then storing them into external storage. It was not

Re: Apache gives exception when running groupby on df temp table

2015-07-16 Thread Ted Yu
Can you provide a bit more information such as: release of Spark you use snippet of your SparkSQL query Thanks On Thu, Jul 16, 2015 at 5:31 AM, nipun ibnipu...@gmail.com wrote: I have a dataframe. I register it as a temp table and run a spark sql query on it to get another dataframe. Now

Re: S3 Read / Write makes executors deadlocked

2015-07-16 Thread Hao Ren
I have tested on another pc which has 8 CPU cores. But it hangs when defaultParallelismLevel 4, e.g. sparkConf.setMaster(local[*]) local[1] ~ local[3] work well. 4 is the mysterious boundary. It seems that I am not the only one encountered this problem:

RE: BlockMatrix multiplication

2015-07-16 Thread Ulanov, Alexander
Hi Burak, If I change the code as you suggested then it fails with (given that blockSize is 1): “org.apache.spark.SparkException: The MatrixBlock at (3, 3) has dimensions different than rowsPerBlock: 2, and colsPerBlock: 2. Blocks on the right and bottom edges can have smaller