permition problem

2014-05-01 Thread Livni, Dana
I'm working with spark 0.9.0 on cdh5. I'm running a spark application written in java in yarn-client mode. Cause of the OP installed on the cluster I need to run the application using the hdfs user, otherwise I have a permission problem and getting the following error:

working with MultiTableInputFormat

2014-03-29 Thread Livni, Dana
I'm trying to create an RDD from multiple scans. I tried to set the configuration this way: Configuration config = HBaseConfiguration.create(); config.setStrings(MultiTableInputFormat.SCANS,scanStrings); And creating each scan string in the array scanStrings this way: Scan scan = new Scan();

RE: major Spark performance problem

2014-03-09 Thread Livni, Dana
YARN also have this scheduling option. The problem is all of our applications have the same flow where the first stage is the heaviest and the rest are very small. The problem is when some request (application) start to run on the same time, the first stage of all is schedule in parallel, and