Re: SparkR in yarn-client mode needs sparkr.zip

2015-10-25 Thread Ram Venkatesh
Felix,

Missed your reply - agree looks like the same issue, resolved mine as
Duplicate.

Thanks!
Ram

On Sun, Oct 25, 2015 at 2:47 PM, Felix Cheung 
wrote:

>
>
> This might be related to https://issues.apache.org/jira/browse/SPARK-10500
>
>
>
> On Sun, Oct 25, 2015 at 9:57 AM -0700, "Ted Yu" 
> wrote:
>
> In zipRLibraries():
>
> // create a zip file from scratch, do not append to existing file.
> val zipFile = new File(dir, name)
>
> I guess instead of creating sparkr.zip in the same directory as R lib,
> the zip file can be created under some directory writable by the user
> launching the app and accessible by user 'yarn'.
>
> Cheers
>
> On Sun, Oct 25, 2015 at 8:29 AM, Ram Venkatesh 
> wrote:
>
> 
>
> If you run sparkR in yarn-client mode, it fails with
>
> Exception in thread "main" java.io.FileNotFoundException:
> /usr/hdp/2.3.2.1-12/spark/R/lib/sparkr.zip (Permission denied)
> at java.io.FileOutputStream.open0(Native Method)
> at java.io.FileOutputStream.open(FileOutputStream.java:270)
> at java.io.FileOutputStream.(FileOutputStream.java:213)
> at
>
> org.apache.spark.deploy.RPackageUtils$.zipRLibraries(RPackageUtils.scala:215)
> at
>
> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:371)
> at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> The behavior is the same when I use the pre-built spark-1.5.1-bin-hadoop2.6
> version also.
>
> Interestingly if I run as a user with write permissions to the R/lib
> directory, it succeeds. However, the sparkr.zip file is recreated each time
> sparkR is launched, so even if the file is present it has to be writable by
> the submitting user.
>
> Couple questions:
> 1. Can spark.zip be packaged once and placed in that location for multiple
> users
> 2. If not, is this location configurable, so that each user can specify a
> directory that they can write?
>
> Thanks!
> Ram
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-in-yarn-client-mode-needs-sparkr-zip-tp25194.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
>


Re: SparkR in yarn-client mode needs sparkr.zip

2015-10-25 Thread Felix Cheung
This might be related to  https://issues.apache.org/jira/browse/SPARK-10500



On Sun, Oct 25, 2015 at 9:57 AM -0700, "Ted Yu"  wrote:
In zipRLibraries():

// create a zip file from scratch, do not append to existing file.
val zipFile = new File(dir, name)

I guess instead of creating sparkr.zip in the same directory as R lib, the
zip file can be created under some directory writable by the user launching
the app and accessible by user 'yarn'.

Cheers

On Sun, Oct 25, 2015 at 8:29 AM, Ram Venkatesh 
wrote:

> 
>
> If you run sparkR in yarn-client mode, it fails with
>
> Exception in thread "main" java.io.FileNotFoundException:
> /usr/hdp/2.3.2.1-12/spark/R/lib/sparkr.zip (Permission denied)
> at java.io.FileOutputStream.open0(Native Method)
> at java.io.FileOutputStream.open(FileOutputStream.java:270)
> at java.io.FileOutputStream.(FileOutputStream.java:213)
> at
>
> org.apache.spark.deploy.RPackageUtils$.zipRLibraries(RPackageUtils.scala:215)
> at
>
> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:371)
> at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> The behavior is the same when I use the pre-built spark-1.5.1-bin-hadoop2.6
> version also.
>
> Interestingly if I run as a user with write permissions to the R/lib
> directory, it succeeds. However, the sparkr.zip file is recreated each time
> sparkR is launched, so even if the file is present it has to be writable by
> the submitting user.
>
> Couple questions:
> 1. Can spark.zip be packaged once and placed in that location for multiple
> users
> 2. If not, is this location configurable, so that each user can specify a
> directory that they can write?
>
> Thanks!
> Ram
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-in-yarn-client-mode-needs-sparkr-zip-tp25194.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: SparkR in yarn-client mode needs sparkr.zip

2015-10-25 Thread Ram Venkatesh
Ted Yu,

Agree that either picking up sparkr.zip if it already exists, or creating a
zip in a local scratch directory will work. This code is called by the
client side job submission logic and the resulting zip is already added to
the local resources for the YARN job, so I don't think the directory needs
be accessible by the user yarn or from the cluster. Filed
https://issues.apache.org/jira/browse/SPARK-11304 for this issue.

As a temporary hack workaround, I created a writable file called sparkr.zip
in R/lib and made it world writable.

Thanks
Ram

On Sun, Oct 25, 2015 at 9:56 AM, Ted Yu  wrote:

> In zipRLibraries():
>
> // create a zip file from scratch, do not append to existing file.
> val zipFile = new File(dir, name)
>
> I guess instead of creating sparkr.zip in the same directory as R lib,
> the zip file can be created under some directory writable by the user
> launching the app and accessible by user 'yarn'.
>
> Cheers
>
> On Sun, Oct 25, 2015 at 8:29 AM, Ram Venkatesh 
> wrote:
>
>> 
>>
>> If you run sparkR in yarn-client mode, it fails with
>>
>> Exception in thread "main" java.io.FileNotFoundException:
>> /usr/hdp/2.3.2.1-12/spark/R/lib/sparkr.zip (Permission denied)
>> at java.io.FileOutputStream.open0(Native Method)
>> at java.io.FileOutputStream.open(FileOutputStream.java:270)
>> at java.io.FileOutputStream.(FileOutputStream.java:213)
>> at
>>
>> org.apache.spark.deploy.RPackageUtils$.zipRLibraries(RPackageUtils.scala:215)
>> at
>>
>> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:371)
>> at
>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
>> at
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>> The behavior is the same when I use the pre-built
>> spark-1.5.1-bin-hadoop2.6
>> version also.
>>
>> Interestingly if I run as a user with write permissions to the R/lib
>> directory, it succeeds. However, the sparkr.zip file is recreated each
>> time
>> sparkR is launched, so even if the file is present it has to be writable
>> by
>> the submitting user.
>>
>> Couple questions:
>> 1. Can spark.zip be packaged once and placed in that location for multiple
>> users
>> 2. If not, is this location configurable, so that each user can specify a
>> directory that they can write?
>>
>> Thanks!
>> Ram
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-in-yarn-client-mode-needs-sparkr-zip-tp25194.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: SparkR in yarn-client mode needs sparkr.zip

2015-10-25 Thread Ted Yu
In zipRLibraries():

// create a zip file from scratch, do not append to existing file.
val zipFile = new File(dir, name)

I guess instead of creating sparkr.zip in the same directory as R lib, the
zip file can be created under some directory writable by the user launching
the app and accessible by user 'yarn'.

Cheers

On Sun, Oct 25, 2015 at 8:29 AM, Ram Venkatesh 
wrote:

> 
>
> If you run sparkR in yarn-client mode, it fails with
>
> Exception in thread "main" java.io.FileNotFoundException:
> /usr/hdp/2.3.2.1-12/spark/R/lib/sparkr.zip (Permission denied)
> at java.io.FileOutputStream.open0(Native Method)
> at java.io.FileOutputStream.open(FileOutputStream.java:270)
> at java.io.FileOutputStream.(FileOutputStream.java:213)
> at
>
> org.apache.spark.deploy.RPackageUtils$.zipRLibraries(RPackageUtils.scala:215)
> at
>
> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:371)
> at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> The behavior is the same when I use the pre-built spark-1.5.1-bin-hadoop2.6
> version also.
>
> Interestingly if I run as a user with write permissions to the R/lib
> directory, it succeeds. However, the sparkr.zip file is recreated each time
> sparkR is launched, so even if the file is present it has to be writable by
> the submitting user.
>
> Couple questions:
> 1. Can spark.zip be packaged once and placed in that location for multiple
> users
> 2. If not, is this location configurable, so that each user can specify a
> directory that they can write?
>
> Thanks!
> Ram
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-in-yarn-client-mode-needs-sparkr-zip-tp25194.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


SparkR in yarn-client mode needs sparkr.zip

2015-10-25 Thread Ram Venkatesh


If you run sparkR in yarn-client mode, it fails with

Exception in thread "main" java.io.FileNotFoundException:
/usr/hdp/2.3.2.1-12/spark/R/lib/sparkr.zip (Permission denied)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at
org.apache.spark.deploy.RPackageUtils$.zipRLibraries(RPackageUtils.scala:215)
at
org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:371)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

The behavior is the same when I use the pre-built spark-1.5.1-bin-hadoop2.6
version also.

Interestingly if I run as a user with write permissions to the R/lib
directory, it succeeds. However, the sparkr.zip file is recreated each time
sparkR is launched, so even if the file is present it has to be writable by
the submitting user.

Couple questions:
1. Can spark.zip be packaged once and placed in that location for multiple
users 
2. If not, is this location configurable, so that each user can specify a
directory that they can write?

Thanks!
Ram



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-in-yarn-client-mode-needs-sparkr-zip-tp25194.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org