[jira] [Updated] (SPARK-16055) sparkR.init() can not load sparkPackages when executing an R file

Sun Rui (JIRA) Sun, 19 Jun 2016 00:50:20 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-16055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sun Rui updated SPARK-16055:
----------------------------
    Description: 
This is an issue reported in the Spark user mailing list. Refer to 
http://comments.gmane.org/gmane.comp.lang.scala.spark.user/35742

This issue does not occur in an interactive SparkR session, while it does occur 
when executing an R file.

The following example code can be put into an R file to reproduce this issue:
{code}
.libPaths(c("/home/user/spark-1.6.1-bin-hadoop2.6/R/lib",.libPaths()))
Sys.setenv(SPARK_HOME="/home/user/spark-1.6.1-bin-hadoop2.6")
library("SparkR")
sc <- sparkR.init(sparkPackages = "com.databricks:spark-csv_2.11:1.4.0")
sqlContext <- sparkRSQL.init(sc)
df <- read.df(sqlContext, 
"file:///home/user/spark-1.6.1-bin-hadoop2.6/data/mllib/sample_tree_data.csv","csv")
showDF(df)
{code}

The error message is as such:
{panel}
16/06/19 15:48:56 ERROR RBackendHandler: loadDF on 
org.apache.spark.sql.api.r.SQLUtils failed
Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
  java.lang.ClassNotFoundException: Failed to find data source: csv. Please 
find packages at http://spark-packages.org
        at 
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
        at 
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
        at org.apache.spark.sql.api.r.SQLUtils$.loadDF(SQLUtils.scala:160)
        at org.apache.spark.sql.api.r.SQLUtils.loadDF(SQLUtils.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
        at 
org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala
Calls: read.df -> callJStatic -> invokeJava
Execution halted
{panel}

The reason behind this is that in case you execute an R file, the R backend 
launches before the R interpreter, so there is no opportunity for packages 
specified with ‘sparkPackages’ to be processed.

This JIRA issue is to track this issue. An appropriate solution is to be 
discussed. Maybe documentation the limitation.


  was:
This is an issue reported in the Spark user mailing list. Refer to 
http://comments.gmane.org/gmane.comp.lang.scala.spark.user/35742

This issue does not occur in an interactive SparkR session, while it does occur 
when executing an R file.

The following example code can be put into an R file to reproduce this issue:
{code}
.libPaths(c("/home/user/spark-1.6.1-bin-hadoop2.6/R/lib",.libPaths()))
Sys.setenv(SPARK_HOME="/home/user/spark-1.6.1-bin-hadoop2.6")
library("SparkR")
sc <- sparkR.init(sparkPackages = "com.databricks:spark-csv_2.11:1.4.0")
sqlContext <- sparkRSQL.init(sc)
df <- read.df(sqlContext, 
"file:///home/user/spark-1.6.1-bin-hadoop2.6/data/mllib/sample_tree_data.csv","csv")
showDF(df)
{code}

The reason behind this is that in case you execute an R file, the R backend 
launches before the R interpreter, so there is no opportunity for packages 
specified with ‘sparkPackages’ to be processed.

This JIRA issue is to track this issue. An appropriate solution is to be 
discussed. Maybe documentation the limitation.



> sparkR.init() can not load sparkPackages when executing an R file
> -----------------------------------------------------------------
>
>                 Key: SPARK-16055
>                 URL: https://issues.apache.org/jira/browse/SPARK-16055
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: SparkR
>    Affects Versions: 1.6.1
>            Reporter: Sun Rui
>            Priority: Minor
>
> This is an issue reported in the Spark user mailing list. Refer to 
> http://comments.gmane.org/gmane.comp.lang.scala.spark.user/35742
> This issue does not occur in an interactive SparkR session, while it does 
> occur when executing an R file.
> The following example code can be put into an R file to reproduce this issue:
> {code}
> .libPaths(c("/home/user/spark-1.6.1-bin-hadoop2.6/R/lib",.libPaths()))
> Sys.setenv(SPARK_HOME="/home/user/spark-1.6.1-bin-hadoop2.6")
> library("SparkR")
> sc <- sparkR.init(sparkPackages = "com.databricks:spark-csv_2.11:1.4.0")
> sqlContext <- sparkRSQL.init(sc)
> df <- read.df(sqlContext, 
> "file:///home/user/spark-1.6.1-bin-hadoop2.6/data/mllib/sample_tree_data.csv","csv")
> showDF(df)
> {code}
> The error message is as such:
> {panel}
> 16/06/19 15:48:56 ERROR RBackendHandler: loadDF on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   java.lang.ClassNotFoundException: Failed to find data source: csv. Please 
> find packages at http://spark-packages.org
>       at 
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>       at 
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>       at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>       at org.apache.spark.sql.api.r.SQLUtils$.loadDF(SQLUtils.scala:160)
>       at org.apache.spark.sql.api.r.SQLUtils.loadDF(SQLUtils.scala)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
>       at 
> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala
> Calls: read.df -> callJStatic -> invokeJava
> Execution halted
> {panel}
> The reason behind this is that in case you execute an R file, the R backend 
> launches before the R interpreter, so there is no opportunity for packages 
> specified with ‘sparkPackages’ to be processed.
> This JIRA issue is to track this issue. An appropriate solution is to be 
> discussed. Maybe documentation the limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-16055) sparkR.init() can not load sparkPackages when executing an R file

Reply via email to