[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-07-10 Thread Shivaram Venkataraman (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155697#comment-17155697
 ] 

Shivaram Venkataraman commented on SPARK-31918:
---

Yes – this is the reason that SparkR has been temporarily removed from CRAN. We 
need a new release to upload a new version and we have some efforts to release 
Spark 2.4.7 and Spark 3.0.1 that are ongoing AFAIK.

cc'ing the release managers [~holden] [~ruifengz] [~prashant]

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Assignee: Hyukjin Kwon
>Priority: Blocker
> Fix For: 2.4.7, 3.0.1, 3.1.0
>
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-07-10 Thread Michael Chirico (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155512#comment-17155512
 ] 

Michael Chirico commented on SPARK-31918:
-

Hey folks, I just saw SparkR was removed from CRAN, I assume it's related to 
this issue?

https://cran.r-project.org/web/packages/SparkR/index.html

Is a new submission in the process as this issue was fixed?

Please let me know if there's any way I can help as well.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Assignee: Hyukjin Kwon
>Priority: Blocker
> Fix For: 2.4.7, 3.0.1, 3.1.0
>
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-24 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143749#comment-17143749
 ] 

Apache Spark commented on SPARK-31918:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/28922

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Assignee: Hyukjin Kwon
>Priority: Blocker
> Fix For: 2.4.7, 3.0.1, 3.1.0
>
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-23 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142862#comment-17142862
 ] 

Apache Spark commented on SPARK-31918:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/28907

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-23 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142859#comment-17142859
 ] 

Apache Spark commented on SPARK-31918:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/28907

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-23 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142837#comment-17142837
 ] 

Hyukjin Kwon commented on SPARK-31918:
--

Ok.. I finally made all tests being passed. I will make a PR soon.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-23 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142657#comment-17142657
 ] 

Hyukjin Kwon commented on SPARK-31918:
--

With SparkR built by R 4.0.1 on R 3.6.3  as is, tests pass with one test 
failure, which I think it's not a big deal:

{code}
Warning message:
package ‘SparkR’ was built under R version 4.0.1
Spark package found in SPARK_HOME: /.../spark
══ testthat results  ═══
[ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ]
✔ |  OK F W S | Context
✔ |  11   | binary functions [3.7 s]
✔ |   4   | functions on binary files [3.7 s]
✔ |   2   | broadcast variables [0.8 s]
✔ |   5   | functions in client.R
✔ |  46   | test functions in sparkR.R [10.1 s]
✔ |   2   | include R packages [0.5 s]
✔ |   2   | JVM API [0.3 s]
✔ |  70   | MLlib classification algorithms, except for tree-based 
algorithms [93.1 s]
✔ |  70   | MLlib clustering algorithms [38.8 s]
✔ |   6   | MLlib frequent pattern mining [3.0 s]
✔ |   8   | MLlib recommendation algorithms [9.9 s]
✔ | 128   | MLlib regression algorithms, except for tree-based algorithms 
[63.9 s]
✔ |   8   | MLlib statistics algorithms [0.5 s]
✔ |  94   | MLlib tree-based algorithms [81.2 s]
✔ |  29   | parallelize() and collect() [0.5 s]
✔ | 428   | basic RDD functions [21.1 s]
✔ |  39   | SerDe functionality [2.1 s]
✔ |  20   | partitionBy, groupByKey, reduceByKey etc. [3.3 s]
✔ |   4   | functions in sparkR.R
✔ |  16   | SparkSQL Arrow optimization [20.3 s]
✔ |   6   | test show SparkDataFrame when eager execution is enabled. [1.3 
s]
✖ | 1172 1 | SparkSQL functions [156.4 s]

test_sparkSQL.R:2719: error: mutate(), transform(), rename() and names()
could not find function "deparse1"
Backtrace:
 1. base::attach(airquality) tests/fulltests/test_sparkSQL.R:2719:2
 2. base::attach(airquality)

✔ |  42   | Structured Streaming [520.2 s]
✔ |  16   | tests RDD function take() [0.9 s]
✔ |  14   | the textFile() function [2.6 s]
✔ |  46   | functions in utils.R [0.5 s]
✔ |   0 1 | Windows-specific tests

test_Windows.R:22: skip: sparkJars tag in SparkContext
Reason: This test is only for Windows, skipped


══ Results ═
Duration: 1039.0 s
{code}

Seems like the test failure is due to missing {{deparse1}} which was added from 
R 4.0.0. I think we can just guide people to use 
https://github.com/r-lib/backports if this is an issue.
The test case itself doesn't look a big deal.

I will take a closer look to make it working in R 4.0.0.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142620#comment-17142620
 ] 

Hyukjin Kwon commented on SPARK-31918:
--

I tested it manually with the fix I mentioned 
[here|https://issues.apache.org/jira/browse/SPARK-31918?focusedCommentId=17142127=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17142127]
 .. let me test that case too.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Shivaram Venkataraman (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142618#comment-17142618
 ] 

Shivaram Venkataraman commented on SPARK-31918:
---

Thats great! [~hyukjin.kwon] -- so we can get around the installation issue if 
we can build on R 4.0.0. However I guess we will still have the the 
serialization issue. BTW does the serialization issue go away if we build in R 
4.0.0 and run with R 3.6.3? 


> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142605#comment-17142605
 ] 

Hyukjin Kwon commented on SPARK-31918:
--

Okay, [~shivaram], the first option seems working although it shows a warning 
such as below. I build Spark 3.0.0 with 4.0.1, and manually downgraded to R 
3.6.3.

{code:java}
During startup - Warning message:
package ‘SparkR’ was built under R version 4.0.1
{code}

I removed unrelated comments I left above.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142591#comment-17142591
 ] 

Hyukjin Kwon commented on SPARK-31918:
--

Oh, wait, the worker should test SparkR built with R 4.0.1. In the first case, 
I guess R worker loaded the one from 3.0.0 download (which is R 3.6.3). Let me 
test it via overwriting it.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142566#comment-17142566
 ] 

Hyukjin Kwon commented on SPARK-31918:
--

Nice, [~shivaram].

I just quickly tested, and the first option is not working.

1. Build Spark 3.0.0 in R 4.0.1 and install it from source with R 3.4.0 in 
another machine:

{code}
install.packages("SparkR_3.0.0.tar.gz", repos = NULL, type = "source")
{code}

{code}
df <- createDataFrame(lapply(seq(100), function (e) list(value=e)))
count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df)))
{code}

It shows the same error as shown in 
https://cran.r-project.org/web/checks/check_results_SparkR.html


2. Build Spark 3.0.0 in R 4.0.1, loads the library directly with R 3.4.0 in 
another machine:

{code}
library(SparkR, lib.loc = c(file.path("~/spark-3.0.0-bin-hadoop2.7", "R", 
"lib")))
{code}

{code}
# this error message is translated from another language. My R in Mac is in 
Korean
Error listing packages, Error in readRDS(pfile): cannot read workspace version 
3 written by R 4.0.1. R version should be 3.5+
{code}


3. Download Spark 3.0.0 release, loads the library directly with R 3.4.0 in 
another machine:

{code}
library(SparkR, lib.loc = c(file.path("~/spark-3.0.0-bin-hadoop2.7", "R", 
"lib")))
{code}

{code}
# this error message is translated from another language. My R in Mac is in 
Korean
Error listing packages, Error in readRDS(pfile): cannot read workspace version 
3 written by R 3.6.3. R version should be 3.5+
{code}


> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Shivaram Venkataraman (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142558#comment-17142558
 ] 

Shivaram Venkataraman commented on SPARK-31918:
---

I can confirm that with build from source of Spark 3.0.0 and R 4.0.2, I see the 
following error while building vignettes.

{{R worker produced errors: Error in lapply(part, FUN) : attempt to bind a 
variable to R_UnboundValue}}

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Shivaram Venkataraman (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142532#comment-17142532
 ] 

Shivaram Venkataraman commented on SPARK-31918:
---

[~hyukjin.kwon] I have R 4.0.2 and will try to do a fresh build from source of 
Spark 3.0.0 

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142520#comment-17142520
 ] 

Dongjoon Hyun commented on SPARK-31918:
---

Unfortunately, no~ I downgraded to R 3.5.2 on both my MacPro and MacBook.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142495#comment-17142495
 ] 

Hyukjin Kwon commented on SPARK-31918:
--

Ah, yeah. That one I read [it in the release 
notes|[https://cran.r-project.org/doc/manuals/r-devel/NEWS.html]]
I was freshly building and testing the package with R 4.0.1 so that was why the 
error messages were different ...

{quote}
> Packages need to be (re-)installed under this version (4.0.0) of *R*.
{quote}

I have two environments in my local. One is R 4.0.1, the other one is R 3.4.0. 
Although it officially says R 3.1+, we deprecated R < 3.4 at SPARK-26014.
I will test the first option out, and come back.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Shivaram Venkataraman (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142466#comment-17142466
 ] 

Shivaram Venkataraman commented on SPARK-31918:
---

Thanks [~hyukjin.kwon]. It looks like there is another problem. From what I saw 
today, R 4.0.0 cannot load packages that were built with R 3.6.0.  Thus when 
SparkR workers try to start up with the pre-built SparkR package we see a 
failure.  I'm not really sure what is a good way to handle this. Options include
- Building the SparkR package using 4.0.0 (need to check if that works with R 
3.6)
- Copy the package from the driver (where it is usually built) and make the 
SparkR workers use the package installed on the driver

Any other ideas?

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Blocker
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142127#comment-17142127
 ] 

Hyukjin Kwon commented on SPARK-31918:
--

Just to share what I investigated:

Seems the problem relates to {{processClosure}} via {{cleanClosure}} in SparkR.
 Looks like there is a problem [when the new environment is set to a 
function|https://github.com/apache/spark/blob/master/R/pkg/R/utils.R#L601] 
especially that includes generic S4 functions, given my observation.
 So, for example, if you skip it with the fix below:
{code:java}
diff --git a/R/pkg/R/utils.R b/R/pkg/R/utils.R
index 65db9c21d9d..60cad588f5e 100644
--- a/R/pkg/R/utils.R
+++ b/R/pkg/R/utils.R
@@ -529,7 +529,9 @@ processClosure <- function(node, oldEnv, defVars, 
checkedFuncs, newEnv) {
 # Namespaces other than "SparkR" will not be searched.
 if (!isNamespace(func.env) ||
 (getNamespaceName(func.env) == "SparkR" &&
-   !(nodeChar %in% getNamespaceExports("SparkR" {
+   !(nodeChar %in% getNamespaceExports("SparkR")) &&
+  # Skip all generics under SparkR - R 4.0.0 looks having an 
issue.
+  !isGeneric(nodeChar, func.env))) {
{code}
{code:java}
* checking re-building of vignette outputs ... OK
{code}
CRAN check passes with the current master branch in my local

For a minimal reproducer, with this diff:
{code:java}
diff --git a/R/pkg/R/RDD.R b/R/pkg/R/RDD.R
index 7a1d157bb8a..89250c37319 100644
--- a/R/pkg/R/RDD.R
+++ b/R/pkg/R/RDD.R
@@ -487,6 +487,7 @@ setMethod("lapply",
 func <- function(partIndex, part) {
   lapply(part, FUN)
 }
+print(SparkR:::cleanClosure(func)(1, 2))
 lapplyPartitionsWithIndex(X, func)
   })
{code}
run:
{code:java}
createDataFrame(lapply(seq(100), function (e) list(value=e)))
{code}
When {{lapply}} is called against the RDD at {{createDataFrame}}, the cleaned 
closure's environment has SparkR's lapply as a S4 method and it leads to the 
error such as {{attempt to bind a variable to R_UnboundValue}}.

Hopefully this is the cause of the issue happening here, and not an issue in my 
env. cc [~felixcheung], [~dongjoon] FYI.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Major
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX

2020-06-22 Thread Hyukjin Kwon (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142073#comment-17142073
 ] 

Hyukjin Kwon commented on SPARK-31918:
--

It affects Spark 3.0 too, and seems failing with a different message in my 
local:

{code}
* creating vignettes ... ERROR
--- re-building ‘sparkr-vignettes.Rmd’ using rmarkdown
Warning in engine$weave(file, quiet = quiet, encoding = enc) :
  Pandoc (>= 1.12.3) and/or pandoc-citeproc not available. Falling back to R 
Markdown v1.

Attaching package: 'SparkR'

The following objects are masked from 'package:stats':

cov, filter, lag, na.omit, predict, sd, var, window

The following objects are masked from 'package:base':

as.data.frame, colnames, colnames<-, drop, endsWith, intersect,
rank, rbind, sample, startsWith, subset, summary, transform, union

Picked up _JAVA_OPTIONS: -XX:-UsePerfData
Picked up _JAVA_OPTIONS: -XX:-UsePerfData
20/06/22 15:07:34 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).

[Stage 0:>  (0 + 1) / 1]
20/06/22 15:07:43 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
org.apache.spark.SparkException: R unexpectedly exited.
{code}

Assuming the errors from R execution itself, the root cause might be same.

> SparkR CRAN check gives a warning with R 4.0.0 on OSX
> -
>
> Key: SPARK-31918
> URL: https://issues.apache.org/jira/browse/SPARK-31918
> Project: Spark
>  Issue Type: Bug
>  Components: SparkR
>Affects Versions: 2.4.6, 3.0.0
>Reporter: Shivaram Venkataraman
>Priority: Major
>
> When the SparkR package is run through a CRAN check (i.e. with something like 
> R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR 
> vignette as a part of the checks.
> However this seems to be failing with R 4.0.0 on OSX -- both on my local 
> machine and on CRAN 
> https://cran.r-project.org/web/checks/check_results_SparkR.html
> cc [~felixcheung]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org