[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155697#comment-17155697 ] Shivaram Venkataraman commented on SPARK-31918: --- Yes – this is the reason that SparkR has been temporarily removed from CRAN. We need a new release to upload a new version and we have some efforts to release Spark 2.4.7 and Spark 3.0.1 that are ongoing AFAIK. cc'ing the release managers [~holden] [~ruifengz] [~prashant] > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Assignee: Hyukjin Kwon >Priority: Blocker > Fix For: 2.4.7, 3.0.1, 3.1.0 > > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155512#comment-17155512 ] Michael Chirico commented on SPARK-31918: - Hey folks, I just saw SparkR was removed from CRAN, I assume it's related to this issue? https://cran.r-project.org/web/packages/SparkR/index.html Is a new submission in the process as this issue was fixed? Please let me know if there's any way I can help as well. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Assignee: Hyukjin Kwon >Priority: Blocker > Fix For: 2.4.7, 3.0.1, 3.1.0 > > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143749#comment-17143749 ] Apache Spark commented on SPARK-31918: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/28922 > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Assignee: Hyukjin Kwon >Priority: Blocker > Fix For: 2.4.7, 3.0.1, 3.1.0 > > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142862#comment-17142862 ] Apache Spark commented on SPARK-31918: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/28907 > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142859#comment-17142859 ] Apache Spark commented on SPARK-31918: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/28907 > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142837#comment-17142837 ] Hyukjin Kwon commented on SPARK-31918: -- Ok.. I finally made all tests being passed. I will make a PR soon. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142657#comment-17142657 ] Hyukjin Kwon commented on SPARK-31918: -- With SparkR built by R 4.0.1 on R 3.6.3 as is, tests pass with one test failure, which I think it's not a big deal: {code} Warning message: package ‘SparkR’ was built under R version 4.0.1 Spark package found in SPARK_HOME: /.../spark ══ testthat results ═══ [ OK: 13 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 0 ] ✔ | OK F W S | Context ✔ | 11 | binary functions [3.7 s] ✔ | 4 | functions on binary files [3.7 s] ✔ | 2 | broadcast variables [0.8 s] ✔ | 5 | functions in client.R ✔ | 46 | test functions in sparkR.R [10.1 s] ✔ | 2 | include R packages [0.5 s] ✔ | 2 | JVM API [0.3 s] ✔ | 70 | MLlib classification algorithms, except for tree-based algorithms [93.1 s] ✔ | 70 | MLlib clustering algorithms [38.8 s] ✔ | 6 | MLlib frequent pattern mining [3.0 s] ✔ | 8 | MLlib recommendation algorithms [9.9 s] ✔ | 128 | MLlib regression algorithms, except for tree-based algorithms [63.9 s] ✔ | 8 | MLlib statistics algorithms [0.5 s] ✔ | 94 | MLlib tree-based algorithms [81.2 s] ✔ | 29 | parallelize() and collect() [0.5 s] ✔ | 428 | basic RDD functions [21.1 s] ✔ | 39 | SerDe functionality [2.1 s] ✔ | 20 | partitionBy, groupByKey, reduceByKey etc. [3.3 s] ✔ | 4 | functions in sparkR.R ✔ | 16 | SparkSQL Arrow optimization [20.3 s] ✔ | 6 | test show SparkDataFrame when eager execution is enabled. [1.3 s] ✖ | 1172 1 | SparkSQL functions [156.4 s] test_sparkSQL.R:2719: error: mutate(), transform(), rename() and names() could not find function "deparse1" Backtrace: 1. base::attach(airquality) tests/fulltests/test_sparkSQL.R:2719:2 2. base::attach(airquality) ✔ | 42 | Structured Streaming [520.2 s] ✔ | 16 | tests RDD function take() [0.9 s] ✔ | 14 | the textFile() function [2.6 s] ✔ | 46 | functions in utils.R [0.5 s] ✔ | 0 1 | Windows-specific tests test_Windows.R:22: skip: sparkJars tag in SparkContext Reason: This test is only for Windows, skipped ══ Results ═ Duration: 1039.0 s {code} Seems like the test failure is due to missing {{deparse1}} which was added from R 4.0.0. I think we can just guide people to use https://github.com/r-lib/backports if this is an issue. The test case itself doesn't look a big deal. I will take a closer look to make it working in R 4.0.0. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142620#comment-17142620 ] Hyukjin Kwon commented on SPARK-31918: -- I tested it manually with the fix I mentioned [here|https://issues.apache.org/jira/browse/SPARK-31918?focusedCommentId=17142127=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17142127] .. let me test that case too. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142618#comment-17142618 ] Shivaram Venkataraman commented on SPARK-31918: --- Thats great! [~hyukjin.kwon] -- so we can get around the installation issue if we can build on R 4.0.0. However I guess we will still have the the serialization issue. BTW does the serialization issue go away if we build in R 4.0.0 and run with R 3.6.3? > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142605#comment-17142605 ] Hyukjin Kwon commented on SPARK-31918: -- Okay, [~shivaram], the first option seems working although it shows a warning such as below. I build Spark 3.0.0 with 4.0.1, and manually downgraded to R 3.6.3. {code:java} During startup - Warning message: package ‘SparkR’ was built under R version 4.0.1 {code} I removed unrelated comments I left above. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142591#comment-17142591 ] Hyukjin Kwon commented on SPARK-31918: -- Oh, wait, the worker should test SparkR built with R 4.0.1. In the first case, I guess R worker loaded the one from 3.0.0 download (which is R 3.6.3). Let me test it via overwriting it. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142566#comment-17142566 ] Hyukjin Kwon commented on SPARK-31918: -- Nice, [~shivaram]. I just quickly tested, and the first option is not working. 1. Build Spark 3.0.0 in R 4.0.1 and install it from source with R 3.4.0 in another machine: {code} install.packages("SparkR_3.0.0.tar.gz", repos = NULL, type = "source") {code} {code} df <- createDataFrame(lapply(seq(100), function (e) list(value=e))) count(dapply(df, function(x) as.data.frame(x[x$value < 50,]), schema(df))) {code} It shows the same error as shown in https://cran.r-project.org/web/checks/check_results_SparkR.html 2. Build Spark 3.0.0 in R 4.0.1, loads the library directly with R 3.4.0 in another machine: {code} library(SparkR, lib.loc = c(file.path("~/spark-3.0.0-bin-hadoop2.7", "R", "lib"))) {code} {code} # this error message is translated from another language. My R in Mac is in Korean Error listing packages, Error in readRDS(pfile): cannot read workspace version 3 written by R 4.0.1. R version should be 3.5+ {code} 3. Download Spark 3.0.0 release, loads the library directly with R 3.4.0 in another machine: {code} library(SparkR, lib.loc = c(file.path("~/spark-3.0.0-bin-hadoop2.7", "R", "lib"))) {code} {code} # this error message is translated from another language. My R in Mac is in Korean Error listing packages, Error in readRDS(pfile): cannot read workspace version 3 written by R 3.6.3. R version should be 3.5+ {code} > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142558#comment-17142558 ] Shivaram Venkataraman commented on SPARK-31918: --- I can confirm that with build from source of Spark 3.0.0 and R 4.0.2, I see the following error while building vignettes. {{R worker produced errors: Error in lapply(part, FUN) : attempt to bind a variable to R_UnboundValue}} > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142532#comment-17142532 ] Shivaram Venkataraman commented on SPARK-31918: --- [~hyukjin.kwon] I have R 4.0.2 and will try to do a fresh build from source of Spark 3.0.0 > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142520#comment-17142520 ] Dongjoon Hyun commented on SPARK-31918: --- Unfortunately, no~ I downgraded to R 3.5.2 on both my MacPro and MacBook. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142495#comment-17142495 ] Hyukjin Kwon commented on SPARK-31918: -- Ah, yeah. That one I read [it in the release notes|[https://cran.r-project.org/doc/manuals/r-devel/NEWS.html]] I was freshly building and testing the package with R 4.0.1 so that was why the error messages were different ... {quote} > Packages need to be (re-)installed under this version (4.0.0) of *R*. {quote} I have two environments in my local. One is R 4.0.1, the other one is R 3.4.0. Although it officially says R 3.1+, we deprecated R < 3.4 at SPARK-26014. I will test the first option out, and come back. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142466#comment-17142466 ] Shivaram Venkataraman commented on SPARK-31918: --- Thanks [~hyukjin.kwon]. It looks like there is another problem. From what I saw today, R 4.0.0 cannot load packages that were built with R 3.6.0. Thus when SparkR workers try to start up with the pre-built SparkR package we see a failure. I'm not really sure what is a good way to handle this. Options include - Building the SparkR package using 4.0.0 (need to check if that works with R 3.6) - Copy the package from the driver (where it is usually built) and make the SparkR workers use the package installed on the driver Any other ideas? > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Blocker > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142127#comment-17142127 ] Hyukjin Kwon commented on SPARK-31918: -- Just to share what I investigated: Seems the problem relates to {{processClosure}} via {{cleanClosure}} in SparkR. Looks like there is a problem [when the new environment is set to a function|https://github.com/apache/spark/blob/master/R/pkg/R/utils.R#L601] especially that includes generic S4 functions, given my observation. So, for example, if you skip it with the fix below: {code:java} diff --git a/R/pkg/R/utils.R b/R/pkg/R/utils.R index 65db9c21d9d..60cad588f5e 100644 --- a/R/pkg/R/utils.R +++ b/R/pkg/R/utils.R @@ -529,7 +529,9 @@ processClosure <- function(node, oldEnv, defVars, checkedFuncs, newEnv) { # Namespaces other than "SparkR" will not be searched. if (!isNamespace(func.env) || (getNamespaceName(func.env) == "SparkR" && - !(nodeChar %in% getNamespaceExports("SparkR" { + !(nodeChar %in% getNamespaceExports("SparkR")) && + # Skip all generics under SparkR - R 4.0.0 looks having an issue. + !isGeneric(nodeChar, func.env))) { {code} {code:java} * checking re-building of vignette outputs ... OK {code} CRAN check passes with the current master branch in my local For a minimal reproducer, with this diff: {code:java} diff --git a/R/pkg/R/RDD.R b/R/pkg/R/RDD.R index 7a1d157bb8a..89250c37319 100644 --- a/R/pkg/R/RDD.R +++ b/R/pkg/R/RDD.R @@ -487,6 +487,7 @@ setMethod("lapply", func <- function(partIndex, part) { lapply(part, FUN) } +print(SparkR:::cleanClosure(func)(1, 2)) lapplyPartitionsWithIndex(X, func) }) {code} run: {code:java} createDataFrame(lapply(seq(100), function (e) list(value=e))) {code} When {{lapply}} is called against the RDD at {{createDataFrame}}, the cleaned closure's environment has SparkR's lapply as a S4 method and it leads to the error such as {{attempt to bind a variable to R_UnboundValue}}. Hopefully this is the cause of the issue happening here, and not an issue in my env. cc [~felixcheung], [~dongjoon] FYI. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Major > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31918) SparkR CRAN check gives a warning with R 4.0.0 on OSX
[ https://issues.apache.org/jira/browse/SPARK-31918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17142073#comment-17142073 ] Hyukjin Kwon commented on SPARK-31918: -- It affects Spark 3.0 too, and seems failing with a different message in my local: {code} * creating vignettes ... ERROR --- re-building ‘sparkr-vignettes.Rmd’ using rmarkdown Warning in engine$weave(file, quiet = quiet, encoding = enc) : Pandoc (>= 1.12.3) and/or pandoc-citeproc not available. Falling back to R Markdown v1. Attaching package: 'SparkR' The following objects are masked from 'package:stats': cov, filter, lag, na.omit, predict, sd, var, window The following objects are masked from 'package:base': as.data.frame, colnames, colnames<-, drop, endsWith, intersect, rank, rbind, sample, startsWith, subset, summary, transform, union Picked up _JAVA_OPTIONS: -XX:-UsePerfData Picked up _JAVA_OPTIONS: -XX:-UsePerfData 20/06/22 15:07:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). [Stage 0:> (0 + 1) / 1] 20/06/22 15:07:43 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) org.apache.spark.SparkException: R unexpectedly exited. {code} Assuming the errors from R execution itself, the root cause might be same. > SparkR CRAN check gives a warning with R 4.0.0 on OSX > - > > Key: SPARK-31918 > URL: https://issues.apache.org/jira/browse/SPARK-31918 > Project: Spark > Issue Type: Bug > Components: SparkR >Affects Versions: 2.4.6, 3.0.0 >Reporter: Shivaram Venkataraman >Priority: Major > > When the SparkR package is run through a CRAN check (i.e. with something like > R CMD check --as-cran ~/Downloads/SparkR_2.4.6.tar.gz), we rebuild the SparkR > vignette as a part of the checks. > However this seems to be failing with R 4.0.0 on OSX -- both on my local > machine and on CRAN > https://cran.r-project.org/web/checks/check_results_SparkR.html > cc [~felixcheung] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org