Corresponding to the Spark 2.3.1 release, I submitted the SparkR build to CRAN yesterday. Unfortunately it looks like there are a couple of issues (full message from CRAN is forwarded below)
1. There are some builds started with Java 10 (http://home.apache.org/~shivaram/SparkR_2.3.1_check_results/Debian/00check.log) which are right now counted as test failures. I wonder if we should somehow mark them as skipped ? I can ping the CRAN team about this. 2. There is another issue with Java version parsing which unfortunately affects even Java 8 builds. I've created https://issues.apache.org/jira/browse/SPARK-24535 to track this. Thanks Shivaram ---------- Forwarded message --------- From: <uwe.lig...@r-project.org> Date: Mon, Jun 11, 2018 at 11:24 AM Subject: [CRAN-pretest-archived] CRAN submission SparkR 2.3.1 To: <shiva...@cs.berkeley.edu> Cc: <cran-submissi...@r-project.org> Dear maintainer, package SparkR_2.3.1.tar.gz does not pass the incoming checks automatically, please see the following pre-tests: Windows: <https://win-builder.r-project.org/incoming_pretest/SparkR_2.3.1_20180611_200923/Windows/00check.log> Status: 2 ERRORs, 1 NOTE Debian: <https://win-builder.r-project.org/incoming_pretest/SparkR_2.3.1_20180611_200923/Debian/00check.log> Status: 1 ERROR, 1 WARNING, 1 NOTE Last released version's CRAN status: ERROR: 1, OK: 1 See: <https://CRAN.R-project.org/web/checks/check_results_SparkR.html> CRAN Web: <https://cran.r-project.org/package=SparkR> Please fix all problems and resubmit a fixed version via the webform. If you are not sure how to fix the problems shown, please ask for help on the R-package-devel mailing list: <https://stat.ethz.ch/mailman/listinfo/r-package-devel> If you are fairly certain the rejection is a false positive, please reply-all to this message and explain. More details are given in the directory: <https://win-builder.r-project.org/incoming_pretest/SparkR_2.3.1_20180611_200923/> The files will be removed after roughly 7 days. No strong reverse dependencies to be checked. Best regards, CRAN teams' auto-check service Flavor: r-devel-linux-x86_64-debian-gcc, r-devel-windows-ix86+x86_64 Check: CRAN incoming feasibility, Result: NOTE Maintainer: 'Shivaram Venkataraman <shiva...@cs.berkeley.edu>' New submission Package was archived on CRAN Possibly mis-spelled words in DESCRIPTION: Frontend (4:10, 5:28) CRAN repository db overrides: X-CRAN-Comment: Archived on 2018-05-01 as check problems were not corrected despite reminders. Flavor: r-devel-windows-ix86+x86_64 Check: running tests for arch 'i386', Result: ERROR Running 'run-all.R' [30s] Running the tests in 'tests/run-all.R' failed. Complete output: > # > # Licensed to the Apache Software Foundation (ASF) under one or more > # contributor license agreements. See the NOTICE file distributed with > # this work for additional information regarding copyright ownership. > # The ASF licenses this file to You under the Apache License, Version 2.0 > # (the "License"); you may not use this file except in compliance with > # the License. You may obtain a copy of the License at > # > # http://www.apache.org/licenses/LICENSE-2.0 > # > # Unless required by applicable law or agreed to in writing, software > # distributed under the License is distributed on an "AS IS" BASIS, > # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > # See the License for the specific language governing permissions and > # limitations under the License. > # > > library(testthat) > library(SparkR) Attaching package: 'SparkR' The following objects are masked from 'package:testthat': describe, not The following objects are masked from 'package:stats': cov, filter, lag, na.omit, predict, sd, var, window The following objects are masked from 'package:base': as.data.frame, colnames, colnames<-, drop, endsWith, intersect, rank, rbind, sample, startsWith, subset, summary, transform, union > > # Turn all warnings into errors > options("warn" = 2) > > if (.Platform$OS.type == "windows") { + Sys.setenv(TZ = "GMT") + } > > # Setup global test environment > # Install Spark first to set SPARK_HOME > > # NOTE(shivaram): We set overwrite to handle any old tar.gz files or directories left behind on > # CRAN machines. For Jenkins we should already have SPARK_HOME set. > install.spark(overwrite = TRUE) Overwrite = TRUE: download and overwrite the tar fileand Spark package directory if they exist. Spark not found in the cache directory. Installation will start. MirrorUrl not provided. Looking for preferred site from apache website... Preferred mirror site found: http://apache.mirror.digionline.de/spark Downloading spark-2.3.1 for Hadoop 2.7 from: - http://apache.mirror.digionline.de/spark/spark-2.3.1/spark-2.3.1-bin-hadoop2.7.tgz trying URL 'http://apache.mirror.digionline.de/spark/spark-2.3.1/spark-2.3.1-bin-hadoop2.7.tgz' Content type 'application/x-gzip' length 225883783 bytes (215.4 MB) ================================================== downloaded 215.4 MB Installing to C:\Users\ligges\AppData\Local\Apache\Spark\Cache DONE. SPARK_HOME set to C:\Users\ligges\AppData\Local\Apache\Spark\Cache/spark-2.3.1-bin-hadoop2.7 > > sparkRDir <- file.path(Sys.getenv("SPARK_HOME"), "R") > sparkRWhitelistSQLDirs <- c("spark-warehouse", "metastore_db") > invisible(lapply(sparkRWhitelistSQLDirs, + function(x) { unlink(file.path(sparkRDir, x), recursive = TRUE, force = TRUE)})) > sparkRFilesBefore <- list.files(path = sparkRDir, all.files = TRUE) > > sparkRTestMaster <- "local[1]" > sparkRTestConfig <- list() > if (identical(Sys.getenv("NOT_CRAN"), "true")) { + sparkRTestMaster <- "" + } else { + # Disable hsperfdata on CRAN + old_java_opt <- Sys.getenv("_JAVA_OPTIONS") + Sys.setenv("_JAVA_OPTIONS" = paste("-XX:-UsePerfData", old_java_opt)) + tmpDir <- tempdir() + tmpArg <- paste0("-Djava.io.tmpdir=", tmpDir) + sparkRTestConfig <- list(spark.driver.extraJavaOptions = tmpArg, + spark.executor.extraJavaOptions = tmpArg) + } > > test_package("SparkR") java version "1.8.0_144" Java(TM) SE Runtime Environment (build 1.8.0_144-b01) Java HotSpot(TM) Client VM (build 25.144-b01, mixed mode) Picked up _JAVA_OPTIONS: -XX:-UsePerfData -- 1. Error: create DataFrame from list or data.frame (@test_basic.R#21) ------ subscript out of bounds 1: sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE, sparkConfig = sparkRTestConfig) at D:/temp/RtmpIJ8Cc3/RLIBS_3242c713c3181/SparkR/tests/testthat/test_basic.R:21 2: sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap, sparkExecutorEnvMap, sparkJars, sparkPackages) 3: checkJavaVersion() 4: strsplit(javaVersionFilter[[1]], "[\"]") java version "1.8.0_144" Java(TM) SE Runtime Environment (build 1.8.0_144-b01) Java HotSpot(TM) Client VM (build 25.144-b01, mixed mode) Picked up _JAVA_OPTIONS: -XX:-UsePerfData -- 2. Error: spark.glm and predict (@test_basic.R#53) ------------------------- subscript out of bounds 1: sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE, sparkConfig = sparkRTestConfig) at D:/temp/RtmpIJ8Cc3/RLIBS_3242c713c3181/SparkR/tests/testthat/test_basic.R:53 2: sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap, sparkExecutorEnvMap, sparkJars, sparkPackages) 3: checkJavaVersion() 4: strsplit(javaVersionFilter[[1]], "[\"]") == testthat results =========================================================== OK: 0 SKIPPED: 0 FAILED: 2 1. Error: create DataFrame from list or data.frame (@test_basic.R#21) 2. Error: spark.glm and predict (@test_basic.R#53) Error: testthat unit tests failed Execution halted Flavor: r-devel-windows-ix86+x86_64 Check: running tests for arch 'x64', Result: ERROR Running 'run-all.R' [30s] Running the tests in 'tests/run-all.R' failed. Complete output: > # > # Licensed to the Apache Software Foundation (ASF) under one or more > # contributor license agreements. See the NOTICE file distributed with > # this work for additional information regarding copyright ownership. > # The ASF licenses this file to You under the Apache License, Version 2.0 > # (the "License"); you may not use this file except in compliance with > # the License. You may obtain a copy of the License at > # > # http://www.apache.org/licenses/LICENSE-2.0 > # > # Unless required by applicable law or agreed to in writing, software > # distributed under the License is distributed on an "AS IS" BASIS, > # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > # See the License for the specific language governing permissions and > # limitations under the License. > # > > library(testthat) > library(SparkR) Attaching package: 'SparkR' The following objects are masked from 'package:testthat': describe, not The following objects are masked from 'package:stats': cov, filter, lag, na.omit, predict, sd, var, window The following objects are masked from 'package:base': as.data.frame, colnames, colnames<-, drop, endsWith, intersect, rank, rbind, sample, startsWith, subset, summary, transform, union > > # Turn all warnings into errors > options("warn" = 2) > > if (.Platform$OS.type == "windows") { + Sys.setenv(TZ = "GMT") + } > > # Setup global test environment > # Install Spark first to set SPARK_HOME > > # NOTE(shivaram): We set overwrite to handle any old tar.gz files or directories left behind on > # CRAN machines. For Jenkins we should already have SPARK_HOME set. > install.spark(overwrite = TRUE) Overwrite = TRUE: download and overwrite the tar fileand Spark package directory if they exist. Spark not found in the cache directory. Installation will start. MirrorUrl not provided. Looking for preferred site from apache website... Preferred mirror site found: http://ftp-stud.hs-esslingen.de/pub/Mirrors/ftp.apache.org/dist/spark Downloading spark-2.3.1 for Hadoop 2.7 from: - http://ftp-stud.hs-esslingen.de/pub/Mirrors/ftp.apache.org/dist/spark/spark-2.3.1/spark-2.3.1-bin-hadoop2.7.tgz trying URL 'http://ftp-stud.hs-esslingen.de/pub/Mirrors/ftp.apache.org/dist/spark/spark-2.3.1/spark-2.3.1-bin-hadoop2.7.tgz' Content type 'application/x-gzip' length 225883783 bytes (215.4 MB) ================================================== downloaded 215.4 MB Installing to C:\Users\ligges\AppData\Local\Apache\Spark\Cache DONE. SPARK_HOME set to C:\Users\ligges\AppData\Local\Apache\Spark\Cache/spark-2.3.1-bin-hadoop2.7 > > sparkRDir <- file.path(Sys.getenv("SPARK_HOME"), "R") > sparkRWhitelistSQLDirs <- c("spark-warehouse", "metastore_db") > invisible(lapply(sparkRWhitelistSQLDirs, + function(x) { unlink(file.path(sparkRDir, x), recursive = TRUE, force = TRUE)})) > sparkRFilesBefore <- list.files(path = sparkRDir, all.files = TRUE) > > sparkRTestMaster <- "local[1]" > sparkRTestConfig <- list() > if (identical(Sys.getenv("NOT_CRAN"), "true")) { + sparkRTestMaster <- "" + } else { + # Disable hsperfdata on CRAN + old_java_opt <- Sys.getenv("_JAVA_OPTIONS") + Sys.setenv("_JAVA_OPTIONS" = paste("-XX:-UsePerfData", old_java_opt)) + tmpDir <- tempdir() + tmpArg <- paste0("-Djava.io.tmpdir=", tmpDir) + sparkRTestConfig <- list(spark.driver.extraJavaOptions = tmpArg, + spark.executor.extraJavaOptions = tmpArg) + } > > test_package("SparkR") java version "1.8.0_144" Java(TM) SE Runtime Environment (build 1.8.0_144-b01) Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode) Picked up _JAVA_OPTIONS: -XX:-UsePerfData -- 1. Error: create DataFrame from list or data.frame (@test_basic.R#21) ------ subscript out of bounds 1: sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE, sparkConfig = sparkRTestConfig) at D:/temp/RtmpIJ8Cc3/RLIBS_3242c713c3181/SparkR/tests/testthat/test_basic.R:21 2: sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap, sparkExecutorEnvMap, sparkJars, sparkPackages) 3: checkJavaVersion() 4: strsplit(javaVersionFilter[[1]], "[\"]") java version "1.8.0_144" Java(TM) SE Runtime Environment (build 1.8.0_144-b01) Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode) Picked up _JAVA_OPTIONS: -XX:-UsePerfData -- 2. Error: spark.glm and predict (@test_basic.R#53) ------------------------- subscript out of bounds 1: sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE, sparkConfig = sparkRTestConfig) at D:/temp/RtmpIJ8Cc3/RLIBS_3242c713c3181/SparkR/tests/testthat/test_basic.R:53 2: sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap, sparkExecutorEnvMap, sparkJars, sparkPackages) 3: checkJavaVersion() 4: strsplit(javaVersionFilter[[1]], "[\"]") == testthat results =========================================================== OK: 0 SKIPPED: 0 FAILED: 2 1. Error: create DataFrame from list or data.frame (@test_basic.R#21) 2. Error: spark.glm and predict (@test_basic.R#53) Error: testthat unit tests failed Execution halted Flavor: r-devel-linux-x86_64-debian-gcc Check: tests, Result: ERROR Running 'run-all.R' [7s/57s] Running the tests in 'tests/run-all.R' failed. Complete output: > # > # Licensed to the Apache Software Foundation (ASF) under one or more > # contributor license agreements. See the NOTICE file distributed with > # this work for additional information regarding copyright ownership. > # The ASF licenses this file to You under the Apache License, Version 2.0 > # (the "License"); you may not use this file except in compliance with > # the License. You may obtain a copy of the License at > # > # http://www.apache.org/licenses/LICENSE-2.0 > # > # Unless required by applicable law or agreed to in writing, software > # distributed under the License is distributed on an "AS IS" BASIS, > # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > # See the License for the specific language governing permissions and > # limitations under the License. > # > > library(testthat) > library(SparkR) Attaching package: 'SparkR' The following objects are masked from 'package:testthat': describe, not The following objects are masked from 'package:stats': cov, filter, lag, na.omit, predict, sd, var, window The following objects are masked from 'package:base': as.data.frame, colnames, colnames<-, drop, endsWith, intersect, rank, rbind, sample, startsWith, subset, summary, transform, union > > # Turn all warnings into errors > options("warn" = 2) > > if (.Platform$OS.type == "windows") { + Sys.setenv(TZ = "GMT") + } > > # Setup global test environment > # Install Spark first to set SPARK_HOME > > # NOTE(shivaram): We set overwrite to handle any old tar.gz files or directories left behind on > # CRAN machines. For Jenkins we should already have SPARK_HOME set. > install.spark(overwrite = TRUE) Overwrite = TRUE: download and overwrite the tar fileand Spark package directory if they exist. Spark not found in the cache directory. Installation will start. MirrorUrl not provided. Looking for preferred site from apache website... Preferred mirror site found: http://mirror.klaus-uwe.me/apache/spark Downloading spark-2.3.1 for Hadoop 2.7 from: - http://mirror.klaus-uwe.me/apache/spark/spark-2.3.1/spark-2.3.1-bin-hadoop2.7.tgz trying URL 'http://mirror.klaus-uwe.me/apache/spark/spark-2.3.1/spark-2.3.1-bin-hadoop2.7.tgz' Content type 'application/x-gzip' length 225883783 bytes (215.4 MB) ================================================== downloaded 215.4 MB Installing to /home/hornik/.cache/spark DONE. SPARK_HOME set to /home/hornik/.cache/spark/spark-2.3.1-bin-hadoop2.7 > > sparkRDir <- file.path(Sys.getenv("SPARK_HOME"), "R") > sparkRWhitelistSQLDirs <- c("spark-warehouse", "metastore_db") > invisible(lapply(sparkRWhitelistSQLDirs, + function(x) { unlink(file.path(sparkRDir, x), recursive = TRUE, force = TRUE)})) > sparkRFilesBefore <- list.files(path = sparkRDir, all.files = TRUE) > > sparkRTestMaster <- "local[1]" > sparkRTestConfig <- list() > if (identical(Sys.getenv("NOT_CRAN"), "true")) { + sparkRTestMaster <- "" + } else { + # Disable hsperfdata on CRAN + old_java_opt <- Sys.getenv("_JAVA_OPTIONS") + Sys.setenv("_JAVA_OPTIONS" = paste("-XX:-UsePerfData", old_java_opt)) + tmpDir <- tempdir() + tmpArg <- paste0("-Djava.io.tmpdir=", tmpDir) + sparkRTestConfig <- list(spark.driver.extraJavaOptions = tmpArg, + spark.executor.extraJavaOptions = tmpArg) + } > > test_package("SparkR") ── 1. Error: create DataFrame from list or data.frame (@test_basic.R#21) ────── Java version 8 is required for this package; found version: 10.0.1 1: sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE, sparkConfig = sparkRTestConfig) at /srv/hornik/tmp/CRAN/SparkR.Rcheck/SparkR/tests/testthat/test_basic.R:21 2: sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap, sparkExecutorEnvMap, sparkJars, sparkPackages) 3: checkJavaVersion() 4: stop(paste("Java version", sparkJavaVersion, "is required for this package; found version:", javaVersionStr)) ── 2. Error: spark.glm and predict (@test_basic.R#53) ───────────────────────── Java version 8 is required for this package; found version: 10.0.1 1: sparkR.session(master = sparkRTestMaster, enableHiveSupport = FALSE, sparkConfig = sparkRTestConfig) at /srv/hornik/tmp/CRAN/SparkR.Rcheck/SparkR/tests/testthat/test_basic.R:53 2: sparkR.sparkContext(master, appName, sparkHome, sparkConfigMap, sparkExecutorEnvMap, sparkJars, sparkPackages) 3: checkJavaVersion() 4: stop(paste("Java version", sparkJavaVersion, "is required for this package; found version:", javaVersionStr)) ══ testthat results ═══════════════════════════════════════════════════════════ OK: 0 SKIPPED: 0 FAILED: 2 1. Error: create DataFrame from list or data.frame (@test_basic.R#21) 2. Error: spark.glm and predict (@test_basic.R#53) Error: testthat unit tests failed Execution halted Flavor: r-devel-linux-x86_64-debian-gcc Check: re-building of vignette outputs, Result: WARNING Error in re-building vignettes: ... Attaching package: 'SparkR' The following objects are masked from 'package:stats': cov, filter, lag, na.omit, predict, sd, var, window The following objects are masked from 'package:base': as.data.frame, colnames, colnames<-, drop, endsWith, intersect, rank, rbind, sample, startsWith, subset, summary, transform, union Quitting from lines 65-67 (sparkr-vignettes.Rmd) Error: processing vignette 'sparkr-vignettes.Rmd' failed with diagnostics: Java version 8 is required for this package; found version: 10.0.1 Execution halted --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org