[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140651399 --- Diff: R/pkg/R/column.R --- @@ -238,8 +238,10 @@ setMethod("between", signature(x = "Column"), #' @param x a Column. #' @param dataType a character object describing the target data type. #'See +# nolint start --- End diff -- I just double checked the links. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140651387 --- Diff: dev/lint-r.R --- @@ -24,10 +24,16 @@ if (! library(SparkR, lib.loc = LOCAL_LIB_LOC, logical.return = TRUE)) { stop("You should install SparkR in a local directory with `R/install-dev.sh`.") } -# Installs lintr from Github in a local directory. -# NOTE: The CRAN's version is too old to adapt to our rules. -if ("lintr" %in% row.names(installed.packages()) == FALSE) { - devtools::install_github("jimhester/lintr@a769c0b") +# Installs lintr from Github in a local directory if lintr is not installed already or +# the existing lintr is not the specified version. +# +# Note that, the CRAN's version is too old to adapt to our rules. Therefore, we try to +# install lintr from the latest commit in Github, rather than a specific tag or release. +# The current latest is jimhester/lintr@5431140 (see SPARK-22063), the dev version, +# '1.0.1.9000'. +if ("lintr" %in% row.names(installed.packages()) == FALSE || +packageVersion("lintr") != "1.0.1.9000") { --- End diff -- I checked this - when downgrading, upgrading and not installed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140643692 --- Diff: R/pkg/R/DataFrame.R --- @@ -2650,8 +2650,9 @@ setMethod("merge", #' @param suffix a suffix for the column name #' @return list of columns #' -#' @note generateAliasesForIntersectedCols since 1.6.0 -generateAliasesForIntersectedCols <- function(x, intersectedColNames, suffix) { # nolint +#' @note genAliasesForIntersectedCols since 1.6.0 --- End diff -- nit: I'd remove this `@note` too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140244036 --- Diff: dev/lint-r.R --- @@ -28,6 +28,7 @@ if (! library(SparkR, lib.loc = LOCAL_LIB_LOC, logical.return = TRUE)) { # NOTE: The CRAN's version is too old to adapt to our rules. if ("lintr" %in% row.names(installed.packages()) == FALSE) { devtools::install_github("jimhester/lintr@a769c0b") + # devtools::install_github("jimhester/lintr@5431140") --- End diff -- Oh, just to be clear, I didn't change this yet but just commented this for now ... Actually, `devtools::install_github("jimhester/lintr@5431140")` is new one but I didn't change this yet just in case. Will add some comments around this when I add few more changes and clean up soon in any event. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140242203 --- Diff: dev/lint-r.R --- @@ -28,6 +28,7 @@ if (! library(SparkR, lib.loc = LOCAL_LIB_LOC, logical.return = TRUE)) { # NOTE: The CRAN's version is too old to adapt to our rules. if ("lintr" %in% row.names(installed.packages()) == FALSE) { devtools::install_github("jimhester/lintr@a769c0b") + # devtools::install_github("jimhester/lintr@5431140") --- End diff -- It looks like the commented is a old code. Maybe a comment to explain it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140238274 --- Diff: dev/lint-r.R --- @@ -28,6 +28,7 @@ if (! library(SparkR, lib.loc = LOCAL_LIB_LOC, logical.return = TRUE)) { # NOTE: The CRAN's version is too old to adapt to our rules. if ("lintr" %in% row.names(installed.packages()) == FALSE) { devtools::install_github("jimhester/lintr@a769c0b") + # devtools::install_github("jimhester/lintr@5431140") --- End diff -- Yes .. I think I should add few more changes here if we are going to update this package. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140151179 --- Diff: dev/lint-r.R --- @@ -28,6 +28,7 @@ if (! library(SparkR, lib.loc = LOCAL_LIB_LOC, logical.return = TRUE)) { # NOTE: The CRAN's version is too old to adapt to our rules. if ("lintr" %in% row.names(installed.packages()) == FALSE) { devtools::install_github("jimhester/lintr@a769c0b") + # devtools::install_github("jimhester/lintr@5431140") --- End diff -- because of `if ("lintr" %in% row.names(installed.packages()) == FALSE) {` ? it might be the case the Jenkins boxes already have a version installed and this won't update it --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140150499 --- Diff: R/pkg/R/DataFrame.R --- @@ -2649,15 +2651,15 @@ setMethod("merge", #' @return list of columns #' #' @note generateAliasesForIntersectedCols since 1.6.0 -generateAliasesForIntersectedCols <- function (x, intersectedColNames, suffix) { +generateAliasesForIntersectedCols <- function(x, intersectedColNames, suffix) { # nolint --- End diff -- wait... this isn't public though; it's not in https://github.com/apache/spark/blob/master/R/pkg/NAMESPACE looks like this shouldn't be in the doc (`@noRd` or `#` instead of `#'`) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140150229 --- Diff: R/pkg/R/DataFrame.R --- @@ -2594,12 +2596,12 @@ setMethod("merge", } else { # if by or both by.x and by.y have length 0, use Cartesian Product joinRes <- crossJoin(x, y) - return (joinRes) + return(joinRes) } # sets alias for making colnames unique in dataframes 'x' and 'y' -colsX <- generateAliasesForIntersectedCols(x, by, suffixes[1]) -colsY <- generateAliasesForIntersectedCols(y, by, suffixes[2]) +colsX <- generateAliasesForIntersectedCols(x, by, suffixes[1]) # nolint +colsY <- generateAliasesForIntersectedCols(y, by, suffixes[2]) # nolint --- End diff -- what's the problem here? name too long I think? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r140150600 --- Diff: R/pkg/R/context.R --- @@ -329,7 +329,7 @@ spark.addFile <- function(path, recursive = FALSE) { #' spark.getSparkFilesRootDirectory() #'} #' @note spark.getSparkFilesRootDirectory since 2.1.0 -spark.getSparkFilesRootDirectory <- function() { +spark.getSparkFilesRootDirectory <- function() { # nolint --- End diff -- this one is an API... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919715 --- Diff: R/pkg/R/mllib_tree.R --- @@ -132,10 +132,10 @@ print.summary.decisionTree <- function(x) { #' Gradient Boosted Tree model, \code{predict} to make predictions on new data, and #' \code{write.ml}/\code{read.ml} to save/load fitted models. #' For more details, see -#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-regression}{ -#' GBT Regression} and -#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted-tree-classifier}{ -#' GBT Classification} +#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted- +#' tree-regression}{GBT Regression} and +#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#gradient-boosted- +#' tree-classifier}{GBT Classification} --- End diff -- links were checked --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919722 --- Diff: R/pkg/R/mllib_tree.R --- @@ -567,10 +569,10 @@ setMethod("write.ml", signature(object = "RandomForestClassificationModel", path #' model, \code{predict} to make predictions on new data, and \code{write.ml}/\code{read.ml} to #' save/load fitted models. #' For more details, see -#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#decision-tree-regression}{ -#' Decision Tree Regression} and -#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#decision-tree-classifier}{ -#' Decision Tree Classification} +#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#decision-tree- +#' regression}{Decision Tree Regression} and +#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#decision-tree- +#' classifier}{Decision Tree Classification} --- End diff -- links were checked --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919377 --- Diff: R/pkg/R/context.R --- @@ -329,7 +329,7 @@ spark.addFile <- function(path, recursive = FALSE) { #' spark.getSparkFilesRootDirectory() #'} #' @note spark.getSparkFilesRootDirectory since 2.1.0 -spark.getSparkFilesRootDirectory <- function() { +spark.getSparkFilesRootDirectory <- function() { # nolint --- End diff -- Ditto: 30 length identifier limit but exposed in the doc --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919190 --- Diff: R/pkg/R/column.R --- @@ -238,8 +238,8 @@ setMethod("between", signature(x = "Column"), #' @param x a Column. #' @param dataType a character object describing the target data type. #'See -#' \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between-r-and-spark}{ -#'Spark Data Types} for available data types. +#' \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between- +#'r-and-spark}{Spark Data Types} for available data types. --- End diff -- I checked this link is not broken for sure. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919707 --- Diff: R/pkg/R/mllib_tree.R --- @@ -352,10 +353,10 @@ setMethod("write.ml", signature(object = "GBTClassificationModel", path = "chara #' model, \code{predict} to make predictions on new data, and \code{write.ml}/\code{read.ml} to #' save/load fitted models. #' For more details, see -#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-regression}{ -#' Random Forest Regression} and -#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-classifier}{ -#' Random Forest Classification} +#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest- +#' regression}{Random Forest Regression} and +#' \href{http://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest- +#' classifier}{Random Forest Classification} --- End diff -- links were checked --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919889 --- Diff: dev/lint-r.R --- @@ -28,6 +28,7 @@ if (! library(SparkR, lib.loc = LOCAL_LIB_LOC, logical.return = TRUE)) { # NOTE: The CRAN's version is too old to adapt to our rules. if ("lintr" %in% row.names(installed.packages()) == FALSE) { devtools::install_github("jimhester/lintr@a769c0b") + # devtools::install_github("jimhester/lintr@5431140") --- End diff -- Here, I didn't yet change this. Up to my knowledge, this doesn't actually try to install this though.. just in case. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139919103 --- Diff: R/pkg/R/DataFrame.R --- @@ -2649,15 +2651,15 @@ setMethod("merge", #' @return list of columns #' #' @note generateAliasesForIntersectedCols since 1.6.0 -generateAliasesForIntersectedCols <- function (x, intersectedColNames, suffix) { +generateAliasesForIntersectedCols <- function(x, intersectedColNames, suffix) { # nolint --- End diff -- Looks this hits 30 length identifier limit but we can't change as exposed in the doc. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19290#discussion_r139918553 --- Diff: R/pkg/.lintr --- @@ -1,2 +1,2 @@ -linters: with_defaults(line_length_linter(100), multiple_dots_linter = NULL, camel_case_linter = NULL, open_curly_linter(allow_single_line = TRUE), closed_curly_linter(allow_single_line = TRUE)) +linters: with_defaults(line_length_linter(100), multiple_dots_linter = NULL, object_name_linter = NULL, camel_case_linter = NULL, open_curly_linter(allow_single_line = TRUE), closed_curly_linter(allow_single_line = TRUE)) --- End diff -- `object_name_linter = NULL` looks required. Otherwise, it complains about inconsistent naming styles. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/19290 [WIP][SPARK-22063][R] Upgrades lintr to latest commit sha1 ID ## What changes were proposed in this pull request? Currently, we set lintr to `jimhester/lintr@a769c0b` (see [this](https://github.com/apache/spark/commit/7d1175011c976756efcd4e4e4f70a8fd6f287026) and [SPARK-14074](https://issues.apache.org/jira/browse/SPARK-14074)). I first tested and checked lintr-1.0.1 but it looks many important fixes are missing (for example, checking 100 length). So, I decided to propose it with the sha1 to https://github.com/jimhester/lintr/commit/5431140ffea65071f1327625d4a8de9688fa7e72. It looks it has fixed many bugs and now finds many instances that I have observed and thought should be caught time to time, here I filed [the results](https://gist.github.com/HyukjinKwon/4f59ddcc7b6487a02da81800baca533c). The downside looks it now takes about 7ish mins, (it was 2ish mins before). ## How was this patch tested? Manually, `./dev/lint-r` after manually updating the lintr package. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark upgrade-r-lint Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19290.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19290 commit 6e41eff8334e0ce87841941d34fb9b88751f1705 Author: hyukjinkwonDate: 2017-09-19T15:51:23Z Upgrade lintr to latest commit sha1 ID --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org