[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15516 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15516#discussion_r85046280 --- Diff: R/pkg/R/DataFrame.R --- @@ -654,6 +654,33 @@ setMethod("unpersist", x }) +#' StorageLevel +#' +#' Get storage level of this SparkDataFrame. +#' +#' @param x the SparkDataFrame to get the storage level. +#' +#' @family SparkDataFrame functions +#' @rdname storageLevel-methods --- End diff -- this should be `@rdname storageLevel` instead of `@rdname storageLevel-methods` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/15516#discussion_r84218510 --- Diff: R/pkg/R/utils.R --- @@ -385,6 +385,41 @@ getStorageLevel <- function(newLevel = c("DISK_ONLY", "OFF_HEAP" = callJStatic(storageLevelClass, "OFF_HEAP")) } +storageLevelToString <- function(levelObj) { + useDisk <- callJMethod(levelObj, "useDisk") + useMemory <- callJMethod(levelObj, "useMemory") + useOffHeap <- callJMethod(levelObj, "useOffHeap") + deserialized <- callJMethod(levelObj, "deserialized") + replication <- callJMethod(levelObj, "replication") + if (!useDisk && !useMemory && !useOffHeap && !deserialized && replication == 1) { --- End diff -- python has itself `StorageLevel` class, and the python side code about `storageLevel` also exists duplicated code problem... and if we make an r-side `StorageLevel` class may cause the code more complex and seems won't help solving the duplicated code problem. What do you think about it ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15516#discussion_r84168871 --- Diff: R/pkg/R/utils.R --- @@ -385,6 +385,41 @@ getStorageLevel <- function(newLevel = c("DISK_ONLY", "OFF_HEAP" = callJStatic(storageLevelClass, "OFF_HEAP")) } +storageLevelToString <- function(levelObj) { + useDisk <- callJMethod(levelObj, "useDisk") + useMemory <- callJMethod(levelObj, "useMemory") + useOffHeap <- callJMethod(levelObj, "useOffHeap") + deserialized <- callJMethod(levelObj, "deserialized") + replication <- callJMethod(levelObj, "replication") + if (!useDisk && !useMemory && !useOffHeap && !deserialized && replication == 1) { --- End diff -- hardcoding the variations in R could be hard to maintain or easily get out of sync. is there anyway to do this? Python seems to be able to get the enum name as a string --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15516#discussion_r83921132 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -795,6 +795,8 @@ test_that("cache(), persist(), and unpersist() on a DataFrame", { persist(df, "MEMORY_AND_DISK") expect_true(df@env$isCached) + expect_equal(storageLevel(df), "StorageLevel(disk, memory, deserialized, 1 replicas)") --- End diff -- so the output of this doesn't say "MEMORY_AND_DISK"? Should we have that in addition to "StorageLevel(disk, memory, deserialized, 1 replicas)"? It might be confusing to set "MEMORY_AND_DISK" and get "StorageLevel(disk, memory, deserialized, 1 replicas)" back? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15516#discussion_r83920746 --- Diff: R/pkg/R/DataFrame.R --- @@ -654,6 +654,33 @@ setMethod("unpersist", x }) +#' StorageLevel +#' +#' Get storage level of this SparkDataFrame. +#' +#' @param x the SparkDataFrame to get the storage level. +#' +#' @family SparkDataFrame functions +#' @rdname storageLevel-methods --- End diff -- change this storageLevel - to match the method name --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...
GitHub user WeichenXu123 opened a pull request: https://github.com/apache/spark/pull/15516 [SPARK-17961][SparkR][SQL] Add storageLevel to Dataset for SparkR ## What changes were proposed in this pull request? Add storageLevel to Dataset for SparkR. This is similar to this RP: https://github.com/apache/spark/pull/13780 ## How was this patch tested? test added. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WeichenXu123/spark storageLevel_df_r Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15516.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15516 commit d4d5c52ba5dab9ec7ad3d41cffdcde70860e57de Author: WeichenXuDate: 2016-10-16T02:26:01Z update. commit 4be3e5fe64a11e0ca0a68c45bc9985abe08dbf0d Author: WeichenXu Date: 2016-10-17T17:03:35Z update. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org