[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...

2016-10-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15516


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...

2016-10-25 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15516#discussion_r85046280
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -654,6 +654,33 @@ setMethod("unpersist",
 x
   })
 
+#' StorageLevel
+#'
+#' Get storage level of this SparkDataFrame.
+#'
+#' @param x the SparkDataFrame to get the storage level.
+#'
+#' @family SparkDataFrame functions
+#' @rdname storageLevel-methods
--- End diff --

this should be 
`@rdname storageLevel` instead of 
`@rdname storageLevel-methods`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...

2016-10-20 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/15516#discussion_r84218510
  
--- Diff: R/pkg/R/utils.R ---
@@ -385,6 +385,41 @@ getStorageLevel <- function(newLevel = c("DISK_ONLY",
  "OFF_HEAP" = callJStatic(storageLevelClass, 
"OFF_HEAP"))
 }
 
+storageLevelToString <- function(levelObj) {
+  useDisk <- callJMethod(levelObj, "useDisk")
+  useMemory <- callJMethod(levelObj, "useMemory")
+  useOffHeap <- callJMethod(levelObj, "useOffHeap")
+  deserialized <- callJMethod(levelObj, "deserialized")
+  replication <- callJMethod(levelObj, "replication")
+  if (!useDisk && !useMemory && !useOffHeap && !deserialized && 
replication == 1) {
--- End diff --

python has itself `StorageLevel` class, and the python side code about 
`storageLevel`  also exists  duplicated code problem...
and if we make an r-side `StorageLevel` class may cause the code more 
complex and seems won't help solving the duplicated code problem.
What do you think about it ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...

2016-10-19 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15516#discussion_r84168871
  
--- Diff: R/pkg/R/utils.R ---
@@ -385,6 +385,41 @@ getStorageLevel <- function(newLevel = c("DISK_ONLY",
  "OFF_HEAP" = callJStatic(storageLevelClass, 
"OFF_HEAP"))
 }
 
+storageLevelToString <- function(levelObj) {
+  useDisk <- callJMethod(levelObj, "useDisk")
+  useMemory <- callJMethod(levelObj, "useMemory")
+  useOffHeap <- callJMethod(levelObj, "useOffHeap")
+  deserialized <- callJMethod(levelObj, "deserialized")
+  replication <- callJMethod(levelObj, "replication")
+  if (!useDisk && !useMemory && !useOffHeap && !deserialized && 
replication == 1) {
--- End diff --

hardcoding the variations in R could be hard to maintain or easily get out 
of sync. is there anyway to do this?
Python seems to be able to get the enum name as a string


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...

2016-10-18 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15516#discussion_r83921132
  
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -795,6 +795,8 @@ test_that("cache(), persist(), and unpersist() on a 
DataFrame", {
   persist(df, "MEMORY_AND_DISK")
   expect_true(df@env$isCached)
 
+  expect_equal(storageLevel(df), "StorageLevel(disk, memory, deserialized, 
1 replicas)")
--- End diff --

so the output of this doesn't say "MEMORY_AND_DISK"? Should we have that in 
addition to "StorageLevel(disk, memory, deserialized, 1 replicas)"? It might be 
confusing to set "MEMORY_AND_DISK" and get "StorageLevel(disk, memory, 
deserialized, 1 replicas)" back?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...

2016-10-18 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15516#discussion_r83920746
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -654,6 +654,33 @@ setMethod("unpersist",
 x
   })
 
+#' StorageLevel
+#'
+#' Get storage level of this SparkDataFrame.
+#'
+#' @param x the SparkDataFrame to get the storage level.
+#'
+#' @family SparkDataFrame functions
+#' @rdname storageLevel-methods
--- End diff --

change this storageLevel - to match the method name


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...

2016-10-17 Thread WeichenXu123
GitHub user WeichenXu123 opened a pull request:

https://github.com/apache/spark/pull/15516

[SPARK-17961][SparkR][SQL] Add storageLevel to Dataset for SparkR

## What changes were proposed in this pull request?

Add storageLevel to Dataset for SparkR.
This is similar to this RP:  https://github.com/apache/spark/pull/13780

## How was this patch tested?

test added.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WeichenXu123/spark storageLevel_df_r

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15516.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15516


commit d4d5c52ba5dab9ec7ad3d41cffdcde70860e57de
Author: WeichenXu 
Date:   2016-10-16T02:26:01Z

update.

commit 4be3e5fe64a11e0ca0a68c45bc9985abe08dbf0d
Author: WeichenXu 
Date:   2016-10-17T17:03:35Z

update.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org