[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2015-01-18 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-70450284 I've tested this PR but the result seems to be off. Parquet generated from Hive with timestamp values set by 'from_utc_timestamp('1970-01-01 08:00:00','PST

[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2015-01-21 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/3820#issuecomment-70904105 Good to hear. Here's how I create my test data, I run this in Hive and then take the data from HDFS directly and Spark is able to read/parse the data file

[GitHub] spark pull request: [SPARK-4987] [SQL] parquet timestamp type supp...

2015-01-20 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/3820#discussion_r23256480 --- Diff: docs/sql-programming-guide.md --- @@ -581,6 +581,15 @@ Configuration of Parquet can be done using the `setConf` method on SQLContext

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-03-24 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/5096#issuecomment-85816756 @redbaron, @oscaroboto The same applies to memory consumption I'm afraid. There isn't a way to constraint how much [R](https://stat.ethz.ch/R-manual/R-devel/library

[GitHub] spark pull request: [SPARK-8307] [SQL] improve timestamp from parq...

2015-06-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/6759#discussion_r32299296 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetConverter.scala --- @@ -498,69 +493,21 @@ private[parquet] object

[GitHub] spark pull request: [SPARK-6797][SPARKR] Add support for YARN clus...

2015-07-01 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/6743#discussion_r33654496 --- Diff: core/src/main/scala/org/apache/spark/api/r/RUtils.scala --- @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-9317] [SPARKR] Change `show` to print D...

2015-08-21 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/8360#issuecomment-133523560 From what I can infer from the original JIRA is that we are trying to match R data.frame behavior. I think it is handy, though it is easy to think of several

[GitHub] spark pull request: [SPARK-9317] [SPARKR] Change `show` to print D...

2015-08-21 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/8360 [SPARK-9317] [SPARKR] Change `show` to print DataFrame entries Small update to DataFrame API in SparkR @shivaram You can merge this pull request into a Git repository by running

[GitHub] spark pull request: [SPARK-9316] [SPARKR] Add support for filterin...

2015-08-24 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/8394 [SPARK-9316] [SPARKR] Add support for filtering using `[` (synonym for filter / select) Add support for ``` df[df$name == Smith, c(1,2)] df[df$age %in% c(19, 30), 1:2

[GitHub] spark pull request: [SPARK-8742][SPARKR] Improve SparkR error mess...

2015-07-29 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/7742#discussion_r35783192 --- Diff: core/src/main/scala/org/apache/spark/api/r/RBackendHandler.scala --- @@ -69,8 +69,11 @@ private[r] class RBackendHandler(server: RBackend

[GitHub] spark pull request: [SPARK-8742][SPARKR] Improve SparkR error mess...

2015-07-29 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/7742#discussion_r35783283 --- Diff: core/src/main/scala/org/apache/spark/api/r/RBackendHandler.scala --- @@ -148,6 +151,9 @@ private[r] class RBackendHandler(server: RBackend

[GitHub] spark pull request: [SPARK-8742][SPARKR] Improve SparkR error mess...

2015-07-30 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/7742#issuecomment-126209438 looks good! thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-8742][SPARKR] Improve SparkR error mess...

2015-07-30 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/7742#discussion_r35901689 --- Diff: core/src/main/scala/org/apache/spark/api/r/RBackendHandler.scala --- @@ -148,6 +151,9 @@ private[r] class RBackendHandler(server: RBackend

[GitHub] spark pull request: [SPARK-10971][SPARKR] RRunner should allow set...

2015-10-22 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9179#issuecomment-150311495 +1 on `spark.r.driver.command` and `spark.r.command` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARKR] [SPARK-11199] Improve R context manag...

2015-10-22 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9185#issuecomment-150318041 I vote for simplicity for SparkR and not have multiple session. In fact I observe it is already messy to handle DataFrame created by a different SparkContext

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-10-23 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9218#discussion_r42899259 --- Diff: R/pkg/R/DataFrame.R --- @@ -276,6 +276,57 @@ setMethod("names<-", } }) +#' @r

[GitHub] spark pull request: [SPARK-10903] [SPARKR] R - Simplify SQLContext...

2015-10-22 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9192#discussion_r42716848 --- Diff: R/pkg/R/SQLContext.R --- @@ -17,6 +17,34 @@ # SQLcontext.R: SQLContext-driven functions +#' Temporary function to reroute

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-10-22 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9218 [SPARK-9319][SPARKR] Add support for setting column names, types Add support for for colnames, colnames<-, coltypes<- I will merge with PR 8984 (coltypes) once it is in, po

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-10-22 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9218#discussion_r42786136 --- Diff: R/pkg/R/DataFrame.R --- @@ -276,6 +276,57 @@ setMethod("names<-", } }) +#' @r

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-10-22 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9218#issuecomment-150311963 @sun-rui `names` and `names<-` are already there, this is to add `colnames`. --- If your project is set up for it, you can reply to this email and have your re

[GitHub] spark pull request: [SPARK-10903] [SPARKR] R - Simplify SQLContext...

2015-10-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9192#discussion_r42664614 --- Diff: R/pkg/R/SQLContext.R --- @@ -17,6 +17,34 @@ # SQLcontext.R: SQLContext-driven functions +#' Temporary function to reroute

[GitHub] spark pull request: [SPARK-10903] [SPARKR] R - Simplify SQLContext...

2015-10-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9192#discussion_r42662379 --- Diff: R/pkg/R/SQLContext.R --- @@ -17,6 +17,34 @@ # SQLcontext.R: SQLContext-driven functions +#' Temporary function to reroute

[GitHub] spark pull request: [SPARK-10903] [SPARKR] R - Simplify SQLContext...

2015-10-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9192#discussion_r42659643 --- Diff: R/pkg/R/SQLContext.R --- @@ -17,6 +17,34 @@ # SQLcontext.R: SQLContext-driven functions +#' Temporary function to reroute

[GitHub] spark pull request: [SPARK-10903] [SPARKR] R - Simplify SQLContext...

2015-10-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9192#discussion_r42659746 --- Diff: R/pkg/R/SQLContext.R --- @@ -17,6 +17,34 @@ # SQLcontext.R: SQLContext-driven functions +#' Temporary function to reroute

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-10-22 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9218#discussion_r42801664 --- Diff: R/pkg/R/DataFrame.R --- @@ -276,6 +276,57 @@ setMethod("names<-", } }) +#' @r

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-22 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9222#issuecomment-150382796 Do you have benchmark numbers for this change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-8277][SPARKR] Faster createDataFrame us...

2015-10-22 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9234#issuecomment-150382185 Hi thanks for the contribution, you might want to check out the ongoing work in https://github.com/apache/spark/pull/9099 and SPARK-11086 --- If your project

[GitHub] spark pull request: [SPARK-11340][SPARKR] Support setting driver p...

2015-10-26 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9290 [SPARK-11340][SPARKR] Support setting driver properties when starting Spark from R programmatically or from RStudio Mapping spark.driver.memory from sparkEnvir to spark-submit commandline

[GitHub] spark pull request: [SPARK-11340][SPARKR] Support setting driver p...

2015-10-26 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9290#issuecomment-151377437 Manual testing with: ``` library(SparkR, lib.loc='/opt/spark-1.6.0-bin-hadoop2.6/R/lib') sc <- sparkR.init(master = "local[*]", spark

[GitHub] spark pull request: [SPARK-11210][SPARKR] Add window functions int...

2015-10-26 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9196#issuecomment-151381136 looks good! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11340][SPARKR] Support setting driver p...

2015-10-26 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9290#issuecomment-151378899 I checked, the user could also set SPARK_DRIVER_MEMORY before running `sparkR.init()` https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-10-26 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9218#issuecomment-151379530 @sun-rui That's a great point, `coltypes()` as its signature is defined, would only return a list of simple types. But how would one create a DataFrame

[GitHub] spark pull request: [SPARK-11340][SPARKR] Support setting driver p...

2015-10-29 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9290#discussion_r43430720 --- Diff: R/pkg/R/sparkR.R --- @@ -93,7 +93,7 @@ sparkR.stop <- function() { #' sc <- sparkR.init("local[2]", "

[GitHub] spark pull request: [SPARK-8019] [SPARKR] Support SparkR spawning ...

2015-10-29 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/6557#issuecomment-152292218 This is updated by #9179 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-11409][SPARKR] Enable url link in R doc...

2015-10-29 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9363 [SPARK-11409][SPARKR] Enable url link in R doc for Persist Quick one line doc fix link is not clickable ![image](https://cloud.githubusercontent.com/assets/8969467/10833041/4e91dd7c

[GitHub] spark pull request: [SPARK-11294][SPARKR] Improve R doc for read.d...

2015-10-23 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9261 [SPARK-11294][SPARKR] Improve R doc for read.df, write.df, saveAsTable Add examples for read.df, write.df; fix grouping for read.df, loadDF; fix formatting and text truncation for write.df

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-27 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/8984#issuecomment-151611692 @shivaram as I was discussing with @sun-rui in #9218 - I think coltypes() could probably handle complex type (JVM -> R) by mapping "map<string,int>&quo

[GitHub] spark pull request: [SPARK-11343] [ML] Allow float and double pred...

2015-10-27 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9296#discussion_r43195310 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/RegressionEvaluator.scala --- @@ -72,10 +73,13 @@ final class RegressionEvaluator @Since

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-27 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/8984#issuecomment-151654117 The error looks different, possibly related but not exactly same cause --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-11215] [ML] Add multiple columns suppor...

2015-10-28 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9183#discussion_r43221520 --- Diff: R/pkg/inst/tests/test_mllib.R --- @@ -56,14 +56,3 @@ test_that("feature interaction vs native glm", { rVals <- predict(gl

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-28 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/8984#issuecomment-152069034 I'm concerned with the lack of "reversibility" as in `coltypes(x) <- coltypes(x)` Could I propose expanding on your suggestion to return `na` f

[GitHub] spark pull request: [SPARK-11329] [SQL] Support star expansion for...

2015-10-28 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9343#discussion_r43342070 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala --- @@ -146,7 +146,11 @@ case class Alias(child

[GitHub] spark pull request: [SPARK-11340][SPARKR] Support setting driver p...

2015-10-28 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9290#discussion_r43346365 --- Diff: R/pkg/R/sparkR.R --- @@ -93,7 +93,7 @@ sparkR.stop <- function() { #' sc <- sparkR.init("local[2]", "

[GitHub] spark pull request: [SPARK-11210][SPARKR][WIP] Add window function...

2015-10-21 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9196#issuecomment-150021663 This is merging 2 PR/JIRA? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-11284] [ML] ALS produces float predicti...

2015-10-23 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9252#issuecomment-150700335 Shouldn't this be fixed/casted in the `RegressionEvaluator` instead? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-11263][SPARKR] lintr Throws Warnings on...

2015-11-10 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9463#issuecomment-155598069 @shivaram ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

2015-11-10 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9489#discussion_r44482198 --- Diff: R/pkg/R/functions.R --- @@ -974,6 +1006,54 @@ setMethod("soundex", column(jc) })

[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

2015-11-10 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9489#discussion_r44482223 --- Diff: R/pkg/R/functions.R --- @@ -1168,6 +1248,54 @@ setMethod("upper", column(jc) }) +#

[GitHub] spark pull request: [SPARK-11567][PYTHON] Add Python API for corr ...

2015-11-10 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9536#issuecomment-155597890 @davis? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() (New v...

2015-11-09 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9579#issuecomment-155258819 doc comment, looks good to me otherwise. @sun-rui @shivaram --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() (New v...

2015-11-09 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9579#discussion_r44361878 --- Diff: R/pkg/R/DataFrame.R --- @@ -2152,3 +2152,47 @@ setMethod("with", newEnv <- assignNewEnv(data) ev

[GitHub] spark pull request: Flaky SparkR test: test_sparkSQL.R: sample on ...

2015-11-08 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9549#issuecomment-154880937 @shivaram @adrian555 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Flaky SparkR test: test_sparkSQL.R: sample on ...

2015-11-08 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9549 Flaky SparkR test: test_sparkSQL.R: sample on a DataFrame Make sample test less flaky by setting the seed Tested with ``` repeat { if (count(sample(df, FALSE, 0.1)) == 3

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155970380 These SparkR support has just been added so this change breaks tests ``` 1. Failure (at test_sparkSQL.R#1010): group by, agg functions

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-11-11 Thread felixcheung
Github user felixcheung closed the pull request at: https://github.com/apache/spark/pull/9218 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-11-11 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9654 [SPARK-9319][SPARKR] Add support for setting column names, types Add support for for colnames, colnames<-, coltypes<- Also added tests for names, names<- which have no test p

[GitHub] spark pull request: [SPARK-10500][SPARKR] sparkr.zip cannot be cre...

2015-11-11 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9390#issuecomment-156024874 I don't fully understand the issue, but why we have to use `.libPaths` and not `library(... lib.loc= )`? --- If your project is set up for it, you can reply

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156024999 @JihongMA yap that should fix them --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44734240 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,107 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44734873 --- Diff: R/pkg/inst/tests/test_sparkSQL.R --- @@ -1525,6 +1525,22 @@ test_that("Method coltypes() to get R's data types of a Data

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44734936 --- Diff: R/pkg/R/generics.R --- @@ -1050,4 +1049,7 @@ setGeneric("with") #' @rdname coltypes #' @export -setGeneric

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44735202 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,107 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11263][SPARKR] lintr Throws Warnings on...

2015-11-12 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9463#issuecomment-156289622 thanks for catching that, I did some test I suspect they are caused by generics.R, so removing those. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-11715][SPARKR] Add R support corr for C...

2015-11-12 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9680 [SPARK-11715][SPARKR] Add R support corr for Column Aggregration Need to match existing method signature You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44734132 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,107 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44734047 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,107 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11715][SPARKR] Add R support corr for C...

2015-11-12 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9680#issuecomment-156331258 I think 9366 is about computing corr or cov matrix whereas this is computing corr between two columns. They seem to be useful in their own ways. Also

[GitHub] spark pull request: [SPARK-11715][SPARKR] Add R support corr for C...

2015-11-13 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9680#issuecomment-156484659 So #9366 is for all columns in DataFrame x=y or different x, y DataFrames And this #9680 is for 2 columns in one DataFrame. --- If your project is set up

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44867038 --- Diff: R/pkg/R/generics.R --- @@ -971,6 +986,9 @@ setGeneric("size", function(x) { standardGeneric("size") }) #' @export

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44867045 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,101 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44867068 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,101 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44867029 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,101 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44867049 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,101 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44867056 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,101 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11031][SPARKR] Method str() on a DataFr...

2015-11-14 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9613#discussion_r44867054 --- Diff: R/pkg/R/DataFrame.R --- @@ -2200,4 +2200,101 @@ setMethod("coltypes", rTypes[naIndices] <- ty

[GitHub] spark pull request: [SPARK-11684] [R] [ML] [Doc] Update SparkR glm...

2015-11-16 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9727#issuecomment-157179595 looks good @shivaram --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11715][SPARKR] Add R support corr for C...

2015-11-16 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9680#discussion_r44974143 --- Diff: R/pkg/R/functions.R --- @@ -259,6 +259,20 @@ setMethod("column", function(x) {

[GitHub] spark pull request: [SPARK-11715][SPARKR] Add R support corr for C...

2015-11-16 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9680#discussion_r44973809 --- Diff: R/pkg/R/functions.R --- @@ -259,6 +259,20 @@ setMethod("column", function(x) {

[GitHub] spark pull request: [SPARK-11756][SPARKR] Fix use of aliases - Spa...

2015-11-16 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9750 [SPARK-11756][SPARKR] Fix use of aliases - SparkR can not output help information for SparkR:::summary correctly Fix use of aliases and changes uses of @rdname and @seealso `@aliases

[GitHub] spark pull request: [SPARK-11715][SPARKR] Add R support corr for C...

2015-11-16 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9680#discussion_r45019825 --- Diff: R/pkg/R/functions.R --- @@ -259,6 +259,20 @@ setMethod("column", function(x) {

[GitHub] spark pull request: [SPARK-11715][SPARKR] Add R support corr for C...

2015-11-16 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9680#issuecomment-157268953 @sun-rui I updated it. I think it's a bit not as strongly typed as I'd like but if I add `col2 = "Column"` to signature I get this error: ```

[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

2015-11-09 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9489#issuecomment-155244366 @mengxr possibly.. though there are usage difference with SparkR DataFrame/Column as compared to R data.frame (eg. `agg` vs [`aggregate`](https://stat.ethz.ch/R

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() (New v...

2015-11-09 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9579#discussion_r44359782 --- Diff: R/pkg/R/schema.R --- @@ -115,20 +115,7 @@ structField.jobj <- function(x) { } checkType <- function(type) { - primtiv

[GitHub] spark pull request: [DOC] Missing link to R DataFrame API doc

2015-11-02 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9394#issuecomment-153241163 it should be `@seealso \link{createDataFrame}` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-11-02 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9218#discussion_r43718136 --- Diff: R/pkg/R/DataFrame.R --- @@ -276,6 +276,75 @@ setMethod("names<-", } }) +#' @r

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-11-02 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9218#discussion_r43718270 --- Diff: R/pkg/R/DataFrame.R --- @@ -276,6 +276,75 @@ setMethod("names<-", } }) +#' @r

[GitHub] spark pull request: [SPARK-11407][SPARKR] Add doc for running from...

2015-11-01 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9401#issuecomment-152870335 And generally people blog about using `.libPath` but this could cause all packages to be installed to SparkR location as it becomes the default

[GitHub] spark pull request: [DOC] Missing link to R DataFrame API doc

2015-11-01 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9394#issuecomment-152870465 why don't we point to http://spark.apache.org/docs/latest/sparkr.html for now instead? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-11407][SPARKR] Add doc for running from...

2015-11-01 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9401 [SPARK-11407][SPARKR] Add doc for running from RStudio ![image](https://cloud.githubusercontent.com/assets/8969467/10871746/612ba44a-80a4-11e5-99a0-40b9931dee52.png) (This is without css

[GitHub] spark pull request: [SPARK-9319][SPARKR] Add support for setting c...

2015-11-01 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9218#issuecomment-152862908 ``` Error : /home/jenkins/workspace/SparkPullRequestBuilder/R/pkg/man/colnames.Rd: Sections \title, and \name must exist and be unique in Rd files ERROR

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-04 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/8984#issuecomment-153870143 I'm a bit confused - I thought `map` `array` `struct` should return NA? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-11260][SPARKR] with() function support

2015-11-04 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9443#discussion_r43940871 --- Diff: R/pkg/R/DataFrame.R --- @@ -2045,3 +2045,34 @@ setMethod("attach", } attach(newEnv, pos = pos, n

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-04 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/8984#issuecomment-153904230 looks good. thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-04 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/8984#discussion_r43976192 --- Diff: R/pkg/R/DataFrame.R --- @@ -1914,3 +1914,46 @@ setMethod("attach", } attach(newEnv, pos = pos, n

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-11-04 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-153938620 @zero323 you could add the test code to SPARK-11283 so that they could be added back then. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-11263][SPARKR] lintr Throws Warnings on...

2015-11-04 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9463#issuecomment-153942273 @yu-iskw btw, these are two other false positives that you might want to follow up with lintr: ``` R/pkg/inst/tests/test_sparkSQL.R:907:53: style

[GitHub] spark pull request: [SPARK-11263][SPARKR] lintr Throws Warnings on...

2015-11-04 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9463#issuecomment-153938433 Done. they won't do anything because of `@noRd` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-11468][SPARKR] add stddev/variance agg ...

2015-11-05 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9489 [SPARK-11468][SPARKR] add stddev/variance agg functions for Column Checked names, none of them should conflict with anything in base @shivaram @davies @rxin You can merge this pull

[GitHub] spark pull request: [SPARK-11567][PYTHON] Add Python API for corr ...

2015-11-06 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/9536 [SPARK-11567][PYTHON] Add Python API for corr in group like `df.agg(corr("col1", "col2")` @davies You can merge this pull request into a Git repository by runn

[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-11-03 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/8984#discussion_r43797767 --- Diff: R/pkg/R/types.R --- @@ -0,0 +1,41 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

  1   2   3   4   5   6   7   8   9   10   >