date:20160811

[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14182
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14182
  
**[Test build #63663 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63663/consoleFull)**
 for PR 14182 at commit 
[`edd1ce0`](https://github.com/apache/spark/commit/edd1ce05275447ceff298d4640f8da988d73184f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14392: [SPARK-16446] [SparkR] [ML] Gaussian Mixture Model wrapp...

2016-08-11 Thread shivaram

Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14392
  
Yeah I am not sure `mvnormalmixEM` is very descriptive. @junyangq Any 
opinions on the name here ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14431: [SPARK-16258][SparkR] Automatically append the grouping ...

2016-08-11 Thread NarineK

Github user NarineK commented on the issue:

https://github.com/apache/spark/pull/14431
  
yes, @shivaram , that will be one way to do.
Basically, adding a new public function to `RelationalGroupedDataset` which 
will return the column names.
If it is fine from SQL perspective, maybe I can make a separate pull 
request for that ? 
cc: @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14559: [SPARK-16968]Add additional options in jdbc when ...

2016-08-11 Thread GraceH

Github user GraceH commented on a diff in the pull request:

https://github.com/apache/spark/pull/14559#discussion_r74542628
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -20,14 +20,21 @@ package org.apache.spark.sql.execution.datasources.jdbc
 /**
  * Options for the JDBC data source.
  */
-private[jdbc] class JDBCOptions(
+private[sql] class JDBCOptions(
--- End diff --

OK. Just intend to follow that origin style. I will fix that. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14559
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63662/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14559
  
**[Test build #63662 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63662/consoleFull)**
 for PR 14559 at commit 
[`4fb5e55`](https://github.com/apache/spark/commit/4fb5e55a50531abf255169c275ad2ad2cf2d71f2).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...

2016-08-11 Thread GraceH

Github user GraceH commented on the issue:

https://github.com/apache/spark/pull/14559
  
Thanks all. I have added the unit test in JDBCWriterSuite. Any further 
comment, please feel free to let me know. 

BTW, or we can point the user to check JDBCOptions for further 
configuration information.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14559
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14558: [SPARK-16508][SparkR] Fix warnings on undocumented/dupli...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14558
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63654/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14558: [SPARK-16508][SparkR] Fix warnings on undocumented/dupli...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14558
  
**[Test build #63654 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63654/consoleFull)**
 for PR 14558 at commit 
[`d2c1d64`](https://github.com/apache/spark/commit/d2c1d641ef05692f629ef7cefa0b2b3131ba3475).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14229: [SPARK-16447][ML][SparkR] LDA wrapper in SparkR

2016-08-11 Thread yinxusen

Github user yinxusen commented on the issue:

https://github.com/apache/spark/pull/14229
  
@felixcheung I add some aliases for spark.lda related functions. However, I 
am not quite understand it. From 
[here](https://cran.r-project.org/web/packages/roxygen2/vignettes/rd.html) I 
can see that 
*When you use ?x, help("x") or example("x") R looks for an Rd file 
containing \alias{x}. It then parses the file, converts it into html and 
displays it.*
But when I using `?GroupedData-method`, sparkr-shell cannot find related 
topics.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14571: [SPARK-16983][SQL] Add `prettyName` for row_number, dens...

2016-08-11 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14571
  
Hi, @rxin .
I added test files for window functions for SQLQueryTestSuite and removed 
the old `WindowQuerySuite.scala`. Could you review this again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14613: [SPARK-16883][SparkR]:SQL decimal type is not properly c...

2016-08-11 Thread wangmiao1981

Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/14613
  
@shivaram Sure. I will add unit tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution in CTE by ...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14452
  
**[Test build #63658 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63658/consoleFull)**
 for PR 14452 at commit 
[`bdb6e84`](https://github.com/apache/spark/commit/bdb6e843ea5e488b0004a82fbba5ec6862c983a1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14617
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63656/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14617
  
**[Test build #63656 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63656/consoleFull)**
 for PR 14617 at commit 
[`a5e9d46`](https://github.com/apache/spark/commit/a5e9d46aedf6de47f1e93de0afd8b5d913f2f36e).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class SparkListenerBlockManagerAdded(`
  * `class StorageStatus(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution in CTE by ...

2016-08-11 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14452
  
@gatorsmile I've made some changes. I will update this soon.

The optimized plan for the query is:

Join Inner
:- Join Inner
:  :- CommonSubquery [a#226, b#227, a#245, b#246]
:  :  :  +- BroadcastNestedLoopJoin BuildRight, Inner, true
:  :  : :- LocalTableScan [a#226, b#227]
:  :  : +- BroadcastExchange IdentityBroadcastMode
:  :  :+- LocalTableScan [a#245, b#246]
:  +- CommonSubquery [a#247, b#248, a#251, b#252]
: :  +- BroadcastNestedLoopJoin BuildRight, Inner, true
: : :- LocalTableScan [a#226, b#227]
: : +- BroadcastExchange IdentityBroadcastMode
: :+- LocalTableScan [a#245, b#246]
+- CommonSubquery [a#253, b#254, a#257, b#258]
   :  +- BroadcastNestedLoopJoin BuildRight, Inner, true
   : :- LocalTableScan [a#226, b#227]
   : +- BroadcastExchange IdentityBroadcastMode
   :+- LocalTableScan [a#245, b#246]




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14617
  
**[Test build #63656 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63656/consoleFull)**
 for PR 14617 at commit 
[`a5e9d46`](https://github.com/apache/spark/commit/a5e9d46aedf6de47f1e93de0afd8b5d913f2f36e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14608: [SPARK-17013][SQL] Parse negative numeric literals

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14608
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63652/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14608: [SPARK-17013][SQL] Parse negative numeric literals

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14608
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14608: [SPARK-17013][SQL] Parse negative numeric literals

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14608
  
**[Test build #63652 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63652/consoleFull)**
 for PR 14608 at commit 
[`908253b`](https://github.com/apache/spark/commit/908253b92f87c823b7104f7b0df6f8ae6b4fd814).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14384: [Spark-16443][SparkR] Alternating Least Squares (...

2016-08-11 Thread junyangq

Github user junyangq commented on a diff in the pull request:

https://github.com/apache/spark/pull/14384#discussion_r74536489
  
--- Diff: R/pkg/R/mllib.R ---
@@ -632,3 +642,159 @@ setMethod("predict", signature(object = 
"AFTSurvivalRegressionModel"),
   function(object, newData) {
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
+
+
+#' Alternating Least Squares (ALS) for Collaborative Filtering
+#'
+#' \code{spark.als} learns latent factors in collaborative filtering via 
alternating least
+#' squares. Users can call \code{summary} to obtain fitted latent factors, 
\code{predict}
+#' to make predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#'
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-collaborative-filtering.html}{MLlib:
+#' Collaborative Filtering}.
+#' Additional arguments can be passed to the methods.
+#' \describe{
+#'\item{nonnegative}{logical value indicating whether to apply 
nonnegativity constraints.
+#'   Default: FALSE}
+#'\item{implicitPrefs}{logical value indicating whether to use 
implicit preference.
+#' Default: FALSE}
+#'\item{alpha}{alpha parameter in the implicit preference formulation 
(>= 0). Default: 1.0}
+#'\item{seed}{integer seed for random number generation. Default: 0}
+#'\item{numUserBlocks}{number of user blocks used to parallelize 
computation (> 0).
+#' Default: 10}
+#'\item{numItemBlocks}{number of item blocks used to parallelize 
computation (> 0).
+#' Default: 10}
+#'\item{checkpointInterval}{number of checkpoint intervals (>= 1) or 
disable checkpoint (-1).
+#'  Default: 10}
+#'}
+#'
+#' @param data A SparkDataFrame for training
+#' @param ratingCol column name for ratings
+#' @param userCol column name for user ids. Ids must be (or can be coerced 
into) integers
+#' @param itemCol column name for item ids. Ids must be (or can be coerced 
into) integers
+#' @param rank rank of the matrix factorization (> 0)
+#' @param reg regularization parameter (>= 0)
+#' @param maxIter maximum number of iterations (>= 0)
+
+#' @return \code{spark.als} returns a fitted ALS model
+#' @rdname spark.als
+#' @aliases spark.als,SparkDataFrame
+#' @name spark.als
+#' @export
+#' @examples
+#' \dontrun{
+#' ratings <- list(list(0, 0, 4.0), list(0, 1, 2.0), list(1, 1, 3.0), 
list(1, 2, 4.0),
+#' list(2, 1, 1.0), list(2, 2, 5.0))
+#' df <- createDataFrame(ratings, c("user", "item", "rating"))
+#' model <- spark.als(df, "rating", "user", "item")
+#'
+#' # extract latent factors
+#' stats <- summary(model)
+#' userFactors <- stats$userFactors
+#' itemFactors <- stats$itemFactors
+#'
+#' # make predictions
+#' predicted <- predict(model, df)
+#' showDF(predicted)
+#'
+#' # save and load the model
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#'
+#' # set other arguments
+#' modelS <- spark.als(df, "rating", "user", "item", rank = 20,
+#' reg = 0.1, nonnegative = TRUE)
+#' statsS <- summary(modelS)
+#' }
+#' @note spark.als since 2.1.0
+setMethod("spark.als", signature(data = "SparkDataFrame"),
+  function(data, ratingCol = "rating", userCol = "user", itemCol = 
"item",
+   rank = 10, reg = 1.0, maxIter = 10, ...) {
+
+if (!is.numeric(rank) || rank <= 0) {
+  stop("rank should be a positive number.")
+}
+if (!is.numeric(reg) || reg < 0) {
+  stop("reg should be a nonnegative number.")
+}
+if (!is.numeric(maxIter) || maxIter <= 0) {
+  stop("maxIter should be a positive number.")
+}
+
+`%||%` <- function(a, b) if (!is.null(a)) a else b
+
+args <- list(...)
+numUserBlocks <- args$numUserBlocks %||% 10
+numItemBlocks <- args$numItemBlocks %||% 10
+implicitPrefs <- args$implicitPrefs %||% FALSE
+alpha <- args$alpha %||% 1.0
+nonnegative <- args$nonnegative %||% FALSE
+checkpointInterval <- args$checkpointInterval %||% 10
+seed <- args$seed %||% 0
+
+features <- array(c(ratingCol, userCol, itemCol))
+distParams <- array(as.integer(c(numUserBlocks, numItemBlocks,
+ checkpointInterval, seed)))
+
+jobj

[GitHub] spark pull request #14102: [SPARK-16434][SQL] Avoid per-record type dispatch...

2016-08-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14102


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14551: [SPARK-16961][CORE] Fixed off-by-one error that biased r...

2016-08-11 Thread yanboliang

Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/14551
  
@nicklavers Please also change seed for ```GaussianMixture``` doctest in 
```python/pyspark/ml/clustering.py```. And check whether we need to change seed 
for ```KMeans``` doctest. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13146: [SPARK-13081][PYSPARK][SPARK_SUBMIT]. Allow set p...

2016-08-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13146


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14616: [SPARK-16955][SQL] Fix analysis error when using ordinal...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14616
  
**[Test build #63655 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63655/consoleFull)**
 for PR 14616 at commit 
[`4087365`](https://github.com/apache/spark/commit/40873650c7397a339210092f616c15aedbf13b17).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13409: [SPARK-15667][SQL]Throw exception if columns number of o...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13409
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14616: [SPARK-16955][SQL] Fix analysis error when using ...

2016-08-11 Thread clockfly

GitHub user clockfly opened a pull request:

https://github.com/apache/spark/pull/14616

[SPARK-16955][SQL] Fix analysis error when using ordinal in ORDER BY or 
GROUP BY

## What changes were proposed in this pull request?

This PR adds two unresolved expressions to represent the ordinal in GROUP 
BY or ORDER BY `GroupByOrdinal` and `OrderByOrdinal`, and fixes the rules when 
resolving ordinals.

Ordinals in GROUP BY or ORDER BY like `1` in `order by 1` or `group by 1` 
should be considered as unresolved expressions before analysis. But in current 
code, it is represented as a `Literal` expression directly, which is a resolved 
expression. It may cause analysis failure if a rule requires the ordinal to be 
resolved before applying.

**For example:**

Before this fix, rule `ResolveAggregateFunctions` will try to resolve the 
`Filter` before `Filter`'s child `Aggregate` is fully resolved (`Aggregate` 
contains an unresolved group by ordinal `2`) 

```
'Filter ('a > 0)
   +- Aggregate [2], [count(1) AS count(1)#83L, a#81]
+- SubqueryAlias tmp
+- Project [1 AS a#81]
 +- OneRowRelation$
```

### Before this change

Ordinal is stored as `Literal` expression

```
scala> sc.setLogLevel("TRACE")
scala> sql("select a from t group by 1 order by 1")
...
'Sort [1 ASC], true  
 +- 'Aggregate [1], ['a]
 +- 'UnresolvedRelation `t
```

And it causes analysis error when applying rule ResolveAggregateFunctions, 
as group by ordinal `2` claim to have been resolved, but is not resolved 
actually.

```
scala> Seq(1).toDF("a").createOrReplaceTempView("t")
scala> sql("select count(a), a from t group by 2 having a > 0").show
org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to 
Group by position: '2' exceeds the size of the select list '1'. on unresolved 
object, tree:
Aggregate [2], [(a#9 > 0) AS havingCondition#15]
+- SubqueryAlias t
   +- Project [value#7 AS a#9]
  +- LocalRelation [value#7]
...
```

### After this change

Ordinals are stored as `GroupByOrdinal` or `OrderByOrdinal`.

```
scala> sc.setLogLevel("TRACE")
scala> sql("select a from t group by 1 order by 1")
...
'Sort [orderbyordinal(1) ASC], true
 +- 'Aggregate [groupbyordinal(1)], ['a]
  +- 'UnresolvedRelation `t`
```

And rule ResolveAggregateFunctions can be safely applied as we have 
explicitly resolved `GroupByOrdinal(2)` before applying this rule. 

```
scala> Seq(1).toDF("a").createOrReplaceTempView("t")
scala> sql("select count(a), a from t group by 2 having a > 0").show
++---+  

|count(a)|  a|
++---+
|   1|  1|
++---+
```

## How was this patch tested?

Unit tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/clockfly/spark spark-16955

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14616.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14616


commit 40873650c7397a339210092f616c15aedbf13b17
Author: Sean Zhong 
Date:   2016-08-08T21:40:53Z

[SPARK-16955][SQL] Fix analysis error when using ordinal in ORDER BY or 
GROUP BY




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14558: [SPARK-16508][SparkR] Fix warnings on undocumented/dupli...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14558
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14609: [MINOR][Core] fix warnings on depreciated methods in Mes...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14609
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63650/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14530: [SPARK-16868][Web Ui] Fix executor be both dead and aliv...

2016-08-11 Thread SaintBacchus

Github user SaintBacchus commented on the issue:

https://github.com/apache/spark/pull/14530
  
I will re-run this case, and dig into why the executor will double register.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14607: [SPARK-16905] SQL DDL: MSCK REPAIR TABLE (follow-up)

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14607
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14607: [SPARK-16905] SQL DDL: MSCK REPAIR TABLE (follow-up)

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14607
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63646/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn't work

2016-08-11 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14433
  
Can we have this like 
[SparkSubmitAction](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L55)
 that extends `Enumeration`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14586: [SPARK-17003] [BUILD] [BRANCH-1.6] release-build.sh is m...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14586
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptron Class...

2016-08-11 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14447
  
so there are a few competing implementation in R and `mlp` might not be a 
super relevant name. @shivaram @mengxr any thought on `spark.mlp` here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74530136
  
--- Diff: R/pkg/R/mllib.R ---
@@ -533,6 +626,26 @@ setMethod("write.ml", signature(object = 
"KMeansModel", path = "character"),
 invisible(callJMethod(writer, "save", path))
   })
 
+# Saves the Multilayer Perceptron Classification Model to the input path.
+
+#' @param path The directory where the model is saved
+#' @param overwrite Overwrites or not if the output path already exists. 
Default is FALSE
+#'  which means throw exception if the output path exists.
+#'
+#' @rdname spark.mlp
+#' @export
+#' @seealso \link{write.ml}
+#' @note write.ml(MultilayerPerceptronClassificationModel, character) 
since 2.0.0
--- End diff --

since 2.1.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74530039
  
--- Diff: R/pkg/R/mllib.R ---
@@ -414,6 +421,92 @@ setMethod("predict", signature(object = "KMeansModel"),
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
 
+#' Multilayer Perceptron Classification Model
+#'
+#' \code{spark.mlp} fits a multi-layer perceptron neural network model 
against a SparkDataFrame.
+#' Users can call \code{summary} to print a summary of the fitted model, 
\code{predict} to make
+#' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#' Only categorical data is supported.
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-classification-regression.html
+#' #multilayer-perceptron-classifier}{Multilayerperceptron classifier}.
+#'
+#' @param data A \code{SparkDataFrame} of observations and labels for 
model fitting
+#' @param blockSize BlockSize parameter
+#' @param layers Layers parameter
+#' @param solver Solver parameter, supported options: "gd" (minibatch 
gradient descent) or "l-bfgs"
+#' @param maxIter Maximum iteration number
+#' @param tol Convergence tolerance of iterations
+#' @param stepSize StepSize parameter
+#' @param seed Seed parameter for weights initialization
+#' @return \code{spark.mlp} returns a fitted Multilayer Perceptron 
Classification Model
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame,formula-method
+#' @name spark.mlp
+#' @seealso \link{read.ml}
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- read.df("data/mllib/sample_multiclass_classification_data.txt", 
source = "libsvm")
+#'
+#' # fit a Multilayer Perceptron Classification Model
+#' model <- spark.mlp(df, blockSize = 128, layers = c(4, 5, 4, 3), solver 
= "l-bfgs",
+#'maxIter = 100, tol = 0.5, stepSize = 1, seed = 1)
+#'
+#' # get the summary of the model
+#' summary(model)
+#'
+#' # make predictions
+#' predictions <- predict(model, df)
+#'
+#' # save and load the model
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.mlp since 2.1.0
+setMethod("spark.mlp", signature(data = "SparkDataFrame"),
+  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
+   tol = 0.5, stepSize = 1, seed = 1, ...) {
+jobj <- 
callJStatic("org.apache.spark.ml.r.MultilayerPerceptronClassifierWrapper",
+"fit", data@sdf, as.integer(blockSize), 
as.array(layers),
+solver, as.integer(maxIter), tol, 
stepSize, as.integer(seed))
+return(new("MultilayerPerceptronClassificationModel", jobj = 
jobj))
+  })
+
+# Makes predictions from a model produced by spark.mlp().
+
+#' @param newData A SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
labeled in a column named
+#' "prediction"
+#' @rdname spark.mlp
+#' @export
+#' @note predict(MultilayerPerceptronClassificationModel) since 2.0.0
--- End diff --

please add @aliases for each function introduced in this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74530043
  
--- Diff: R/pkg/R/mllib.R ---
@@ -414,6 +421,92 @@ setMethod("predict", signature(object = "KMeansModel"),
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
 
+#' Multilayer Perceptron Classification Model
+#'
+#' \code{spark.mlp} fits a multi-layer perceptron neural network model 
against a SparkDataFrame.
+#' Users can call \code{summary} to print a summary of the fitted model, 
\code{predict} to make
+#' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#' Only categorical data is supported.
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-classification-regression.html
+#' #multilayer-perceptron-classifier}{Multilayerperceptron classifier}.
+#'
+#' @param data A \code{SparkDataFrame} of observations and labels for 
model fitting
+#' @param blockSize BlockSize parameter
+#' @param layers Layers parameter
+#' @param solver Solver parameter, supported options: "gd" (minibatch 
gradient descent) or "l-bfgs"
+#' @param maxIter Maximum iteration number
+#' @param tol Convergence tolerance of iterations
+#' @param stepSize StepSize parameter
+#' @param seed Seed parameter for weights initialization
+#' @return \code{spark.mlp} returns a fitted Multilayer Perceptron 
Classification Model
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame,formula-method
+#' @name spark.mlp
+#' @seealso \link{read.ml}
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- read.df("data/mllib/sample_multiclass_classification_data.txt", 
source = "libsvm")
+#'
+#' # fit a Multilayer Perceptron Classification Model
+#' model <- spark.mlp(df, blockSize = 128, layers = c(4, 5, 4, 3), solver 
= "l-bfgs",
+#'maxIter = 100, tol = 0.5, stepSize = 1, seed = 1)
+#'
+#' # get the summary of the model
+#' summary(model)
+#'
+#' # make predictions
+#' predictions <- predict(model, df)
+#'
+#' # save and load the model
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.mlp since 2.1.0
+setMethod("spark.mlp", signature(data = "SparkDataFrame"),
+  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
+   tol = 0.5, stepSize = 1, seed = 1, ...) {
+jobj <- 
callJStatic("org.apache.spark.ml.r.MultilayerPerceptronClassifierWrapper",
+"fit", data@sdf, as.integer(blockSize), 
as.array(layers),
+solver, as.integer(maxIter), tol, 
stepSize, as.integer(seed))
+return(new("MultilayerPerceptronClassificationModel", jobj = 
jobj))
+  })
+
+# Makes predictions from a model produced by spark.mlp().
+
+#' @param newData A SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
labeled in a column named
+#' "prediction"
+#' @rdname spark.mlp
+#' @export
+#' @note predict(MultilayerPerceptronClassificationModel) since 2.0.0
+setMethod("predict", signature(object = 
"MultilayerPerceptronClassificationModel"),
+  function(object, newData) {
+return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
+  })
+
+# Returns the summary of a Multilayer Perceptron Classification Model 
produced by \code{spark.mlp}
+
+#' @param object A Multilayer Perceptron Classification Model fitted by 
\code{spark.mlp}
+#' @return \code{summary} returns a list containing \code{layers}, the 
label distribution, and
+#' \code{tables}, conditional probabilities given the target label
+#' @rdname spark.mlp
+#' @export
+#' @note summary(MultilayerPerceptronClassificationModel) since 2.0.0
--- End diff --

please add @aliases


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74529955
  
--- Diff: R/pkg/R/mllib.R ---
@@ -487,7 +580,7 @@ setMethod("write.ml", signature(object = 
"NaiveBayesModel", path = "character"),
 #' @rdname spark.survreg
 #' @export
 #' @note write.ml(AFTSurvivalRegressionModel, character) since 2.0.0
-#' @seealso \link{read.ml}
+#' @seealso \link{write.ml}
--- End diff --

same here, this was intentional to link to `read.ml`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74529891
  
--- Diff: R/pkg/R/mllib.R ---
@@ -414,6 +421,92 @@ setMethod("predict", signature(object = "KMeansModel"),
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
 
+#' Multilayer Perceptron Classification Model
+#'
+#' \code{spark.mlp} fits a multi-layer perceptron neural network model 
against a SparkDataFrame.
+#' Users can call \code{summary} to print a summary of the fitted model, 
\code{predict} to make
+#' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#' Only categorical data is supported.
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-classification-regression.html
+#' #multilayer-perceptron-classifier}{Multilayerperceptron classifier}.
+#'
+#' @param data A \code{SparkDataFrame} of observations and labels for 
model fitting
+#' @param blockSize BlockSize parameter
+#' @param layers Layers parameter
+#' @param solver Solver parameter, supported options: "gd" (minibatch 
gradient descent) or "l-bfgs"
+#' @param maxIter Maximum iteration number
+#' @param tol Convergence tolerance of iterations
+#' @param stepSize StepSize parameter
+#' @param seed Seed parameter for weights initialization
+#' @return \code{spark.mlp} returns a fitted Multilayer Perceptron 
Classification Model
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame,formula-method
+#' @name spark.mlp
+#' @seealso \link{read.ml}
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- read.df("data/mllib/sample_multiclass_classification_data.txt", 
source = "libsvm")
+#'
+#' # fit a Multilayer Perceptron Classification Model
+#' model <- spark.mlp(df, blockSize = 128, layers = c(4, 5, 4, 3), solver 
= "l-bfgs",
+#'maxIter = 100, tol = 0.5, stepSize = 1, seed = 1)
+#'
+#' # get the summary of the model
+#' summary(model)
+#'
+#' # make predictions
+#' predictions <- predict(model, df)
+#'
+#' # save and load the model
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.mlp since 2.1.0
+setMethod("spark.mlp", signature(data = "SparkDataFrame"),
+  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
+   tol = 0.5, stepSize = 1, seed = 1, ...) {
+jobj <- 
callJStatic("org.apache.spark.ml.r.MultilayerPerceptronClassifierWrapper",
+"fit", data@sdf, as.integer(blockSize), 
as.array(layers),
+solver, as.integer(maxIter), tol, 
stepSize, as.integer(seed))
+return(new("MultilayerPerceptronClassificationModel", jobj = 
jobj))
+  })
+
+# Makes predictions from a model produced by spark.mlp().
+
+#' @param newData A SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
labeled in a column named
+#' "prediction"
+#' @rdname spark.mlp
+#' @export
+#' @note predict(MultilayerPerceptronClassificationModel) since 2.0.0
+setMethod("predict", signature(object = 
"MultilayerPerceptronClassificationModel"),
+  function(object, newData) {
+return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
+  })
+
+# Returns the summary of a Multilayer Perceptron Classification Model 
produced by \code{spark.mlp}
+
+#' @param object A Multilayer Perceptron Classification Model fitted by 
\code{spark.mlp}
+#' @return \code{summary} returns a list containing \code{layers}, the 
label distribution, and
+#' \code{tables}, conditional probabilities given the target label
+#' @rdname spark.mlp
+#' @export
+#' @note summary(MultilayerPerceptronClassificationModel) since 2.0.0
--- End diff --

since 2.1.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74529897
  
--- Diff: R/pkg/R/mllib.R ---
@@ -414,6 +421,92 @@ setMethod("predict", signature(object = "KMeansModel"),
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
 
+#' Multilayer Perceptron Classification Model
+#'
+#' \code{spark.mlp} fits a multi-layer perceptron neural network model 
against a SparkDataFrame.
+#' Users can call \code{summary} to print a summary of the fitted model, 
\code{predict} to make
+#' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#' Only categorical data is supported.
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-classification-regression.html
+#' #multilayer-perceptron-classifier}{Multilayerperceptron classifier}.
+#'
+#' @param data A \code{SparkDataFrame} of observations and labels for 
model fitting
+#' @param blockSize BlockSize parameter
+#' @param layers Layers parameter
+#' @param solver Solver parameter, supported options: "gd" (minibatch 
gradient descent) or "l-bfgs"
+#' @param maxIter Maximum iteration number
+#' @param tol Convergence tolerance of iterations
+#' @param stepSize StepSize parameter
+#' @param seed Seed parameter for weights initialization
+#' @return \code{spark.mlp} returns a fitted Multilayer Perceptron 
Classification Model
+#' @rdname spark.mlp
+#' @aliases spark.mlp,SparkDataFrame,formula-method
+#' @name spark.mlp
+#' @seealso \link{read.ml}
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- read.df("data/mllib/sample_multiclass_classification_data.txt", 
source = "libsvm")
+#'
+#' # fit a Multilayer Perceptron Classification Model
+#' model <- spark.mlp(df, blockSize = 128, layers = c(4, 5, 4, 3), solver 
= "l-bfgs",
+#'maxIter = 100, tol = 0.5, stepSize = 1, seed = 1)
+#'
+#' # get the summary of the model
+#' summary(model)
+#'
+#' # make predictions
+#' predictions <- predict(model, df)
+#'
+#' # save and load the model
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.mlp since 2.1.0
+setMethod("spark.mlp", signature(data = "SparkDataFrame"),
+  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
+   tol = 0.5, stepSize = 1, seed = 1, ...) {
+jobj <- 
callJStatic("org.apache.spark.ml.r.MultilayerPerceptronClassifierWrapper",
+"fit", data@sdf, as.integer(blockSize), 
as.array(layers),
+solver, as.integer(maxIter), tol, 
stepSize, as.integer(seed))
+return(new("MultilayerPerceptronClassificationModel", jobj = 
jobj))
+  })
+
+# Makes predictions from a model produced by spark.mlp().
+
+#' @param newData A SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
labeled in a column named
+#' "prediction"
+#' @rdname spark.mlp
+#' @export
+#' @note predict(MultilayerPerceptronClassificationModel) since 2.0.0
--- End diff --

since 2.1.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74529661
  
--- Diff: R/pkg/R/mllib.R ---
@@ -414,6 +421,92 @@ setMethod("predict", signature(object = "KMeansModel"),
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
 
+#' Multilayer Perceptron Classification Model
+#'
+#' \code{spark.mlp} fits a multi-layer perceptron neural network model 
against a SparkDataFrame.
+#' Users can call \code{summary} to print a summary of the fitted model, 
\code{predict} to make
+#' predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#' Only categorical data is supported.
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-classification-regression.html
+#' #multilayer-perceptron-classifier}{Multilayerperceptron classifier}.
+#'
+#' @param data A \code{SparkDataFrame} of observations and labels for 
model fitting
+#' @param blockSize BlockSize parameter
+#' @param layers Layers parameter
--- End diff --

something like "integer vector containing the number of nodes for each 
layer"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14612: [SPARK-16803] [SQL] SaveAsTable does not work when sourc...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14612
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63645/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14522: [Spark-16508][SparkR] Split docs for arrange and ...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14522#discussion_r74529416
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2121,7 +2121,7 @@ setMethod("arrange",
   })
 
 #' @rdname arrange
-#' @name orderBy
--- End diff --

hmm, so you are saying having a @name is ok?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14447#discussion_r74529446
  
--- Diff: R/pkg/R/mllib.R ---
@@ -53,6 +53,13 @@ setClass("AFTSurvivalRegressionModel", 
representation(jobj = "jobj"))
 #' @note KMeansModel since 2.0.0
 setClass("KMeansModel", representation(jobj = "jobj"))
 
+#' S4 class that represents a MultilayerPerceptronClassificationModel
+#'
+#' @param jobj a Java object reference to the backing Scala 
MultilayerPerceptronClassifierWrapper
+#' @export
+#' @note MultilayerPerceptronClassificationModel since 2.0.0
--- End diff --

since 2.1.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14612: [SPARK-16803] [SQL] SaveAsTable does not work when sourc...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14612
  
**[Test build #63645 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63645/consoleFull)**
 for PR 14612 at commit 
[`71399f1`](https://github.com/apache/spark/commit/71399f1ca1e91af2a7d2a12c92d32bb691031c86).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14384: [Spark-16443][SparkR] Alternating Least Squares (...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14384#discussion_r74527784
  
--- Diff: R/pkg/R/mllib.R ---
@@ -632,3 +642,159 @@ setMethod("predict", signature(object = 
"AFTSurvivalRegressionModel"),
   function(object, newData) {
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
+
+
+#' Alternating Least Squares (ALS) for Collaborative Filtering
+#'
+#' \code{spark.als} learns latent factors in collaborative filtering via 
alternating least
+#' squares. Users can call \code{summary} to obtain fitted latent factors, 
\code{predict}
+#' to make predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#'
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-collaborative-filtering.html}{MLlib:
+#' Collaborative Filtering}.
+#' Additional arguments can be passed to the methods.
+#' \describe{
+#'\item{nonnegative}{logical value indicating whether to apply 
nonnegativity constraints.
+#'   Default: FALSE}
+#'\item{implicitPrefs}{logical value indicating whether to use 
implicit preference.
+#' Default: FALSE}
+#'\item{alpha}{alpha parameter in the implicit preference formulation 
(>= 0). Default: 1.0}
+#'\item{seed}{integer seed for random number generation. Default: 0}
+#'\item{numUserBlocks}{number of user blocks used to parallelize 
computation (> 0).
+#' Default: 10}
+#'\item{numItemBlocks}{number of item blocks used to parallelize 
computation (> 0).
+#' Default: 10}
+#'\item{checkpointInterval}{number of checkpoint intervals (>= 1) or 
disable checkpoint (-1).
+#'  Default: 10}
--- End diff --

Is there a reason we are preferring `...` vs naming these out like 
`maxIter` in the function definition on L714? if it's well known it's probably 
better to name them?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14182: [SPARK-16444][SparkR]: Isotonic Regression wrappe...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14182#discussion_r74527541
  
--- Diff: R/pkg/R/mllib.R ---
@@ -292,6 +299,85 @@ setMethod("summary", signature(object = 
"NaiveBayesModel"),
 return(list(apriori = apriori, tables = tables))
   })
 
+#' Isotonic Regression Model
+#'
+#' Fits an Isotonic Regression model against a Spark DataFrame, similarly 
to R's isoreg().
+#' Users can print, make predictions on the produced model and save the 
model to the input path.
+#'
+#' @param data SparkDataFrame for training
+#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#'operators are supported, including '~', '.', ':', '+', 
and '-'.
+#' @param isotonic Whether the output sequence should be 
isotonic/increasing (TRUE) or
+#' antitonic/decreasing (FALSE)
+#' @param featureIndex The index of the feature if \code{featuresCol} is a 
vector column (default: `0`),
+#' no effect otherwise
+#' @param weightCol The weight column name.
+#' @return \code{spark.isoreg} returns a fitted Isotonic Regression model
+#' @rdname spark.isoreg
+#' @aliases spark.isoreg,SparkDataFrame,formula-method
+#' @name spark.isoreg
+#' @export
+#' @examples
+#' \dontrun{
+#' sparkR.session()
+#' data <- list(list(7.0, 0.0), list(5.0, 1.0), list(3.0, 2.0),
+#' list(5.0, 3.0), list(1.0, 4.0))
+#' df <- createDataFrame(data, c("label", "feature"))
+#' model <- spark.isoreg(df, label ~ feature, isotonic = FALSE)
+#' # return model boundaries and prediction as lists
--- End diff --

also please add `spark.isoreg` to @seealso of write.ml (around L63), 
predict like other ML


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14608: [SPARK-17013][SQL] Parse negative numeric literals

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14608
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63644/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14608: [SPARK-17013][SQL] Parse negative numeric literals

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14608
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14608: [SPARK-17013][SQL] Parse negative numeric literals

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14608
  
**[Test build #63644 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63644/consoleFull)**
 for PR 14608 at commit 
[`154abba`](https://github.com/apache/spark/commit/154abba4cff36f352e47927d8b707b1c3fa25668).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14384: [Spark-16443][SparkR] Alternating Least Squares (...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14384#discussion_r74527216
  
--- Diff: R/pkg/R/mllib.R ---
@@ -632,3 +642,159 @@ setMethod("predict", signature(object = 
"AFTSurvivalRegressionModel"),
   function(object, newData) {
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
+
+
+#' Alternating Least Squares (ALS) for Collaborative Filtering
+#'
+#' \code{spark.als} learns latent factors in collaborative filtering via 
alternating least
+#' squares. Users can call \code{summary} to obtain fitted latent factors, 
\code{predict}
+#' to make predictions on new data, and \code{write.ml}/\code{read.ml} to 
save/load fitted models.
+#'
+#' For more details, see
+#' 
\href{http://spark.apache.org/docs/latest/ml-collaborative-filtering.html}{MLlib:
+#' Collaborative Filtering}.
+#' Additional arguments can be passed to the methods.
+#' \describe{
+#'\item{nonnegative}{logical value indicating whether to apply 
nonnegativity constraints.
+#'   Default: FALSE}
+#'\item{implicitPrefs}{logical value indicating whether to use 
implicit preference.
+#' Default: FALSE}
+#'\item{alpha}{alpha parameter in the implicit preference formulation 
(>= 0). Default: 1.0}
+#'\item{seed}{integer seed for random number generation. Default: 0}
+#'\item{numUserBlocks}{number of user blocks used to parallelize 
computation (> 0).
+#' Default: 10}
+#'\item{numItemBlocks}{number of item blocks used to parallelize 
computation (> 0).
+#' Default: 10}
+#'\item{checkpointInterval}{number of checkpoint intervals (>= 1) or 
disable checkpoint (-1).
+#'  Default: 10}
+#'}
+#'
+#' @param data A SparkDataFrame for training
+#' @param ratingCol column name for ratings
+#' @param userCol column name for user ids. Ids must be (or can be coerced 
into) integers
+#' @param itemCol column name for item ids. Ids must be (or can be coerced 
into) integers
+#' @param rank rank of the matrix factorization (> 0)
+#' @param reg regularization parameter (>= 0)
+#' @param maxIter maximum number of iterations (>= 0)
--- End diff --

please add documentation for `...` as for example `@param ... additional 
name arguments such as nonnegative`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14614
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14614
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63649/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14609: [MINOR][Core] fix warnings on depreciated methods in Mes...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14609
  
**[Test build #63650 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63650/consoleFull)**
 for PR 14609 at commit 
[`75b6c22`](https://github.com/apache/spark/commit/75b6c2254be381abef667a0fce1d47a6f8b40cf5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14116
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14116: [SPARK-16452][SQL] Support basic INFORMATION_SCHEMA

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14116
  
**[Test build #63641 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63641/consoleFull)**
 for PR 14116 at commit 
[`dc5d1dc`](https://github.com/apache/spark/commit/dc5d1dc38d3cd92c08bedd2ee5ce6f0937353ca3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14229: [SPARK-16447][ML][SparkR] LDA wrapper in SparkR

2016-08-11 Thread yinxusen

Github user yinxusen commented on the issue:

https://github.com/apache/spark/pull/14229
  
@felixcheung Yes. Sorry I missed the email.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14426: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14426
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14426: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14426
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63639/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14182: [SPARK-16444][SparkR]: Isotonic Regression wrappe...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14182#discussion_r74526172
  
--- Diff: R/pkg/R/mllib.R ---
@@ -292,6 +299,83 @@ setMethod("summary", signature(object = 
"NaiveBayesModel"),
 return(list(apriori = apriori, tables = tables))
   })
 
+#' Isotonic Regression Model
+#'
+#' Fits an Isotonic Regression model against a Spark DataFrame, similarly 
to R's isoreg().
+#' Users can print, make predictions on the produced model and save the 
model to the input path.
+#'
+#' @param data SparkDataFrame for training
+#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#'operators are supported, including '~', '.', ':', '+', 
and '-'.
+#' @param isotonic Whether the output sequence should be 
isotonic/increasing (true) or
+#' antitonic/decreasing (false)
+#' @param featureIndex The index of the feature if \code{featuresCol} is a 
vector column (default: `0`),
+#' no effect otherwise
+#' @param weightCol The weight column name.
+#' @return \code{spark.isoreg} returns a fitted Isotonic Regression model
+#' @rdname spark.isoreg
+#' @aliases spark.isoreg,SparkDataFrame,formula-method
+#' @name spark.isoreg
+#' @export
+#' @examples
+#' \dontrun{
+#' sparkR.session()
+#' data <- list(list(7.0, 0.0), list(5.0, 1.0), list(3.0, 2.0),
+#' list(5.0, 3.0), list(1.0, 4.0))
+#' df <- createDataFrame(data, c("label", "feature"))
+#' model <- spark.isoreg(df, label ~ feature, isotonic = FALSE)
+#' # return model boundaries and prediction as lists
+#' result <- summary(model, df)
+#'
+#' # save fitted model to input path
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#'
+#' # can also read back the saved model and print
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.isoreg since 2.1.0
+setMethod("spark.isoreg", signature(data = "SparkDataFrame", formula = 
"formula"),
+  function(data, formula, isotonic = TRUE, featureIndex = 0, 
weightCol = NULL) {
+formula <- paste0(deparse(formula), collapse = "")
+
+if (is.null(weightCol)) {
+  weightCol <- ""
+}
+
+jobj <- 
callJStatic("org.apache.spark.ml.r.IsotonicRegressionWrapper", "fit",
+data@sdf, formula, as.logical(isotonic), 
as.integer(featureIndex), weightCol)
+return(new("IsotonicRegressionModel", jobj = jobj))
+  })
+
+#  Predicted values based on an isotonicRegression model
+
+#' @param object a fitted isotonicRegressionModel
+#' @param newData SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
values
+#' @rdname spark.isoreg
+#' @export
+#' @note predict(isotonicRegressionModel) since 2.1.0
+setMethod("predict", signature(object = "IsotonicRegressionModel"),
+  function(object, newData) {
+return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
+  })
+
+#  Get the summary of a isotonicRegressionModel model
+
+#' @return \code{summary} returns the model's boundaries and prediction as 
lists
+#' @rdname spark.isoreg
--- End diff --

Please make sure we add this - otherwise it fails CRAN test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor po...

2016-08-11 Thread skonto

Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/11157#discussion_r74526303
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala
 ---
@@ -358,6 +376,109 @@ private[mesos] trait MesosSchedulerUtils extends 
Logging {
   }
 
   /**
+   * Checks executor ports if they are within some range of the offered 
list of ports ranges,
+   *
+   * @param conf the Spark Config
+   * @param ports the list of ports to check
+   * @return true if ports are within range false otherwise
+   */
+  protected def checkPorts(conf: SparkConf, ports: List[(Long, Long)]): 
Boolean = {
+
+def checkIfInRange(port: Long, ps: List[(Long, Long)]): Boolean = {
+  ps.exists(r => r._1 <= port & r._2 >= port)
+}
+
+val portsToCheck = nonZeroPortValuesFromConfig(conf)
+val withinRange = portsToCheck.forall(p => checkIfInRange(p, ports))
+// make sure we have enough ports to allocate per offer
+ports.map(r => r._2 - r._1 + 1).sum >= portsToCheck.size && withinRange
+  }
+
+  /**
+   * Partitions port resources.
+   *
+   * @param requestedPorts non-zero ports to assign
+   * @param offeredResources the resources offered
+   * @return resources left, port resources to be used.
+   */
+  def partitionPortResources(requestedPorts: List[Long], offeredResources: 
List[Resource])
+: (List[Resource], List[Resource]) = {
+if (requestedPorts.isEmpty) {
+  (offeredResources, List[Resource]())
+}
+else {
+  // partition port offers
+  val (resourcesWithoutPorts, portResources) = 
filterPortResources(offeredResources)
+
+  val portsAndRoles = requestedPorts.
+map(x => (x, findPortAndGetAssignedRangeRole(x, portResources)))
+
+  val assignedPortResources = createResourcesFromPorts(portsAndRoles)
+
+  // ignore non-assigned port resources, they will be declined 
implicitly by mesos
+  // no need for splitting port resources.
+  (resourcesWithoutPorts, assignedPortResources)
+}
+  }
+
+  val managedPortNames = List("spark.executor.port", 
"spark.blockManager.port")
+
+  /**
+   * The values of the non-zero ports to be used by the executor process.
+   * @param conf the spark config to use
+   * @return the ono-zero values of the ports
+   */
+  def nonZeroPortValuesFromConfig(conf: SparkConf): List[Long] = {
+managedPortNames.map(conf.getLong(_, 0)).filter( _ != 0)
+  }
+
+  /** Creates a mesos resource for a specific port number. */
+  private def createResourcesFromPorts(portsAndRoles: List[(Long, 
String)]) : List[Resource] = {
+portsAndRoles.flatMap{port => createMesosPortResource(List((port._1, 
port._1)), Some(port._2))}
+  }
+
+  /** Helper to create mesos resources for specific port ranges. */
+  private def createMesosPortResource(
+  ranges: List[(Long, Long)],
+  role: Option[String] = None): List[Resource] = {
+ranges.map { range =>
--- End diff --

ok will try it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor po...

2016-08-11 Thread skonto

Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/11157#discussion_r74526278
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala
 ---
@@ -358,6 +376,109 @@ private[mesos] trait MesosSchedulerUtils extends 
Logging {
   }
 
   /**
+   * Checks executor ports if they are within some range of the offered 
list of ports ranges,
+   *
+   * @param conf the Spark Config
+   * @param ports the list of ports to check
+   * @return true if ports are within range false otherwise
+   */
+  protected def checkPorts(conf: SparkConf, ports: List[(Long, Long)]): 
Boolean = {
+
+def checkIfInRange(port: Long, ps: List[(Long, Long)]): Boolean = {
+  ps.exists(r => r._1 <= port & r._2 >= port)
+}
+
+val portsToCheck = nonZeroPortValuesFromConfig(conf)
+val withinRange = portsToCheck.forall(p => checkIfInRange(p, ports))
+// make sure we have enough ports to allocate per offer
+ports.map(r => r._2 - r._1 + 1).sum >= portsToCheck.size && withinRange
+  }
+
+  /**
+   * Partitions port resources.
+   *
+   * @param requestedPorts non-zero ports to assign
+   * @param offeredResources the resources offered
+   * @return resources left, port resources to be used.
+   */
+  def partitionPortResources(requestedPorts: List[Long], offeredResources: 
List[Resource])
+: (List[Resource], List[Resource]) = {
+if (requestedPorts.isEmpty) {
+  (offeredResources, List[Resource]())
+}
+else {
+  // partition port offers
+  val (resourcesWithoutPorts, portResources) = 
filterPortResources(offeredResources)
+
+  val portsAndRoles = requestedPorts.
+map(x => (x, findPortAndGetAssignedRangeRole(x, portResources)))
+
+  val assignedPortResources = createResourcesFromPorts(portsAndRoles)
+
+  // ignore non-assigned port resources, they will be declined 
implicitly by mesos
+  // no need for splitting port resources.
+  (resourcesWithoutPorts, assignedPortResources)
+}
+  }
+
+  val managedPortNames = List("spark.executor.port", 
"spark.blockManager.port")
+
+  /**
+   * The values of the non-zero ports to be used by the executor process.
+   * @param conf the spark config to use
+   * @return the ono-zero values of the ports
+   */
+  def nonZeroPortValuesFromConfig(conf: SparkConf): List[Long] = {
+managedPortNames.map(conf.getLong(_, 0)).filter( _ != 0)
+  }
+
+  /** Creates a mesos resource for a specific port number. */
+  private def createResourcesFromPorts(portsAndRoles: List[(Long, 
String)]) : List[Resource] = {
+portsAndRoles.flatMap{port => createMesosPortResource(List((port._1, 
port._1)), Some(port._2))}
--- End diff --

ok 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor po...

2016-08-11 Thread skonto

Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/11157#discussion_r74526246
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala
 ---
@@ -358,6 +376,109 @@ private[mesos] trait MesosSchedulerUtils extends 
Logging {
   }
 
   /**
+   * Checks executor ports if they are within some range of the offered 
list of ports ranges,
+   *
+   * @param conf the Spark Config
+   * @param ports the list of ports to check
+   * @return true if ports are within range false otherwise
+   */
+  protected def checkPorts(conf: SparkConf, ports: List[(Long, Long)]): 
Boolean = {
+
+def checkIfInRange(port: Long, ps: List[(Long, Long)]): Boolean = {
+  ps.exists(r => r._1 <= port & r._2 >= port)
+}
+
+val portsToCheck = nonZeroPortValuesFromConfig(conf)
+val withinRange = portsToCheck.forall(p => checkIfInRange(p, ports))
+// make sure we have enough ports to allocate per offer
+ports.map(r => r._2 - r._1 + 1).sum >= portsToCheck.size && withinRange
+  }
+
+  /**
+   * Partitions port resources.
+   *
+   * @param requestedPorts non-zero ports to assign
+   * @param offeredResources the resources offered
+   * @return resources left, port resources to be used.
+   */
+  def partitionPortResources(requestedPorts: List[Long], offeredResources: 
List[Resource])
+: (List[Resource], List[Resource]) = {
+if (requestedPorts.isEmpty) {
+  (offeredResources, List[Resource]())
+}
+else {
--- End diff --

ok 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14561: [SPARK-16972][CORE] Move DriverEndpoint out of CoarseGra...

2016-08-11 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/14561
  
Agreed with @jerryshao. @lshmouse  could you submit the whole refactoring 
PR in order to show why this one is necessary? It's better to not refactor 
stable code paths unless there is a strong reason.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14182: [SPARK-16444][SparkR]: Isotonic Regression wrappe...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14182#discussion_r74525862
  
--- Diff: R/pkg/R/mllib.R ---
@@ -292,6 +299,85 @@ setMethod("summary", signature(object = 
"NaiveBayesModel"),
 return(list(apriori = apriori, tables = tables))
   })
 
+#' Isotonic Regression Model
+#'
+#' Fits an Isotonic Regression model against a Spark DataFrame, similarly 
to R's isoreg().
+#' Users can print, make predictions on the produced model and save the 
model to the input path.
+#'
+#' @param data SparkDataFrame for training
+#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#'operators are supported, including '~', '.', ':', '+', 
and '-'.
+#' @param isotonic Whether the output sequence should be 
isotonic/increasing (TRUE) or
+#' antitonic/decreasing (FALSE)
+#' @param featureIndex The index of the feature if \code{featuresCol} is a 
vector column (default: `0`),
+#' no effect otherwise
+#' @param weightCol The weight column name.
+#' @return \code{spark.isoreg} returns a fitted Isotonic Regression model
+#' @rdname spark.isoreg
+#' @aliases spark.isoreg,SparkDataFrame,formula-method
+#' @name spark.isoreg
+#' @export
+#' @examples
+#' \dontrun{
+#' sparkR.session()
+#' data <- list(list(7.0, 0.0), list(5.0, 1.0), list(3.0, 2.0),
+#' list(5.0, 3.0), list(1.0, 4.0))
+#' df <- createDataFrame(data, c("label", "feature"))
+#' model <- spark.isoreg(df, label ~ feature, isotonic = FALSE)
+#' # return model boundaries and prediction as lists
+#' result <- summary(model, df)
+#'
+#' # save fitted model to input path
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#'
+#' # can also read back the saved model and print
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.isoreg since 2.1.0
+setMethod("spark.isoreg", signature(data = "SparkDataFrame", formula = 
"formula"),
+  function(data, formula, isotonic = TRUE, featureIndex = 0, 
weightCol = NULL) {
+formula <- paste0(deparse(formula), collapse = "")
+
+if (is.null(weightCol)) {
+  weightCol <- ""
+}
+
+jobj <- 
callJStatic("org.apache.spark.ml.r.IsotonicRegressionWrapper", "fit",
+data@sdf, formula, as.logical(isotonic), 
as.integer(featureIndex),
+  as.character(weightCol))
+return(new("IsotonicRegressionModel", jobj = jobj))
+  })
+
+#  Predicted values based on an isotonicRegression model
+
+#' @param object a fitted isotonicRegressionModel
+#' @param newData SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
values
+#' @rdname spark.isoreg
+#' @export
+#' @note predict(isotonicRegressionModel) since 2.1.0
--- End diff --

capital "IsotonicRegressionModel" since it's a class it needs to match?
similarly in L355, 368, 372


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14182: [SPARK-16444][SparkR]: Isotonic Regression wrappe...

2016-08-11 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14182#discussion_r74525897
  
--- Diff: R/pkg/R/mllib.R ---
@@ -292,6 +299,85 @@ setMethod("summary", signature(object = 
"NaiveBayesModel"),
 return(list(apriori = apriori, tables = tables))
   })
 
+#' Isotonic Regression Model
+#'
+#' Fits an Isotonic Regression model against a Spark DataFrame, similarly 
to R's isoreg().
+#' Users can print, make predictions on the produced model and save the 
model to the input path.
+#'
+#' @param data SparkDataFrame for training
+#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#'operators are supported, including '~', '.', ':', '+', 
and '-'.
+#' @param isotonic Whether the output sequence should be 
isotonic/increasing (TRUE) or
+#' antitonic/decreasing (FALSE)
+#' @param featureIndex The index of the feature if \code{featuresCol} is a 
vector column (default: `0`),
+#' no effect otherwise
+#' @param weightCol The weight column name.
+#' @return \code{spark.isoreg} returns a fitted Isotonic Regression model
+#' @rdname spark.isoreg
+#' @aliases spark.isoreg,SparkDataFrame,formula-method
+#' @name spark.isoreg
+#' @export
+#' @examples
+#' \dontrun{
+#' sparkR.session()
+#' data <- list(list(7.0, 0.0), list(5.0, 1.0), list(3.0, 2.0),
+#' list(5.0, 3.0), list(1.0, 4.0))
+#' df <- createDataFrame(data, c("label", "feature"))
+#' model <- spark.isoreg(df, label ~ feature, isotonic = FALSE)
+#' # return model boundaries and prediction as lists
--- End diff --

could you add an example with `predict`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14613: [SPARK-16883][SparkR]:SQL decimal type is not properly c...

2016-08-11 Thread shivaram

Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14613
  
@wangmiao1981 Thanks for the PR. Could we add a couple of test cases for 
this ? It'll also help me understand what is the expected behavior -- one of 
them could be for `collect` with decimals and another one could be for `str` on 
a Spark DatatFrame which contains decimals.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14613: [SPARK-16883][SparkR]:SQL decimal type is not properly c...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14613
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63648/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14613: [SPARK-16883][SparkR]:SQL decimal type is not properly c...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14613
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...

2016-08-11 Thread clockfly

Github user clockfly commented on the issue:

https://github.com/apache/spark/pull/14546
  
I think a proper fix will be marking ordinal unresolved, the ordinal can 
exists in group by or order by expression.

Then we can make sure the ResolveAggregateFunctions and other analyzer 
rules doesn't assume 
the ordinals are resolved, and do pre-mature Analysis.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14392: [SPARK-16446] [SparkR] [ML] Gaussian Mixture Model wrapp...

2016-08-11 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14392
  
btw, I think it'll be great to get some feedback on the naming of this.
As per SPARK-14831, should we go with a more Spark specific name like 
`gaussianmixture` rather than a R one? How well known is `mvnormalmixEM `? How 
close is the Spark implementation to that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...

2016-08-11 Thread clockfly

Github user clockfly commented on the issue:

https://github.com/apache/spark/pull/14546
  
I think the root cause is that the Aggregate operator is treated as 
resolved if even it has group by ordinals.

For example:
```
'Filter ('a > 0)
   +- Aggregate [2], [count(1) AS count(1)#83L, a#81]
+- SubqueryAlias tmp
+- Project [1 AS a#81]
 +- OneRowRelation$
```
Aggregate is treated as resolved even if it has a group by ordinal "2".

Then, it tries to resolve the `Filter` by putting the `Filter` as a 
aggregation expression:

```
!'Aggregate [2], [('a > 0) AS havingCondition#84] 
 +- SubqueryAlias tmp
+- Project [1 AS a#81]
   +- OneRowRelation$
```

Actually this plan is already wrong. As we are asking for ordinal "2", but 
actually there is only one 
aggregation expression `[('a > 0) AS havingCondition#84] `






 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14609: [MINOR][Core] fix warnings on depreciated methods in Mes...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14609
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63638/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14615: make toJSON not go through rdd form but operate o...

2016-08-11 Thread robert3005

GitHub user robert3005 opened a pull request:

https://github.com/apache/spark/pull/14615

make toJSON not go through rdd form but operate on dataset always

## What changes were proposed in this pull request?

Don't convert toRdd when doing toJSON

## How was this patch tested?
Existing unit tests



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robert3005/spark robertk/correct-tojson

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14615.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14615


commit 98086f4fdf0d7464bed0bb4f23c3694da828e222
Author: Robert Kruszewski 
Date:   2016-08-11T19:26:21Z

make toJSON not go through rdd form but operate on dataset always




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14609: [MINOR][Core] fix warnings on depreciated methods in Mes...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14609
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14571: [SPARK-16983][SQL] Add `prettyName` for row_number, dens...

2016-08-11 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14571
  
I see. Then, I'll include only that today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12930: [SPARK-15153] [ML] [SparkR] Fix SparkR spark.naiveBayes ...

2016-08-11 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/12930
  
Do we still need this?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...

2016-08-11 Thread clockfly

Github user clockfly commented on the issue:

https://github.com/apache/spark/pull/14546
  
@dongjoon-hyun  The exception was muted by line:

https://github.com/apache/spark/pull/14546/files#diff-57b3d87be744b7d79a9beacf8e5e5eb2R1257

If you add some log message, you wils find it still throws exception like:
```
org.apache.spark.sql.AnalysisException: GROUP BY position 2 is not in 
select list (valid range is [1, 1]); line 1 pos 53
...
```




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14586: [SPARK-17003] [BUILD] [BRANCH-1.6] release-build.sh is m...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14586
  
**[Test build #3222 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3222/consoleFull)**
 for PR 14586 at commit 
[`a785c01`](https://github.com/apache/spark/commit/a785c0190bf093f1e6deb0c46b5dbc89bc307603).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14611: [SPARK-17028][Repl]Backport SI-9734 for Scala 2.10

2016-08-11 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/14611
  
Closing this one. Just found another issue with the current implementation 
and will reopen in the future


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14611: [SPARK-17028][Repl]Backport SI-9734 for Scala 2.1...

2016-08-11 Thread zsxwing

Github user zsxwing closed the pull request at:

https://github.com/apache/spark/pull/14611


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14397
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63637/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14613: [SPARK-16883][SparkR]:SQL decimal type is not properly c...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14613
  
**[Test build #63648 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63648/consoleFull)**
 for PR 14613 at commit 
[`e95f557`](https://github.com/apache/spark/commit/e95f5575018d15782917b9b3d679b4f6da345ee6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14586: [SPARK-17003] [BUILD] [BRANCH-1.6] release-build.sh is m...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14586
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14614: [SPARK-17027][ML] Avoid integer overflow in PolynomialEx...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14614
  
**[Test build #63649 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63649/consoleFull)**
 for PR 14614 at commit 
[`47170b8`](https://github.com/apache/spark/commit/47170b80cad68baf073fff54f5505124508267fd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14613: [SPARK-16883][SparkR]:SQL decimal type is not pro...

2016-08-11 Thread wangmiao1981

GitHub user wangmiao1981 opened a pull request:

https://github.com/apache/spark/pull/14613

[SPARK-16883][SparkR]:SQL decimal type is not properly cast to number when 
collecting SparkDataFrame

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

registerTempTable(createDataFrame(iris), "iris")
str(collect(sql("select cast('1' as double) as x, cast('2' as decimal) as y 
 from iris limit 5")))

'data.frame':   5 obs. of  2 variables:
 $ x: num  1 1 1 1 1
 $ y:List of 5
  ..$ : num 2
  ..$ : num 2
  ..$ : num 2
  ..$ : num 2
  ..$ : num 2

The problem is that spark returns `decimal(10, 0)` col type, instead of 
`decimal`. Thus, `decimal(10, 0)` is not handled correctly. It should be 
handled as "double".

As discussed in JIRA thread, we can have two potential fixes:
1). Scala side fix to add a new case when writing the object back; However, 
I can't use spark.sql.types._ in Spark core due to dependency issues. I don't 
find a way of doing type case match;

2). SparkR side fix: Add a helper function to check special type like 
`"decimal(10, 0)"` and replace it with `double`, which is PRIMITIVE type. This 
special helper is generic for adding new types handling in the future. 

I open this PR to discuss pros and cons of both approaches. If we want to 
do Scala side fix, we need to find a way to match the case of DecimalType and 
StructType in Spark Core.

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)

Manual test:
> str(collect(sql("select cast('1' as double) as x, cast('2' as decimal) as 
y  from iris limit 5")))
'data.frame':   5 obs. of  2 variables:
 $ x: num  1 1 1 1 1
 $ y: num  2 2 2 2 2
R Unit tests



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangmiao1981/spark type

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14613.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14613


commit e95f5575018d15782917b9b3d679b4f6da345ee6
Author: wm...@hotmail.com 
Date:   2016-08-11T23:15:15Z

add a type check helper




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14614: [SPARK-17027][ML] Avoid integer overflow in Polyn...

2016-08-11 Thread zero323

GitHub user zero323 opened a pull request:

https://github.com/apache/spark/pull/14614

[SPARK-17027][ML] Avoid integer overflow in PolynomialExpansion.getPolySize

## What changes were proposed in this pull request?

Replaces custom choose function with 
o.a.commons.math3.CombinatoricsUtils.binomialCoefficient

## How was this patch tested?

Spark unit tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zero323/spark SPARK-17027

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14614.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14614


commit 47170b80cad68baf073fff54f5505124508267fd
Author: zero323 
Date:   2016-08-11T17:44:32Z

Replace PolynomialExpansion.choose with 
CombinatoricsUtils.binomialCoefficient




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14571: [SPARK-16983][SQL] Add `prettyName` for row_number, dens...

2016-08-11 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/14571
  
Oh, I think you are working on that transition somewhere else.
BTW, what about other tests? If you have a plan to enrich 
SQLQueryTestSuite, I prefer to do all of them in a single PR during this 
weekend. How do you think about making a single JIRA for that transition?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14586: [SPARK-17003] [BUILD] [BRANCH-1.6] release-build.sh is m...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14586
  
**[Test build #63647 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63647/consoleFull)**
 for PR 14586 at commit 
[`a785c01`](https://github.com/apache/spark/commit/a785c0190bf093f1e6deb0c46b5dbc89bc307603).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14586: [SPARK-17003] [BUILD] [BRANCH-1.6] release-build.sh is m...

2016-08-11 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14586
  
Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14571: [SPARK-16983][SQL] Add `prettyName` for row_number, dens...

2016-08-11 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14571
  
Do you think you can create a test file for window functions in the new 
SQLQueryTestSuite along with this fix?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14612: [SPARK-16803] [SQL] SaveAsTable does not work when sourc...

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14612
  
**[Test build #63645 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63645/consoleFull)**
 for PR 14612 at commit 
[`71399f1`](https://github.com/apache/spark/commit/71399f1ca1e91af2a7d2a12c92d32bb691031c86).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14607: [SPARK-16905] SQL DDL: MSCK REPAIR TABLE (follow-up)

2016-08-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14607
  
**[Test build #63646 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63646/consoleFull)**
 for PR 14607 at commit 
[`c442b75`](https://github.com/apache/spark/commit/c442b758e8bf0fc1affd1daa08381d458c7a71a4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14607: [SPARK-16905] SQL DDL: MSCK REPAIR TABLE (follow-up)

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14607
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63636/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14607: [SPARK-16905] SQL DDL: MSCK REPAIR TABLE (follow-up)

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14607
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13146: [SPARK-13081][PYSPARK][SPARK_SUBMIT]. Allow set pythonEx...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13146
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13146: [SPARK-13081][PYSPARK][SPARK_SUBMIT]. Allow set pythonEx...

2016-08-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13146
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63632/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 >

1 - 100 of 558 matches

Mail list logo