[GitHub] spark pull request: [SPARK-3349] Output partitioning of limit shou...

2014-09-03 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/2262 [SPARK-3349] Output partitioning of limit should not be inherited from child This resolves https://issues.apache.org/jira/browse/SPARK-3349 You can merge this pull request into a Git repository

[GitHub] spark pull request: [SPARK-3394] Fix crash in TakeOrdered when lim...

2014-09-03 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/2264 [SPARK-3394] Fix crash in TakeOrdered when limit is 0 This resolves https://issues.apache.org/jira/browse/SPARK-3394 You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-3395] DSL sometimes incorrectly reuses ...

2014-09-03 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/2266 [SPARK-3395] DSL sometimes incorrectly reuses attribute ids, breaking queries This resolves https://issues.apache.org/jira/browse/SPARK-3395 You can merge this pull request into a Git repository

[GitHub] spark pull request: [SPARK-3394] [SQL] Fix crash in TakeOrdered wh...

2014-09-03 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/2264#issuecomment-54406499 That's a good point - and it seems the pyspark api for example does support takeOrdered(0). Updated. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-3394] [SQL] Fix crash in TakeOrdered wh...

2014-09-04 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/2264#issuecomment-54520273 addressed comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3349] [SQL] Output partitioning of limi...

2014-09-04 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/2262#issuecomment-54526771 added regression test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3395] [SQL] DSL sometimes incorrectly r...

2014-09-04 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/2266#issuecomment-54535256 added regression test in DslQuerySuite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6749] [SQL] Make metastore client robus...

2015-06-23 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/6912#discussion_r33114226 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala --- @@ -136,12 +137,62 @@ private[hive] class ClientWrapper

[GitHub] spark pull request: [SPARK-6749] [SQL] Make metastore client robus...

2015-06-19 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/6912#issuecomment-113676454 @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-6749] [SQL] Make metastore client robus...

2015-06-19 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/6912 [SPARK-6749] [SQL] Make metastore client robust to underlying socket connection loss This works around a bug in the underlying RetryingMetaStoreClient (HIVE-10384) by refreshing the metastore client

[GitHub] spark pull request: [SPARK-9914] [ML] define setters explicitly fo...

2015-08-12 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/8143#discussion_r36924356 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -144,6 +145,12 @@ class RFormulaModel private[feature]( pipelineModel

[GitHub] spark pull request: [SPARK-9914] [ML] define setters explicitly fo...

2015-08-12 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/8143#issuecomment-130528387 Lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-9895] User Guide for RFormula Feature T...

2015-08-19 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/8293#discussion_r37453786 --- Diff: docs/ml-features.md --- @@ -1386,6 +1386,102 @@ print(output.select(features, clicked).first()) {% endhighlight %} /div /div

[GitHub] spark pull request: [SPARK-9895] User Guide for RFormula Feature T...

2015-08-19 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/8293#discussion_r37453787 --- Diff: docs/ml-features.md --- @@ -1386,6 +1386,102 @@ print(output.select(features, clicked).first()) {% endhighlight %} /div /div

[GitHub] spark pull request: [SPARK-9895] User Guide for RFormula Feature T...

2015-08-19 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/8293#discussion_r37453762 --- Diff: docs/ml-features.md --- @@ -1386,6 +1386,102 @@ print(output.select(features, clicked).first()) {% endhighlight %} /div /div

[GitHub] spark pull request: [SPARK-9895] User Guide for RFormula Feature T...

2015-08-19 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/8293#issuecomment-132787474 Fixed and verified. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-9895] User Guide for RFormula Feature T...

2015-08-18 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/8293 [SPARK-9895] User Guide for RFormula Feature Transformer @mengxr You can merge this pull request into a Git repository by running: $ git pull https://github.com/ericl/spark docs-2

[GitHub] spark pull request: [SPARK-9698] [ML] Add RInteraction transformer...

2015-08-20 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7987#issuecomment-133127935 If if I understand correctly, the concern is that the category to index assignment when predicting data will be different from that used when fitting the model

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7707#discussion_r35620888 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormulaParser.scala --- @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7707#discussion_r35620875 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/SparkRWrappers.scala --- @@ -32,8 +32,10 @@ private[r] object SparkRWrappers { alpha: Double

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7707#discussion_r35620883 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormulaParser.scala --- @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7707#discussion_r35620868 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaParserSuite.scala --- @@ -32,4 +37,38 @@ class RFormulaParserSuite extends SparkFunSuite

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7707#discussion_r35620892 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -78,13 +78,20 @@ class RFormula(override val uid: String) extends Estimator

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7707#discussion_r35620872 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormulaParser.scala --- @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7707#discussion_r35620881 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/SparkRWrappers.scala --- @@ -32,8 +32,10 @@ private[r] object SparkRWrappers { alpha: Double

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7707#discussion_r35620870 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormulaParser.scala --- @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7707#discussion_r35620885 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormulaParser.scala --- @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-29 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/7771 [SPARK-9463] [ML] Expose model coefficients with names in SparkR RFormula Preview: ``` summary(m) features coefficients 1(Intercept)1.6765001 2

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35928619 --- Diff: mllib/src/main/scala/org/apache/spark/ml/attribute/AttributeGroup.scala --- @@ -165,6 +165,11 @@ class AttributeGroup private ( /** Converts

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35928599 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -41,6 +42,17 @@ class VectorAssembler(override val uid: String

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35928607 --- Diff: R/pkg/R/mllib.R --- @@ -71,3 +71,27 @@ setMethod(predict, signature(object = PipelineModel), function(object, newData

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35928626 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala --- @@ -56,17 +56,26 @@ class OneHotEncoder(override val uid: String) extends

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35928666 --- Diff: R/pkg/R/mllib.R --- @@ -71,3 +71,27 @@ setMethod(predict, signature(object = PipelineModel), function(object, newData

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35928613 --- Diff: R/pkg/R/mllib.R --- @@ -71,3 +71,27 @@ setMethod(predict, signature(object = PipelineModel), function(object, newData

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35928631 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -300,7 +303,8 @@ class LinearRegressionTrainingSummary private

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35931506 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala --- @@ -167,4 +165,10 @@ class OneHotEncoder(override val uid: String) extends

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35931502 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -300,7 +303,8 @@ class LinearRegressionTrainingSummary private

[GitHub] spark pull request: [SPARK-9463] [ML] Expose model coefficients wi...

2015-07-30 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7771#discussion_r35931515 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -17,7 +17,7 @@ package org.apache.spark.ml.feature

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-28 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7707#issuecomment-125690700 Updated to filter column types --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-9698] [ML] Add RInteraction transformer...

2015-08-06 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7987#issuecomment-128548608 @mengxr done, this PR now just has the RInteraction changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-9713] [ML] Document SparkR MLlib glm() ...

2015-08-10 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/8085 [SPARK-9713] [ML] Document SparkR MLlib glm() integration in Spark 1.5 This documents the use of R model formulae in the SparkR guide. Also fixes some bugs in the R api doc. You can merge this pull

[GitHub] spark pull request: [SPARK-9713] [ML] Document SparkR MLlib glm() ...

2015-08-11 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/8085#discussion_r36811167 --- Diff: docs/sparkr.md --- @@ -210,6 +210,43 @@ head(df) {% endhighlight %} /div +### Model Formulae + +SparkR allows the fitting

[GitHub] spark pull request: [SPARK-9713] [ML] Document SparkR MLlib glm() ...

2015-08-11 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/8085#discussion_r36811160 --- Diff: docs/sparkr.md --- @@ -210,6 +210,43 @@ head(df) {% endhighlight %} /div +### Model Formulae + +SparkR allows the fitting

[GitHub] spark pull request: [SPARK-9713] [ML] Document SparkR MLlib glm() ...

2015-08-11 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/8085#discussion_r36811157 --- Diff: docs/sparkr.md --- @@ -210,6 +210,43 @@ head(df) {% endhighlight %} /div +### Model Formulae --- End diff -- Done

[GitHub] spark pull request: [SPARK-9713] [ML] Document SparkR MLlib glm() ...

2015-08-11 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/8085#discussion_r36811172 --- Diff: docs/sparkr.md --- @@ -210,6 +210,43 @@ head(df) {% endhighlight %} /div +### Model Formulae + --- End diff

[GitHub] spark pull request: [SPARK-9681] [ML] Support R feature interactio...

2015-08-06 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/7987 [SPARK-9681] [ML] Support R feature interactions in RFormula This adds support for the interaction (:) operator to the RFormula feature transformer. Design doc from umbrella task: https

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-27 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7574#discussion_r35598513 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala --- @@ -48,55 +49,59 @@ class RFormulaSuite extends SparkFunSuite

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-27 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7574#discussion_r35598495 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -114,25 +177,29 @@ class RFormula(override val uid: String

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-27 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7574#discussion_r35598510 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala --- @@ -48,55 +49,59 @@ class RFormulaSuite extends SparkFunSuite

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-27 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7574#discussion_r35598489 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -62,19 +77,72 @@ class RFormula(override val uid: String) /** @group

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-27 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7574#discussion_r35598503 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala --- @@ -48,55 +49,59 @@ class RFormulaSuite extends SparkFunSuite

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-27 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7574#discussion_r35598479 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -62,19 +77,72 @@ class RFormula(override val uid: String) /** @group

[GitHub] spark pull request: [SPARK-9391] [ML] Support minus, dot, and inte...

2015-07-27 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/7707 [SPARK-9391] [ML] Support minus, dot, and intercept operators in SparkR RFormula Adds '.', '-', and intercept parsing to RFormula. Also splits RFormulaParser into a separate file. Umbrella

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-24 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7574#discussion_r35397462 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -130,9 +173,52 @@ class RFormula(override val uid: String) Label

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-24 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7574#discussion_r35397464 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -130,9 +173,52 @@ class RFormula(override val uid: String) Label

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-24 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7574#discussion_r35397461 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -62,19 +77,60 @@ class RFormula(override val uid: String) /** @group

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-21 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/7574 [SPARK-9230] [ML] Support StringType features in RFormula This adds StringType feature support via OneHotEncoder. As part of this task it was necessary to change RFormula to an Estimator, so

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-25 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7574#issuecomment-124916428 ptal --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-22 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7574#issuecomment-123961633 @mengxr to clarify, not calling `StringIndexer.fit` in `RFormula.fit` means RFormulaModel will have a reference to the original fitted dataset, correct? --- If your

[GitHub] spark pull request: [SPARK-9230] [ML] Support StringType features ...

2015-07-23 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7574#issuecomment-123988222 Hmm, I guess that is pretty harmless though. Will do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-16 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34830515 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-16 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7381#issuecomment-122069902 Sounds good, I'll look at the R integration next. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632802 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632796 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632816 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632827 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -116,7 +116,7 @@ class VectorAssembler(override val uid: String

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632829 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RModelFormulaSuite.scala --- @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632824 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632806 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632795 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632811 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632800 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632803 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632817 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-14 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34632833 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RModelFormulaSuite.scala --- @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-15 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34742755 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RModelFormulaSuite.scala --- @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-15 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34742729 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-15 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34742685 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RModelFormula.scala --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-15 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7381#discussion_r34742784 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RModelFormulaSuite.scala --- @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-15 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7381#issuecomment-121777321 @mengxr That makes sense, I'll do that in a followup PR. I also addressed the comments. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-6805] [ML] Initial integration of MLlib...

2015-07-18 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r34945297 --- Diff: R/pkg/R/mllib.R --- @@ -0,0 +1,53 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request: [SPARK-6805] [ML] Initial integration of MLlib...

2015-07-17 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/7483 [SPARK-6805] [ML] Initial integration of MLlib + SparkR using RFormula This exposes the SparkR:::glm() and SparkR:::predict() APIs. It was necessary to change RFormula to silently drop the label

[GitHub] spark pull request: [SPARK-8774] [ML] Add R model formula with bas...

2015-07-13 Thread ericl
GitHub user ericl opened a pull request: https://github.com/apache/spark/pull/7381 [SPARK-8774] [ML] Add R model formula with basic support as a transformer This implements minimal R formula support as a feature transformer. Both numeric and string labels are supported

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031787 --- Diff: R/pkg/R/mllib.R --- @@ -0,0 +1,53 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031800 --- Diff: R/pkg/R/mllib.R --- @@ -0,0 +1,53 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031805 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/r/MLUtils.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031781 --- Diff: R/pkg/inst/tests/test_mllib.R --- @@ -0,0 +1,34 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031795 --- Diff: R/pkg/R/mllib.R --- @@ -0,0 +1,53 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031811 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/r/MLUtils.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031804 --- Diff: R/pkg/R/mllib.R --- @@ -0,0 +1,53 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031777 --- Diff: R/pkg/R/mllib.R --- @@ -0,0 +1,53 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031775 --- Diff: R/pkg/R/mllib.R --- @@ -0,0 +1,53 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031784 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/r/MLUtils.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35031793 --- Diff: R/pkg/R/mllib.R --- @@ -0,0 +1,53 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35062705 --- Diff: R/pkg/inst/tests/test_mllib.R --- @@ -0,0 +1,34 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7483#issuecomment-123131497 @shivaram done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35061955 --- Diff: R/pkg/NAMESPACE --- @@ -10,6 +10,10 @@ export(sparkR.init) export(sparkR.stop) export(print.jobj) +# MLlib integration +export

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on the pull request: https://github.com/apache/spark/pull/7483#issuecomment-123120127 @shivaram I think it already only captures if data = DataFrame. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35062035 --- Diff: R/pkg/R/mllib.R --- @@ -0,0 +1,65 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request: [SPARK-9201] [ML] Initial integration of MLlib...

2015-07-20 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/7483#discussion_r35061986 --- Diff: R/pkg/inst/tests/test_mllib.R --- @@ -0,0 +1,34 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor

  1   2   3   4   5   6   7   8   9   10   >