Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125295030
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125295005
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125294308
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125300346
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35596332
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -114,25 +177,29 @@ class RFormula(override val uid: String)
}
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35596324
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -62,19 +77,72 @@ class RFormula(override val uid: String)
/** @group
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35596320
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -62,19 +77,72 @@ class RFormula(override val uid: String)
/** @group
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125372454
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125372440
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125376053
[Test build #38602 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38602/consoleFull)
for PR 7574 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125375942
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125375931
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user ericl commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35598513
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala ---
@@ -48,55 +49,59 @@ class RFormulaSuite extends SparkFunSuite with
Github user ericl commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35598495
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -114,25 +177,29 @@ class RFormula(override val uid: String)
}
Github user ericl commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35598510
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala ---
@@ -48,55 +49,59 @@ class RFormulaSuite extends SparkFunSuite with
Github user ericl commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35598489
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -62,19 +77,72 @@ class RFormula(override val uid: String)
/** @group
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125371041
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35596483
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala ---
@@ -48,55 +49,59 @@ class RFormulaSuite extends SparkFunSuite with
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35596486
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala ---
@@ -48,55 +49,59 @@ class RFormulaSuite extends SparkFunSuite with
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35596488
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala ---
@@ -48,55 +49,59 @@ class RFormulaSuite extends SparkFunSuite with
Github user ericl commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35598503
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala ---
@@ -48,55 +49,59 @@ class RFormulaSuite extends SparkFunSuite with
Github user ericl commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35598479
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -62,19 +77,72 @@ class RFormula(override val uid: String)
/** @group
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125373141
[Test build #38597 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38597/consoleFull)
for PR 7574 at commit
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125376595
LGTM pending Jenkins.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125396509
[Test build #38597 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38597/console)
for PR 7574 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125396596
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125380508
[Test build #38602 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38602/console)
for PR 7574 at commit
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125384454
Merged into master. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/7574
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-125380557
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user ericl commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124916428
ptal
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user ericl commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35397462
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -130,9 +173,52 @@ class RFormula(override val uid: String)
Label
Github user ericl commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35397464
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -130,9 +173,52 @@ class RFormula(override val uid: String)
Label
Github user ericl commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35397461
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -62,19 +77,60 @@ class RFormula(override val uid: String)
/** @group
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124346736
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124346148
[Test build #38316 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38316/console)
for PR 7574 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124338457
[Test build #38316 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38316/consoleFull)
for PR 7574 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124769768
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124769744
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124770734
[Test build #38410 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38410/consoleFull)
for PR 7574 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124776979
[Test build #38410 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/38410/console)
for PR 7574 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124777072
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124338057
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-124338078
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user ericl commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123988222
Hmm, I guess that is pretty harmless though. Will do.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123994687
You can construct a `Pipeline` object in `RFormula.fit`, which contains all
`StringIndexer`, `OneHotEncoder`, etc. Then call `Pipeline.fit` in
`RFormula.fit` and get the
Github user ericl commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123961633
@mengxr to clarify, not calling `StringIndexer.fit` in `RFormula.fit` means
RFormulaModel will have a reference to the original fitted dataset, correct?
---
If your
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35279252
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -62,19 +77,60 @@ class RFormula(override val uid: String)
/** @group
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35279311
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -130,9 +173,52 @@ class RFormula(override val uid: String)
Label
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123913341
@ericl I think it is simpler to construct a `pipeline` in `RFormula.fit`
without calling `StringIndexer.fit` explicitly. That leaves space for
`pipeline.fit`
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/7574#discussion_r35279570
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala
---
@@ -130,9 +173,52 @@ class RFormula(override val uid: String)
Label
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123488315
[Test build #37982 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37982/console)
for PR 7574 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123488427
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123475216
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123475106
[Test build #37977 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37977/console)
for PR 7574 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123479213
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123479155
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123467145
[Test build #37977 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37977/consoleFull)
for PR 7574 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123465703
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123465662
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7574#issuecomment-123480884
[Test build #37982 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37982/consoleFull)
for PR 7574 at commit
GitHub user ericl opened a pull request:
https://github.com/apache/spark/pull/7574
[SPARK-9230] [ML] Support StringType features in RFormula
This adds StringType feature support via OneHotEncoder. As part of this
task it was necessary to change RFormula to an Estimator, so that
62 matches
Mail list logo