GitHub user ericl opened a pull request:
https://github.com/apache/spark/pull/7483
[SPARK-6805] [ML] Initial integration of MLlib + SparkR using RFormula
This exposes the SparkR:::glm() and SparkR:::predict() APIs. It was
necessary to change RFormula to silently drop the label column if it was
missing from the input dataset, which is kind of a hack but necessary to
integrate with the Pipeline API.
The umbrella design doc for MLlib + SparkR integration can be viewed here:
https://docs.google.com/document/d/10NZNSEurN2EdWM31uFYsgayIPfCFHiuIu3pCWrUmP_c/edit
@mengxr
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ericl/spark spark-8774
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/7483.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #7483
----
commit fb0826b875d8cda29dce6ec6654cdf0f66ac958f
Author: Eric Liang <[email protected]>
Date: 2015-07-14T00:32:11Z
[SPARK-8774] Add R model formula with basic support as a transformer
commit d33211b6f07eddc9cd048c20be48e93f004c4a1f
Author: Eric Liang <[email protected]>
Date: 2015-07-14T00:32:29Z
r support
commit 1f361b0e0f6a7de12a39bc1b75fd59f6a7128ab8
Author: Eric Liang <[email protected]>
Date: 2015-07-14T00:55:21Z
doc
commit 5765ec6ace737049c91a1096f3e5c4670a2b19f2
Author: Eric Liang <[email protected]>
Date: 2015-07-14T00:57:11Z
fix style checks
commit dc3c943a9e3167cd419451b3d83a720db5152b23
Author: Eric Liang <[email protected]>
Date: 2015-07-14T23:39:49Z
address comments
commit 2db68aaa26d2a963b528449a80cc6cd294c8ec06
Author: Eric Liang <[email protected]>
Date: 2015-07-15T23:36:16Z
second round of comments
commit d1959d2818b11c6b173442deb6582e73557545c2
Author: Eric Liang <[email protected]>
Date: 2015-07-15T23:47:26Z
clarify comment
commit 29a2ce7d9d7f5e659058fd879755f2c4816b820b
Author: Eric Liang <[email protected]>
Date: 2015-07-16T23:38:49Z
Merge branch 'spark-8774-1' into spark-8774
commit d417d0c1579cdf6ad1284765766778c1558d02dc
Author: Eric Liang <[email protected]>
Date: 2015-07-16T23:39:23Z
Merge remote-tracking branch 'upstream/master' into spark-8774
commit e37603f02a5d692e93d7e7a83bf6d99ba04577eb
Author: Eric Liang <[email protected]>
Date: 2015-07-17T19:15:03Z
Fri Jul 17 12:15:03 PDT 2015
commit 0299c59cdf3757aa533c836f3a5dffe8b5424e0f
Author: Eric Liang <[email protected]>
Date: 2015-07-17T20:40:32Z
Fri Jul 17 13:40:32 PDT 2015
commit ce61367eb44a4f6799a60f161ca78fca41973ca2
Author: Eric Liang <[email protected]>
Date: 2015-07-17T20:41:17Z
Fri Jul 17 13:41:17 PDT 2015
commit 3a63ae564b805ec984eb080c2333a1acefa6803d
Author: Eric Liang <[email protected]>
Date: 2015-07-17T20:41:52Z
Fri Jul 17 13:41:52 PDT 2015
commit 6b7f15f033612dc0cf1105227f070e33a424c3e0
Author: Eric Liang <[email protected]>
Date: 2015-07-17T21:20:22Z
Fri Jul 17 14:20:22 PDT 2015
commit 5afbc6730011d4aa7beccabd0b08737d2202fc89
Author: Eric Liang <[email protected]>
Date: 2015-07-17T21:26:13Z
test label columns
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]