GitHub user yinxusen opened a pull request:
https://github.com/apache/spark/pull/14229
[SPARK-16447][ML][SparkR] LDA wrapper in SparkR
## What changes were proposed in this pull request?
Add LDA Wrapper in SparkR with the following interfaces:
- spark.lda(data, ...)
- spark.posterior(object, newData, ...)
- spark.perplexity(object, ...)
- summary(object)
- write.ml(object)
- read.ml(path)
## How was this patch tested?
Test with SparkR unit test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/yinxusen/spark SPARK-16447
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14229.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14229
----
commit 4db86c16768cafff2a3091520a282764ce69bf84
Author: Xusen Yin <[email protected]>
Date: 2016-07-09T23:30:23Z
a runnable version
commit 7f8650dae796f85c66600c85dfb7ced26ed3e29e
Author: Xusen Yin <[email protected]>
Date: 2016-07-10T02:53:46Z
runnable version with complex args
commit 1487dcc9de85e95af3e2865d3a9068c9bc395928
Author: Xusen Yin <[email protected]>
Date: 2016-07-10T03:59:45Z
add summary without new dictionary
commit bdc38191f41e8ecb3b2c8caa46671f33db6576fc
Author: Xusen Yin <[email protected]>
Date: 2016-07-11T21:25:07Z
add test for spark.lda
commit 324871f3519465cd10ec85d670ddb3459416569e
Author: Xusen Yin <[email protected]>
Date: 2016-07-12T22:18:33Z
add new functions
commit 3be7105a4b6661b2c74e8e6f0a3936ca50c2c414
Author: Xusen Yin <[email protected]>
Date: 2016-07-12T23:36:18Z
merge with master
commit 7f3fcc63197ddc8f49ed2ee21956c2527d18a542
Author: Xusen Yin <[email protected]>
Date: 2016-07-14T20:10:49Z
add raw text input support
commit 27fa94b9c74814459a54286212e11b6426635ef1
Author: Xusen Yin <[email protected]>
Date: 2016-07-14T22:21:18Z
add vocabulary
commit 4f6aa1ecd6ae50212b456b7ae1c1ef83d10b8bbd
Author: Xusen Yin <[email protected]>
Date: 2016-07-15T22:31:34Z
add index to term dict
commit db61624ca9de39700c2fab83eedcb026a13bf8d7
Author: Xusen Yin <[email protected]>
Date: 2016-07-15T22:49:50Z
change likelihood to log one
commit 00e4e07a4fcbfe5b093511d1e5dfe09b6880a45f
Author: Xusen Yin <[email protected]>
Date: 2016-07-15T23:29:35Z
refine R docs
commit 89f0ae4b23a2d20d62c851db522718aa08c12514
Author: Xusen Yin <[email protected]>
Date: 2016-07-16T01:10:39Z
update docs and more tests
commit 02a7719f08dbc6c985b9c2768a444bbb4995ca28
Author: Xusen Yin <[email protected]>
Date: 2016-07-16T01:57:44Z
fix interface
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]