GitHub user shubhamchopra opened a pull request:

    https://github.com/apache/spark/pull/17673

    [SPARK-20372] [ML] Word2Vec Continuous Bag of Words model

    ## What changes were proposed in this pull request?
    
    This adds Continuous Bag of Words implementation to Word2Vec
    
    ## How was this patch tested?
    Patch tested using unit tests contributed as a part of this PR.
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shubhamchopra/spark Word2VecCBOW

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17673.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17673
    
----
commit 2e777406b9dd69a47952a9650ab2a7a323e93391
Author: Shubham Chopra <[email protected]>
Date:   2017-04-11T21:18:00Z

    Word2Vec CBOW + Negative Sampling implementation.

commit 5d76210a48004b8f6d8bb32e55ba2d67892d72c7
Author: Shubham Chopra <[email protected]>
Date:   2017-04-11T23:24:31Z

    Correcting the negative samples function.

commit 5725209ab48b0d0f2c58559d7b5da0a2dbbdb5d6
Author: Shubham Chopra <[email protected]>
Date:   2017-04-12T16:41:33Z

    Correcting scala style issue.

commit 3fa2be111125a7739e0ed798c8c36df1ce826872
Author: Shubham Chopra <[email protected]>
Date:   2017-04-13T19:01:55Z

    Checking to make sure neg samples is less than the vocab size.

commit 8af4980af7522d7964cd265be8b021a8b39bad59
Author: Shubham Chopra <[email protected]>
Date:   2017-04-13T19:04:59Z

    removing unused function.

commit a8eb7ed50d357156107417c20b073b8ba5aefa81
Author: Shubham Chopra <[email protected]>
Date:   2017-04-13T19:06:15Z

    Adding test cases, similar to the ones for skip-gram based estimation.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to