GitHub user jkbradley opened a pull request:

    https://github.com/apache/spark/pull/21090

    [SPARK-15784][ML] Add Power Iteration Clustering to spark.ml

    ## What changes were proposed in this pull request?
    
    This PR adds PowerIterationClustering as a Transformer to spark.ml.  In the 
transform method, it calls spark.mllib's PowerIterationClustering.run() method 
and transforms the return value assignments (the Kmeans output of the 
pseudo-eigenvector) as a DataFrame (id: LongType, cluster: IntegerType).
    
    This PR is copied and modified from 
https://github.com/apache/spark/pull/15770  The primary author is @wangmiao1981 
    
    ## How was this patch tested?
    
    This PR has 2 types of tests:
    * Copies of tests from spark.mllib's PIC tests
    * New tests specific to the spark.ml APIs


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jkbradley/spark wangmiao1981-pic

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21090.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21090
    
----
commit e4492a64b74b0bbbbccc2da8f13353d37bb9bb0c
Author: [email protected] <wm624@...>
Date:   2016-06-13T19:47:42Z

    add pic framework (model, class etc)

commit 70862491e5b86ce4add500a0c96ae5220733b35d
Author: [email protected] <wm624@...>
Date:   2016-06-13T23:28:09Z

    change a comment

commit b73d8a78fa69f83c278996feb1b19502ef871c5b
Author: [email protected] <wm624@...>
Date:   2016-06-17T17:27:55Z

    add missing functions fit predict load save etc.

commit 022fe523f735c5519f948b175871489f79434fb5
Author: [email protected] <wm624@...>
Date:   2016-06-18T01:12:41Z

    add unit test flie

commit 552cf54fb03f88af023f080e60fa50f1f39060fc
Author: [email protected] <wm624@...>
Date:   2016-06-20T17:35:05Z

    add test cases part 1

commit 0b4954d55b4d344794d3c47366220c67f07d0d43
Author: [email protected] <wm624@...>
Date:   2016-06-20T20:29:54Z

    add unit test part 2: test fit, parameters etc.

commit f22b01e06eaaf5951befcebdffc18c8e519183d2
Author: [email protected] <wm624@...>
Date:   2016-06-20T21:22:59Z

    fix a type issue

commit 305b194dae40eaff990c18837c3f2bc8d469e60c
Author: [email protected] <wm624@...>
Date:   2016-06-21T20:07:27Z

    add more unit tests

commit 4b32cbf02965c5c1a0c094fa144836dab0dfd543
Author: [email protected] <wm624@...>
Date:   2016-06-21T21:46:25Z

    delete unused import and add comments

commit f6eda88a6c0af416b988a2c37f46c8b08e5e99cf
Author: [email protected] <wm624@...>
Date:   2016-10-25T21:28:12Z

    change version to 2.1.0

commit 45c4b1cd1afa28c775c666b57ecee614ed9a41d0
Author: [email protected] <wm624@...>
Date:   2016-11-03T23:26:01Z

    change PIC as a Transformer

commit e8d7ed37138909d010a812fba7d03ef30a4f6e05
Author: [email protected] <wm624@...>
Date:   2016-11-04T17:28:26Z

    add LabelCol

commit e4e1e055a9b3ab54b83331ac7dc56d6b792dcf7b
Author: [email protected] <wm624@...>
Date:   2016-11-04T18:36:09Z

    change col implementation

commit 8384422ec0e7192cc8ce53df02ddb4ae0401fd0b
Author: [email protected] <wm624@...>
Date:   2017-02-17T22:20:00Z

    address some of the comments

commit d6a199c48ff940861d80caf275da29d99375ce33
Author: [email protected] <wm624@...>
Date:   2017-02-21T22:37:51Z

    add additional test with dataset having more data

commit b0c3aff4a76ace99c104c2b2c10c9485a028bfd6
Author: [email protected] <wm624@...>
Date:   2017-03-14T23:13:45Z

    change input data format

commit 091225dd2f1c353edc28dc4299034a018a92bc81
Author: [email protected] <wm624@...>
Date:   2017-03-15T22:49:45Z

    resolve warnings

commit 8bb99567556ce29c75d5f395157d0161dff695bc
Author: [email protected] <wm624@...>
Date:   2017-03-16T18:33:47Z

    add neighbor and weight cols

commit 8ba82e8392e6d607ab750ed8eb3caaf386e1352a
Author: wangmiao1981 <wm624@...>
Date:   2017-08-15T21:13:55Z

    address review comments 1

commit 468a94741efe6530c9acfbb1af4f46499550ee1f
Author: wangmiao1981 <wm624@...>
Date:   2017-08-15T21:23:39Z

    fix style

commit ec10f24336ff51354a1657c7ceadb9ada8cd1484
Author: wangmiao1981 <wm624@...>
Date:   2017-08-15T22:30:28Z

    remove unused comments

commit 5710cfcf2e3596c95f353ce043f7358a030d70a0
Author: wangmiao1981 <wm624@...>
Date:   2017-08-15T23:43:14Z

    add Since

commit 88654b3055ebd863e3b3c5774abdce28f3cda184
Author: wangmiao1981 <wm624@...>
Date:   2017-08-17T00:12:12Z

    fix missing >

commit 804adc6fece91e7264f315ee965faa40c5e334c5
Author: wangmiao1981 <wm624@...>
Date:   2017-08-17T17:26:40Z

    fix doc

commit 4a6dd79a9c37f71ea4378692438f19b3247b7913
Author: wangmiao1981 <wm624@...>
Date:   2017-10-25T23:16:55Z

    address review comments

commit 5cb8ed6de3865f58719b3b30888b3bc4542905d4
Author: wangmiao1981 <wm624@...>
Date:   2017-10-30T21:44:24Z

    fix unit test

commit 6abf6023868d944068a26186cde3fbadffd83a74
Author: Joseph K. Bradley <joseph@...>
Date:   2018-04-03T23:46:40Z

    cleanups to docs

commit d9270876797153d7660843fc621e707b4dff71ca
Author: Joseph K. Bradley <joseph@...>
Date:   2018-04-03T23:52:36Z

    typo

commit d2157489770a79fe443d567bfc03d61f72fbe161
Author: Joseph K. Bradley <joseph@...>
Date:   2018-04-17T20:17:15Z

    final updates for PIC PR

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to