GitHub user holdenk reopened a pull request:
https://github.com/apache/spark/pull/9524
[SPARK-10387][ML][WIP] Add code gen for gbt
This PR adds code gen for GBT in ML. It does this using quasi quotes since
they give nice compile time type messages. The API could also use some feedback
for how we want to expose code gen - the current iteration gives the user a
`toCodeGen` function they can call, but we could also use a config flag or
something similar (open to suggestions). This is based on DB's work (as
mentioned in the Jira)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/holdenk/spark SPARK-10387-code-gen-for-gbt
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9524.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9524
----
commit de332b7cebe2e44dda46ad95214479973380f9b3
Author: Holden Karau <[email protected]>
Date: 2015-11-02T20:41:06Z
Start some work towards adding a codegen capable model
commit 344349a28b40b9e9c3b9026f2b0dbd4704567358
Author: Holden Karau <[email protected]>
Date: 2015-11-03T20:12:48Z
Merge branch 'master' into SPARK-10387-code-gen-for-gbt
commit 8be52cc0a4650164ccea2285e23473e7160d3156
Author: Holden Karau <[email protected]>
Date: 2015-11-04T20:08:47Z
Try and get quasi quotes for 2.10
commit 7ec3b4a78c667db43211b08a8ed0edf9dc5d968e
Author: Holden Karau <[email protected]>
Date: 2015-11-04T20:43:25Z
move tree bits around some
commit 28c134f29bc2b244b210f314324245f1fe25c660
Author: Holden Karau <[email protected]>
Date: 2015-11-04T21:33:24Z
Add quasiquotes lib
commit df279367a3b1772df285fd9201fd12303a9d69e5
Author: Holden Karau <[email protected]>
Date: 2015-11-04T21:36:54Z
Update based on what spark sql used to do
commit 3c3e67e1427a3b22c138e954c4c03247402a42b8
Author: Holden Karau <[email protected]>
Date: 2015-11-04T21:37:05Z
Update tree model codegen a bit
commit 3d8a6ad08654b3057d069e871802d7ad2165f185
Author: Holden Karau <[email protected]>
Date: 2015-11-04T21:54:29Z
Well Spark SQL put it in the root pom, maybe for a good reason. Lets try
that
commit 0ec21f7fbbb1d10ad4cf38499f608b7727a0c878
Author: Holden Karau <[email protected]>
Date: 2015-11-04T23:31:30Z
It compiles
commit 5cfbf5c024034c93f8c213255e80aa3cd7e4912e
Author: Holden Karau <[email protected]>
Date: 2015-11-05T00:47:15Z
It compiles
commit d2b90066bd1f3bc1b82dff54d4d8084c93f21962
Author: Holden Karau <[email protected]>
Date: 2015-11-05T01:06:23Z
Cleanup some whitespace
commit 64a69104b1b35b2f3289a82918dc2916abd9cdb0
Author: Holden Karau <[email protected]>
Date: 2015-11-05T19:11:56Z
Add missing scala-compiler depedency so that quasi quotes can do its thing
commit 204fa7c6ecb5fb393d353978158191c7c60afda6
Author: Holden Karau <[email protected]>
Date: 2015-11-05T19:17:31Z
tests pass! #shipit
commit a1eadf0198c630cc192a13c4b475a45644e32eb6
Author: Holden Karau <[email protected]>
Date: 2015-11-05T19:40:31Z
Style fixes
commit a2de15b5411acfab0382479ce508a6146bf302f3
Author: Holden Karau <[email protected]>
Date: 2015-11-06T07:48:53Z
Merge branch 'master' into SPARK-10387-code-gen-for-gbt
commit de077419c3f9fb8de4d78d8adfaf039a992f8328
Author: Holden Karau <[email protected]>
Date: 2015-11-06T16:26:10Z
Add an explicit leaf node test
commit 037e611965bcb0ac9d5b9b2ab6936a1e6b5d3c0c
Author: Holden Karau <[email protected]>
Date: 2015-11-06T16:58:06Z
Use toCodeGen instead of just auto doing it for trees under 400, add it for
GBT classifier as well
commit 70b00dbd44022af6cbc0718560eb2ecfb7de39e6
Author: Holden Karau <[email protected]>
Date: 2015-11-06T18:11:16Z
Add explicit tests for a tree conversion to codegen
commit b27e5531dc573afa5ca99f68a61c1e69f0267b53
Author: Holden Karau <[email protected]>
Date: 2015-11-06T18:13:01Z
whoops need to be ge not geq for cond
commit aa315e202cebecf8de041358047919f0c5ac08a5
Author: Holden Karau <[email protected]>
Date: 2015-11-06T21:31:43Z
rip out quasi quotes for mllib
commit ca72c1c27a9d81d048019a3cc687cb65595ea516
Author: Holden Karau <[email protected]>
Date: 2015-11-06T21:54:00Z
Start switching MLLib codejen to java codegen
commit 9bc39bfaaa67656272df5723596c2fc5c42e79f3
Author: Holden Karau <[email protected]>
Date: 2015-11-06T22:34:48Z
Add missing profiles close tag
commit fa6764e98da0bac01400c2e1dca5b6a245fddf90
Author: Holden Karau <[email protected]>
Date: 2015-11-07T06:09:58Z
move the test
commit f12d0bd4708d4fde79c43439b701a176843e6cfa
Author: Holden Karau <[email protected]>
Date: 2015-11-07T07:38:19Z
Ok seems to run now
commit 471b934673d199f6649ae4f8dcaad68c1fb22313
Author: Holden Karau <[email protected]>
Date: 2015-11-08T03:12:51Z
Implement serializable for the function
commit c96caf18a25ddb1f43c8bfd75286aecb3653df5a
Author: Holden Karau <[email protected]>
Date: 2015-11-08T05:29:59Z
don't need scala macros anymore
commit 647b2e71a6d22cd52f67bf3dc50c31cf01a4a9d7
Author: Holden Karau <[email protected]>
Date: 2015-11-10T01:43:15Z
Merge branch 'master' into SPARK-10387-code-gen-for-gbt
commit ada176e7802e1a4e522ee1625663655ec648202a
Author: Holden Karau <[email protected]>
Date: 2015-11-19T05:50:48Z
Merge branch 'master' into SPARK-10387-code-gen-for-gbt
commit 8912ff16941ba6a5d7562eb8b83eead3f053c38c
Author: Holden Karau <[email protected]>
Date: 2015-11-19T09:03:08Z
Switch it around so we don't generate huge trees
commit a697dda5c083d2d9ccab52af5eee981c33dc0fd2
Author: Holden Karau <[email protected]>
Date: 2015-11-19T09:25:06Z
oops call
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]