GitHub user MLnick opened a pull request:
https://github.com/apache/spark/pull/12574
[SPARK-13857][ML][WIP] Add "recommend all" functionality in ALS
This PR adds "recommend all" functionality to ML's `ALSModel`, similar to
what exists in the `recommendProductsForUsers/recommendUsersForProducts`
methods in MLlib's `MatrixFactorizationModel`.
This is a WIP to get some feedback around:
1. my approach to generic `blockify` and `recommendAll` methods
in`MatrixFactorizationModel`, to handle `Double` and `Float` types (since `ml`
uses `Float` for ratings and factors while `mllib` uses `Double`)
2. the approach to the new params related to recommending top-k in
`ALSModel`
3. semantics of `ALSModel.transform` and how it relates to pipelines,
cross-validation & evaluation (see this JIRA as well as
[SPARK-14409](http://issues.apache.org/jira/browse/SPARK-14409) and the linked
[design doc](
https://docs.google.com/document/d/1YEvf5eEm2vRcALJs39yICWmUx6xFW5j8DvXFWbRbStE/edit#heading=h.a06u73tsuqc5)
and #12461).
cc @mengxr @jkbradley @srowen @debasish83.
### TODO
- [ ] Decide on `transform` semantics and how this fits in with
cross-validation and evaluation.
- [ ] Performance testing vs `MatrixFactorizationModel` and vs the
alternative `transform` semantics.
- [ ] Clean up schema validation a bit and add related tests.
## How was this patch tested?
New unit tests in `ml.recommendation.ALSSuite`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MLnick/spark SPARK-13857-als-parity
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/12574.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #12574
----
commit fadfc74dc60faba70c1281eaf06472430cb0f606
Author: Nick Pentreath <[email protected]>
Date: 2016-04-15T12:09:16Z
initial work on adding 'recommendAll' methods to ML ALS
commit 8f54fd56bb437d2260e03ae5939807264a05c8e8
Author: Nick Pentreath <[email protected]>
Date: 2016-04-18T18:57:36Z
Update params and tests
commit 943b6c752f546db0bdbd84adff779e4220b7acf4
Author: Nick Pentreath <[email protected]>
Date: 2016-04-20T13:46:55Z
recommend all using transform
commit f9d1548452a19cb2bac9ebdf84d136e895fbaf61
Author: Nick Pentreath <[email protected]>
Date: 2016-04-21T06:54:32Z
remove space
commit f8cf8b4b73fd2bce64239f5d9617024285c13e90
Author: Nick Pentreath <[email protected]>
Date: 2016-04-21T09:13:33Z
clean up and docs
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]