GitHub user jaxony opened a pull request:
https://github.com/apache/incubator-hivemall/pull/167
[HIVEMALL-220] Implement Cofactor
## What changes were proposed in this pull request?
Implemented new matrix factorization algorithm for the recommendation
problem.
## What type of PR is it?
Feature
## What is the Jira issue?
https://issues.apache.org/jira/browse/HIVEMALL-220
## How was this patch tested?
Unit tests and manual testing on ML20M in a Hive dev environment
## How to use this feature?
TODO
## Checklist
(Please remove this section if not needed; check `x` for YES, blank for NO)
- [ ] Did you apply source code formatter, i.e., `./bin/format_code.sh`,
for your commit?
- [ ] Did you run system tests on Hive (or Spark)?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jaxony/incubator-hivemall
feature/cofactor-feature-array
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-hivemall/pull/167.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #167
----
commit 118cdb531c1d809a889b51fc8fd8e3a471ab4352
Author: Jackson Huang <huang.j@...>
Date: 2018-09-20T05:59:09Z
feat: implementing CofactorModel as subclass of FactorizedModel
commit 58ab28cee1dd17b2ae50e5aca866d6d5a3e6bbf0
Author: Jackson Huang <huang.j@...>
Date: 2018-09-20T06:08:20Z
feat: implemented getter and setter for contextBias
commit 8430caefc79b0667dc32968877beed62a11b3ab7
Author: Jackson Huang <huang.j@...>
Date: 2018-09-20T08:17:07Z
feat: added co-occurrence matrix accumulation
commit c06378a8d22792c13f23e0a08073f5ee1f813172
Author: Jackson Huang <huang.j@...>
Date: 2018-09-20T08:57:17Z
CofactorModel: add hyperparameters c0 and c1
commit 41a3f001d819ac588c8f1e8a3034696384fc57e8
Author: Jackson Huang <huang.j@...>
Date: 2018-09-20T08:58:07Z
CofactorModel: c hyperparameters are final
commit d1eee785cd67e797d46de6d11f5961b92413e680
Author: Jackson Huang <huang.j@...>
Date: 2018-09-20T09:20:13Z
CofactorModel: Change c0 and c1 to float
commit 800d7ca80f91e1f3d10ce9c68bb4df2d80ddbde6
Author: Jackson Huang <huang.j@...>
Date: 2018-09-20T09:20:38Z
WIP: Implementing cofactor UDTF
commit ee76755ffab07ad004a9b2768cc01f3d52db5ad2
Author: Jackson Huang <huang.j@...>
Date: 2018-09-20T09:27:52Z
CofactorizationUDTF: rename scaling parameters
commit 43f15f97d91e1608baa114ce3d7aa9aa62ab0e2d
Author: Jackson Huang <huang.j@...>
Date: 2018-09-20T09:28:52Z
CofactorizationUDTF: Implement option parsing for cofactorization options
commit 4a81f0733a8bc03461e1e8db32b571d57b8aa9ef
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T02:52:35Z
make Cofactor standalone class: copied code from FactorizedModel
commit 90f89bbf1869960a59206d942b0fc73e4d678714
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T02:56:19Z
Remove user bias because cofactor paper does not use it
commit 6e1539bd225e38d36fe1b46bbd0eab24691dd59d
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T03:26:09Z
Added numItems to getOptions
commit 8e12e28240f6e967cc39ff5df1dec0b48804e3cd
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T03:26:42Z
Implementing RatingInitializer
commit e04ccc1a185bffaa4f5d5834575290ff18a8a52b
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T03:27:06Z
Added batch training class
commit 747c90c79827d65e23200bb7adb12c4e89d52b9e
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T03:27:26Z
Copied and pasted from OnlineMatrixFactorizationUDTF
commit d482f2c101628bffd2e3fc30ca1035a06b1e80c1
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T05:29:48Z
Implemented part of process()
commit 51cb2f05ed637a48167ce2db4b7b23fe18ba19a3
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T06:00:42Z
WIP: implementing process
commit e9e8a31b384a45638fee1ad0b41eefa989ea18ed
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T08:13:01Z
Removing zero features in input Feature[], Added test for nnz array creation
commit 73b68e29c9eda5c636f68f2e930ac56dd617c707
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T08:17:34Z
Assert non-zero entries in Features when updating cooccurrence matrix
commit ef28ed82404b3ce14d7f76907825de7f6515adfe
Author: Jackson Huang <huang.j@...>
Date: 2018-09-26T08:50:53Z
fix: use feature index instead of index of modified array for co-occurrence
updates
commit 34063ccebc52e8b4374530227c679f28e3c444a0
Author: Jackson Huang <jackson_huang@...>
Date: 2018-09-27T06:34:08Z
Removed SPPMI matrix as it will be supplied by the user
commit 65f9e72f3018992a2fb215bcacdb91acd8bf4296
Author: Jackson Huang <jackson_huang@...>
Date: 2018-09-27T07:13:06Z
Rename weight variable names to be the same as cofacto.py
commit 5b27101b0d41263a3f422f99cba3f2f2f97f3958
Author: Jackson Huang <jackson_huang@...>
Date: 2018-10-11T07:00:28Z
Change Feature#parseFeature method to public
commit 2bf64d02e8adc2005f9fa3280cca427279e170e4
Author: Jackson Huang <jackson_huang@...>
Date: 2018-10-11T07:01:28Z
WIP: changed input argument format to process(...)
commit b6c7261ebc7fedcec5681aaa13457ca2a0a8fe77
Author: Jackson Huang <jackson_huang@...>
Date: 2018-10-11T09:25:00Z
Refactor: less code duplication
commit 3c522ce1824c77bcba92e032d0f60ba91f7fd11b
Author: Jackson Huang <jackson_huang@...>
Date: 2018-10-11T09:29:23Z
Better implementation of minibatch data structure, updated writing data to
buffer
commit 985ea29f61c115d1b500fa2f4b6bea3a9af41f86
Author: Jackson Huang <jackson_huang@...>
Date: 2018-10-11T09:29:40Z
Remove RatingInitializer interface
commit 2aeeb5250dcebd942cce081eb34475fb10bc8569
Author: Jackson Huang <jackson_huang@...>
Date: 2018-10-11T09:35:47Z
Replace setBetaBias with flexible implementation
commit f174d2264be6d2ed479c53a5c2336faa1727fa89
Author: Jackson Huang <jackson_huang@...>
Date: 2018-10-11T09:35:58Z
Reformatting
commit b40f48f6e27dd8ab6c01c0545d9405f301e76393
Author: Jackson Huang <jackson_huang@...>
Date: 2018-10-11T09:37:02Z
More reformatting
----
---