Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/17793#discussion_r114009389
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -910,26 +944,127 @@ object ALS extends DefaultParamsReadable[ALS] with
Logging {
private type FactorBlock = Array[Array[Float]]
/**
- * Out-link block that stores, for each dst (item/user) block, which src
(user/item) factors to
- * send. For example, outLinkBlock(0) contains the local indices (not
the original src IDs) of the
- * src factors in this block to send to dst block 0.
+ * Out-link blocks that store information about which columns of the
items factor matrix are
--- End diff --
Is this any clearer? "For each user in each block, a mapping of which item
blocks that user's factors must be sent to in order to compute the updated item
factors, and vice versa."
Referring to user rows or item columns seems unnecessary since you can
transpose the ratings matrix and get opposite mappings. There may be some
standard convention though.
Also, how about adding
````scala
/**
* Say user block 0 corresponds users 1, 42, 29575. Then a corresponding
outblock of:
*
* {{{
* [[0, 15, 42],
* [12, 43],
* [314]]
* }}}
* means that user 1 factors must be sent to item blocks 0, 15, and 42;
user 42 factors must be
* sent to item blocks 12 and 43; user 29575 factors must be sent to
item block 314.
*/
````
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]