On Wed, Jun 24, 2015 at 12:02 PM, Nick Pentreath
<nick.pentre...@gmail.com> wrote:
> Oryx does almost the same but Oryx1 kept all user and item vectors in memory
> (though I am not sure about whether Oryx2 still stores all user and item
> vectors in memory or partitions in some way).

(Yes, this is a weakness, but makes things fast and easy to manage. My
rule of thumb is 1M user/item vectors ~= 1GB RAM, comfortably, even
with necessary ancillary structures. If you can afford N serving
machines with a bunch of RAM, you can get away with this for a long
while, but that's an "if")

Scoring in memory is just the first step if it needs to be real-time
-- scoring also probably needs to be even sub-linear in the number of
items (i.e. don't even score all items) but this is a tangent relative
to the Spark-related question.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to