Yes, I think you'd have to be more specific to get anything but general answers --
The non-distributed algorithms scale best, if by "scale" you're referring to CPU/memory required per unit of output. But they hit a point where they can't run anymore because you'd need a single machine so large that it's impractical. Every algorithm has different needs as its input grows, and even needs different needs depending on the nature of its input (e.g. number of users versus number of items, not just total ratings, for recommenders). So there's not a single answer to how much is needed per unit of output. The distribution versions don't have this limit, so if you mean by "scale" the upper limit on size of input that can be processed, there isn't one. They generally require more CPU/memory per unit output in general due to the overhead of distributing, but then can scale infinitely.