Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/17742
BLAS3 with still keeping the output size as `n x m` rather than `n x k`
results in massively more shuffle data - I don't think any solution based on
exploding the intermediate data so much can be as efficient as this. Since for
`k=10` it's ~80k little objects per block vs ~33 million...
I had a version using BLAS 3 followed by a sort per row (see
https://issues.apache.org/jira/browse/SPARK-11968 for branch link and test
details). For MLLIB it was slower than this approach by a factor of 1.5x. I
just re-tested for ML and it it is 56s vs 16s for this approach, so really
significantly slower.
Comparatively, both approaches created the intermediate `output` objects
(but only `n x k` size). Certainly that part could perhaps be further
optimized. However, the BLAS3 approach still had around 20% GC time vs around
12% from this approach. Each gemm does indeed require a large intermediate
array and this seems to cause additional GC time (whether directly or
indirectly).
Even without that this approach is a lot faster than `gemm` and sort for
the top-k by row. I'm sure the per-row top-k can be made a lot more efficient
and that is worth exploring (though frankly I am doubtful it will result in
that much more gain over this approach, relative to the code complexity it will
introduce). The small object GC can perhaps be improved with the iterator
approach and avoiding creating the `output` array (that may be good for another
5% or so perhaps?) - this applies to whatever approach is used.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]