Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/18551#discussion_r126086812
--- Diff: docs/ml-guide.md ---
@@ -61,6 +61,11 @@ To configure `netlib-java` / Breeze to use system
optimised binaries, include
project and read the [netlib-java](https://github.com/fommil/netlib-java)
documentation for your
platform's additional installation instructions.
+The most popular native BLAS such as [Intel
MKL](https://software.intel.com/en-us/mkl),
[OpenBLAS](http://www.openblas.net), are based on multi-threading.
+For example, when OpenBLAS is loaded, it will create a thread pool with
`MAX_CPU_NUMBER` threads, and the threads are using spinlock by default, which
will conflict with Spark.
--- End diff --
Is it always worse? for example can it multi-thread a single computation?
it's possible that's advantageous if the other tasks on the machine aren't
CPU-intensive. But probably counterproductive if all the tasks are CPU
intensive.
Maybe we can soften the language slightly, to say you _might_ get better
performance by setting these to 1.
Also I don't think users will know what spinlock or MAX_CPU_NUMBER is (not
a Spark value?)
Also, I don't think we ever explain here what MKL or OpenBLAS is, in any
docs. You might briefly mention that this is explained in the netlib docs.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]