Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/14937 @sethah I think the test result can be reproduced against the current patch, however, there are two issues should be considered: * Make sure you installed optimized/native BLAS on your system and loaded it correctly in JVM via netlib-java. Otherwise, it will fall back to Java implementation. * Make sure you load the dataset via DenseVector which will be converted into DenseMatrix and get performance improvement. ```Scala val df = spark.read.format("libsvm").options(Map("vectorType" -> "dense")).load(path) ``` Spark loads dataset of libsvm format into SparseVector/SparseMatrix by default, and it will fall into the branch of processing sparse data which will cause huge performance degradation. Could you share some of your test detail? If you already considered the above two tips correctly, please let me know as well. I'm on a business travel and will resolve the merge conflicts in a few days. I'm very appreciate to hear your thoughts about this issue. Thanks.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org