zhengruifeng commented on issue #27374: [SPARK-30659][ML][PYSPARK] LogisticRegression blockify input vectors URL: https://github.com/apache/spark/pull/27374#issuecomment-579302386 @srowen The orignial dataset `a9a` is not big, its numFeatures=123, numInstances=32,561, after upsampling its numInstances=32,561X256=8,335,616. I had made other performance tests, it seems that the performance is related to `numFeatures` and `blockSize`, and I guess the performance is highly related to: given a array of vectors, to what degree can Level2/3-BLAS be faster than existing java impl or Level-1. Thanks for reviewing!
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
