[GitHub] [spark] zhengruifeng commented on pull request #28974: [SPARK-31976][ML][PYSPARK] LinearSVC use MemoryUsage to control the size of block

GitBox Sun, 05 Jul 2020 19:35:19 -0700


zhengruifeng commented on pull request #28974:
URL: https://github.com/apache/spark/pull/28974#issuecomment-653984111



   @huaxingao   since this changed is suggested by @mengxr and @WeichenXu123 , 
I perfer to append the performace tests after they think current design is OK.
   
   In current commit, if the `maxBlockMemoryInMB==0` (by default), the old cold 
path `trainOnRows` is chosen. If we want to discard this code path in the 
future, there will be a small behavior change, that the default 
`maxBlockMemoryInMB` need to be set with another default value (i.e. 4MB).
   
   Existing performace tests were done against `numRows`, I think the results 
can be converted to `maxBlockMemoryInMB`. (But I agree with anohter test 
directly on `maxBlockMemoryInMB`).
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zhengruifeng commented on pull request #28974: [SPARK-31976][ML][PYSPARK] LinearSVC use MemoryUsage to control the size of block

Reply via email to