[GitHub] [spark] amanomer commented on issue #26454: [SPARK-29818][MLLIB] Missing persist on RDD

GitBox Sun, 10 Nov 2019 01:14:06 -0800

amanomer commented on issue #26454: [SPARK-29818][MLLIB] Missing persist on RDD
URL: https://github.com/apache/spark/pull/26454#issuecomment-552176930
 
 
   I have not tested this patch on any performance benchmark but I think these 
functions are quite generic, most of the applications/vendors must be using 
them. So it would be better if we optimize them like we are doing in other 
places?
   
https://github.com/apache/spark/blob/57b954e825970f004895ac127083da67e10c09fb/mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala#L155-L158
   
https://github.com/apache/spark/blob/57b954e825970f004895ac127083da67e10c09fb/mllib/src/main/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetrics.scala#L227
   @MaxGekk Kindly correct me if I am wrong. Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] amanomer commented on issue #26454: [SPARK-29818][MLLIB] Missing persist on RDD

Reply via email to