imatiach-msft commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-458203376 I think I made a mistake and it should actually be: ``` val tolerance = Utils.EPSILON * (unweightedNumSamples + unweightedNumSamples) ``` or perhaps a larger threshold: ``` val tolerance = Utils.EPSILON * unweightedNumSamples * SomeLargeConstant ``` but I will need to verify by adding some debug to ensure that no zero features slip through for the sample tests, otherwise that tolerance would still be too low and the factor would need to be increased; my worry is that by using the square of the samples the tolerance would become too high with a very large number of samples and then some values would be included as zero feature values which we don't want
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
