imatiach-msft commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample 
weights to decision trees
URL: https://github.com/apache/spark/pull/21632#issuecomment-458203376
 
 
   I think I made a mistake and it should actually be:
   ```
   val tolerance = Utils.EPSILON * (unweightedNumSamples + unweightedNumSamples)
   ```
   or perhaps a larger threshold:
   ```
   val tolerance = Utils.EPSILON * unweightedNumSamples * SomeLargeConstant
   ```
   but I will need to verify by adding some debug to ensure that no zero 
features slip through for the sample tests, otherwise that tolerance would 
still be too low and the factor would need to be increased; my worry is that by 
using the square of the samples the tolerance would become too high with a very 
large number of samples and then some values would be included as zero feature 
values which we don't want

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to