Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/15721
  
    This issue also brings up a more general point of how sample weights should 
be tested. It seems there are some common rules that all algorithms that 
incorporate sample weights are thought to follow (mentioned above), but there 
are also algorithm specific details in some cases. I am of the opinion that we 
ought to incorporate some mixture of the two - common sample weight tests and 
algorithm specific tests. This is the approach followed in 
linear/logistic/generalized linear regression which all compare weighted data 
sets to R's output. Other algorithms that can't be compared directly with R 
will make use of the common helper functions. Random forests and decision trees 
will also heavily incorporate these into testing.
    
    I do appreciate others' thoughts on this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to