Hi Dale.
Please keep all discussions on the mailing list as not everybody might
have the time to reply.
The default should be class_weight=1 for each class, so dropping the
half in one class should reduce the weight for that class to .5.
This only works for removing duplicate data points (dropping points will
clearly lose information otherwise).
Have a look here:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/tests/test_class_weight.py#L41
Cheers,
Andy
On 07/24/2015 12:57 PM, Dale Smith wrote:
Andy,
I’ve thought a bit about your suggestion for testing. I’m not sure I
fully understand the mechanics or process. Forgive my lack of experience.
Suppose I have a data set with equal weights for a binary
classification. Dropping half the samples for one class would change
the weights to 0.25/0.75. Is this what you are thinking?
I suppose I could retrain the model with the 0.25/0.75 weights. I
suspect I should get the same predictions (assuming I use the same
seed for the random number generator).
Am I on the right track here?
*Dale Smith, Ph.D.*
Data Scientist
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20logo.png
<http://nexidia.com/>
*
d.* 404.495.7220 x 4008 *f.* 404.795.7221
Nexidia Corporate | 3565 Piedmont Road, Building Two, Suite 400 |
Atlanta, GA 30305
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Blog.jpeg
<http://blog.nexidia.com/>http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20LinkedIn.jpeg
<https://www.linkedin.com/company/nexidia>http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Google.jpeg
<https://plus.google.com/u/0/107921893643164441840/posts>http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20twitter.jpeg
<https://twitter.com/Nexidia>http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Youtube.jpeg
<https://www.youtube.com/user/NexidiaTV>
*From:*Andy [mailto:t3k...@gmail.com]
*Sent:* Thursday, July 23, 2015 8:27 AM
*To:* scikit-learn-general@lists.sourceforge.net
*Subject:* Re: [Scikit-learn-general] Added sample_weight to RFECV.fit
but not sure how to test the change
I think my reply for this got swallowed by the sourceforge outage:
The main thing that you should test is whether the added behavior is
correct.
For that you should confirm that changing sample weights is equivalent
to duplicating / dropping a sample.
On 07/22/2015 01:34 PM, Dale Smith wrote:
I’ve added sample_weight as an optional parameter to RFECV.fit in
order to handle highly unbalanced cases. I can build the package
locally. However, looking at the tests directory does not give me
much confidence that I can write validation and regression tests.
I looked particularly at test_metaestimators.py.
I also reviewed the Contributing section of the documentation, the
wiki, and searched the mailing list archive, but didn’t find
anything relevant. Are there any other sources I should review?
*Dale Smith, Ph.D.*
Data Scientist
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20logo.png
<http://nexidia.com/>
*
d.* 404.495.7220 x 4008 *f.* 404.795.7221
Nexidia Corporate | 3565 Piedmont Road, Building Two, Suite 400 |
Atlanta, GA 30305
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Blog.jpeg
<http://blog.nexidia.com/>
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20LinkedIn.jpeg
<https://www.linkedin.com/company/nexidia>
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Google.jpeg
<https://plus.google.com/u/0/107921893643164441840/posts>
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20twitter.jpeg
<https://twitter.com/Nexidia>
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Youtube.jpeg
<https://www.youtube.com/user/NexidiaTV>
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general