Also, you should think about what your performance measure should be,
and if it should be accuracy (usually it is not).
AUC is often good, but you need to choose an operating point in the end.
On 06/23/2015 10:58 AM, Trevor Stephens wrote:
Many of the scikit-learn classifiers are equipped with a parameter
`class_weight` that can be helpful in situations such as this.
Depending on if you are on the development branch, or a public
release, the preset "auto" or "balanced" will re-weight samples by
their inverse class frequencies.
You may also do a grid search to try and find a "better" set of class
weights, something like this perhaps:
parameters = {'class_weight': [{A: i + 1., B: 10. - i} for i in
range(10)]}
clf = SomeClassifier()
grid = GridSearchCV(clf, parameters)
grid.fit(X, y)
- Trev
On Tue, Jun 23, 2015 at 7:25 AM, Neal Becker <ndbeck...@gmail.com
<mailto:ndbeck...@gmail.com>> wrote:
Any suggestions?
Neal Becker wrote:
> I am interested in supervised learning for classification where
I have
> multiple classes, but training data is highly unequal. There may
be 1000s
> of training examples for class A, but maybe 100s for class B.
What are
> suggested algorithms/approaches?
>
>
>
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction.
Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general