Re: [Scikit-learn-general] SVC and unbalanced dataset

Andy Wed, 10 Sep 2014 07:47:02 -0700

On 09/10/2014 09:07 AM, Gael Varoquaux wrote:
> How are you measuring your errors? If you are using the zero-one loss
> (accuracy score), you are taking in account only the binary decisions,
> and not a possible decision function. I have found that in the situation
> of unbalanced classes, it could be useful to threshold the decision
> function at a different value than 0, to maximize the left-out accuracy
> score.
>
> Of course, that's an extra hyper-parameter, and I don't have a great way
> to set it.
>
> G


I usually use AUC or AP for this setting, which means you don't have to 
set a threshold.

I would try to pick a class weight that optimizes AUC. The "auto" class 
weight is just a heuristic which works often but is in no way guaranteed 
to give good results.

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] SVC and unbalanced dataset

Reply via email to