Hi All,

Currently, scikit-learn uses randomly generated codebook for
error-correcting output-code (line 468 in sklearn/multiclass.py). However,
there are some interesting strategies we could use in sklearn. In
particular, I would like to start from trying:

1. BCH Codes as mentioned in section 2.3.4
http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume2/dietterich95a.pdf

2. Decriminant ECOC as presented in
O. Pujol, P. Radeva, , and J. Vitria`. Discriminant ECOC: A heuristic
method for application dependent design of error correcting output codes

Also, for now we use only euclidean distance to find the nearest class as
represented in a codebook. We could add some new, for example, Humming
distance.

What do you think about those new enhancements?

Thanks,
Karol
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to