Hi Karol,

I would do the benchmark on commonly-used datasets such as MNIST, USPS,
News20, Covertype, Sector, etc.

http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/

Note that ECOC can potentially improve accuracy on binary classification
too, so I would do benchmarks on binary classification datasets as well.

Thanks for your interest in improving our ECOC classifier!

Mathieu

On Tue, Aug 13, 2013 at 1:25 AM, Karol Pysniak <kpysn...@gmail.com> wrote:

> Hi Mathieu,
>
> It looks interesting. Do you have in mind any specific real data we should
> use to benchmark the methods?
>
> Thanks,
> Karol
>
>
> 2013/8/12 Mathieu Blondel <math...@mblondel.org>
>
>> Hi Karol,
>>
>> Those would indeed be nice additions. However, we should do benchmarks on
>> real data and focus on the most effective methods.
>>
>> I found this paper / software which could serve as a reference:
>> http://jmlr.org/papers/v11/escalera10a.html
>>
>> Mathieu
>>
>> On Mon, Aug 12, 2013 at 1:27 PM, Karol Pysniak <kpysn...@gmail.com>wrote:
>>
>>> Hi All,
>>>
>>> Currently, scikit-learn uses randomly generated codebook for
>>> error-correcting output-code (line 468 in sklearn/multiclass.py). However,
>>> there are some interesting strategies we could use in sklearn. In
>>> particular, I would like to start from trying:
>>>
>>> 1. BCH Codes as mentioned in section 2.3.4
>>> http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume2/dietterich95a.pdf
>>>
>>> 2. Decriminant ECOC as presented in
>>> O. Pujol, P. Radeva, , and J. Vitria`. Discriminant ECOC: A heuristic
>>> method for application dependent design of error correcting output codes
>>>
>>> Also, for now we use only euclidean distance to find the nearest class
>>> as represented in a codebook. We could add some new, for example, Humming
>>> distance.
>>>
>>> What do you think about those new enhancements?
>>>
>>> Thanks,
>>> Karol
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Get 100% visibility into Java/.NET code with AppDynamics Lite!
>>> It's a free troubleshooting tool designed for production.
>>> Get down to code-level detail for bottlenecks, with <2% overhead.
>>> Download for free and get started troubleshooting in minutes.
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Get 100% visibility into Java/.NET code with AppDynamics Lite!
>> It's a free troubleshooting tool designed for production.
>> Get down to code-level detail for bottlenecks, with <2% overhead.
>> Download for free and get started troubleshooting in minutes.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to