Hi All,
I added PR for the implemented DECOC method as proposed in:
The algorithm as proposed in:
O. Pujol, P. Radeva, , and J. Vitria`. "Discriminant ECOC: A heuristic method
for application dependent design of error correcting output codes"
In general, it seems to improve accuracy, but it is also more time-consuming.
However, it could the first step to extend the functionality of creating
error-code output books as it is in, for example,
http://jmlr.org/papers/v11/escalera10a.html
Let me know what you think.
Thanks,
Karol
On Aug 12, 2013, at 11:25 PM, Karol Pysniak <kpysn...@gmail.com> wrote:
> Hi Mathieu,
>
> Thanks for the suggestions, I'll test the methods and get back with the
> results.
>
> Thanks,
> Karol
>
> On Aug 12, 2013, at 7:55 PM, Mathieu Blondel <math...@mblondel.org> wrote:
>
>> Hi Karol,
>>
>> I would do the benchmark on commonly-used datasets such as MNIST, USPS,
>> News20, Covertype, Sector, etc.
>>
>> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
>>
>> Note that ECOC can potentially improve accuracy on binary classification
>> too, so I would do benchmarks on binary classification datasets as well.
>>
>> Thanks for your interest in improving our ECOC classifier!
>>
>> Mathieu
>>
>> On Tue, Aug 13, 2013 at 1:25 AM, Karol Pysniak <kpysn...@gmail.com> wrote:
>> Hi Mathieu,
>>
>> It looks interesting. Do you have in mind any specific real data we should
>> use to benchmark the methods?
>>
>> Thanks,
>> Karol
>>
>>
>> 2013/8/12 Mathieu Blondel <math...@mblondel.org>
>> Hi Karol,
>>
>> Those would indeed be nice additions. However, we should do benchmarks on
>> real data and focus on the most effective methods.
>>
>> I found this paper / software which could serve as a reference:
>> http://jmlr.org/papers/v11/escalera10a.html
>>
>> Mathieu
>>
>> On Mon, Aug 12, 2013 at 1:27 PM, Karol Pysniak <kpysn...@gmail.com> wrote:
>> Hi All,
>>
>> Currently, scikit-learn uses randomly generated codebook for
>> error-correcting output-code (line 468 in sklearn/multiclass.py). However,
>> there are some interesting strategies we could use in sklearn. In
>> particular, I would like to start from trying:
>>
>> 1. BCH Codes as mentioned in section 2.3.4
>> http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume2/dietterich95a.pdf
>>
>> 2. Decriminant ECOC as presented in
>> O. Pujol, P. Radeva, , and J. Vitria`. Discriminant ECOC: A heuristic method
>> for application dependent design of error correcting output codes
>>
>> Also, for now we use only euclidean distance to find the nearest class as
>> represented in a codebook. We could add some new, for example, Humming
>> distance.
>>
>> What do you think about those new enhancements?
>>
>> Thanks,
>> Karol
>>
>> ------------------------------------------------------------------------------
>> Get 100% visibility into Java/.NET code with AppDynamics Lite!
>> It's a free troubleshooting tool designed for production.
>> Get down to code-level detail for bottlenecks, with <2% overhead.
>> Download for free and get started troubleshooting in minutes.
>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Get 100% visibility into Java/.NET code with AppDynamics Lite!
>> It's a free troubleshooting tool designed for production.
>> Get down to code-level detail for bottlenecks, with <2% overhead.
>> Download for free and get started troubleshooting in minutes.
>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>
------------------------------------------------------------------------------
Introducing Performance Central, a new site from SourceForge and
AppDynamics. Performance Central is your source for news, insights,
analysis and resources for efficient Application Performance Management.
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general