The problem mentioned in previous mail about classifying nominals was
solved using answer from Lars in one SE
post<http://stackoverflow.com/questions/7698713/fuzzy-c-means-categorical-data/7698936#7698936>,
that is, to use bag-of-nominal or one-hot representation.

On 12 November 2011 13:49, SK Sn <[email protected]> wrote:

> Hi all,
>
> I am looking into how to combine classifiers using Scikit-learn.
> I think for general purpose, it could be useful to have functions like
> stacking and voting in scikit-learn. Is there any plan of developing
> ensemble methods?
>
> For now, I am writting my own snippet for stacking. First phase would be
> stacking simply on predictions from different models and next would be
> stacking on probabilities.
>
> However, while dealing with the predictions, I get a problem of
> classifying nominals:
> In details, in level 0, several (say m) classifiers are used, and m
> predictions for each sample are gathered to form a Z matrix.
> In m=7, Z could look like:
> [ [1 1 2 1 1 1 1]
>   [3 3 3 6 3 3 3]
>  ....
>   [3 9 3 2 3 3 3]
> ]
> y in this case, could be:
> [1  3  ...  3 ]
>
> So, on level 1 (stacking level), a new classifier's task is to predict
> base on the results from level 0, e.g., for a test case, level 0 generates:
> [1 6 6 6 6 6]
> we expect level 1 classifier to give prediction as 6.
> Because in stakcing, level 1 is a machine learning classifier rather than
> selecting mode, one excepts stacking will out-perform voting in general.
>
> The problem is that all the numbers in Z are predication of categories,
> these numbers are nomial without any real quantitative meaning.
> I directly applied classification methods on (Z,y), results are terrible,
> except for tree classifier.
> Also regressions with rouding are tried, results are relatively higher
> than classification, but not as high as level 0. But still, regression on
> nomial numbers does not seem to make too much sense to me.
> I though about normalization and scaling in preprocessing, but not sure if
> they are relevant here.
>
> I wonder what is the right way to classify based on nomials?
>
> Thanks a lot!
>
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to