[ 
https://issues.apache.org/jira/browse/IGNITE-11655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Dmitriev updated IGNITE-11655:
------------------------------------
    Affects Version/s: 2.7

> ML: OneHotEncoder returns more columns than expected
> ----------------------------------------------------
>
>                 Key: IGNITE-11655
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11655
>             Project: Ignite
>          Issue Type: Bug
>          Components: ml
>    Affects Versions: 2.7
>            Reporter: Anton Dmitriev
>            Priority: Major
>
> OneHotEncoder returns more columns than expected (two values that might be 
> encoded using two columns encoded using 3 columns). The following example 
> demonstrates the problem:
> Map<Integer, Object[]> training = new HashMap<>();
>         training.put(0, new Object[]{42.0});
>         training.put(1, new Object[]{43.0});
>         training.put(2, new Object[]{42.0});
>         EncoderTrainer<Integer, Object[]> trainer = new 
> EncoderTrainer<Integer, Object[]>()
>             .withEncoderType(EncoderType.ONE_HOT_ENCODER)
>             .withEncodedFeature(0);
>         IgniteBiFunction<Integer, Object[], Vector> processor = 
> trainer.fit(training, 1, (k, v) -> v);
>         Vector res = processor.apply(1, new Object[]{42.0});
>         System.out.println(Arrays.toString(res.asArray()));
> >>> [0.0, 1.0, 0.0]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to