[ML] [New Feature] Handle different labels in training data and handle unknown labels in test or updated training data correctly

Alexey Zinoviev Thu, 13 Sep 2018 08:50:54 -0700

Hi, Igniters

Welcome to the discussion about labels handling during ML training.


The problem is that all algorithms of binary classification are ready to
handle the datasets marked with 0/1 labels and predict 0/1 labels without
especial mapping.

Also the algorithms don't handle situation with unknown labels during the
updating and testing phases

Umbrella ticket is created here

https://issues.apache.org/jira/browse/IGNITE-9587

Also, I'd invite you to discuss

1) list of trainers to upgrade with this feature

2) how to handle unknown labels during the prediction/test phase

3) how to handle unknown labels during the update model phase (new data is
coming during the next training and the results of next training should be
merged with the results of the previous training)

4) Where to store metadata about labels during training phases

Glad to hear your ideas

[ML] [New Feature] Handle different labels in training data and handle unknown labels in test or updated training data correctly

Reply via email to