[
https://issues.apache.org/jira/browse/IGNITE-12396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Zinoviev updated IGNITE-12396:
-------------------------------------
Affects Version/s: (was: 3.0)
2.8
> [ML] Random Forest generates NaN for a part of models on small datasets
> -----------------------------------------------------------------------
>
> Key: IGNITE-12396
> URL: https://issues.apache.org/jira/browse/IGNITE-12396
> Project: Ignite
> Issue Type: Bug
> Components: ml
> Affects Versions: 2.8
> Reporter: Alexey Zinoviev
> Assignee: Alexey Zinoviev
> Priority: Major
> Fix For: 3.0
>
>
> @Override public Double predict(Vector features) {
> double[] predictions = new double[models.size()];
> for (int i = 0; i < models.size(); i++)
> predictions[i] = models.get(i).predict(features);
> return predictionsAggregator.apply(predictions);
> }
>
> predictionAggreagtor gets a lot of models and part of them returns null and
> it could be aggregated, first of all handle this in Aggregator (using
> threshold for amount of broken models before aggregation) also RandomForest
> trees should return Double.NaN - it should fail or throw message after the
> training
>
> I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows
>
> RF generates a few models with one LEAF node with empty val (Double.NaN by
> default)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)