[ 
https://issues.apache.org/jira/browse/IGNITE-12396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Zinoviev updated IGNITE-12396:
-------------------------------------
    Affects Version/s:     (was: 2.8)

> [ML] Random Forest generates NaN for a part of models on small datasets
> -----------------------------------------------------------------------
>
>                 Key: IGNITE-12396
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12396
>             Project: Ignite
>          Issue Type: Bug
>          Components: ml
>            Reporter: Alexey Zinoviev
>            Assignee: Alexey Zinoviev
>            Priority: Major
>             Fix For: 2.10
>
>
> @Override public Double predict(Vector features) {
>  double[] predictions = new double[models.size()];
>  for (int i = 0; i < models.size(); i++)
>  predictions[i] = models.get(i).predict(features);
>  return predictionsAggregator.apply(predictions);
> }
>  
> predictionAggreagtor gets a lot of models and part of them returns null and 
> it could be aggregated, first of all handle this in Aggregator (using 
> threshold for amount of broken models before aggregation) also RandomForest 
> trees should return Double.NaN - it should fail or throw message after the 
> training
>  
> I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows
>  
> RF generates a few models with one LEAF node with empty val (Double.NaN by 
> default)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to