[jira] [Updated] (IGNITE-12396) [ML] Random Forest generates NaN for a part of models on small datasets

2020-12-29 Thread Maxim Muzafarov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-12396:
-
Fix Version/s: (was: 2.10)

> [ML] Random Forest generates NaN for a part of models on small datasets
> ---
>
> Key: IGNITE-12396
> URL: https://issues.apache.org/jira/browse/IGNITE-12396
> Project: Ignite
>  Issue Type: Bug
>  Components: ml
>Reporter: Alexey Zinoviev
>Assignee: Alexey Zinoviev
>Priority: Major
>
> @Override public Double predict(Vector features) {
>  double[] predictions = new double[models.size()];
>  for (int i = 0; i < models.size(); i++)
>  predictions[i] = models.get(i).predict(features);
>  return predictionsAggregator.apply(predictions);
> }
>  
> predictionAggreagtor gets a lot of models and part of them returns null and 
> it could be aggregated, first of all handle this in Aggregator (using 
> threshold for amount of broken models before aggregation) also RandomForest 
> trees should return Double.NaN - it should fail or throw message after the 
> training
>  
> I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows
>  
> RF generates a few models with one LEAF node with empty val (Double.NaN by 
> default)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-12396) [ML] Random Forest generates NaN for a part of models on small datasets

2020-09-29 Thread Alexey Zinoviev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Zinoviev updated IGNITE-12396:
-
Affects Version/s: (was: 2.8)

> [ML] Random Forest generates NaN for a part of models on small datasets
> ---
>
> Key: IGNITE-12396
> URL: https://issues.apache.org/jira/browse/IGNITE-12396
> Project: Ignite
>  Issue Type: Bug
>  Components: ml
>Reporter: Alexey Zinoviev
>Assignee: Alexey Zinoviev
>Priority: Major
> Fix For: 2.10
>
>
> @Override public Double predict(Vector features) {
>  double[] predictions = new double[models.size()];
>  for (int i = 0; i < models.size(); i++)
>  predictions[i] = models.get(i).predict(features);
>  return predictionsAggregator.apply(predictions);
> }
>  
> predictionAggreagtor gets a lot of models and part of them returns null and 
> it could be aggregated, first of all handle this in Aggregator (using 
> threshold for amount of broken models before aggregation) also RandomForest 
> trees should return Double.NaN - it should fail or throw message after the 
> training
>  
> I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows
>  
> RF generates a few models with one LEAF node with empty val (Double.NaN by 
> default)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-12396) [ML] Random Forest generates NaN for a part of models on small datasets

2020-09-29 Thread Alexey Zinoviev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Zinoviev updated IGNITE-12396:
-
Fix Version/s: (was: 3.0)
   2.10

> [ML] Random Forest generates NaN for a part of models on small datasets
> ---
>
> Key: IGNITE-12396
> URL: https://issues.apache.org/jira/browse/IGNITE-12396
> Project: Ignite
>  Issue Type: Bug
>  Components: ml
>Affects Versions: 2.8
>Reporter: Alexey Zinoviev
>Assignee: Alexey Zinoviev
>Priority: Major
> Fix For: 2.10
>
>
> @Override public Double predict(Vector features) {
>  double[] predictions = new double[models.size()];
>  for (int i = 0; i < models.size(); i++)
>  predictions[i] = models.get(i).predict(features);
>  return predictionsAggregator.apply(predictions);
> }
>  
> predictionAggreagtor gets a lot of models and part of them returns null and 
> it could be aggregated, first of all handle this in Aggregator (using 
> threshold for amount of broken models before aggregation) also RandomForest 
> trees should return Double.NaN - it should fail or throw message after the 
> training
>  
> I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows
>  
> RF generates a few models with one LEAF node with empty val (Double.NaN by 
> default)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-12396) [ML] Random Forest generates NaN for a part of models on small datasets

2020-01-23 Thread Alexey Zinoviev (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Zinoviev updated IGNITE-12396:
-
Affects Version/s: (was: 3.0)
   2.8

> [ML] Random Forest generates NaN for a part of models on small datasets
> ---
>
> Key: IGNITE-12396
> URL: https://issues.apache.org/jira/browse/IGNITE-12396
> Project: Ignite
>  Issue Type: Bug
>  Components: ml
>Affects Versions: 2.8
>Reporter: Alexey Zinoviev
>Assignee: Alexey Zinoviev
>Priority: Major
> Fix For: 3.0
>
>
> @Override public Double predict(Vector features) {
>  double[] predictions = new double[models.size()];
>  for (int i = 0; i < models.size(); i++)
>  predictions[i] = models.get(i).predict(features);
>  return predictionsAggregator.apply(predictions);
> }
>  
> predictionAggreagtor gets a lot of models and part of them returns null and 
> it could be aggregated, first of all handle this in Aggregator (using 
> threshold for amount of broken models before aggregation) also RandomForest 
> trees should return Double.NaN - it should fail or throw message after the 
> training
>  
> I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows
>  
> RF generates a few models with one LEAF node with empty val (Double.NaN by 
> default)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)