Hi, The model(s) learn a correlation between the label(s) and the features. In the Random Forest Classification example the Labeled feature represents the class that a wine belongs to based on a given set of features. see:
The labeled feature is defined here: Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>() .labeled(Vectorizer.LabelCoordinate.FIRST); ModelsComposition randomForestMdl = classifier.fit(ignite, dataCache, vectorizer); After the model has learned the associations between class and labels, it is tested here: double groundTruth = val.get(0); double prediction = randomForestMdl.predict(inputs); totalAmount++; if (!Precision.equals(groundTruth, prediction, Precision.EPSILON)) amountOfErrors++; if you put breakpoints on these lines, groundTruth will be one of 3 available classes and the model prediction will try match that classification based on available inputs. see: https://apacheignite.readme.io/docs/random-forest In that document you will find more references on working with random forest models. If you are new to ML, simple Linear Regression might be the most accessible model to learn. https://apacheignite.readme.io/docs/ols-multiple-linear-regression Is there a way to parallelize the training across available cores while still limiting the operation to a single JVM process? Apache Ignite machine learning was designed from the bottom up to train a model quickly by spreading the load across all nodes of a cluster. see: https://apacheignite.readme.io/docs/ml-partition-based-dataset If you want to limit training to a single JVM process then create a cluster of one node. Take a look in the examples here on pointers with feature selection: https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/ml/selection https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/ml/tutorial/hyperparametertuning Thanks, Alex -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/