I guess that the confusion matrix from TrainImagesClassifier is better than the
one you get using your validation polygons. The training application samples
pixels and therefore it will use pixels from the same polygon for training and
for validation, which can yield better performance metrics than using a
complete different set of polygons for training and for validation.
Maybe you can share the confusion matrices with us so we can give you a better
guess of what is happening.
On Mon 31-Jul-2017 at 09:05:01 +0200, Poughon Victor <victor.poug...@cnes.fr>
> Hi Thierry,
> In OTB version 5.10 and higher, TrainImagesClassifier uses the sample
> procedure from SampleSelection, another OTB application:
> You can set some parameters directly in TrainImagesClassifier, but if you
> want greater control over the way the sampling is done, that’s one option.
> There’s a detailed tutorial in the cookbook here:
> Concerning your issue, are the two confusion matrices you get really
> different qualitatively, or is it just noise? Also, which OTB version are you
> Victor Poughon
> De : email@example.com [mailto:firstname.lastname@example.org] De la
> part de Thierry Bélouard
> Envoyé : jeudi 27 juillet 2017 17:46
> À : otb-users
> Objet : [otb-users] Learning process: confusion matrix
> I would like to have some explanations about the calculus of the confusion
> matrix (learning process) with OTB because I get totally different results
> according to the way I proceed.
> On one side, I have split my reference data (polygons, 4 classes of
> defoliation in forest) into 2 data files: a learning data set (50% of the
> polygons) and a testing data set (others 50%). To do so, I have sampled
> polygons regularly according to their size in each class of defoliation. An
> important point maybe is that I have sampled polygons and not points or
> pixels. I calculate my random forest rule on my learning data set and then I
> calculate my confusion matrix with my classification map and my testing data
> On the other side, it seem to me than we can calculate simultaneously the
> classifying rule and the matrix confusion with TrainImagesClassifier module
> with the entire reference data set (learning and testing polygons all
> together) and setting learning/validation ratio to 0.5. Isn’t? If yes, I
> don’t understand why I get completely different results. Can the reason be a
> different sampling procedure of OTB as a systematic sampling of pixels for
> Thank you for your answer.
> Thierry Bélouard
> Check the OTB FAQ at
> You received this message because you are subscribed to the Google
> Groups "otb-users" group.
> To post to this group, send email to email@example.com
> To unsubscribe from this group, send email to
> For more options, visit this group at
> You received this message because you are subscribed to the Google Groups
> "otb-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to otb-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
Check the OTB FAQ at
You received this message because you are subscribed to the Google
Groups "otb-users" group.
To post to this group, send email to firstname.lastname@example.org
To unsubscribe from this group, send email to
For more options, visit this group at
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email
For more options, visit https://groups.google.com/d/optout.