Hello and thank you Victor and Jordi for your answers
First of all, sorry for this long silence but I was on vacation. As suggested by Jordi, I send you the different confusion matrices. But first, I want to give you some explanations about what I try to do. I want to map defoliations of Larch forests caused by a caterpillar (Zeiraphera diniana, 'tordeuse grise du mélèze') in the French Alps. I use learning process with Landsat images as input data and relatively coarse field cartography as reference data. Field crews give me the map of the unhealthy stands. There are three classes of defoliations (low, medium and high). The healthy forests (no defoliation) are provided by IGN map (I have computed a difference map). In fact, most of the stands are healthy. So, my sample is not well balanced. I try different predictors in the process: spectral bands (BS, 7), vegetation indexes (8), difference of spectral bands with a year of reference (7), difference of vegetation indexes with a year of reference (8). There are 2 files, one for each method: - Method 1: 50% polygons of each class as learning data set, the 50% others polygons as validating data set. I have used TrainImagesClassifier, ImageClassifier and ComputeConfusionmatrix applications to do that - Method 2: Set “Training and validation sample ratio” to 0.5 in TrainImagesClassifier application and directly save the confusion matrix. As you can see, the results with the second method are much better. Any comment about this surprising result (for me at least) are welcome! Which method should I use? FYI, I use OTB 6.0 on Windows. Thierry Le jeudi 27 juillet 2017 17:46:18 UTC+2, Thierry Bélouard a écrit : > > Hello, > > > I would like to have some explanations about the calculus of the confusion > matrix (learning process) with OTB because I get totally different results > according to the way I proceed. > > > On one side, I have split my reference data (polygons, 4 classes of > defoliation in forest) into 2 data files: a learning data set (50% of the > polygons) and a testing data set (others 50%). To do so, I have sampled > polygons regularly according to their size in each class of defoliation. An > important point maybe is that I have sampled polygons and not points or > pixels. I calculate my random forest rule on my learning data set and then > I calculate my confusion matrix with my classification map and my testing > data set. > > > On the other side, it seem to me than we can calculate simultaneously the > classifying rule and the matrix confusion with TrainImagesClassifier module > with the entire reference data set (learning and testing polygons all > together) and setting learning/validation ratio to 0.5. Isn’t? If yes, I > don’t understand why I get completely different results. Can the reason be > a different sampling procedure of OTB as a systematic sampling of pixels > for example? > > > Thank you for your answer. > > > Thierry Bélouard > -- -- Check the OTB FAQ at http://www.orfeo-toolbox.org/FAQ.html You received this message because you are subscribed to the Google Groups "otb-users" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/otb-users?hl=en --- You received this message because you are subscribed to the Google Groups "otb-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
confusion Landsat 2015 testing.xlsx
Description: MS-Excel 2007 spreadsheet
confusion Landsat 2015 constant sample.xlsx
Description: MS-Excel 2007 spreadsheet
