Hello Paulo,
Many thanks for this. I updated the mode last night to include the ability to 
force regression mode, as well as including some more error checking for valid 
combinations of input parameters. Classification mode also checks that the 
input labelled pixels are CELL type. I'm not outputting all of the appropriate 
uncertainty measures like RSQ yet for regression mode yet, but I'll add those 
in.
That is interesting that you had better performance when using regression. I 
will have to check that for my application using scikit learn. In R using the 
randomforest package, the results were pretty much identical but my classes 
were balanced already, which I think is one factor that can lead to significant 
differences between binary classification probabilities vs regression.

Yes definitely will use this as a template to include other methods. I Only 
recently switched my work from R to Python but am just submitting a paper based 
on R which uses a range of classifiers like randomforest, GLM, GAM, and MARS 
which it was useful to evaluate the differences.
Steve


    _____________________________
From: Paulo van Breugel <[email protected]>
Sent: Sunday, March 27, 2016 3:11 AM
Subject: Re: [GRASS-dev] RandomForest classifier for imagery groups add-on
To: Vaclav Petras <[email protected]>, Steven Pawley 
<[email protected]>
Cc:  <[email protected]>


           Hi Steve  
   
 Yes, your user case will not differ methodologically from species modeling 
based on presence/absence. One reason I was asking for the regression 
randomForest is that in one article (can't remember the title, will look it up) 
it was found that the regression approach yielded better results, even though 
the response variable is binary. One your help page, you write that 
r.randomforest performs random forest classification and regression, and the 
regression mode can be used by setting the mode to the regression option. But I 
am not seeing that option?  
   
 Great you are planning other methods as well. Giving model uncertainties 
(quite an issue in species distribution modeling), having multiple methods is 
really a plus, especially as it allows one to build consensus models [1] and 
combine them to create uncertainty maps.  
   
 Cheers,  
   
 Paulo  
   
 [1]Marmion, M., Parviainen, M., Luoto, M., Heikkinen, R.K., & Thuiller, W. 
2009. Evaluation of consensus methods in predictive species distribution 
modelling.   Diversity and Distributions 15: 59–69.  
   
            
      On 27-03-16 00:47, Steven Pawley wrote:   
                    Hi Vaclaw and Paulo,              
               Thanks for those pointers re. lazy technique and documentation. 
I have a RandomForest diagram to explain the process, as well as some examples, 
so I'll update documentation next week.              
               Paulo thanks for running a few tests. It looks there is an error 
with the class_weight parameter, I'll check into that.              
               In terms of species distribution modelling, I have been using 
the tool for landslide susceptibility modelling, which I believe is 
methodologically similar to SDM in terms of having a binary response variable. 
I have been doing this for the area of Alberta, using an 8000 x 14000 pixel and 
17 band stack of predictors. In the case of a binary response variable, the 
usual approach is to run random forest in classification mode, i.e. with fully 
grown trees, but use the class probabilities to represent the 'species' or 
'landslide' index.              
               I am planning to implement other methods in the scikit learn 
package, which represents a trivial change to the module once he bugs are 
ironed out. I will probably look to create modules for SVM and logistic 
regression, and maybe  nearest neighbours classification. Certainly open to any 
suggestions.              
               Steve                _____________________________    
 From: Vaclav Petras <    [email protected]>    
 Sent: Saturday, March 26, 2016 11:21 AM    
 Subject: Re: [GRASS-dev] RandomForest classifier for imagery groups add-on    
 To: Steven Pawley <    [email protected]>    
 Cc: <    [email protected]>    
     
     
                  
               On Sat, Mar 26, 2016 at 12:40 PM, Steven Pawley        
<[email protected]> wrote:        
                 I would like to draw your attention to a new GRASS add-on, 
r.randomforest, which uses the scikit-learn and pandas Python packages to 
classify GRASS rasters.                      
                   Thanks, this looks good. Please consider adding an image to 
the documentation to better promote the module [1] and also an example which 
would work with the NC SPM dataset [2]. For the addon to generate documentation 
on the server and work well at few other special occasions, it is advantageous 
to employ lazy import technique for the non-standard dependencies, see for 
example       v.class.ml and v.class.mlpy [3].       
       
                   Vaclav       
                   
 [1]       https://trac.osgeo.org/grass/wiki/Submitting/Docs#Images       
 [2]       https://grass.osgeo.org/download/sample-data/       
 [3]       https://trac.osgeo.org/grass/changeset/66482/       
                
     
          
    


  
_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev

Reply via email to