Hi Steve

Great news! I gave it a quick try (on Ubuntu 14.04, GRASS 7 master). Size input raster layers: rows: 1578, columns: 1436

*1st try - input full map, classes 1/0, *
I had to stop as it took too much time. Stopping it did not stop the python processes however, I had to kill the processes.

*2nd try - input random sample of 100 points, 1 (12) and 0 (88), with b flag* r.randomforest -b igroup=predictors@SampleSize roi=test2 output=test2_output ntrees=500 mfeatures=-1 minsplit=2 randst=1 lines=100
Group <predictors> references the following raster maps:
Traceback (most recent call last):
  File "/home/paulo/.grass7/addons/scripts/r.randomforest",
line 335, in <module>
    main()
  File "/home/paulo/.grass7/addons/scripts/r.randomforest",
line 243, in main
    class_weight = "balanced", max_features = mfeatures,
min_samples_split = minsplit, random_state = randst)
TypeError: __init__() got an unexpected keyword argument
'class_weight'
Removing raster <tmp_jNyNcqZa>

*3rd try**- input random sample of 100 points, 1 (#12) and 0 (#88), with b flag* r.randomforest igroup=predictors@SampleSize roi=test2 output=test2_output ntrees=500 mfeatures=-1 minsplit=2 randst=1 lines=100
Group <predictors> references the following raster maps:
Our OOB prediction of accuracy is: 89.0%
                   Raster  Importance
0   bio1_wc30s@SampleSize    0.183670
1   bio2_wc30s@SampleSize    0.139914
2   bio3_wc30s@SampleSize    0.105035
3   bio4_wc30s@SampleSize    0.106413
4  bio13_wc30s@SampleSize    0.087399
5  bio14_wc30s@SampleSize    0.146495
6     dm_wc30s@SampleSize    0.104575
7   llds_wc30s@SampleSize    0.126499
Removing raster <tmp_RhTllKlA>

*Questions*
* I am using it for species distribution modeling (presence/absence input map), but I prefer to use the regression mode. Is there a way to force it to use the regression mode? * Are you planning to implement other classification methods? Seems if this works it shouldn't be too hard to replace the randomforest method by any of the other methods in scipy? I have for som time been thinking about using scipy, but my programming skills are not up to standards. But perhaps it is easier using your addon as template?

Cheers,

Paulo




On Sat, Mar 26, 2016 at 5:40 PM, Steven Pawley <[email protected] <mailto:[email protected]>> wrote:

   Hello developers,

   I would like to draw your attention to a new GRASS add-on,
   r.randomforest, which uses the scikit-learn and pandas Python
   packages to classify GRASS rasters. Similar to existing GRASS
   classification methods, it uses an imagery group and a raster of
   labelled pixels as the inputs for the classification. It also reads
   the rasters row-by-row, and then bundles these rows based on a user
   specified row increment to the classifier to keep memory
   requirements low, but also allow efficient classification because
   the scikit-learn implementation is multithreaded by default, and
   row-by-row results in too much stop-start behaviour. The feature
   importance scores and out-of-bag error are displayed in the command
   window.

   I would appreciate testing - you need to have scikit-learn and
   pandas installed in your Python environment which is easy on Linux
   and OS X, and instructions are provided in the tool for Windows.

   I have another add-on that I will upload soon, r.roc, which
   generates ROC and AUROC for prediction models.

   Steve

   Sent from Outlook Mobile <https://aka.ms/sdimjr>


   _______________________________________________
   grass-dev mailing list
   [email protected] <mailto:[email protected]>
   http://lists.osgeo.org/mailman/listinfo/grass-dev


_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev

Reply via email to