Re: [R-sig-Geo] seemingly unresolved problem with predict() in package raster
In the development version of raster (= 2.1-22, on R-Forge) there is a new 'factors' argument to predict that allows you to deal with this problem with cforest models. The help file has an example. This may change as I am not sure if this is the best approach; a perhaps more consistent alternative would be to specify that the predictor variable in the RasterStack is a factor. Robert On Fri, Apr 5, 2013 at 12:05 AM, Gonzalez-Mirelis Genoveva genoveva.gonzalez-mire...@imr.no wrote: Many thanks for looking into this Robert! From: Robert J. Hijmans [mailto:r.hijm...@gmail.com] Sent: April-04-13 17:53 To: Gonzalez-Mirelis Genoveva Cc: r-sig-geo@r-project.org Subject: Re: [R-sig-Geo] seemingly unresolved problem with predict() in package raster cforest with factors are currently not supported (although they may work in some cases). I will change the predict function to make it more general by adding a 'levels' argument such that you can indicate (for models with a non-standard structure such as cforest) which variables are factors and what their levels are. Robert On Thu, Apr 4, 2013 at 4:22 AM, Gonzalez-Mirelis Genoveva genoveva.gonzalez-mire...@imr.nomailto:genoveva.gonzalez-mire...@imr.no wrote: Hi all, I have a problem with the function raster::predict very similar to the one described here [1], [2] using raster package version 2.0-41 and party package version 1.0-6, where my model is a conditional inference forest (party::cforest). Could not find a solution in either post. The problem is the following: when I use the function predict() I get an error relating to a mismatch between the levels of factors in the data frame used to fit the model and those of the raster layers used to predict to the whole surface. The only difference that I can see is that the levels of the factors in the model are as follows (e.g.): morphrec.levels-levels(v$morphrec) morphrec.levels [1] 1 2 3 4 5 6 Whereas if I ask for the unique values of the same raster layer (where values were extracted from in an earlier step) that is now being used to predict a value for the response variable, I see the following: morphrec.values-sort(unique(getValues(morphrec))) morphrec.values [1] 1 2 3 4 5 6 When I try running predict, the following happens: r - predict(predictors, confor.dens, type='response',OOB=TRUE) Loading required package: tcltk Loading Tcl/Tk interface ... done Error in checkData(oldData, RET) : Classes of new data do not match original data predictors class : RasterStack dimensions : 2787, 2293, 6390591, 7 (nrow, ncol, ncell, nlayers) resolution : 200, 200 (x, y) extent : 296387.5, 754987.5, 7488413, 8045813 (xmin, xmax, ymin, ymax) coord. ref. : +proj=utm +zone=33 +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 names : bathy, slope, tri, grainsize, landscape, sedenv,morphrec min values : 0.44850001, 0.02432192, 0.04429007, 1., 21., 1., 1. max values : 2912.94849,46.13418,43.59974, 300.0, 431.0, 5.0, 6.0 confor.dens Random Forest using Conditional Inference Trees Number of trees: 1000 Response: density Inputs: bathy, slope, tri, grainsize, landscape, sedenv, morphrec Number of observations: 1020 Is there any way of telling predict() that the values of some raster layers will have to be considered characters? Or am I facing a different problem entirely? Any help will be much appreciated. I have been working on creating a small, reproducible example, but I can't seem to get the exact same behavior. Regards, Geno [1] http://r-sig-geo.2731867.n2.nabble.com/problems-with-predict-in-package-raster-td6627291.html#a6628041 [2] http://r-sig-geo.2731867.n2.nabble.com/new-function-predict-package-raster-td5796744.html#a5816769 Genoveva Gonzalez Mirelis, Scientist Institute of Marine Research Nordnesgaten 50 5005 Bergen, Norway +47 5523 6376tel:%2B47%205523%206376 [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.orgmailto:R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Re: [R-sig-Geo] seemingly unresolved problem with predict() in package raster
Many thanks for looking into this Robert! From: Robert J. Hijmans [mailto:r.hijm...@gmail.com] Sent: April-04-13 17:53 To: Gonzalez-Mirelis Genoveva Cc: r-sig-geo@r-project.org Subject: Re: [R-sig-Geo] seemingly unresolved problem with predict() in package raster cforest with factors are currently not supported (although they may work in some cases). I will change the predict function to make it more general by adding a 'levels' argument such that you can indicate (for models with a non-standard structure such as cforest) which variables are factors and what their levels are. Robert On Thu, Apr 4, 2013 at 4:22 AM, Gonzalez-Mirelis Genoveva genoveva.gonzalez-mire...@imr.nomailto:genoveva.gonzalez-mire...@imr.no wrote: Hi all, I have a problem with the function raster::predict very similar to the one described here [1], [2] using raster package version 2.0-41 and party package version 1.0-6, where my model is a conditional inference forest (party::cforest). Could not find a solution in either post. The problem is the following: when I use the function predict() I get an error relating to a mismatch between the levels of factors in the data frame used to fit the model and those of the raster layers used to predict to the whole surface. The only difference that I can see is that the levels of the factors in the model are as follows (e.g.): morphrec.levels-levels(v$morphrec) morphrec.levels [1] 1 2 3 4 5 6 Whereas if I ask for the unique values of the same raster layer (where values were extracted from in an earlier step) that is now being used to predict a value for the response variable, I see the following: morphrec.values-sort(unique(getValues(morphrec))) morphrec.values [1] 1 2 3 4 5 6 When I try running predict, the following happens: r - predict(predictors, confor.dens, type='response',OOB=TRUE) Loading required package: tcltk Loading Tcl/Tk interface ... done Error in checkData(oldData, RET) : Classes of new data do not match original data predictors class : RasterStack dimensions : 2787, 2293, 6390591, 7 (nrow, ncol, ncell, nlayers) resolution : 200, 200 (x, y) extent : 296387.5, 754987.5, 7488413, 8045813 (xmin, xmax, ymin, ymax) coord. ref. : +proj=utm +zone=33 +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 names : bathy, slope, tri, grainsize, landscape, sedenv,morphrec min values : 0.44850001, 0.02432192, 0.04429007, 1., 21., 1., 1. max values : 2912.94849,46.13418,43.59974, 300.0, 431.0, 5.0, 6.0 confor.dens Random Forest using Conditional Inference Trees Number of trees: 1000 Response: density Inputs: bathy, slope, tri, grainsize, landscape, sedenv, morphrec Number of observations: 1020 Is there any way of telling predict() that the values of some raster layers will have to be considered characters? Or am I facing a different problem entirely? Any help will be much appreciated. I have been working on creating a small, reproducible example, but I can't seem to get the exact same behavior. Regards, Geno [1] http://r-sig-geo.2731867.n2.nabble.com/problems-with-predict-in-package-raster-td6627291.html#a6628041 [2] http://r-sig-geo.2731867.n2.nabble.com/new-function-predict-package-raster-td5796744.html#a5816769 Genoveva Gonzalez Mirelis, Scientist Institute of Marine Research Nordnesgaten 50 5005 Bergen, Norway +47 5523 6376tel:%2B47%205523%206376 [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.orgmailto:R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[R-sig-Geo] seemingly unresolved problem with predict() in package raster
Hi all, I have a problem with the function raster::predict very similar to the one described here [1], [2] using raster package version 2.0-41 and party package version 1.0-6, where my model is a conditional inference forest (party::cforest). Could not find a solution in either post. The problem is the following: when I use the function predict() I get an error relating to a mismatch between the levels of factors in the data frame used to fit the model and those of the raster layers used to predict to the whole surface. The only difference that I can see is that the levels of the factors in the model are as follows (e.g.): morphrec.levels-levels(v$morphrec) morphrec.levels [1] 1 2 3 4 5 6 Whereas if I ask for the unique values of the same raster layer (where values were extracted from in an earlier step) that is now being used to predict a value for the response variable, I see the following: morphrec.values-sort(unique(getValues(morphrec))) morphrec.values [1] 1 2 3 4 5 6 When I try running predict, the following happens: r - predict(predictors, confor.dens, type='response',OOB=TRUE) Loading required package: tcltk Loading Tcl/Tk interface ... done Error in checkData(oldData, RET) : Classes of new data do not match original data predictors class : RasterStack dimensions : 2787, 2293, 6390591, 7 (nrow, ncol, ncell, nlayers) resolution : 200, 200 (x, y) extent : 296387.5, 754987.5, 7488413, 8045813 (xmin, xmax, ymin, ymax) coord. ref. : +proj=utm +zone=33 +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 names : bathy, slope, tri, grainsize, landscape, sedenv,morphrec min values : 0.44850001, 0.02432192, 0.04429007, 1., 21., 1., 1. max values : 2912.94849,46.13418,43.59974, 300.0, 431.0, 5.0, 6.0 confor.dens Random Forest using Conditional Inference Trees Number of trees: 1000 Response: density Inputs: bathy, slope, tri, grainsize, landscape, sedenv, morphrec Number of observations: 1020 Is there any way of telling predict() that the values of some raster layers will have to be considered characters? Or am I facing a different problem entirely? Any help will be much appreciated. I have been working on creating a small, reproducible example, but I can't seem to get the exact same behavior. Regards, Geno [1] http://r-sig-geo.2731867.n2.nabble.com/problems-with-predict-in-package-raster-td6627291.html#a6628041 [2] http://r-sig-geo.2731867.n2.nabble.com/new-function-predict-package-raster-td5796744.html#a5816769 Genoveva Gonzalez Mirelis, Scientist Institute of Marine Research Nordnesgaten 50 5005 Bergen, Norway +47 5523 6376 [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Re: [R-sig-Geo] seemingly unresolved problem with predict() in package raster
cforest with factors are currently not supported (although they may work in some cases). I will change the predict function to make it more general by adding a 'levels' argument such that you can indicate (for models with a non-standard structure such as cforest) which variables are factors and what their levels are. Robert On Thu, Apr 4, 2013 at 4:22 AM, Gonzalez-Mirelis Genoveva genoveva.gonzalez-mire...@imr.no wrote: Hi all, I have a problem with the function raster::predict very similar to the one described here [1], [2] using raster package version 2.0-41 and party package version 1.0-6, where my model is a conditional inference forest (party::cforest). Could not find a solution in either post. The problem is the following: when I use the function predict() I get an error relating to a mismatch between the levels of factors in the data frame used to fit the model and those of the raster layers used to predict to the whole surface. The only difference that I can see is that the levels of the factors in the model are as follows (e.g.): morphrec.levels-levels(v$morphrec) morphrec.levels [1] 1 2 3 4 5 6 Whereas if I ask for the unique values of the same raster layer (where values were extracted from in an earlier step) that is now being used to predict a value for the response variable, I see the following: morphrec.values-sort(unique(getValues(morphrec))) morphrec.values [1] 1 2 3 4 5 6 When I try running predict, the following happens: r - predict(predictors, confor.dens, type='response',OOB=TRUE) Loading required package: tcltk Loading Tcl/Tk interface ... done Error in checkData(oldData, RET) : Classes of new data do not match original data predictors class : RasterStack dimensions : 2787, 2293, 6390591, 7 (nrow, ncol, ncell, nlayers) resolution : 200, 200 (x, y) extent : 296387.5, 754987.5, 7488413, 8045813 (xmin, xmax, ymin, ymax) coord. ref. : +proj=utm +zone=33 +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 names : bathy, slope, tri, grainsize, landscape, sedenv,morphrec min values : 0.44850001, 0.02432192, 0.04429007, 1., 21., 1., 1. max values : 2912.94849,46.13418,43.59974, 300.0, 431.0, 5.0, 6.0 confor.dens Random Forest using Conditional Inference Trees Number of trees: 1000 Response: density Inputs: bathy, slope, tri, grainsize, landscape, sedenv, morphrec Number of observations: 1020 Is there any way of telling predict() that the values of some raster layers will have to be considered characters? Or am I facing a different problem entirely? Any help will be much appreciated. I have been working on creating a small, reproducible example, but I can't seem to get the exact same behavior. Regards, Geno [1] http://r-sig-geo.2731867.n2.nabble.com/problems-with-predict-in-package-raster-td6627291.html#a6628041 [2] http://r-sig-geo.2731867.n2.nabble.com/new-function-predict-package-raster-td5796744.html#a5816769 Genoveva Gonzalez Mirelis, Scientist Institute of Marine Research Nordnesgaten 50 5005 Bergen, Norway +47 5523 6376 [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo