[R] Missing variable in new dataframe for prediction
Hi, I'm using a loop to evaluate several models by taking adjacent variables from my dataframe. When i try to get predictions for new values, i get an error message about a missing variable in my new dataframe. Below is an example adapted from ?gam in mgcv package library(mgcv) set.seed(0) n-400 sig-2 x0 - runif(n, 0, 1) x1 - runif(n, 0, 1) x2 - runif(n, 0, 1) x3 - runif(n, 0, 1) f0 - function(x) 2 * sin(pi * x) f1 - function(x) exp(2 * x) f2 - function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10 f3 - function(x) 0*x f - f0(x0) + f1(x1) + f2(x2) e - rnorm(n, 0, sig) y - f + e Mydata-data.frame(y=y,x0=x0,x1=x1,x2=x2,x3=x3) remove(list=c(y,x0,x1,x2,x3)) # Note below the syntax of the 3rd variable required for my loop for (i in 4:5){ b-gam(y~s(x0)+ s(x1)+ ns(Mydata[,i], 3), data=Mydata) newd - data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,x3=(0:399)/30) pred - predict.gam(b,newd) } Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames, : type (list) incorrect pour la variable 'Mydata' De plus : Warning message: not all required variables have been supplied in newdata! in: predict.gam(b, newd) #Defining the name for the variable as in the gam function doesn't solve the problem newd - data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,Mydata[,i]=(0:399)/30) Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames, : type (list) incorrect pour la variable 'Mydata' De plus : Warning message: not all required variables have been supplied in newdata! in: predict.gam(b, newd) How should i define my new dataset to be able to get my predictions ? Thanks in advance O__ Alain Le Tertre c/ /'_ --- Institut de Veille Sanitaire (InVS)/ Département Santé Environnement (*) \(*) -- Responsable de l'unité Systèmes d'Information Statistiques ~~ - 12 rue du val d'Osne 94415 Saint Maurice cedex FRANCE Voice: 33 1 41 79 68 76 Fax: 33 1 41 79 67 68 email: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing variable in new dataframe for prediction
The call to library(splines) is missing and also try replacing the line b - ... with fo - as.formula(sprintf(y ~ s(x0) + s(x1) + ns(%s, 3), names(Mydata)[i])) b - do.call(gam, list(fo, data = Mydata)) to dynamically recreate the formula on each iteration of the loop with the correct name, x2 or x3, inserted. On 2/13/07, LE TERTRE Alain [EMAIL PROTECTED] wrote: Hi, I'm using a loop to evaluate several models by taking adjacent variables from my dataframe. When i try to get predictions for new values, i get an error message about a missing variable in my new dataframe. Below is an example adapted from ?gam in mgcv package library(mgcv) set.seed(0) n-400 sig-2 x0 - runif(n, 0, 1) x1 - runif(n, 0, 1) x2 - runif(n, 0, 1) x3 - runif(n, 0, 1) f0 - function(x) 2 * sin(pi * x) f1 - function(x) exp(2 * x) f2 - function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10 f3 - function(x) 0*x f - f0(x0) + f1(x1) + f2(x2) e - rnorm(n, 0, sig) y - f + e Mydata-data.frame(y=y,x0=x0,x1=x1,x2=x2,x3=x3) remove(list=c(y,x0,x1,x2,x3)) # Note below the syntax of the 3rd variable required for my loop for (i in 4:5){ b-gam(y~s(x0)+ s(x1)+ ns(Mydata[,i], 3), data=Mydata) newd - data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,x3=(0:399)/30) pred - predict.gam(b,newd) } Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames, : type (list) incorrect pour la variable 'Mydata' De plus : Warning message: not all required variables have been supplied in newdata! in: predict.gam(b, newd) #Defining the name for the variable as in the gam function doesn't solve the problem newd - data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,Mydata[,i]=(0:399)/30) Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames, : type (list) incorrect pour la variable 'Mydata' De plus : Warning message: not all required variables have been supplied in newdata! in: predict.gam(b, newd) How should i define my new dataset to be able to get my predictions ? Thanks in advance O__ Alain Le Tertre c/ /'_ --- Institut de Veille Sanitaire (InVS)/ Département Santé Environnement (*) \(*) -- Responsable de l'unité Systèmes d'Information Statistiques ~~ - 12 rue du val d'Osne 94415 Saint Maurice cedex FRANCE Voice: 33 1 41 79 68 76 Fax: 33 1 41 79 67 68 email: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing variable in new dataframe for prediction
Actually this simpler replacement for the b - ... line would work just as well: fo - as.formula(sprintf(y ~ s(x0) + s(x1) + ns(%s, 3), names(Mydata)[i])) b - gam(fo, data = Mydata) On 2/13/07, Gabor Grothendieck [EMAIL PROTECTED] wrote: The call to library(splines) is missing and also try replacing the line b - ... with fo - as.formula(sprintf(y ~ s(x0) + s(x1) + ns(%s, 3), names(Mydata)[i])) b - do.call(gam, list(fo, data = Mydata)) to dynamically recreate the formula on each iteration of the loop with the correct name, x2 or x3, inserted. On 2/13/07, LE TERTRE Alain [EMAIL PROTECTED] wrote: Hi, I'm using a loop to evaluate several models by taking adjacent variables from my dataframe. When i try to get predictions for new values, i get an error message about a missing variable in my new dataframe. Below is an example adapted from ?gam in mgcv package library(mgcv) set.seed(0) n-400 sig-2 x0 - runif(n, 0, 1) x1 - runif(n, 0, 1) x2 - runif(n, 0, 1) x3 - runif(n, 0, 1) f0 - function(x) 2 * sin(pi * x) f1 - function(x) exp(2 * x) f2 - function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10 f3 - function(x) 0*x f - f0(x0) + f1(x1) + f2(x2) e - rnorm(n, 0, sig) y - f + e Mydata-data.frame(y=y,x0=x0,x1=x1,x2=x2,x3=x3) remove(list=c(y,x0,x1,x2,x3)) # Note below the syntax of the 3rd variable required for my loop for (i in 4:5){ b-gam(y~s(x0)+ s(x1)+ ns(Mydata[,i], 3), data=Mydata) newd - data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,x3=(0:399)/30) pred - predict.gam(b,newd) } Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames, : type (list) incorrect pour la variable 'Mydata' De plus : Warning message: not all required variables have been supplied in newdata! in: predict.gam(b, newd) #Defining the name for the variable as in the gam function doesn't solve the problem newd - data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,Mydata[,i]=(0:399)/30) Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames, : type (list) incorrect pour la variable 'Mydata' De plus : Warning message: not all required variables have been supplied in newdata! in: predict.gam(b, newd) How should i define my new dataset to be able to get my predictions ? Thanks in advance O__ Alain Le Tertre c/ /'_ --- Institut de Veille Sanitaire (InVS)/ Département Santé Environnement (*) \(*) -- Responsable de l'unité Systèmes d'Information Statistiques ~~ - 12 rue du val d'Osne 94415 Saint Maurice cedex FRANCE Voice: 33 1 41 79 68 76 Fax: 33 1 41 79 67 68 email: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.