On Sun, 11 Jan 2009, [iso-8859-1] Elsa et Stéphane BOUEE wrote:
Hi all,
I have a question on the package « survey
Please don't send questions to the list and to me separately. Either one is ok,
but not both.
I have some difficulties to use the function calibrate. Although it works
well with one single factor variable, I cannot use it for 2 and get the
message
Erreur dans regcalibrate.survey.design2(design, formula, population,
aggregate.stage = aggregate.stage, : Population and sample totals are not
the same length.
Here is the format I use as a data.frame:
<snip>
My program is:
grap<-svydesign(id=~1, data=ecodiaMG)
regMG <-c(region1MG_NE =852, region1MG_NO=662, region1MG_P=636,
region1MG_SE=961, region1MG_SO=545)
sexMG <-c(sexe1MG_F =976, sexe1MG_H=2680)
ageMG <-c(age_cl1MG_40 =380, age_cl1MG_4054=2099, age_cl1MG_54=1177)
grap2<- calibrate(grap, formula= ~ age_cl1-1, c(ageMG))
grap3<- calibrate(grap2, formula= ~ sexe1-1, c(sexMG))
grap4<- calibrate(grap3, formula= ~region1-1, c(regMG))
I can calibrate the variables one by one, which is wrong, so I would like to
do it all in once:
grap2<- calibrate(grap, formula= ~ age_cl1+ sexe1+ regMG -1, c(ageMG, sexMG,
regMG ))
You need to drop the first level of sex1 and region1 (I assume you mean
region1, not regMG in the formula argument).
grap2<- calibrate(grap, formula= ~ age_cl1+ sexe1+ region1 -1,
c(ageMG, sexMG[-1], regMG[-1] ))
The population totals for calibrate() are the column totals for the regression
model matrix specified by the formula. With the default settings, when you
have a single factor
~region1
the model matrix has a intercept column and then columns for each level of the
factor except the first. Using the -1 notation
~region1 -1
removes the intercept and so requires a column for each level of the factor.
When you have two or more factors and no intercept the first factor is coded
with a column for each level of the factor. The sum of these columns is a
constant column, so the model now effectively includes an intercept and all
remaining factor variables are coded with a column for all levels except the
first.
One way to be sure about what model matrix corresponds to the formula is to use
the formula in a regression model, eg with svyglm() and see what coefficients
appear.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
[email protected] University of Washington, Seattle
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.