[R] tapply error svyby function survey package

2014-11-12 Thread Martin Canon
Hi.


I'm trying to calculate the weighted mean score of a quality of life
measure (ovt) in patients with irritable bowel syndrome by their
marital status (d7).

This is a summary of the structure of the dataset:

 str(sii.tesis)
'data.frame':1063 obs. of  75 variables:
 $ id : int  51 52 53 54 55 56 57 58 59 60 ...
 $ stratum: Factor w/ 6 levels MEst,MAcad,..: 1 4 NA 4 4 1 6 NA 4 4 ...
 $ expfc  : num  22.8 17.1 NA 17.1 17.1 ...
 $ d6 : Factor w/ 3 levels Estudiante,Profesor,..: 1 1 NA
1 1 1 3 NA 1 1 ...
 $ d7 : Factor w/ 6 levels Soltero,Casado,..: 1 1 NA 1 1 1
1 NA 1 1 ...
 $ d7c: Factor w/ 2 levels No estable,Estable: 1 1 NA 1 1
1 1 NA 1 1 ...
 $ s1cm   : Factor w/ 2 levels No,Si: 1 2 NA 1 1 1 2 NA 1 1 ...
 $ ovt: num  NA 93.4 NA NA NA ...

I declared the sampling design:

 sii.design - svydesign(
  id = ~1,
  strata = ~stratum,
  weights = ~expfc,
  data = subset(sii.tesis, !is.na(stratum)))

Then I tried to get the result:

 svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95)

but i get the error:

Error in tapply(1:NROW(x), list(factor(strata)), function(index) { :
  arguments must have same length


The length of both variables is the same. If the variable ovt exists,
there is a d7 match in the data frame.

I try the same thing using another variable instead - role (d6) -
and it works.

 svyby(~ovt, ~d6, sii.design, svymean, na.rm = TRUE, level = 0.95)
   d6  ovt   se
Estudiante Estudiante 71.01805 1.370569
Profesor Profesor 72.30923 6.518378
Administrativo Administrativo 75.69102 3.715050

If I use the recategorized d7 variable (d7c,  two levels only) it works too:

 svyby(~ovt, ~d7c, sii.design, svymean, na.rm = TRUE, level = 0.95)
  d7c  ovt  se
No estable No estable 70.92344 1.37460
Estable   Estable 74.53719 4.16954


What could be the problem?


Regards.


Martin Canon
Colombia, South America

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tapply error svyby function survey package

2014-11-12 Thread Anthony Damico
try resetting your levels?  if that doesn't work, please dput() an example
data set that we can test with :) thanks!

sii.design - update( sii.design , d6 = factor( d6 ) )






On Wed, Nov 12, 2014 at 7:59 AM, Martin Canon martin.ca...@gmail.com
wrote:

 Hi.


 I'm trying to calculate the weighted mean score of a quality of life
 measure (ovt) in patients with irritable bowel syndrome by their
 marital status (d7).

 This is a summary of the structure of the dataset:

  str(sii.tesis)
 'data.frame':1063 obs. of  75 variables:
  $ id : int  51 52 53 54 55 56 57 58 59 60 ...
  $ stratum: Factor w/ 6 levels MEst,MAcad,..: 1 4 NA 4 4 1 6 NA 4
 4 ...
  $ expfc  : num  22.8 17.1 NA 17.1 17.1 ...
  $ d6 : Factor w/ 3 levels Estudiante,Profesor,..: 1 1 NA
 1 1 1 3 NA 1 1 ...
  $ d7 : Factor w/ 6 levels Soltero,Casado,..: 1 1 NA 1 1 1
 1 NA 1 1 ...
  $ d7c: Factor w/ 2 levels No estable,Estable: 1 1 NA 1 1
 1 1 NA 1 1 ...
  $ s1cm   : Factor w/ 2 levels No,Si: 1 2 NA 1 1 1 2 NA 1 1 ...
  $ ovt: num  NA 93.4 NA NA NA ...

 I declared the sampling design:

  sii.design - svydesign(
   id = ~1,
   strata = ~stratum,
   weights = ~expfc,
   data = subset(sii.tesis, !is.na(stratum)))

 Then I tried to get the result:

  svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95)

 but i get the error:

 Error in tapply(1:NROW(x), list(factor(strata)), function(index) { :
   arguments must have same length


 The length of both variables is the same. If the variable ovt exists,
 there is a d7 match in the data frame.

 I try the same thing using another variable instead - role (d6) -
 and it works.

  svyby(~ovt, ~d6, sii.design, svymean, na.rm = TRUE, level = 0.95)
d6  ovt   se
 Estudiante Estudiante 71.01805 1.370569
 Profesor Profesor 72.30923 6.518378
 Administrativo Administrativo 75.69102 3.715050

 If I use the recategorized d7 variable (d7c,  two levels only) it works
 too:

  svyby(~ovt, ~d7c, sii.design, svymean, na.rm = TRUE, level = 0.95)
   d7c  ovt  se
 No estable No estable 70.92344 1.37460
 Estable   Estable 74.53719 4.16954


 What could be the problem?


 Regards.


 Martin Canon
 Colombia, South America

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tapply error svyby function survey package

2014-11-12 Thread Anthony Damico
hi martin, sending the first 25 rows does not help if it does not re-create
the problem..  when i run the data you have provided, i do not encounter
your problem (see below).  someone else may be able to guess the issue, but
this would be a lot easier to solve if you can create a minimal
reproducible example

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example


sii.tesis -
structure(list(id = c(51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L,
59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L,
73L, 74L, 75L, 76L), stratum = structure(c(1L, 4L, NA, 4L, 4L,
1L, 6L, NA, 4L, 4L, 1L, 1L, 1L, 6L, 6L, 3L, 3L, 6L, NA, 1L, 1L,
6L, 4L, 3L, 6L), .Label = c(MEst, MAcad, MAdm, FEst,
FAcad, FAdm), class = factor), expfc = c(22.8195266723633,
17.0644626617432, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
5.1702127456665, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
22.8195266723633, 22.8195266723633, 5.1702127456665, 5.1702127456665,
6.24137926101685, 6.24137926101685, 5.1702127456665, NA, 22.8195266723633,
22.8195266723633, 5.1702127456665, 17.0644626617432, 6.24137926101685,
5.1702127456665), d7 = structure(c(1L, 1L, NA, 1L, 1L, 1L, 1L,
NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, NA, 1L, 1L, 6L, 1L,
6L, 6L), .Label = c(Soltero, Casado, Separado, Divorciado,
Viudo, Union libre), class = factor), ovt = c(NA, 93.3823547363281,
NA, NA, NA, NA, 83.8235321044922, NA, NA, NA, NA, NA, NA, NA,
79.4117660522461, NA, NA, 19.1176471710205, NA, NA, NA, 85.2941207885742,
NA, NA, NA)), .Names = c(id, stratum, expfc, d7, ovt
), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25), class = data.frame)

 sii.design - svydesign(
  id = ~1,
  strata = ~stratum,
  weights = ~expfc,
  data = subset(sii.tesis, !is.na(stratum)))

svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95)


# works fine---
 svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95)
 d7  ovt   se
Soltero Soltero 88.94329 3.333485
Casado   Casado 19.11765 0.00
Union libre Union libre 85.29412 0.00






On Wed, Nov 12, 2014 at 5:25 PM, Martin Canon martin.ca...@gmail.com
wrote:

 Anthony, thanks for your reply.

 Resetting the levels didn't work.

 These are the first 25 rows of the dataset:

 structure(list(id = c(51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L,
 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L,
 73L, 74L, 75L, 76L), stratum = structure(c(1L, 4L, NA, 4L, 4L,
 1L, 6L, NA, 4L, 4L, 1L, 1L, 1L, 6L, 6L, 3L, 3L, 6L, NA, 1L, 1L,
 6L, 4L, 3L, 6L), .Label = c(MEst, MAcad, MAdm, FEst,
 FAcad, FAdm), class = factor), expfc = c(22.8195266723633,
 17.0644626617432, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
 5.1702127456665, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
 22.8195266723633, 22.8195266723633, 5.1702127456665, 5.1702127456665,
 6.24137926101685, 6.24137926101685, 5.1702127456665, NA, 22.8195266723633,
 22.8195266723633, 5.1702127456665, 17.0644626617432, 6.24137926101685,
 5.1702127456665), d7 = structure(c(1L, 1L, NA, 1L, 1L, 1L, 1L,
 NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, NA, 1L, 1L, 6L, 1L,
 6L, 6L), .Label = c(Soltero, Casado, Separado, Divorciado,
 Viudo, Union libre), class = factor), ovt = c(NA, 93.3823547363281,
 NA, NA, NA, NA, 83.8235321044922, NA, NA, NA, NA, NA, NA, NA,
 79.4117660522461, NA, NA, 19.1176471710205, NA, NA, NA, 85.2941207885742,
 NA, NA, NA)), .Names = c(id, stratum, expfc, d7, ovt
 ), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9,
 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
 21, 22, 23, 24, 25), class = data.frame)

 Regards.

 Martin

 On Wed, Nov 12, 2014 at 1:39 PM, Anthony Damico ajdam...@gmail.com
 wrote:
  try resetting your levels?  if that doesn't work, please dput() an
 example
  data set that we can test with :) thanks!
 
  sii.design - update( sii.design , d6 = factor( d6 ) )
 
 
 
 
 
 
  On Wed, Nov 12, 2014 at 7:59 AM, Martin Canon martin.ca...@gmail.com
  wrote:
 
  Hi.
 
 
  I'm trying to calculate the weighted mean score of a quality of life
  measure (ovt) in patients with irritable bowel syndrome by their
  marital status (d7).
 
  This is a summary of the structure of the dataset:
 
   str(sii.tesis)
  'data.frame':1063 obs. of  75 variables:
   $ id : int  51 52 53 54 55 56 57 58 59 60 ...
   $ stratum: Factor w/ 6 levels MEst,MAcad,..: 1 4 NA 4 4 1 6 NA
 4
  4 ...
   $ expfc  : num  22.8 17.1 NA 17.1 17.1 ...
   $ d6 : Factor w/ 3 levels Estudiante,Profesor,..: 1 1 NA
  1 1 1 3 NA 1 1 ...
   $ d7 : Factor w/ 6 levels Soltero,Casado,..: 1 1 NA 1 1 1
  1 NA 1 1 ...
   $ d7c: Factor w/ 2 levels No estable,Estable: 1 1 NA 1 1
  1 1 NA 1 1 ...
   $ s1cm   : Factor w/ 2 levels No,Si: 1 2 NA 1 1 1 2 NA 1 1 ...
   $ ovt: num  NA 93.4 NA NA NA ...
 
  I declared the sampling design:
 
   sii.design - svydesign(
id = ~1,
strata = 

Re: [R] tapply error svyby function survey package

2014-11-12 Thread Martin Canon
Anthony, thanks for your reply.

Resetting the levels didn't work.

These are the first 25 rows of the dataset:

structure(list(id = c(51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L,
59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L,
73L, 74L, 75L, 76L), stratum = structure(c(1L, 4L, NA, 4L, 4L,
1L, 6L, NA, 4L, 4L, 1L, 1L, 1L, 6L, 6L, 3L, 3L, 6L, NA, 1L, 1L,
6L, 4L, 3L, 6L), .Label = c(MEst, MAcad, MAdm, FEst,
FAcad, FAdm), class = factor), expfc = c(22.8195266723633,
17.0644626617432, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
5.1702127456665, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633,
22.8195266723633, 22.8195266723633, 5.1702127456665, 5.1702127456665,
6.24137926101685, 6.24137926101685, 5.1702127456665, NA, 22.8195266723633,
22.8195266723633, 5.1702127456665, 17.0644626617432, 6.24137926101685,
5.1702127456665), d7 = structure(c(1L, 1L, NA, 1L, 1L, 1L, 1L,
NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, NA, 1L, 1L, 6L, 1L,
6L, 6L), .Label = c(Soltero, Casado, Separado, Divorciado,
Viudo, Union libre), class = factor), ovt = c(NA, 93.3823547363281,
NA, NA, NA, NA, 83.8235321044922, NA, NA, NA, NA, NA, NA, NA,
79.4117660522461, NA, NA, 19.1176471710205, NA, NA, NA, 85.2941207885742,
NA, NA, NA)), .Names = c(id, stratum, expfc, d7, ovt
), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25), class = data.frame)

Regards.

Martin

On Wed, Nov 12, 2014 at 1:39 PM, Anthony Damico ajdam...@gmail.com wrote:
 try resetting your levels?  if that doesn't work, please dput() an example
 data set that we can test with :) thanks!

 sii.design - update( sii.design , d6 = factor( d6 ) )






 On Wed, Nov 12, 2014 at 7:59 AM, Martin Canon martin.ca...@gmail.com
 wrote:

 Hi.


 I'm trying to calculate the weighted mean score of a quality of life
 measure (ovt) in patients with irritable bowel syndrome by their
 marital status (d7).

 This is a summary of the structure of the dataset:

  str(sii.tesis)
 'data.frame':1063 obs. of  75 variables:
  $ id : int  51 52 53 54 55 56 57 58 59 60 ...
  $ stratum: Factor w/ 6 levels MEst,MAcad,..: 1 4 NA 4 4 1 6 NA 4
 4 ...
  $ expfc  : num  22.8 17.1 NA 17.1 17.1 ...
  $ d6 : Factor w/ 3 levels Estudiante,Profesor,..: 1 1 NA
 1 1 1 3 NA 1 1 ...
  $ d7 : Factor w/ 6 levels Soltero,Casado,..: 1 1 NA 1 1 1
 1 NA 1 1 ...
  $ d7c: Factor w/ 2 levels No estable,Estable: 1 1 NA 1 1
 1 1 NA 1 1 ...
  $ s1cm   : Factor w/ 2 levels No,Si: 1 2 NA 1 1 1 2 NA 1 1 ...
  $ ovt: num  NA 93.4 NA NA NA ...

 I declared the sampling design:

  sii.design - svydesign(
   id = ~1,
   strata = ~stratum,
   weights = ~expfc,
   data = subset(sii.tesis, !is.na(stratum)))

 Then I tried to get the result:

  svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95)

 but i get the error:

 Error in tapply(1:NROW(x), list(factor(strata)), function(index) { :
   arguments must have same length


 The length of both variables is the same. If the variable ovt exists,
 there is a d7 match in the data frame.

 I try the same thing using another variable instead - role (d6) -
 and it works.

  svyby(~ovt, ~d6, sii.design, svymean, na.rm = TRUE, level = 0.95)
d6  ovt   se
 Estudiante Estudiante 71.01805 1.370569
 Profesor Profesor 72.30923 6.518378
 Administrativo Administrativo 75.69102 3.715050

 If I use the recategorized d7 variable (d7c,  two levels only) it works
 too:

  svyby(~ovt, ~d7c, sii.design, svymean, na.rm = TRUE, level = 0.95)
   d7c  ovt  se
 No estable No estable 70.92344 1.37460
 Estable   Estable 74.53719 4.16954


 What could be the problem?


 Regards.


 Martin Canon
 Colombia, South America

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.