[R] tapply error svyby function survey package
Hi. I'm trying to calculate the weighted mean score of a quality of life measure (ovt) in patients with irritable bowel syndrome by their marital status (d7). This is a summary of the structure of the dataset: str(sii.tesis) 'data.frame':1063 obs. of 75 variables: $ id : int 51 52 53 54 55 56 57 58 59 60 ... $ stratum: Factor w/ 6 levels MEst,MAcad,..: 1 4 NA 4 4 1 6 NA 4 4 ... $ expfc : num 22.8 17.1 NA 17.1 17.1 ... $ d6 : Factor w/ 3 levels Estudiante,Profesor,..: 1 1 NA 1 1 1 3 NA 1 1 ... $ d7 : Factor w/ 6 levels Soltero,Casado,..: 1 1 NA 1 1 1 1 NA 1 1 ... $ d7c: Factor w/ 2 levels No estable,Estable: 1 1 NA 1 1 1 1 NA 1 1 ... $ s1cm : Factor w/ 2 levels No,Si: 1 2 NA 1 1 1 2 NA 1 1 ... $ ovt: num NA 93.4 NA NA NA ... I declared the sampling design: sii.design - svydesign( id = ~1, strata = ~stratum, weights = ~expfc, data = subset(sii.tesis, !is.na(stratum))) Then I tried to get the result: svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95) but i get the error: Error in tapply(1:NROW(x), list(factor(strata)), function(index) { : arguments must have same length The length of both variables is the same. If the variable ovt exists, there is a d7 match in the data frame. I try the same thing using another variable instead - role (d6) - and it works. svyby(~ovt, ~d6, sii.design, svymean, na.rm = TRUE, level = 0.95) d6 ovt se Estudiante Estudiante 71.01805 1.370569 Profesor Profesor 72.30923 6.518378 Administrativo Administrativo 75.69102 3.715050 If I use the recategorized d7 variable (d7c, two levels only) it works too: svyby(~ovt, ~d7c, sii.design, svymean, na.rm = TRUE, level = 0.95) d7c ovt se No estable No estable 70.92344 1.37460 Estable Estable 74.53719 4.16954 What could be the problem? Regards. Martin Canon Colombia, South America __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply error svyby function survey package
try resetting your levels? if that doesn't work, please dput() an example data set that we can test with :) thanks! sii.design - update( sii.design , d6 = factor( d6 ) ) On Wed, Nov 12, 2014 at 7:59 AM, Martin Canon martin.ca...@gmail.com wrote: Hi. I'm trying to calculate the weighted mean score of a quality of life measure (ovt) in patients with irritable bowel syndrome by their marital status (d7). This is a summary of the structure of the dataset: str(sii.tesis) 'data.frame':1063 obs. of 75 variables: $ id : int 51 52 53 54 55 56 57 58 59 60 ... $ stratum: Factor w/ 6 levels MEst,MAcad,..: 1 4 NA 4 4 1 6 NA 4 4 ... $ expfc : num 22.8 17.1 NA 17.1 17.1 ... $ d6 : Factor w/ 3 levels Estudiante,Profesor,..: 1 1 NA 1 1 1 3 NA 1 1 ... $ d7 : Factor w/ 6 levels Soltero,Casado,..: 1 1 NA 1 1 1 1 NA 1 1 ... $ d7c: Factor w/ 2 levels No estable,Estable: 1 1 NA 1 1 1 1 NA 1 1 ... $ s1cm : Factor w/ 2 levels No,Si: 1 2 NA 1 1 1 2 NA 1 1 ... $ ovt: num NA 93.4 NA NA NA ... I declared the sampling design: sii.design - svydesign( id = ~1, strata = ~stratum, weights = ~expfc, data = subset(sii.tesis, !is.na(stratum))) Then I tried to get the result: svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95) but i get the error: Error in tapply(1:NROW(x), list(factor(strata)), function(index) { : arguments must have same length The length of both variables is the same. If the variable ovt exists, there is a d7 match in the data frame. I try the same thing using another variable instead - role (d6) - and it works. svyby(~ovt, ~d6, sii.design, svymean, na.rm = TRUE, level = 0.95) d6 ovt se Estudiante Estudiante 71.01805 1.370569 Profesor Profesor 72.30923 6.518378 Administrativo Administrativo 75.69102 3.715050 If I use the recategorized d7 variable (d7c, two levels only) it works too: svyby(~ovt, ~d7c, sii.design, svymean, na.rm = TRUE, level = 0.95) d7c ovt se No estable No estable 70.92344 1.37460 Estable Estable 74.53719 4.16954 What could be the problem? Regards. Martin Canon Colombia, South America __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tapply error svyby function survey package
hi martin, sending the first 25 rows does not help if it does not re-create the problem.. when i run the data you have provided, i do not encounter your problem (see below). someone else may be able to guess the issue, but this would be a lot easier to solve if you can create a minimal reproducible example http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example sii.tesis - structure(list(id = c(51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 73L, 74L, 75L, 76L), stratum = structure(c(1L, 4L, NA, 4L, 4L, 1L, 6L, NA, 4L, 4L, 1L, 1L, 1L, 6L, 6L, 3L, 3L, 6L, NA, 1L, 1L, 6L, 4L, 3L, 6L), .Label = c(MEst, MAcad, MAdm, FEst, FAcad, FAdm), class = factor), expfc = c(22.8195266723633, 17.0644626617432, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633, 5.1702127456665, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633, 22.8195266723633, 22.8195266723633, 5.1702127456665, 5.1702127456665, 6.24137926101685, 6.24137926101685, 5.1702127456665, NA, 22.8195266723633, 22.8195266723633, 5.1702127456665, 17.0644626617432, 6.24137926101685, 5.1702127456665), d7 = structure(c(1L, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, NA, 1L, 1L, 6L, 1L, 6L, 6L), .Label = c(Soltero, Casado, Separado, Divorciado, Viudo, Union libre), class = factor), ovt = c(NA, 93.3823547363281, NA, NA, NA, NA, 83.8235321044922, NA, NA, NA, NA, NA, NA, NA, 79.4117660522461, NA, NA, 19.1176471710205, NA, NA, NA, 85.2941207885742, NA, NA, NA)), .Names = c(id, stratum, expfc, d7, ovt ), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25), class = data.frame) sii.design - svydesign( id = ~1, strata = ~stratum, weights = ~expfc, data = subset(sii.tesis, !is.na(stratum))) svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95) # works fine--- svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95) d7 ovt se Soltero Soltero 88.94329 3.333485 Casado Casado 19.11765 0.00 Union libre Union libre 85.29412 0.00 On Wed, Nov 12, 2014 at 5:25 PM, Martin Canon martin.ca...@gmail.com wrote: Anthony, thanks for your reply. Resetting the levels didn't work. These are the first 25 rows of the dataset: structure(list(id = c(51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 73L, 74L, 75L, 76L), stratum = structure(c(1L, 4L, NA, 4L, 4L, 1L, 6L, NA, 4L, 4L, 1L, 1L, 1L, 6L, 6L, 3L, 3L, 6L, NA, 1L, 1L, 6L, 4L, 3L, 6L), .Label = c(MEst, MAcad, MAdm, FEst, FAcad, FAdm), class = factor), expfc = c(22.8195266723633, 17.0644626617432, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633, 5.1702127456665, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633, 22.8195266723633, 22.8195266723633, 5.1702127456665, 5.1702127456665, 6.24137926101685, 6.24137926101685, 5.1702127456665, NA, 22.8195266723633, 22.8195266723633, 5.1702127456665, 17.0644626617432, 6.24137926101685, 5.1702127456665), d7 = structure(c(1L, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, NA, 1L, 1L, 6L, 1L, 6L, 6L), .Label = c(Soltero, Casado, Separado, Divorciado, Viudo, Union libre), class = factor), ovt = c(NA, 93.3823547363281, NA, NA, NA, NA, 83.8235321044922, NA, NA, NA, NA, NA, NA, NA, 79.4117660522461, NA, NA, 19.1176471710205, NA, NA, NA, 85.2941207885742, NA, NA, NA)), .Names = c(id, stratum, expfc, d7, ovt ), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25), class = data.frame) Regards. Martin On Wed, Nov 12, 2014 at 1:39 PM, Anthony Damico ajdam...@gmail.com wrote: try resetting your levels? if that doesn't work, please dput() an example data set that we can test with :) thanks! sii.design - update( sii.design , d6 = factor( d6 ) ) On Wed, Nov 12, 2014 at 7:59 AM, Martin Canon martin.ca...@gmail.com wrote: Hi. I'm trying to calculate the weighted mean score of a quality of life measure (ovt) in patients with irritable bowel syndrome by their marital status (d7). This is a summary of the structure of the dataset: str(sii.tesis) 'data.frame':1063 obs. of 75 variables: $ id : int 51 52 53 54 55 56 57 58 59 60 ... $ stratum: Factor w/ 6 levels MEst,MAcad,..: 1 4 NA 4 4 1 6 NA 4 4 ... $ expfc : num 22.8 17.1 NA 17.1 17.1 ... $ d6 : Factor w/ 3 levels Estudiante,Profesor,..: 1 1 NA 1 1 1 3 NA 1 1 ... $ d7 : Factor w/ 6 levels Soltero,Casado,..: 1 1 NA 1 1 1 1 NA 1 1 ... $ d7c: Factor w/ 2 levels No estable,Estable: 1 1 NA 1 1 1 1 NA 1 1 ... $ s1cm : Factor w/ 2 levels No,Si: 1 2 NA 1 1 1 2 NA 1 1 ... $ ovt: num NA 93.4 NA NA NA ... I declared the sampling design: sii.design - svydesign( id = ~1, strata =
Re: [R] tapply error svyby function survey package
Anthony, thanks for your reply. Resetting the levels didn't work. These are the first 25 rows of the dataset: structure(list(id = c(51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 73L, 74L, 75L, 76L), stratum = structure(c(1L, 4L, NA, 4L, 4L, 1L, 6L, NA, 4L, 4L, 1L, 1L, 1L, 6L, 6L, 3L, 3L, 6L, NA, 1L, 1L, 6L, 4L, 3L, 6L), .Label = c(MEst, MAcad, MAdm, FEst, FAcad, FAdm), class = factor), expfc = c(22.8195266723633, 17.0644626617432, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633, 5.1702127456665, NA, 17.0644626617432, 17.0644626617432, 22.8195266723633, 22.8195266723633, 22.8195266723633, 5.1702127456665, 5.1702127456665, 6.24137926101685, 6.24137926101685, 5.1702127456665, NA, 22.8195266723633, 22.8195266723633, 5.1702127456665, 17.0644626617432, 6.24137926101685, 5.1702127456665), d7 = structure(c(1L, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, NA, 1L, 1L, 6L, 1L, 6L, 6L), .Label = c(Soltero, Casado, Separado, Divorciado, Viudo, Union libre), class = factor), ovt = c(NA, 93.3823547363281, NA, NA, NA, NA, 83.8235321044922, NA, NA, NA, NA, NA, NA, NA, 79.4117660522461, NA, NA, 19.1176471710205, NA, NA, NA, 85.2941207885742, NA, NA, NA)), .Names = c(id, stratum, expfc, d7, ovt ), row.names = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25), class = data.frame) Regards. Martin On Wed, Nov 12, 2014 at 1:39 PM, Anthony Damico ajdam...@gmail.com wrote: try resetting your levels? if that doesn't work, please dput() an example data set that we can test with :) thanks! sii.design - update( sii.design , d6 = factor( d6 ) ) On Wed, Nov 12, 2014 at 7:59 AM, Martin Canon martin.ca...@gmail.com wrote: Hi. I'm trying to calculate the weighted mean score of a quality of life measure (ovt) in patients with irritable bowel syndrome by their marital status (d7). This is a summary of the structure of the dataset: str(sii.tesis) 'data.frame':1063 obs. of 75 variables: $ id : int 51 52 53 54 55 56 57 58 59 60 ... $ stratum: Factor w/ 6 levels MEst,MAcad,..: 1 4 NA 4 4 1 6 NA 4 4 ... $ expfc : num 22.8 17.1 NA 17.1 17.1 ... $ d6 : Factor w/ 3 levels Estudiante,Profesor,..: 1 1 NA 1 1 1 3 NA 1 1 ... $ d7 : Factor w/ 6 levels Soltero,Casado,..: 1 1 NA 1 1 1 1 NA 1 1 ... $ d7c: Factor w/ 2 levels No estable,Estable: 1 1 NA 1 1 1 1 NA 1 1 ... $ s1cm : Factor w/ 2 levels No,Si: 1 2 NA 1 1 1 2 NA 1 1 ... $ ovt: num NA 93.4 NA NA NA ... I declared the sampling design: sii.design - svydesign( id = ~1, strata = ~stratum, weights = ~expfc, data = subset(sii.tesis, !is.na(stratum))) Then I tried to get the result: svyby(~ovt, ~d7, sii.design, svymean, na.rm = TRUE, level = 0.95) but i get the error: Error in tapply(1:NROW(x), list(factor(strata)), function(index) { : arguments must have same length The length of both variables is the same. If the variable ovt exists, there is a d7 match in the data frame. I try the same thing using another variable instead - role (d6) - and it works. svyby(~ovt, ~d6, sii.design, svymean, na.rm = TRUE, level = 0.95) d6 ovt se Estudiante Estudiante 71.01805 1.370569 Profesor Profesor 72.30923 6.518378 Administrativo Administrativo 75.69102 3.715050 If I use the recategorized d7 variable (d7c, two levels only) it works too: svyby(~ovt, ~d7c, sii.design, svymean, na.rm = TRUE, level = 0.95) d7c ovt se No estable No estable 70.92344 1.37460 Estable Estable 74.53719 4.16954 What could be the problem? Regards. Martin Canon Colombia, South America __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.