[R] confidence intervals of proportions from complex surveys
This is partly an R and partly a general statistics question. I'm trying to get confidence intervals of proportions (sometimes for subgroups) estimated from complex survey data. Because a function like prop.test() does not exist for the survey package I tried the following: 1) Define a survey object (PSU of clustered sample, population weights); 2) Use svyglm() of the package survey to estimate a binary logistic regression (family='binomial'): For the confidence interval of a single proportion regress the binary dependent variable on a constant (1), for confidence intervals of that variable for subgroups regress this variable on the groups (factor) variable; 3) Use predict() to obtain estimated logits and the respective standard errors (mod.dat specifiying either the constant or the subgroups): pred=predict(model,mod.dat,type='link',se.fit=T) and apply the following to obtain the proportion with its confidence intervals (for example, for conf.level=.95): lo.e = pred[1:length(pred)]-qnorm((1+conf.level)/2)*SE(pred) hi.e = pred[1:length(pred)]+qnorm((1+conf.level)/2)*SE(pred) prop = 1/(1+exp(-pred[1:length(pred)])) lo = 1/(1+exp(-lo.e)) hi = 1/(1+exp(-hi.e)) I think that in that way I get CI's based on asymptotic normality - either for a single proportion or split up into subgroups. Question: Is this a correct or a defensible procedure? Or should I use a different approach? Note that this approach should also allow to estimate CI's for proportions of subgroups taking into account the complex survey design. TIA, Dirk R version 2.5.1 Patched (2007-08-10 r42469) i386-pc-mingw32 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to reply to a thread if receiving R-help mails in digest form
I receive R-help messages in digest form that makes it difficult to answer a post. I noticed that my answer is not added to the thread (instead, a new thread is started) although I use the same subject line (starting with Re: ) as the original post. Is there a solution (I prefer the digest to separate messages for several reasons and don't want to change my email reader)? The way I answer post up to now is: 1) I press the reply button of my email program (Mozilla / Thunderbird, Windows) 2) I delete all contents of the digest except for the post (including name and mail address of the posting person) I want to answer so that the original question will be included (cited) in my answer. 3) I add the email address to the individual sender to cc: to the automatically generated address of the R-help list. 4) I replace the automatically generated subject line (for example Re: R-help Digest, Vol 42, Issue 13 by Re: followed by a copy of the original subject line of the post. 5) I write my answer and send the mail to the mailing list. It's not that this is tedious - the problem is that the thread is broken. Is there a better way even if I want to keep receiving messages in digest form? The posting guide is silent about this. Dirk * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-(0)40-42838.7498 (office) +49-(0)40-42838.4591 (Billon) fax: +49-(0)40-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Geometrical Interpretation of Eigen value and Eigen vector
Arun, have a look at: http://149.170.199.144/multivar/eigen.htm HTH, Dirk Arun Kumar Saha [EMAIL PROTECTED] wrote: It is not a R related problem rather than statistical/mathematical. However I am posting this query hoping that anyone can help me on this matter. My problem is to get the Geometrical Interpretation of Eigen value and Eigen vector of any square matrix. Can anyone give me a light on it? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NLME: Problem with plotting ranef vs a factor
Greg, be careful using attach() and detach(). From the syntax snippets you showed it seems that you did create an object pcat (factor variable), but you did not change the respective variable in your data frame. Try to remove pcat and see what happens do the results of lme()! Dirk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] standardized residuals (random effects) using nlme and ranef
www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dirk Enzmann wrote: Using ranef() (package nlme, version 3.1-75) with an 'lme' object I can obtain random effects for intercept and slope of a certain level (say: 1) - this corresponds to (say level 1) residuals in MLWin. Maybe I'm mistaken here, but the results are identical. However, if I try to get the standardized random effects adding the paramter standard=T to the specification of ranef(), the results differ considerably from the results of MLWin (although MLWin defines standardized in the same way as divided by its estimated (diagnostic) standard error). Why do the results differ although the estimates (random effects and thus their variances) are almost identical? I noticed that lme() does not compute the standard errors of the variances of the random effects - for several reasons, but if this is true, how does ranef() calculate the standardized random effects (the help says: 'standardized (i.e. divided by the corresponding estimated standard error)'). Is there a way to obtain similar results as in MLWin (or: should I prefer the results of ranef() for certain reasons)? Dirk - R version: 2.3.1 Patched (2006-06-21 r38367) -- * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-(0)40-42838.7498 (office) +49-(0)40-42838.4591 (Billon) fax: +49-(0)40-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] conflict of package Matrix and summary of lme object
After loading the package Matrix (version 0.995-12), using the summary function with an lme (package nlme version 3.1-75) object results in an error message saying Fehler in dim(x) : kein Slot des Namens Dim für dieses Objekt der Klasse correlation (translated: 'Error in dim(x) : no slot of the name Dim for this object of class correlation)'. Without loading Matrix this error message does not occur. Any ideas? -- R version: 2.3.1 Patched (2006-06-21 r38367) Operating system: Windows XP (5.1 (Build 2600)) CPU: Pentium Model 2 Stepping 9 * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-(0)40-42838.7498 (office) +49-(0)40-42838.4591 (Billon) fax: +49-(0)40-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] standardized random effects with ranef.lme()
Using ranef() (package nlme, version 3.1-75) with an 'lme' object I can obtain random effects for intercept and slope of a certain level (say: 1) - this corresponds to (say level 1) residuals in MLWin. Maybe I'm mistaken here, but the results are identical. However, if I try to get the standardized random effects adding the paramter standard=T to the specification of ranef(), the results differ considerably from the results of MLWin (although MLWin defines standardized in the same way as divided by its estimated (diagnostic) standard error). Why do the results differ although the estimates (random effects and thus their variances) are almost identical? I noticed that lme() does not compute the standard errors of the variances of the random effects - for several reasons, but if this is true, how does ranef() calculate the standardized random effects (the help says: 'standardized (i.e. divided by the corresponding estimated standard error)'). Is there a way to obtain similar results as in MLWin (or: should I prefer the results of ranef() for certain reasons)? Dirk - R version: 2.3.1 Patched (2006-06-21 r38367) * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-(0)40-42838.7498 (office) +49-(0)40-42838.4591 (Billon) fax: +49-(0)40-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with read.spss() and as.data.frame(), or: alternative to subset()?
The selection problem can be solved by dr2000=read.spss('myfile') d=lapply(dr2000,subset,dr2000$RBINZ99 0) however, there is still the problem that R crashes when using d = as.data.frame(dr2000) or dr2000=read.spss('myfile',to.data.frame=T) Any suggestions why? I checked whether all components of dr2000 are of the same length and the sort of object of each component. This is not the problem: Each component has the same length (9232) and there are 66 components of the class 'character', 981 of the class 'factor', and 479 of the class 'numeric'. Trying to select a subset of cases (rows of data) I encountered several problems: Firstly, because I did not read the help to read.spss() thoroughly enough, I treated the data read as a data frame. For example, dr2000 - read.spss('myfile.sav') d - subset(dr2000,RBINZ99 0) and thus received an error message (Object RBINZ99 not found), because dr2000 is not a data.frame but a list (shown by class(dr2000)). d - subset(dr2000,dr2000$RBINZ99 0) didn' help either, because now d is empty (dim = NULL). Thus, I tried to use the option to.data.frame=T of read.spss(): dr2000 - read.spss('myfile.sav',to.data.frame=T) However, now R crashes ('R for Windows GUI front-end has found an error and must be closed') (the error message is in German). Finally, I tried again using read.spss() without the option 'to.data.frame=T' (as before) and tried to convert dr2000 to a data frame by using d - as.data.frame(dr2000) However, R crashes again (with the same error message). Of course, I could use SPSS first and save only the cases with RBINZ99 0, but this is not always possible (all users of the data must have SPSS available and we have to use different selection criteria). Is there another possibility to solve the problem by using R? I want to select certain rows (cases) based on the values of one variable of dr2000, but keep all columns (variables) - although dr2000 is not a data frame? And: R should not crash but rather give a warning. R version 2.1.1 Patched (2005-07-15) Package Foreign Version 0.8-10 Operating system: Windows XP Professional (5.1 (Build 2600)) CPU: Pentium Model 2 Stepping 9 RAM: 512 MB * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problem with read.spss() and as.data.frame(), or: alternative to subset()?
Trying to select a subset of cases (rows of data) I encountered several problems: Firstly, because I did not read the help to read.spss() thoroughly enough, I treated the data read as a data frame. For example, dr2000 - read.spss('myfile.sav') d - subset(dr2000,RBINZ99 0) and thus received an error message (Object RBINZ99 not found), because dr2000 is not a data.frame but a list (shown by class(dr2000)). d - subset(dr2000,dr2000$RBINZ99) didn' help either, because now d is empty (dim = NULL). Thus, I tried to use the option to.data.frame=T of read.spss(): dr2000 - read.spss('myfile.sav',to.data.frame=T) However, now R crashes ('R for Windows GUI front-end has found an error and must be closed') (the error message is in German). Finally, I tried again using read.spss() without the option 'to.data.frame=T' (as before) and tried to convert dr2000 to a data frame by using d - as.data.frame(dr2000) However, R crashes again (with the same error message). Of course, I could use SPSS first and save only the cases with RBINZ99 0, but this is not always possible (all users of the data must have SPSS available and we have to use different selection criteria). Is there another possibility to solve the problem by using R? I want to select certain rows (cases) based on the values of one variable of dr2000, but keep all columns (variables) - although dr2000 is not a data frame? And: R should not crash but rather give a warning. R version 2.1.1 Patched (2005-07-15) Package Foreign Version 0.8-10 Operating system: Windows XP Professional (5.1 (Build 2600)) CPU: Pentium Model 2 Stepping 9 RAM: 512 MB * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problem with read.spss() and as.data.frame(), or: alternative to subset()? (2)
In my previous mail there is a typo. Instead of d - subset(dr2000,dr2000$RBINZ99) I used d - subset(dr2000,dr2000$RBINZ99 0) * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] using argument names (of indeterminate number) within a function
Although I tried to find an answer in the manuals and archives, I cannot solve this (please excuse that my English and/or R programming skills are not good enough to state my problem more clearly): I want to write a function with an indeterminate (not pre-defined) number of arguments and think that I should use the ... construct and the match.call() function. The goal is to write a function that (among other things) uses cbind() to combine a not pre-defined number of vectors specified in the function call. For example, if my vectors are x1, x2, x3, ... xn, within the function I want to use cbind(x1, x2) or cbind(x1, x3, x5) or ... depending on the vector names I use in the funcion call. Additionally, the function has other arguments. In the archives I found the following thread (followed by Marc Schwartz) http://finzi.psych.upenn.edu/R/Rhelp02a/archive/15186.html [R] returning argument names from Peter Dalgaard BSA on 2003-04-10 (stdin) that seems to contain the solution to my problem, but I am stuck because sapply(match.call()[-1], deparse) gives me a vector of strings and I don't know how to use the names in this vector in the cbind() function. Up to now my (clearly deficit) function looks like: test - function(..., mvalid=1) { args = sapply(match.call()[-1], deparse) # and here, I don't know how the vector names in args # can be used in the cbind() function to follow: # # temp - cbind( ??? if (mvalid 1) { # here it goes on } } Ultimately, I want that the function can be called like test(x1,x2,mvalid=1) or test(x1,x3,x5,mavlid=2) and that within the function cbind(x1,x2) or cbind(x1,x3,x5) will be used. Can someone give and explain an example / a solution on how to proceed? * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] skewness and kurtosis in e1071 correct?
To me your answer is opaque (but that seems to be rather a problem of language ;-) ). Perhaps my question has not been expressed clearly enough. Let me state it differently: In the R package e1071 the formulas (implicit) used are (3) and (4) (see below), the standard deviation used in these formulas, however is based on (2) (see below). This seems to be inconsistent and my question is, whether there is a commonly used third definition of skewness and kurtosis in which the formulas for the biased skewness and kurtosis _but_ with the unbiased standard deviation are employed. The standard deviation can be defined as the _sample_ statistic: sd = 1/n * sum( (x - mean(x))^2 ) # (1) and as the estimated population parameter: sd = 1/(n-1) * sum( (x-mean(x))^2 ) # (2). In R the function sd() calculates the latter. In the same way, expressed via z-values skewness and kurtosis can be defined as the _sample_ statistic (also called biased estimator , see: http://www.mathdaily.com/lessons/Skewness ): skewness = mean(z^3) # (3) kurtosis = mean(z^4)-3 # (4) with z = (x - mean(x))/sd(x) with sd = 1/n * sum( (x - mean(x)^2 ) (thus: here sd is the _sample_ statistic, see (1) above!) but they can also be defined as the estimated population parameters (also called unbiased, see: http://www.mathdaily.com/lessons/Kurtosis#Sample_kurtosis ): skewness = n/((n-1)*(n-2)) * sum(z^3) # (5) kurtosis = n*(n+1)/((n-1)*(n-2)*(n-3)) * sum(z^4) - 3*(n-1)^2/((n-2)*(n-3)) # (6) with z = (x - mean(x))/sd(x) with sd = 1/(n-1) * sum( (x - mean(x)^2 ) (thus: here sd is the estimated population parameter, see (2) above!. BTW: The R function scale() calculates the z-values based on this definition, as well.) Campbell wrote: This is probably an issue over definitions rather than the correct answer. To me skewness and kurtosis are functions of the distribution rather than the population, they are equivalent to expectation rather than mean. For the normal distribution it makes no sense to estimate them as the distribution is uniquely defined by its first two moments. However there are two defnitions of kurotsis as it is often standardized such that the expectation is 0. * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] skewness and kurtosis in e1071 correct? (correction)
I'm sorry, but my previous message, as often happens, some brackets were wrong: Here are the correct formulas: sd = 1/n * sum((x-mean(x))^2) # (1) sd = 1/(n-1) * sum((x-mean(x))^2) # (2) This also occured in the last paragraph. Dirk * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] skewness and kurtosis in e1071 correct?
My last question to the R list is another example of asking too fast before reading (sorry). But by the way and because it might be interesting for others, an answer can be found in: Joanes, D. N. Gill, C. A. (1998) Comparing measures of sample skewness and kurtosis. Journal of the Royal Statistical Society (Series D): The Statistician 47 (1), 183189. The measures discussed in this article are: g1 : skewness as defined in formula (3) of my mail (and in STATA) G1 : skewness as defined in formula (5) of my mail (and in SAS, SPSS, and EXCEL) b1 : skewness as defined in the package e1071 (and in MINITAB and BMDP) g2 : kurtosis as defined in formula (4) of my mail (and in STATA) G2 : kurtosis as defined in formula (6) of my mail (and in SAS, SPSS, and EXCEL) b2 : kurtosis as defined in the package e1071 (and in MINITAB and BMDP) Off topic: I don't know why the thread of these mails are not included in the correct thread. Dirk -- * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] skewness and kurtosis in e1071 correct?
I wonder whether the functions for skewness and kurtosis in the e1071 package are based on correct formulas. The functions in the package e1071 are: # skewness - function (x, na.rm = FALSE) { if (na.rm) x - x[!is.na(x)] sum((x - mean(x))^3)/(length(x) * sd(x)^3) } # and # kurtosis - function (x, na.rm = FALSE) { if (na.rm) x - x[!is.na(x)] sum((x - mean(x))^4)/(length(x) * var(x)^2) - 3 } # However, sd() and var() are the estimated population parameters. To calculate the sample statistics of skewness and kurtosis, shouldn't one use the sample statistics of the standard deviation (and variance), as well? For example: # # Function to calculate the sample statistic of skewness: skew_s=function(x) { x = x[!is.na(x)] n = length(x) if (n 3) { cat('valid cases = ',n,'\nskewness is not defined for less than 3 valid cases!\n') } else { z = sqrt(n/(n-1))*scale(x) mean(z^3) } } # and # # Function to calculate the sample statistic of kurtosis: kurt_s=function(x) { x = x[!is.na(x)] n = length(x) if (n 4) { cat('valid cases = ',n,'\nkurtosis is not defined for less than 4 valid cases!\n') } else { z = sqrt(n/(n-1))*scale(x) mean(z^4)-3 } } # Whereas, to calculate the (unbiased) estimated population parameter of skewness and kurtosis, the correction should also include the number of cases in the following way: # # Function to calculate the unbiased populataion estimate of skewness: skew=function(x) { x = x[!is.na(x)] n = length(x) if (n 3) { cat('valid cases = ',n,'\nskewness is not defined for less than 3 valid cases!\n') } else { z = scale(x) sum(z^3)*n/((n-1)*(n-2)) } } # and # # Function to calculate the unbiased population estimate of kurtosis: kurt=function(x) { x = x[!is.na(x)] n = length(x) if (n 4) { cat('valid cases = ',n,'\nkurtosis is not defined for less than 4 valid cases!\n') } else { z = scale(x) sum(z^4)*n*(n+1)/((n-1)*(n-2)*(n-3))-3*(n-1)^2/((n-2)*(n-3)) } } # Thus, it seems that the formulas used in the e1071 package are neither formulas for the sample statistics nor for the (unbiased) estimates of the population parameters. Is there another definition of kurtosis and skewness that I am not aware of? Dirk -- * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Source R code via menu File
If I use the first menu item Source R code of the menu File to call source() by searching the *.r files in my directory interactively, a window opens showing the active directory, but no files to choose (even if there are files with the extension .r). If I start entering the the name of a file that I know to be in the active directory, by entering a firt letter of the file name a list of files beginning with this letter shows up - but not a list of all files with the extension .r. This happens if I use R version 2.1.0 patched (2005-04-18) running under Windows XP (German), the menu item is in German Lese R Code ein. In previous versions of R (e.g. version 2.0.1 patched (2004-11-17)) this never happend, i.e. in older versions I could choose a file from a list of all files with the extension .r in the current directory. What has changed? Is this specific to my configuration (of R or of Windows) or is it a bug? R: version 2.1.0 patched (2005-04-18) Operating system: Windows XP (German) * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Edmund-Siemers-Allee 1 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] par(new=TRUE) vs par(new=FALSE)
Just out of curiosity (and I hope that my question does not confuse those who intuitively understood the par(new=TRUE) statement right in the beginning): I would like to know whether beginners (learning R) had the same difficulty as I had: Intuitively I thought that par(new=TRUE) would draw a new plot (and not into an already existing plot) and that par(new=FALSE) would not draw a new plot but draw into an already existing plot - opposed to what the par(new=TRUE) statement actually does. (It would be interesting to receive an answer also from those who meanwhile reached an expert status in working with the R syntax). Perhaps my confusion is due to the fact that I am a trained right-handed that really is a left-handed (those people tend to mistake contrasts). If there are a lot of people having this difficulty, I would like to know why the decision was made to define the mode of action of the par(new=TRUE) statement as it is now. Perhaps that helps me to understand better the way programmers tend to think. * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Schlueterstr. 28 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Confidence intervals for rates (dependent events)
I need advice or opinions for the following problem: In a sample a part of the respondents has experienced victimizing events. Only a part of these events have been reported to the police. The rate of events reported to the police is number of experienced events / number of reported events. Because the events cannot be assumed to be independent (some victims have a high rate of vicitimization because they are more prone to become a victim) the construction of a confidence interval for the rate of events reported becomes difficult (to me). Question 1: If nevertheless a use a binomial test to construct a confidence interval (say: 0 of 13 events reported, binom.test(0,13,0/13) CI = 0 to 24.7 %), is it correct that the width of this interval is a lower bound and thus a conservative estimate? Question 2: (a) My intuition tells me that multilevel modeling could be a solution to obtain a correct confidence interval by treating the events and the events reported as the first level and the victims as the second. Is this correct and how should I specify this model? (b) Alternatively, could I treat the events (and events reported?) as coming from a negative binomial distribution and use this for constructing a confidence interval? How can this be done technically, for example by using nb.glm? Thanks in advance, Dirk * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Schlueterstr. 28 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Percentages in contingency tables *warning trivial question*
Being still unsatisfied with the CrossTable() function I modified the code so that the function will create an output similar to the SPSS procedure CROSSTABS. Most probably the code will not meet most R programmers' standards, perhaps someone else is willing to optimize it. Unfortunately, as an R beginner I am not able to write a documentation file (perhaps someone is willing to put some effort in it, too)- the parameters that can be used can be found next to function. Including the function code here would cause nasty line breaks, you can find it at http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Software/crosstabs.r Dirk At Mon, 13 Dec 2004 05:47:17 -0500 Chuck Cleland [EMAIL PROTECTED] wrote: (snip) You might want to look at CrossTable() in the gmodels package of the gregmisc bundle. (snip) -- * Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Schlueterstr. 28 D-20146 Hamburg Germany phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: [EMAIL PROTECTED] www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Zero inflated count models
Can someone show me how to specify zero inflated poisson and zero inflated negative poisson models in R? I would like to replicate the example given in Long (1997: Regression Models for Categorical and Limited Dependent Variables) in Chapter 8.5 (pp. 242-247). TIA Dirk * Dr. Dirk Enzmann Criminological Research Institute of Lower Saxony Luetzerodestr. 9 D-30161 Hannover Germany phone: +49-511-348.36.32 fax: +49-511-348.36.10 email: [EMAIL PROTECTED] http://www.kfn.de __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Zero inflated count models (2)
In my previous mail, meant to say negative binomial regression (not negative poisson), of course. Dirk * Dr. Dirk Enzmann Criminological Research Institute of Lower Saxony Luetzerodestr. 9 D-30161 Hannover Germany phone: +49-511-348.36.32 fax: +49-511-348.36.10 email: [EMAIL PROTECTED] http://www.kfn.de __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Job Announcement
The Criminological Research Institute of Lower Saxony (KFN) in Germany (Hannover) seeks a quantitative methodologist (Psychologist/Sociologist) to work in a research project on the developmental consequences of incarceration of juvenile delinquents. For more detailed information see: http://www.kfn.de/KFN_180903.pdf Application deadline is October 12, 2003, but applications will be accepted until the position is filled. More information about the Institute is available at our home page http://www.kfn.de * Dr. Dirk Enzmann Criminological Research Institute of Lower Saxony Luetzerodestr. 9 D-30161 Hannover Germany phone: +49-511-348.36.32 fax: +49-511-348.36.10 email: [EMAIL PROTECTED] http://www.kfn.de __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Update VR_7.1-6
The update of VR by downloading VR_7.1-6.zip and using install.packages (from local zip files) fails with the following error message: Error in file(file, r) : unable to open connection In addition: Warning message: cannot open file `VR/DESCRIPTION' Other packages can be installed without problems, except of dse_2003.4-1 with a similar error message. Why? Operating System: Windows NT (4.0) R Version: R 1.7.0 * Dr. Dirk Enzmann Criminological Research Institute of Lower Saxony Luetzerodestr. 9 D-30161 Hannover Germany phone: +49-511-348.36.32 fax: +49-511-348.36.10 email: [EMAIL PROTECTED] http://www.kfn.de __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help