Re: [R] memory problem
Zoltan Kmetty wrote: Hi! I had some memory problem with R - hope somebody could tell me a solution. I work with very large datasets, but R cannot allocate enough memoty to handle these datasets. I want work a matrix with row= 100 000 000 and column=10 A know this is 1 milliard cases, but i thought R could handle it (other commercial software like spss could do), but R wrote out everytime: not enough memory.. any good idea? Buy a machine that has at least 8Gb (better 16Gb) of RAM and proceed ... Uwe Ligges Thanks, Zoltan [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using VGAM's vglm function for ordinal logistic regression
On Sat, 6 Jan 2007, Inman, Brant A. M.D. wrote: R-Experts: I am using the vglm function of the VGAM library to perform proportional odds ordinal logistic regression. The issue that I would like help with concerns the format in which the response variable must be provided for this function to work correctly. pneumo2$severity [1] normal normal normal normal normal normal normal normal mild mild [11] mild mild mild mild mild mild severe severe severe severe [21] severe severe severe severe Levels: mild normal severe is different from the ordering in the first example. The difference between vglm and polr is that the latter uses the conventional sign for the coefficients: see MASS4 p.204. I would never use as.ordered on a character vector, as this leaves it to R to choose the ordering. (Even if you think you intended alphabetical order, that depends on the locale: see the warnings on the help page.) Consider the following example: -- library(VGAM) library(MASS) attach(pneumo) pneumo# Inspect the format of the original dataset # Restructure the pneumo dataset into a different format pneumo2 - data.frame(matrix(ncol=3, nrow=24)) colnames(pneumo2) - c('exposure.time', 'severity', 'freq') pneumo2[,1] - rep(pneumo[,1],3) pneumo2[,2] - as.ordered(c(rep('normal',8),rep('mild',8),rep('severe',8))) pneumo2[1:8,3] - pneumo[,2] pneumo2[9:16,3] - pneumo[,3] pneumo2[17:24,3] - pneumo[,4] pneumo2 # Inspect the format of the new modified dataset -- The problem occurs when I try to analyze these two datasets, which are identical in content, with the proportional odds model using vglm: -- # Analyze the original dataset with vglm, get one set of results vglm(vglm(cbind(normal, mild, severe) ~ log(exposure.time), data=pneumo, + family=cumulative(parallel=T)) Coefficients: (Intercept):1 (Intercept):2 log(exposure.time) 9.676093 10.581725 -2.596807 Degrees of Freedom: 16 Total; 13 Residual Residual Deviance: 5.026826 Log-likelihood: -204.2742 # Analyzing the modified dataset with vglm gives another set of results vglm(severity ~ log(exposure.time), weights=freq, data=pneumo2, + family=cumulative(parallel=T)) Coefficients: (Intercept):1 (Intercept):2 log(exposure.time) -1.6306499 2.5630170 -0.1937956 Degrees of Freedom: 48 Total; 45 Residual Residual Deviance: 503.7251 Log-likelihood: -251.8626 Warning messages: 1: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) 2: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) 3: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) 4: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) 5: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) # Analyzing the modified dataset with polr reproduces these second results polr(severity ~ log(exposure.time), weights=freq, data=pneumo2) Coefficients: log(exposure.time) 0.1938154 Intercepts: mild|normal normal|severe -1.630612 2.563055 Residual Deviance: 503.7251 AIC: 509.7251 -- Can someone explain what I am doing wrong when using vglm and polr with the modified dataset? I do not understand why these two formulations should give different results. Brant Inman __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] export many plots to one file
Another thing to think about is about each individual scatter plot. Aren't you plotting too much duplicate (x,y) values? You could try to plot only unique values, or even try to filter out points that are very close. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] as.Date() results depend on order of data within vector?
Dear all, The as.Date() function appears to give different results depending on the order of the vector passed into it. d1 = c(1900-01-01, 2007-01-01,,2001-05-03) d2 = c(, 1900-01-01, 2007-01-01,2001-05-03) as.Date(d1) # gives correct results as.Date(d2) # fails with error (* see below) This problem does not arise if the dates are NA rather than an empty string, but my data is coming via RODBC and I still don't have NAs passed across properly. I might add that I initially noticed this behaviour when using RODBC's sqlQuery() function call, and I initially had difficulty explaining why one column of dates was passed correctly, but another failed. The failing column was a date of death column where it was NA () for most patients. I've come up with two workarounds that work. The first is to sort the data at the SQL level, ensuring the initial record is not null. The second is to use sqlQuery() with as.is=T option, and then do the sorting and conversion afterwards. Is the behaviour of as.Date() shown above as expected/designed? Many thanks, Mark (*) Error in fromchar(x) : character string is not in a standard unambiguous format sessionInfo(): R version 2.4.0 (2006-10-03) powerpc-apple-darwin8.7.0 locale: C/en_GB.UTF-8/C/C/C/C attached base packages: [1] methods stats graphics grDevices utils datasets base other attached packages: rcompletion RODBC 0.0-12 1.1-7 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.Date() results depend on order of data within vector?
On Sun, 2007-01-07 at 12:01 +, Mark Wardle wrote: Dear all, The as.Date() function appears to give different results depending on the order of the vector passed into it. d1 = c(1900-01-01, 2007-01-01,,2001-05-03) d2 = c(, 1900-01-01, 2007-01-01,2001-05-03) as.Date(d1) # gives correct results as.Date(d2) # fails with error (* see below) This problem does not arise if the dates are NA rather than an empty string, but my data is coming via RODBC and I still don't have NAs passed across properly. I might add that I initially noticed this behaviour when using RODBC's sqlQuery() function call, and I initially had difficulty explaining why one column of dates was passed correctly, but another failed. The failing column was a date of death column where it was NA () for most patients. I've come up with two workarounds that work. The first is to sort the data at the SQL level, ensuring the initial record is not null. The second is to use sqlQuery() with as.is=T option, and then do the sorting and conversion afterwards. Why not just tell R what the format the dates are in, using the format argument to as.Date? d1 = c(1900-01-01, 2007-01-01,,2001-05-03) d2 = c(, 1900-01-01, 2007-01-01,2001-05-03) as.Date(d1, %Y-%m-%d) [1] 1900-01-01 2007-01-01 NA 2001-05-03 as.Date(d2, %Y-%m-%d) [1] NA 1900-01-01 2007-01-01 2001-05-03 Is the behaviour of as.Date() shown above as expected/designed? I don't know about expected/designed, but I would have thought explicitly stating the date format would be the most fool-proof way of making sure R did what you wanted, and the easiest way to work around your problem. HTH G Many thanks, Mark (*) Error in fromchar(x) : character string is not in a standard unambiguous format sessionInfo(): R version 2.4.0 (2006-10-03) powerpc-apple-darwin8.7.0 locale: C/en_GB.UTF-8/C/C/C/C attached base packages: [1] methods stats graphics grDevices utils datasets base other attached packages: rcompletion RODBC 0.0-12 1.1-7 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK[w] http://www.ucl.ac.uk/~ucfagls/ WC1E 6BT [w] http://www.freshwaters.org.uk/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.Date() results depend on order of data within vector?
On Sun, 7 Jan 2007, Mark Wardle wrote: Dear all, The as.Date() function appears to give different results depending on the order of the vector passed into it. d1 = c(1900-01-01, 2007-01-01,,2001-05-03) d2 = c(, 1900-01-01, 2007-01-01,2001-05-03) as.Date(d1) # gives correct results as.Date(d2) # fails with error (* see below) This problem does not arise if the dates are NA rather than an empty string, but my data is coming via RODBC and I still don't have NAs passed across properly. I might add that I initially noticed this behaviour when using RODBC's sqlQuery() function call, and I initially had difficulty explaining why one column of dates was passed correctly, but another failed. The failing column was a date of death column where it was NA () for most patients. I've come up with two workarounds that work. The first is to sort the data at the SQL level, ensuring the initial record is not null. The second is to use sqlQuery() with as.is=T option, and then do the sorting and conversion afterwards. Is the behaviour of as.Date() shown above as expected/designed? Yes. It uses the first non-NA string to choose the format *if you do not specify it*. The correct work-around is to get non-valid strings returned as NA, not . That is argument 'na.strings' in RODBC (and elsewhere: read.table behaves in the same way). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.Date() results depend on order of data within vector?
Prof Brian Ripley wrote: The correct work-around is to get non-valid strings returned as NA, not . That is argument 'na.strings' in RODBC (and elsewhere: read.table behaves in the same way). Thanks for these replies. As I have mentioned before, my peculiar combination of PostgreSQL, Actual's ODBC driver on Mac OS X, and RODBC means that for numbers and dates, NULL values are passed to R as empty strings rather than NAs (2). This does not occur with PostgreSQL's text column type. For the benefit of others who in the future may use this combination(1), my workaround for numbers/integers/boolean values is to essentially have temporary intermediate tables with columns of type text whatever the format, and let R/RODBC parse the strings into the correct resulting format (which it then does faultlessly). This does not work for dates however, and so I must use one of the two workarounds I mentioned in my post. Best wishes, Mark (1) unlikely as it may be (2) I still cannot fathom why integers and dates are not handled correctly. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] different points and lines on the same plot
Dear all, I have following data called paitent daypatient1patient4patient5patient6 0 -0.27842688 -0.04080808 -0.41948398 -0.04508318 56 -0.22275425 -0.01767067 -0.30977249 -0.03168185 112 -0.08217659 -0.26209243 -0.29141451 -0.09876170 252 0.08044537 -0.26701769 0.05727087 -0.09663701 where each patient have response values at four time points. I want to plot each patient's values over time on the same plot where the value points are connected by line. That is, the graph will have four lines for the four patients. I tried the program below but couldn't make it work correctly. I'm new beginner and haven't yet learned how functions line and points work together. Hope you can help me out. Thanks for your help, Antonia par(mfrow=c(1,1)) plot(patient[,1],patient[,2], pch=1, type=l,col=1,cex=1,lwd=2, xlab=Days, ylab=Patient response,cex.main =1,font.main= 1, main=NULL) points(patient[,1],patient[,3],col=2,pch=1,cex=1) lines(patient[,1],patient[,3],col=2,lty=1,cex=1) points(patient[,1],patient[,4],col=3,pch=1,cex=1) lines(patient[,1],patient[,4],col=2,lty=2,cex=1) points(patient[,1],patient[,5],col=4,pch=1,cex=1) lines(patient[,1],patient[,5],col=2,lty=1,cex=1) points(patient[,1],patient[,6],col=5,pch=1,cex=1) lines(patient[,1],patient[,6],col=2,lty=1,cex=1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negative binomial family glm R and STATA
Dear Patrick below are some comments. For ML estimation of negative binomial glim, there is also the function negbin in the package aod (CRAN). This function uses optim(stats). Based on your data, we have just detected a small bug in negbin, when the Hessian matrix (that we use for computing the variances of the ML estimates) is singular, which seems to be the case in the model you proposed. We will soon fix this bug and update the package. At the end of my message, I've provided a corrected (and simplified) version of the function, negbin0, that you can source for reproducing the code below. Note that we don't estimate theta but phi = 1 / theta (with E[y] = mu and Var[y] = mu + phi * mu^2). #=== FIT OF YOUR MODEL # The data you provided mydata - zonesdb4 # Remove the unused level 0 of axe_routier mydata$axe_routier - factor(as.character(mydata$axe_routier), levels = c(1, 2)) # Your model negbin0( formula = nbcas ~ pop + Area + V_100kHab + gares + ports + axe_routier + lacs, random = ~ 1, control = list(maxit = 2000), data = mydata, ) $param (Intercept) pop AreaV_100kHab1gares1 ports1 axe_routier2 lacs1 6.008098e+00 1.015842e-05 -3.019320e-06 1.556476e+00 1.267495e+00 -4.549933e+00 -3.156201e+00 4.677113e+00 8.287353e+00 $H.singularity [1] TRUE $varparam [1] NA $logL [1] -418.5078 $logL.max [1] -167.6718 $dev [1] 501.672 $code [1] 0 #=== END Here phi = 8.287353 (i.e. theta = 0.1206658), log-likelihood = -418.5078 and deviance = 501.672. Few remarks: - our results seem not to be similar to the STATA results you provided. If I well understood, with STATA, log-likelihood = -597.1477759 (which is lower than ours) and theta = 1. With R, I considered all the covariables as factors, except pop and Area (continuous). Did you the same with STATA? - In the optimization process used in the example, the Hessian matrix is singular. That often occurs when the model is overparametrized (and therefore very instable) compared to the number of data you have (I think your model is). - I am not sure that the type of model you proposed is the most adapted. Why not a model such as log(nbcas / pop) = X b, which is commonly used (see Agresti, 1990. Categorical data analysis. Wiley) for analysing rates of occurence of events, for example in epidemiology? With negbin, this model is (with only considering axe_routier): negbin( + formula = nbcas ~ axe_routier + offset(log(pop)), + random = ~ 1, + data = mydata + ) Negative-binomial model --- negbin(formula = nbcas ~ axe_routier + offset(log(pop)), random = ~1, data = mydata) Convergence was obtained after 82 iterations. Fixed-effect coefficients: Estimate Std. Error z value Pr( |z|) (Intercept) -6.5072 0.4888 -13.3114 1e-4 axe_routier2 1.0234 0.6839 1.49640.1346 Overdispersion coefficients: Estimate Std. Error z value Pr( z) phi.(Intercept) 10.7611 1.7936 5.9997 1e-4 Log-likelihood statistics Log-liknbpar df res. Deviance AIC AICc -411.1923 89 487.040 828.384 828.656 - Finally, the response variable nbcas has a lot of values 0. A zero-inflated model could perhaps much better fit the data. Best wishes Matthieu #== # FUNCTION negbin0 (TO SOURCE) #== negbin0 - function(formula, random, data, phi.ini = NULL, fixpar = list(), hessian = TRUE,...){ link - log f - formula mf - model.frame(formula = f, data = data) y - model.response(mf) fam - poisson(link = log) fm - glm(formula = f, family = fam, data = data) offset - model.offset(mf) # Model matrices modmatrix.b - model.matrix(object = f, data = data) if(random != ~ 1) random - update.formula(random, ~ . - 1) modmatrix.phi - model.matrix(object = random, data = data) nb.b - ncol(modmatrix.b) ; nb.phi - ncol(modmatrix.phi) ; nbpar - nb.b + nb.phi # Initial values if(is.null(phi.ini)) phi.ini - rep(0.1, nb.phi) b - fm$coefficients param.ini - c(b, phi.ini) if(is.null(unlist(fixpar)) == FALSE) param.ini[fixpar[[1]]] - fixpar[[2]] # minuslogL minuslogL - function(param){ if(!is.null(unlist(fixpar))) param[fixpar[[1]]] - fixpar[[2]] b - param[1:nb.b] eta - as.vector(modmatrix.b %*% b) mu - if(is.null(offset)) exp(eta) else exp(offset + eta) phi - as.vector(modmatrix.phi %*% param[(nb.b + 1):(nb.b + nb.phi)]) k - 1 / phi cnd - phi == 0 f1 - dpois(x = y[cnd], lambda = mu[cnd], log = TRUE) y2 - y[!cnd]; k2 - k[!cnd]; mu2 - mu[!cnd] f2 - lgamma(y2 + k2) - lfactorial(y2) - lgamma(k2) + k2 * log(k2 / (k2 + mu2)) + y2 * log(mu2 / (k2 + mu2)) fn - sum(c(f1, f2)) if(!is.finite(fn)) fn - -1e20 -fn } # Fit res -
Re: [R] different points and lines on the same plot
Try this: matplot(patient[,1], patient[,-1], type = o) On 1/7/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Dear all, I have following data called paitent daypatient1patient4patient5patient6 0 -0.27842688 -0.04080808 -0.41948398 -0.04508318 56 -0.22275425 -0.01767067 -0.30977249 -0.03168185 112 -0.08217659 -0.26209243 -0.29141451 -0.09876170 252 0.08044537 -0.26701769 0.05727087 -0.09663701 where each patient have response values at four time points. I want to plot each patient's values over time on the same plot where the value points are connected by line. That is, the graph will have four lines for the four patients. I tried the program below but couldn't make it work correctly. I'm new beginner and haven't yet learned how functions line and points work together. Hope you can help me out. Thanks for your help, Antonia par(mfrow=c(1,1)) plot(patient[,1],patient[,2], pch=1, type=l,col=1,cex=1,lwd=2, xlab=Days, ylab=Patient response,cex.main =1,font.main= 1, main=NULL) points(patient[,1],patient[,3],col=2,pch=1,cex=1) lines(patient[,1],patient[,3],col=2,lty=1,cex=1) points(patient[,1],patient[,4],col=3,pch=1,cex=1) lines(patient[,1],patient[,4],col=2,lty=2,cex=1) points(patient[,1],patient[,5],col=4,pch=1,cex=1) lines(patient[,1],patient[,5],col=2,lty=1,cex=1) points(patient[,1],patient[,6],col=5,pch=1,cex=1) lines(patient[,1],patient[,6],col=2,lty=1,cex=1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using VGAM's vglm function for ordinal logistic regression
Thank you for the help. Indeed, the differences in the results that I noted were due to the incorrect ordering of the response variable that resulted from my unattentive use of as.ordered on a character vector. Brant -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Sunday, January 07, 2007 2:52 AM To: Inman, Brant A. M.D. Cc: r-help@stat.math.ethz.ch; [EMAIL PROTECTED] Subject: Re: [R] Using VGAM's vglm function for ordinal logistic regression On Sat, 6 Jan 2007, Inman, Brant A. M.D. wrote: R-Experts: I am using the vglm function of the VGAM library to perform proportional odds ordinal logistic regression. The issue that I would like help with concerns the format in which the response variable must be provided for this function to work correctly. pneumo2$severity [1] normal normal normal normal normal normal normal normal mild mild [11] mild mild mild mild mild mild severe severe severe severe [21] severe severe severe severe Levels: mild normal severe is different from the ordering in the first example. The difference between vglm and polr is that the latter uses the conventional sign for the coefficients: see MASS4 p.204. I would never use as.ordered on a character vector, as this leaves it to R to choose the ordering. (Even if you think you intended alphabetical order, that depends on the locale: see the warnings on the help page.) Consider the following example: -- library(VGAM) library(MASS) attach(pneumo) pneumo# Inspect the format of the original dataset # Restructure the pneumo dataset into a different format pneumo2 - data.frame(matrix(ncol=3, nrow=24)) colnames(pneumo2) - c('exposure.time', 'severity', 'freq') pneumo2[,1] - rep(pneumo[,1],3) pneumo2[,2] - as.ordered(c(rep('normal',8),rep('mild',8),rep('severe',8))) pneumo2[1:8,3] - pneumo[,2] pneumo2[9:16,3] - pneumo[,3] pneumo2[17:24,3] - pneumo[,4] pneumo2 # Inspect the format of the new modified dataset -- The problem occurs when I try to analyze these two datasets, which are identical in content, with the proportional odds model using vglm: -- # Analyze the original dataset with vglm, get one set of results vglm(vglm(cbind(normal, mild, severe) ~ log(exposure.time), data=pneumo, + family=cumulative(parallel=T)) Coefficients: (Intercept):1 (Intercept):2 log(exposure.time) 9.676093 10.581725 -2.596807 Degrees of Freedom: 16 Total; 13 Residual Residual Deviance: 5.026826 Log-likelihood: -204.2742 # Analyzing the modified dataset with vglm gives another set of results vglm(severity ~ log(exposure.time), weights=freq, data=pneumo2, + family=cumulative(parallel=T)) Coefficients: (Intercept):1 (Intercept):2 log(exposure.time) -1.6306499 2.5630170 -0.1937956 Degrees of Freedom: 48 Total; 45 Residual Residual Deviance: 503.7251 Log-likelihood: -251.8626 Warning messages: 1: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) 2: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) 3: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) 4: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) 5: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, wzeps = control$wzepsilon) # Analyzing the modified dataset with polr reproduces these second results polr(severity ~ log(exposure.time), weights=freq, data=pneumo2) Coefficients: log(exposure.time) 0.1938154 Intercepts: mild|normal normal|severe -1.630612 2.563055 Residual Deviance: 503.7251 AIC: 509.7251 -- Can someone explain what I am doing wrong when using vglm and polr with the modified dataset? I do not understand why these two formulations should give different results. Brant Inman __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Partial proportional odds logistic regression
R-experts: I would like to explore the partial proportional odds models of Peterson and Harrell (Applied Statistics 1990, 39(2): 205-217) for a dataset that I am analyzing. I have not been able to locate a R package that implements these models. Is anyone aware of existing R functions, packages, etc... that might be used to implement the partial proportional odds models? Brant Inman __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple plots via sapply or lapply?
Hi all, I've got the following problem. I have a vector containing file names. I want to read these files as csv and calculate the density-function for each file (has just one column with data). Then, I'd like to plot all density functions into one window. I did the following to calculate the density data: s - sapply(filelist, function(x) { if(file.exists(x)) { file - read.csv(x, sep=\t, header=F) return( list(density(file$V1)$x, density(file$V1)$y)) } }) Now I would like to plot these x,y data in a similar way but my result s is a matrix containing lists... File1.csv File2.csv File3.csv [1,] Numeric,512Numeric,512 Numeric,512 [2,] Numeric,512Numeric,512 Numeric,512 Now I don't know how to handle the x,y values for each plot into an sapply (or lapply, I don't know) Any idea? Maybe, I should somehow change the return type? Antje __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] odfWeave and figures in MS Word Format
Dear R-List, I try to use the package odfWeave but I missed something. I use the command odfWeave(examples.odt,out.odt) to generate the file that I can open with OpenOffice : it works fine. Then, I used the Save As to export the document to HTML format : it works fine and creates the .png files for every figure from the out.odt document. But when I export the document to MS Word format (OS : XP) there is no figure in the word document .doc. What did I do wrong ? Thanks Laurent ## R version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 3.1 year 2006 month 06 day01 svn rev38247 language R version.string Version 2.3.1 (2006-06-01) ## OpenOffice Version 2.1 ## __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple plots via sapply or lapply?
try apply() : par(new=F); apply(s,2,function(x){plot(x[[1]],x[[2]],type=o);par(new=T)}) On 1/8/07, Antje [EMAIL PROTECTED] wrote: Hi all, I've got the following problem. I have a vector containing file names. I want to read these files as csv and calculate the density-function for each file (has just one column with data). Then, I'd like to plot all density functions into one window. I did the following to calculate the density data: s - sapply(filelist, function(x) { if(file.exists(x)) { file - read.csv(x, sep=\t, header=F) return( list(density(file$V1)$x, density(file$V1)$y)) } }) Now I would like to plot these x,y data in a similar way but my result s is a matrix containing lists... File1.csv File2.csv File3.csv [1,] Numeric,512 Numeric,512 Numeric,512 [2,] Numeric,512 Numeric,512 Numeric,512 Now I don't know how to handle the x,y values for each plot into an sapply (or lapply, I don't know) Any idea? Maybe, I should somehow change the return type? Antje __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory problem
Doing so will also require using the Linux version of R as there is no open source 64bit compiler to create a windows version (so I am told). Anyway, memory is getting cheap now. I got a 64bit machine from Dell with 16Gb od DDR2 for around $4000. Good luck. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges Sent: Sunday, January 07, 2007 3:42 AM To: Zoltan Kmetty Cc: r-help@stat.math.ethz.ch Subject: Re: [R] memory problem Zoltan Kmetty wrote: Hi! I had some memory problem with R - hope somebody could tell me a solution. I work with very large datasets, but R cannot allocate enough memoty to handle these datasets. I want work a matrix with row= 100 000 000 and column=10 A know this is 1 milliard cases, but i thought R could handle it (other commercial software like spss could do), but R wrote out everytime: not enough memory.. any good idea? Buy a machine that has at least 8Gb (better 16Gb) of RAM and proceed ... Uwe Ligges Thanks, Zoltan [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** * This message is for the named person's use only. It may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by any error in transmission. If you have received this message in error, please immediately notify the sender by e-mail, delete the message and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message if you are not the intended recipient. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.Date() results depend on order of data within vector?
On Sun, 07-Jan-2007 at 12:01PM +, Mark Wardle wrote: | Dear all, | | The as.Date() function appears to give different results depending on | the order of the vector passed into it. | | d1 = c(1900-01-01, 2007-01-01,,2001-05-03) | d2 = c(, 1900-01-01, 2007-01-01,2001-05-03) | as.Date(d1) # gives correct results | as.Date(d2) # fails with error (* see below) | | This problem does not arise if the dates are NA rather than an empty | string, but my data is coming via RODBC and I still don't have NAs | passed across properly. | | I might add that I initially noticed this behaviour when using RODBC's | sqlQuery() function call, and I initially had difficulty explaining why | one column of dates was passed correctly, but another failed. The | failing column was a date of death column where it was NA () for | most patients. | | I've come up with two workarounds that work. The first is to sort the | data at the SQL level, ensuring the initial record is not null. The | second is to use sqlQuery() with as.is=T option, and then do the sorting | and conversion afterwards. Simpler, I think, is to add one line d2[d2 == ] - NA I've not tested the idea extensively, so there might be occasions where it falls down. If you're working with a dataframe, you can use one of the apply functions to effect all columns. HTH -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_Middle minds discuss events (:_~*~_:)Small minds discuss people (_)-(_) . Anon ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] creating a list of lists
Hello, I'm trying to create a series of randomForest objects, basically in a loop like this: forests - list(); for (level in 1:10) { # do some other things here # create a random forest forest - randomForest( x = x.level, y = z.level, ntree = trees ); forests - c(forests, forest); } But instead of creating a list of 10 forests, this creates a list of 180 elements, 18 for each forest. Is there a way to create a list of randomForest objects for use later on in code like: for (forest in forests) { values - predict(forest, data); # do things with these predicted values } Sincerely, Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
Hi Martin, On 1/6/07, Martin Morgan [EMAIL PROTECTED] wrote: Hi Ramon, It seems like a naming convention (f.xxx) and eval(parse(...)) are standing in for objects (of class 'GeneSelector', say, representing a function with a particular form and doing a particular operation) and dispatch (a function 'geneConverter' might handle a converter of class 'GeneSelector' one way, user supplied ad-hoc functions more carefully; inside geneConverter the only real concern is that the converter argument is in fact a callable function). eval(parse(...)) brings scoping rules to the fore as an explicit programming concern; here scope is implicit, but that's probably better -- R will get its own rules right. Martin Here's an S4 sketch: setClass(GeneSelector, contains=function, representation=representation(description=character), validity=function(object) { msg - NULL argNames - names(formals(object)) if (argNames[1]!=x) msg - c(msg, \n GeneSelector requires a first argument named 'x') if (!... %in% argNames) msg - c(msg, \n GeneSelector requires '...' in its signature) if (0==length([EMAIL PROTECTED])) msg - c(msg, \n Please describe your GeneSelector) if (is.null(msg)) TRUE else msg }) setGeneric(geneConverter, function(converter, x, ...) standardGeneric(geneConverter), signature=c(converter)) setMethod(geneConverter, signature(converter=GeneSelector), function(converter, x, ...) { ## important stuff here converter(x, ...) }) setMethod(geneConverter, signature(converter=function), function(converter, x, ...) { message(ad-hoc converter; hope it works!) converter(x, ...) }) and then... c1 - new(GeneSelector, + function(x, ...) prod(x, ...), + description=Product of x) c2 - new(GeneSelector, + function(x, ...) sum(x, ...), + description=Sum of x) geneConverter(c1, 1:4) [1] 24 geneConverter(c2, 1:4) [1] 10 geneConverter(mean, 1:4) ad-hoc converter; hope it works! [1] 2.5 cvterr - new(GeneSelector, function(y) {}) Error in validObject(.Object) : invalid class GeneSelector object: 1: GeneSelector requires a first argument named 'x' invalid class GeneSelector object: 2: GeneSelector requires '...' in its signature invalid class GeneSelector object: 3: Please describe your GeneSelector xxx - 10 geneConverter(xxx, 1:4) Error in function (classes, fdef, mtable) : unable to find an inherited method for function geneConverter, for signature numeric Thanks!! That is actually a rather interesting alternative approach and I can see it also adds a lot of structure to the problem. I have to confess, though, that I am not a fan of OOP (nor of S4 classes); in this case, in particular, it seems there is a lot of scaffolding in the code above (the counterpoint to the structure?) and, regarding scoping rules, I prefer to think about them explicitly (I find it much simpler than inheritance). Best, R. Ramon Diaz-Uriarte [EMAIL PROTECTED] writes: Dear Greg, On 1/5/07, Greg Snow [EMAIL PROTECTED] wrote: Ramon, I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us). Those suggestions do apply to me of course (no claim to being organized nor beyond idiocy here). And actually the suggestions on this thread are being very useful. I think, though, that I was not very clear on the context and my examples were too dumbed down. So I'll try to give more detail (nothing here is secret, I am just trying not to bore people). The code is part of a web-based application, so there is no interactive user. The R code is passed the arguments (and optional user functions) from the CGI. There is one core function (call it cvFunct) that, among other things, does cross-validation. So this is one way to do things: cvFunct - function(whatever, genefiltertype, whateverelse) { internalGeneSelect - eval(parse(text = paste(geneSelect, genefiltertype, sep = .))) ## do things calling internalGeneSelect, } and now define all possible functions as geneSelect.Fratio - function(x, y, z) {##something} geneSelect.Wilcoxon - function(x, y, z) {## something else} If I want more geneSelect functions, adding them is simple. And I can even allow the user to pass her/his own functions, with the only restriction that it takes three args, x, y, z, and that the
Re: [R] eval(parse(text vs. get when accessing a function
Hi Ramon, It seems like a naming convention (f.xxx) and eval(parse(...)) are standing in for objects (of class 'GeneSelector', say, representing a function with a particular form and doing a particular operation) and dispatch (a function 'geneConverter' might handle a converter of class 'GeneSelector' one way, user supplied ad-hoc functions more carefully; inside geneConverter the only real concern is that the converter argument is in fact a callable function). eval(parse(...)) brings scoping rules to the fore as an explicit programming concern; here scope is implicit, but that's probably better -- R will get its own rules right. Martin Here's an S4 sketch: setClass(GeneSelector, contains=function, representation=representation(description=character), validity=function(object) { msg - NULL argNames - names(formals(object)) if (argNames[1]!=x) msg - c(msg, \n GeneSelector requires a first argument named 'x') if (!... %in% argNames) msg - c(msg, \n GeneSelector requires '...' in its signature) if (0==length([EMAIL PROTECTED])) msg - c(msg, \n Please describe your GeneSelector) if (is.null(msg)) TRUE else msg }) setGeneric(geneConverter, function(converter, x, ...) standardGeneric(geneConverter), signature=c(converter)) setMethod(geneConverter, signature(converter=GeneSelector), function(converter, x, ...) { ## important stuff here converter(x, ...) }) setMethod(geneConverter, signature(converter=function), function(converter, x, ...) { message(ad-hoc converter; hope it works!) converter(x, ...) }) and then... c1 - new(GeneSelector, + function(x, ...) prod(x, ...), + description=Product of x) c2 - new(GeneSelector, + function(x, ...) sum(x, ...), + description=Sum of x) geneConverter(c1, 1:4) [1] 24 geneConverter(c2, 1:4) [1] 10 geneConverter(mean, 1:4) ad-hoc converter; hope it works! [1] 2.5 cvterr - new(GeneSelector, function(y) {}) Error in validObject(.Object) : invalid class GeneSelector object: 1: GeneSelector requires a first argument named 'x' invalid class GeneSelector object: 2: GeneSelector requires '...' in its signature invalid class GeneSelector object: 3: Please describe your GeneSelector xxx - 10 geneConverter(xxx, 1:4) Error in function (classes, fdef, mtable) : unable to find an inherited method for function geneConverter, for signature numeric Ramon Diaz-Uriarte [EMAIL PROTECTED] writes: Dear Greg, On 1/5/07, Greg Snow [EMAIL PROTECTED] wrote: Ramon, I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us). Those suggestions do apply to me of course (no claim to being organized nor beyond idiocy here). And actually the suggestions on this thread are being very useful. I think, though, that I was not very clear on the context and my examples were too dumbed down. So I'll try to give more detail (nothing here is secret, I am just trying not to bore people). The code is part of a web-based application, so there is no interactive user. The R code is passed the arguments (and optional user functions) from the CGI. There is one core function (call it cvFunct) that, among other things, does cross-validation. So this is one way to do things: cvFunct - function(whatever, genefiltertype, whateverelse) { internalGeneSelect - eval(parse(text = paste(geneSelect, genefiltertype, sep = .))) ## do things calling internalGeneSelect, } and now define all possible functions as geneSelect.Fratio - function(x, y, z) {##something} geneSelect.Wilcoxon - function(x, y, z) {## something else} If I want more geneSelect functions, adding them is simple. And I can even allow the user to pass her/his own functions, with the only restriction that it takes three args, x, y, z, and that the function is to be called: geneSelect. and a user choosen string. (Yes, I need to make sure no calls to system, etc, are in the user code, etc, etc, but that is another issue). The general idea is not new of course. For instance, in package e1071, a somewhat similar thing is done in function tune, and David Meyer there uses do.call. However, tune is a lot more general than what I had in mind. For instance, tune deals with arbitrary functions, with arbitrary numbers and names of parameters, whereas my functions above all take only three arguments (x: a matrix, y: a vector; z: an integer), so the neat functionality
Re: [R] creating a list of lists
change forests - c(forests, forest); into forests[[level]] - forest On 1/6/07, Paul Boutros [EMAIL PROTECTED] wrote: Hello, I'm trying to create a series of randomForest objects, basically in a loop like this: forests - list(); for (level in 1:10) { # do some other things here # create a random forest forest - randomForest( x = x.level, y = z.level, ntree = trees ); forests - c(forests, forest); } But instead of creating a list of 10 forests, this creates a list of 180 elements, 18 for each forest. Is there a way to create a list of randomForest objects for use later on in code like: for (forest in forests) { values - predict(forest, data); # do things with these predicted values } Sincerely, Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a list of lists
Howabout using 'lapply'? forest - lapply(1:10, function(level) { # do some other things here # create and return a random forest randomForest(x = x.level, y = z.level, ntree = trees) }) # give meaningful names names(forest) - paste(level, 1:10, sep = _) Paul Boutros wrote: Hello, I'm trying to create a series of randomForest objects, basically in a loop like this: forests - list(); for (level in 1:10) { # do some other things here # create a random forest forest - randomForest( x = x.level, y = z.level, ntree = trees ); forests - c(forests, forest); } But instead of creating a list of 10 forests, this creates a list of 180 elements, 18 for each forest. Is there a way to create a list of randomForest objects for use later on in code like: for (forest in forests) { values - predict(forest, data); # do things with these predicted values } Sincerely, Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eric Archer, Ph.D. NOAA-SWFSC 8604 La Jolla Shores Dr. La Jolla, CA 92037 858-546-7121,7003(FAX) [EMAIL PROTECTED] Lighthouses are more helpful than churches. - Benjamin Franklin ...but I'll take a GPS over either one. - Craig George __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MDS in 3D
Hi, I have tried to develop multidimensional scaling for 3D space using PCA without success, yet;-) Is there some application ready in R? Cheers, Atte __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MDS in 3D
Hi, I have tried to develop multidimensional scaling for 3D space using PCA without success, yet;-) Is there some application ready in R? Cheers, Atte I found xgobi, but when I try to run example I get some command not found -errors. Atte data(morsecodes) ## from the XGobi/XGvis data, see ?morsecodes mc.row - paste(morsecodes.row[,1],morsecodes.row[,2]) xgvis(dmat = morsecodes.dist, + pos = morsecodes.pos, + rowlab = mc.row, + colors = morsecodes.colors, + glyphs = morsecodes.glyphs, + lines = morsecodes.lines, + linecolors = morsecodes.linecolors) xgvis /tmp/RtmpDaR3cT/xgvis-6058ed8 ## 2) Show lines by hitting l with the mouse over the plot. ## 3) Examine morsecode labels by hitting i and mousing around on the plot. ## 3b) Press r (on the plot) to switch 3D rotation in xgobi. ## 4) Run MDS in 3D by clicking Run MDS (in xgvis). ## 5) Speed up the optimization by increasing the Stepsize with the slider. ## The Stress function value may go as low as 0.1925 (MM). ## 6) When the optimization calms down, click Run MDS to toggle MDS off. ## 7) Rotate the MDS configuration in 3D {by r with mouse over plot}. ## 8) Increase the rotation speed with the slider in the top left and ## control the rotation direction by dragging the mouse on the plot. ## 9) You can check out the initial configuration by ## In order to have no color warning : Mcolors - unique(morsecodes.colors) /bin/sh: line 1: xgvis: command not found (Mcolors - paste(*brushColor, 0:(length(Mcolors)-1),: , Mcolors, sep=)) [1] *brushColor0: SkyBlue *brushColor1: Green [3] *brushColor2: Yellow *brushColor3: HotPink [5] *brushColor4: Red xgobi(morsecodes.pos, collab = morsecodes.col, rowlab = mc.row, + colors = morsecodes.colors, + glyphs = morsecodes.glyphs, + lines = morsecodes.lines, + linecolors = morsecodes.linecolors, + resources= c(*showLines: True, Mcolors)) xgobi -title 'morsecodes.pos' -std mmx /tmp/RtmpDaR3cT/xgobi-mrscd56e509fe /bin/sh: line 1: xgobi: command not found ## This XGobi window will be linked with ## the XGvis window for glyph-color brushing and labeling. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MDS in 3D
...and my system is OSX 10.4. Atte Hi, I have tried to develop multidimensional scaling for 3D space using PCA without success, yet;-) Is there some application ready in R? Cheers, Atte I found xgobi, but when I try to run example I get some command not found -errors. Atte data(morsecodes) ## from the XGobi/XGvis data, see ?morsecodes mc.row - paste(morsecodes.row[,1],morsecodes.row[,2]) xgvis(dmat = morsecodes.dist, + pos = morsecodes.pos, + rowlab = mc.row, + colors = morsecodes.colors, + glyphs = morsecodes.glyphs, + lines = morsecodes.lines, + linecolors = morsecodes.linecolors) xgvis /tmp/RtmpDaR3cT/xgvis-6058ed8 ## 2) Show lines by hitting l with the mouse over the plot. ## 3) Examine morsecode labels by hitting i and mousing around on the plot. ## 3b) Press r (on the plot) to switch 3D rotation in xgobi. ## 4) Run MDS in 3D by clicking Run MDS (in xgvis). ## 5) Speed up the optimization by increasing the Stepsize with the slider. ## The Stress function value may go as low as 0.1925 (MM). ## 6) When the optimization calms down, click Run MDS to toggle MDS off. ## 7) Rotate the MDS configuration in 3D {by r with mouse over plot}. ## 8) Increase the rotation speed with the slider in the top left and ## control the rotation direction by dragging the mouse on the plot. ## 9) You can check out the initial configuration by ## In order to have no color warning : Mcolors - unique(morsecodes.colors) /bin/sh: line 1: xgvis: command not found (Mcolors - paste(*brushColor, 0:(length(Mcolors)-1),: , Mcolors, sep=)) [1] *brushColor0: SkyBlue *brushColor1: Green [3] *brushColor2: Yellow *brushColor3: HotPink [5] *brushColor4: Red xgobi(morsecodes.pos, collab = morsecodes.col, rowlab = mc.row, + colors = morsecodes.colors, + glyphs = morsecodes.glyphs, + lines = morsecodes.lines, + linecolors = morsecodes.linecolors, + resources= c(*showLines: True, Mcolors)) xgobi -title 'morsecodes.pos' -std mmx /tmp/RtmpDaR3cT/xgobi- mrscd56e509fe /bin/sh: line 1: xgobi: command not found ## This XGobi window will be linked with ## the XGvis window for glyph-color brushing and labeling. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] negative binomial family glm R and STATA
Dear Patrick Try the package gamlss which allow to fit the negative binomial distrbution. For example with your data I am getting #--- ga1-gamlss(nbcas~.,data=zonesdb4,family=NBI) GAMLSS-RS iteration 1: Global Deviance = 817.9027 GAMLSS-RS iteration 2: Global Deviance = 817.9025 ga1 Family: c(NBI, Negative Binomial type I) Fitting method: RS() Call: gamlss(formula = nbcas ~ ., family = NBI, data = zonesdb4) Mu Coefficients: (Intercept) pop AreaV_100kHab1gares1 3.204e+00 1.114e-05 1.354e-05 9.144e-01 7.946e-01 ports1 axe_routier1 axe_routier2 lacs1 -1.730e+00 1.989e-01NA 3.042e+00 Sigma Coefficients: (Intercept) 2.313 Degrees of Freedom for the fit: 9 Residual Deg. of Freedom 83 Global Deviance: 817.902 AIC: 835.902 SBC: 858.599 #-- Note that the AIC: 835.902 is similar to your fitted model using glm.nb which is AIC: 836.2. The coefficients are not identical but this is not suprissing when you are using x-variables with extreme values as pop and Area. The profile function for sigma can be found using prof.dev(ga1,sigma, min=7, max=16, step=0.1, type=l) Your discrepancy with STATA come from the fact that in STATA you are fitting the model with sigma fixed to 1. You can see that by fitting the same model in GAMLSS. ga2-gamlss(nbcas~.,data=zonesdb4,family=NBI, sigma.fix=T, sigma.start=1) GAMLSS-RS iteration 1: Global Deviance = 1194.299 GAMLSS-RS iteration 2: Global Deviance = 1194.298 This is similar to the log likelihod you are getting in STATA. i.e. -2*-597.14778= 1194.296. You can also use the stepGAIC() function to simplify your model. For example ga2-stepGAIC(ga1) will result to a model with only pop and lacs in the mdel. Note also the the Negative binomial type II fits better to you data. ga3-gamlss(nbcas~.,data=zonesdb4,family=NBII) GAMLSS-RS iteration 1: Global Deviance = 804.5682 ... GAMLSS-RS iteration 10: Global Deviance = 804.4995 Thanks Mikis __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] eval(parse(text vs. get when accessing a function
The S4 is not essential. You could do it in S3 too: f.a - function(x) x+1 f.b - function(x) x+2 f - function(x) UseMethod(f) f(structure(10, class = a)) [1] 11 attr(,class) [1] a On 1/6/07, Ramon Diaz-Uriarte [EMAIL PROTECTED] wrote: Hi Martin, On 1/6/07, Martin Morgan [EMAIL PROTECTED] wrote: Hi Ramon, It seems like a naming convention (f.xxx) and eval(parse(...)) are standing in for objects (of class 'GeneSelector', say, representing a function with a particular form and doing a particular operation) and dispatch (a function 'geneConverter' might handle a converter of class 'GeneSelector' one way, user supplied ad-hoc functions more carefully; inside geneConverter the only real concern is that the converter argument is in fact a callable function). eval(parse(...)) brings scoping rules to the fore as an explicit programming concern; here scope is implicit, but that's probably better -- R will get its own rules right. Martin Here's an S4 sketch: setClass(GeneSelector, contains=function, representation=representation(description=character), validity=function(object) { msg - NULL argNames - names(formals(object)) if (argNames[1]!=x) msg - c(msg, \n GeneSelector requires a first argument named 'x') if (!... %in% argNames) msg - c(msg, \n GeneSelector requires '...' in its signature) if (0==length([EMAIL PROTECTED])) msg - c(msg, \n Please describe your GeneSelector) if (is.null(msg)) TRUE else msg }) setGeneric(geneConverter, function(converter, x, ...) standardGeneric(geneConverter), signature=c(converter)) setMethod(geneConverter, signature(converter=GeneSelector), function(converter, x, ...) { ## important stuff here converter(x, ...) }) setMethod(geneConverter, signature(converter=function), function(converter, x, ...) { message(ad-hoc converter; hope it works!) converter(x, ...) }) and then... c1 - new(GeneSelector, + function(x, ...) prod(x, ...), + description=Product of x) c2 - new(GeneSelector, + function(x, ...) sum(x, ...), + description=Sum of x) geneConverter(c1, 1:4) [1] 24 geneConverter(c2, 1:4) [1] 10 geneConverter(mean, 1:4) ad-hoc converter; hope it works! [1] 2.5 cvterr - new(GeneSelector, function(y) {}) Error in validObject(.Object) : invalid class GeneSelector object: 1: GeneSelector requires a first argument named 'x' invalid class GeneSelector object: 2: GeneSelector requires '...' in its signature invalid class GeneSelector object: 3: Please describe your GeneSelector xxx - 10 geneConverter(xxx, 1:4) Error in function (classes, fdef, mtable) : unable to find an inherited method for function geneConverter, for signature numeric Thanks!! That is actually a rather interesting alternative approach and I can see it also adds a lot of structure to the problem. I have to confess, though, that I am not a fan of OOP (nor of S4 classes); in this case, in particular, it seems there is a lot of scaffolding in the code above (the counterpoint to the structure?) and, regarding scoping rules, I prefer to think about them explicitly (I find it much simpler than inheritance). Best, R. Ramon Diaz-Uriarte [EMAIL PROTECTED] writes: Dear Greg, On 1/5/07, Greg Snow [EMAIL PROTECTED] wrote: Ramon, I prefer to use the list method for this type of thing, here are a couple of reasons why (maybe you are more organized than me and would never do some of the stupid things that I have, so these don't apply to you, but you can see that the general suggestion applys to some of the rest of us). Those suggestions do apply to me of course (no claim to being organized nor beyond idiocy here). And actually the suggestions on this thread are being very useful. I think, though, that I was not very clear on the context and my examples were too dumbed down. So I'll try to give more detail (nothing here is secret, I am just trying not to bore people). The code is part of a web-based application, so there is no interactive user. The R code is passed the arguments (and optional user functions) from the CGI. There is one core function (call it cvFunct) that, among other things, does cross-validation. So this is one way to do things: cvFunct - function(whatever, genefiltertype, whateverelse) { internalGeneSelect - eval(parse(text = paste(geneSelect, genefiltertype, sep = .))) ## do things calling
[R] Using JRI Calling R function from Java
Hey guys. I have Installed and running JRI from Java and it works. Using : java -cp jri.jar; MainApp.java The problem is How do I calling R function that need R library I have tried these with my Java program : x = re.eval(glm( y ~ x1 + x2, family = poisson)); String resultString = x.asString(); but it did'n't work, resultString variable did't give any result it's null questions : Is there any other way to call this R function from Java using JRI? and I have heard omegahat package can be use to calling R function are there any links, examples , tutorials for using omegahat? Thanks -- View this message in context: http://www.nabble.com/Using-JRI-Calling--R-function-from-Java-tf2937179.html#a8212021 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R scripts to plot Taylor Diagram
Dear All, Are there any existing scripts to plot Taylor Diagram (definition see http://www.ipsl.jussieu.fr/~jmesce/Taylor_diagram/taylor_diagram_definition.html ) ? Thanks a lot! Linda [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MDS in 3D
On Mon, 8 Jan 2007, Atte Tenkanen wrote: I have tried to develop multidimensional scaling for 3D space using PCA without success, yet;-) Is there some application ready in R? Yes. stats has cmdscale. MASS has sammon and isoMDS. All three have a 'k' argument to determine the output dimension. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MDS in 3D
I found xgobi, but when I try to run example I get some command not found -errors. Try the more moden ggobi and ggvis, http://www.ggobi.org. Installing on the mac is still a bit of a trial, but we hope to have an installer soon. Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.