Re: [R] Suggestion for ?split
On Thu, 22 Jun 2006, Simon Blomberg wrote: Hi all, I noticed an undocumented feature for split. It sorts the resulting list according to the grouping factor. An example: test - data.frame(x=rnorm(48), f=letters[sample(1:8)]) split(test, test$f) I wasn't expecting this behaviour, although I was pleasantly surprised. I suggest that the help page for split be amended to include this feature. I know it's a small thing, but someone else may also find it useful to know. It is not really true. The help page says The value returned from 'split' is a list of vectors containing the values for the groups. The components of the list are named by the _used_ factor levels given by 'f'. They are in the same order as the _used_ factor levels (as the statement implies), but those are in no sense sorted. Indeed, the factor may be created by as.factor or interaction, and working out the order of the factor levels can be tricky, which is why they are named. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] NLME: using the layout option with the plot command
Hi Greg, Since you haven't yet had a response, you could distill this. It uses the pixel dataset from nlme() as an example. ## To get separate files, do this postscript(c:\MyGraph%03.ps, onefile=F) plot(Pixel, display = Dog, inner = ~Side, layout=c(4,1)) dev.off() ## To get your layout into one file, as separate pages, do this postscript(c:\MyGraph.ps, onefile=T) plot(Pixel, display = Dog, inner = ~Side, layout=c(4,1)) dev.off() If you prefer pdf, then use : pdf(filename, onefile=F), c. At the R prompt do : ?postscript (and ?pdf), and go on reading! Also have a look at setps() in package Hmisc. Regards, Mark. Mark DiffordPh.D. candidate, Botany Department, Nelson Mandela Metropolitan University, Port Elizabeth, SA. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Basic NA handling problem
Hi All, I need your help in NA handling. I've following data series. x-c(1,4,5,8,NA,4,NA,5,5,1,2,7,8,9,NA,NA,NA,15,6,8) Now i want to interpolate where NA value persists. Like, between 9 and 5 there are three NA's. So, that should be interpolated like, 1st NA- (15-9)/4 2nd NA- 1st NA value + (15-9)/4 Can i get help on this using a 'for' loop. Actually i have huge daily time series with lots of NA values where i need to make these values. Please help. Thanks, Sumanta Basak. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Basic NA handling problem
Try this: help.search(interpolate) Then find this: library(zoo) na.approx(x) [1] 1.0 4.0 5.0 8.0 6.0 4.0 4.5 5.0 5.0 1.0 2.0 7.0 8.0 9.0 10.5 [16] 12.0 13.5 15.0 6.0 8.0 Hth, JeeBee. On Thu, 22 Jun 2006 06:54:53 +0100, Sumanta Basak wrote: Hi All, I need your help in NA handling. I've following data series. x-c(1,4,5,8,NA,4,NA,5,5,1,2,7,8,9,NA,NA,NA,15,6,8) Now i want to interpolate where NA value persists. Like, between 9 and 5 there are three NA's. So, that should be interpolated like, 1st NA- (15-9)/4 2nd NA- 1st NA value + (15-9)/4 Can i get help on this using a 'for' loop. Actually i have huge daily time series with lots of NA values where i need to make these values. Please help. Thanks, Sumanta Basak. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to plot a image with restrictions?
On Wed, 21 Jun 2006, Cleber N.Borges wrote: Hello All! How to plot a image ( image function ) with restrictions, like a polygon?? The suggestions given in: http://finzi.psych.upenn.edu/R/Rhelp02a.new/archive/15030.html by Denis White let you overplot the image with a masking polygon - that may be the most suitable for you. Alternatively, you can set the image values to be omitted to NA, but how you would do this depends rather on whether the polygon is irregular or not. If the polygon is regular, unrolling the image into a vector and setting conditions on expand.grid() of the row and column coordinates will work with is.na(). If irregular, you'll need to do point-in-polygon checking of the image grid coordinates. If your data are geographical (not assumed here), you may like to follow this up on the R-sig-geo list. Roger Thanks in advance! Cleber N. Borges # problem example x - y - seq(-4*pi, 4*pi, len=27) r - sqrt(outer(x^2, y^2, +)) image( z ) contour(z, add = TRUE, drawlabels = FALSE) m=scan() 0.2 0.2 0.8 0.2 0.5 0.8 m = matrix(m,nr=3,byrow=T) polygon( m, lwd=5 ) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rosner's test
BertG == Berton Gunter [EMAIL PROTECTED] on Tue, 13 Jun 2006 14:34:48 -0700 writes: BertG RSiteSearch('Rosner') ?RSiteSearch or search directly BertG from CRAN. BertG Incidentally, I'll repeat what I've said BertG before. Don't do outlier tests. They're BertG dangerous. Use robust methods instead. Yes, yes, yes!!! Note that rlm() or cov.rob() from recommended package MASS will most probably be sufficient for your needs. For slightly newer methodology, look at package 'robustbase', or also 'rrcov'. Martin Maechler, ETH Zurich BertG -- Bert Gunter Genentech Non-Clinical Statistics BertG South San Francisco, CA BertG The business of the statistician is to catalyze the BertG scientific learning process. - George E. P. Box __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] multivariate splits
Thanks to everyone for the replies to my question. There are several good algorithms to make decision trees with multivariate splits (sometimes called oblique trees), some are even freeware (QUEST, CRUISE...), but most are not open-sourced. Unfortunately there seems to be no good choice for oblique trees in R now. Mvpart seems to be really fancy, although it cannot handle splits on the linear combination of the covariates. Rweka might be useful, but it still takes some time for me to get used to its language. I'll see where I get. Thanks again and bye: Bálint -- Czúcz Bálint PhD hallgató BCE KTK Talajtan és Vízgazdálkodás Tanszék 1118 Budapest, Villányi út 29-43. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sort matrix by sum of columns
Albert Is this what you want?: a[,order(colSums(a))] John S --- -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Albert Vilella Sent: 21 June 2006 11:38 To: r help Subject: [R] sort matrix by sum of columns Hi all, I would like to know how can I sort the cols of a matrix by the sum of their elements. a - matrix(as.integer(rnorm(25,4,2)),10,5) colnames(a) = c(alfa,bravo,charlie,delta,echo) I guess I should use colSums, and then rearrange the matrix somehow according to the result. My idea is to display a sorted barplot: barplot(a, horiz=TRUE, legend.text=T) Thanks in advance, Albert. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Two sets of axis in a lattice plot?
Dear list, Is it possible to have lattice provide two different ordinate axis, one on the right side and one of the left side of the plot? I would like to make a xyplot with x both in a linear scale and in a log-approximate scale. Thank you in advance. /Fredrik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problem with integrate() - correction
Sorry should read: ip - function(x) 1240 * x^(-0.88) Rainer M Krug wrote: Hi System: Linux, SuSE 10 R 2.3.0 I try to do the following ip - function(x) 1240*dip(x, 0.88) iip - function(x) integrate(ip, x - 0.045, x + 0.045)$value f - integrate(iip, 10, 100) Error in integrate(iip, 10, 100) : evaluation of function gave a result of wrong length How can I solve this? Thanks Rainer -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel:+27 - (0)72 808 2975 (w) Fax:+27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with integrate() - correction
Well why not do it symbolically, e.g. int(a*x^b,x) = \frac{a}{b+1}x^{b+1} and integrating this again gives \frac{a}{(b+1)(b+2)}x^{b+2}, of course for resonable values of a and b such that things are well defined. But if you follow the link https://stat.ethz.ch/pipermail/r-help/2005-January/064026.html you may come to this solution: ip - function(x) 1240 * x^(-0.88) iip - function(x){ res - rep(NA,length(x)) for (i in seq(along=x)) res[i] - integrate(ip, x[i] - 0.045, x[i] + 0.045)$value return(res) } f - integrate(iip, 10, 100) which may not be an effective way to do it. Med venlig hilsen Frede Aakmann Tøgersen -Oprindelig meddelelse- Fra: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] På vegne af Rainer M Krug Sendt: 22. juni 2006 14:03 Til: [EMAIL PROTECTED] Cc: R help list Emne: [R] problem with integrate() - correction Sorry should read: ip - function(x) 1240 * x^(-0.88) Rainer M Krug wrote: Hi System: Linux, SuSE 10 R 2.3.0 I try to do the following ip - function(x) 1240*dip(x, 0.88) iip - function(x) integrate(ip, x - 0.045, x + 0.045)$value f - integrate(iip, 10, 100) Error in integrate(iip, 10, 100) : evaluation of function gave a result of wrong length How can I solve this? Thanks Rainer -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel: +27 - (0)72 808 2975 (w) Fax: +27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email:[EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with integrate() - correction
Thanks a lot - the symbolic solution will work and I understand what is wrong with my function Rainer Frede Aakmann Tøgersen wrote: Well why not do it symbolically, e.g. int(a*x^b,x) = \frac{a}{b+1}x^{b+1} and integrating this again gives \frac{a}{(b+1)(b+2)}x^{b+2}, of course for resonable values of a and b such that things are well defined. But if you follow the link https://stat.ethz.ch/pipermail/r-help/2005-January/064026.html you may come to this solution: ip - function(x) 1240 * x^(-0.88) iip - function(x){ res - rep(NA,length(x)) for (i in seq(along=x)) res[i] - integrate(ip, x[i] - 0.045, x[i] + 0.045)$value return(res) } f - integrate(iip, 10, 100) which may not be an effective way to do it. Med venlig hilsen Frede Aakmann Tøgersen -Oprindelig meddelelse- Fra: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] På vegne af Rainer M Krug Sendt: 22. juni 2006 14:03 Til: [EMAIL PROTECTED] Cc: R help list Emne: [R] problem with integrate() - correction Sorry should read: ip - function(x) 1240 * x^(-0.88) Rainer M Krug wrote: Hi System: Linux, SuSE 10 R 2.3.0 I try to do the following ip - function(x) 1240*dip(x, 0.88) iip - function(x) integrate(ip, x - 0.045, x + 0.045)$value f - integrate(iip, 10, 100) Error in integrate(iip, 10, 100) : evaluation of function gave a result of wrong length How can I solve this? Thanks Rainer -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel: +27 - (0)72 808 2975 (w) Fax: +27 - (0)21 808 3304 Cell:+27 - (0)83 9479 042 email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology and Entomology University of Stellenbosch Matieland 7602 South Africa Tel:+27 - (0)72 808 2975 (w) Fax:+27 - (0)21 808 3304 Cell: +27 - (0)83 9479 042 email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Two sets of axis in a lattice plot?
On 6/22/06, Fredrik Karlsson [EMAIL PROTECTED] wrote: Dear list, Is it possible to have lattice provide two different ordinate axis, one on the right side and one of the left side of the plot? Not easily. I hope to add some support for this sort of thing in the future, but I'm not sure when. Currently, you can add a second axis as part of the panel function (using e.g. panel.axis), but you will have to adjust the space for it manually. You will also need to turn off clipping. -Deepayan I would like to make a xyplot with x both in a linear scale and in a log-approximate scale. Thank you in advance. /Fredrik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- http://www.stat.wisc.edu/~deepayan/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] As.Factor with Logistic Regression
I am modeling the probability of player succeeding in the NFL with a binomial logistic regression with 1 signifying success and 0 signifying no success. I performed the regression of the binomial variable against overall draft position using the college conference for which each player played as a factor using the as.factor(Conference) command. My question is: How do I plot specific factors against the curve for the entire set. There are only a few factors that have significant coefficients and I would like to plot those against the logistic curve for the entire set. Any help would be greatly appreciated. -- Justin Rapp 409 S. 22nd St. Apt. 1 Philadelphia, PA 19146 Cell:(267)252.0297 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] glm beta hypothesis testing
Thanks for the insight on the debugger. It has taken me a day or so to get use to it, but it is very helpful. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ben Bolker Sent: Tuesday, June 20, 2006 12:56 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] glm beta hypothesis testing Davis, Jacob B. JBDavis at txfb-ins.com writes: In summary.glm I'm trying to get a better feel for the z output. The following lines can be found in the function [snip] digging through the function is good: debugging your way through the function is sometimes even better. examples(glm,local=TRUE) ## run glm examples and get ## results left in local workspace) ls() debug(summary.glm) summary(glm.D93) shows ... p is object rank (~ number of parameters) Qr$pivot gives the order in which the parameters have been rearranged to solve the model, so Qr$pivot[p1] gives the rearranged order of the coefficients. We need this rearranged order because we're going to extract the unscaled covariance matrix by solving the inverse QR matrix, which is in the pivoted (rearranged) order. The null hypothesis for any particular contrast in glm is that the parameter is 0, so the estimates of the coefficients (object$coefficients) *are* the distance from the null hypothesis. Ben Bolker __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] As.Factor with Logistic Regression
Justin Rapp wrote: I am modeling the probability of player succeeding in the NFL with a binomial logistic regression with 1 signifying success and 0 signifying no success. I performed the regression of the binomial variable against overall draft position using the college conference for which each player played as a factor using the as.factor(Conference) command. My question is: How do I plot specific factors against the curve for the entire set. There are only a few factors that have significant coefficients and I would like to plot those against the logistic curve for the entire set. Any help would be greatly appreciated. It will bias the analysis to omit insignificant factors. To get the plots you want: library(Hmisc); library(Design) dd - datadist(mydata); options(datadist='dd') f - lrm(y ~ x1+x2+) plot(f, x1=NA, fun=plogis) plot(f, x2=NA, fun=plogis) plot(f, fun=plogis) # plot partial effects of all predictors, probability scale -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] GLM plotting estimated responses
I have recently fit a binomial link = logit model and I want to visually see the estimated response across a specific covariate. I can accomplish this by creating a function that extracts the betas from the model then forms a linear combination with the classes I'm interested in next pass the value through the inverse logit function, do this for each class in a covariate, then plot the results. My question is: Is there a built in function that will do all of this for me, where all I have to do is supply the classes for each covariate I'm interested in and it plots for me? On a similar note is there a quick way to plot the study size across a covariate from model results, without creating my own functions? Thanks in advance, ~~ Jacob Davis Actuarial Analyst [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] programming advice
Dear R users I want to compute Kendall's Tau between two vectors x and y. But x and y may have zeros in the same position(s) and I wrote the following function to be sure to drop out those double zeros cor.kendall - function(x,y) { nox - c() noy - c() # for (i in 1:length(x)) if (x[i]!= 0 | y[i] != 0) nox[length(nox)+1]- x[i] for (i in 1:length(y)) if (x[i]!= 0 | y[i] != 0) noy[length(noy)+1]- y[i] # res.kendall - cor.test(nox,noy,method = kendall,exact=F) return(list(x=nox,y=noy,res.kendall,length(nox))) } Do you know a more elegant way to achieve the same goal ? (I'm sure you do : it's a newbie's program actually) -- Fred __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Basic package structure question
At the encouragement of many at UseR, I'm trying to build my first real package. I have no C/Fortran code, just plain old R code, so it should be rocket science. On a Linux box, I used package.skeleton() to create a basic package containing just one hello world type of function. I edited the DESCRIPTION file, changin the package name appropriately. I edited the hello.Rd file. Upon running R CMD check hello, the only warning had to do with the fact that src/ was empty (obviously I had no source in such a simple package). I doubt this is a problem. I was able to install and use the package successfully on the Linux system from the .tar.gz file, so far so good! Next, on to Windows, where the problem arose: I created a zip file from inside the package directory: zip -r ../hello.zip ./* When I moved this to my Windows machine and tried to install the package, I received the following error: utils:::menuInstallLocal() Error in unpackPkg(pkgs[i], pkgnames[i], lib, installWithVers) : malformed bundle DESCRIPTION file, no Contains field I only found one mention of this in my Google search, with no reply to the thread. The Contains field appears to be used for bundles, but I'm trying to create a package, not a bundle. This leads me to believe that a simple zipping of the package directory structure is not the correct format for Windows. Needless to say, there appears to be wide agreement that making packages requires precision, but fundamentally a package should (as described in the documentation) just be a collection of files and folders organized a certain way. If someone could point me to documentation I may have missed that explains this, I would be grateful. Regards, Jay -- John W. Emerson (Jay) Assistant Professor of Statistics Yale University http://www.stat.yale.edu/~jay [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Why different results with different initial values for MLE (optim)!
Hi, All: I used optim() to minimise likelihood function for fitting the data to a partiuclar distribution. The function is converged and the value of log-likelihood is different when I change the intial value. Whether it means the program does not work well? Thanks! Xin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Bayesian Networks with deal
Since I haven't seen a reply to this, I will offer a couple of comments. I've never used deal, but it sounded interesting, so I thought I'd look at it. Have you looked at Susanne G. Bøttcher and Claus Dethlefsen. deal: A Package for Learning Bayesian Networks. Journal of Statistical Software, 8(20), 2003, and the deal reference manual downloadable under documentation from www.math.aau.dk/~dethlef/novo/deal? If yes and you still would like more help from this listserve, please submit another post including a simple, self-contained example explaining something you've tried and why it doesn't seem to answer your question? (This is suggested in the posting guide! 'www.R-project.org/posting-guide.html'.) This documentation might answer your questions. Even though I've not read them, I will guess potential answers to your two questions, hoping some other reader may disabuse us both of our ignorance: From what I saw in the examples, I would guess that deal supports two types of distributions: normal and finite (discrete). If so, it does NOT support a Poisson. If it were my problem and I still held that view after reviewing this documentation, I might write to the maintainer [listed with help(package=deal)] and ask him for suggestions. Then if it were sufficiently important, I might think about how I would modify the code to allow for a Poisson. Regarding simulations, have you looked at rnetwork, which provides simulation of data sets with a given dependency structure? Hope this helps, Spencer Graves Carsten Steinhoff wrote: Hello, I want to use R to model a bayesian belief network of frequencies for system failiures in various departments of a company. For the nodes I want to use a poisson-distribution parameterized with expert knowledge (e.g. with a gamma prior). Then I want to fill in learning-data to improve the initial estimates and get some information about possible connections. Later I want to simulate dependend random variables from the network I tryed to use the package deal for that task, which is as far as I know the best (and only?) R-package for that task. But a few questions rose that I could not solve with the documentation: (1) Is it possible to parameterize the prior distribution (for example (dpois(x,lambda=60) directly and non gaussian ? (2) If I've chosen a structure, can I simulate dependend states that are non gausian distributed? Thank you for any idea! Regards, Carsten [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Reading in a table with ISO-latin1 encoding in MacOS-X (Intel)
Dear colleagues, With the help of a colleague of mine here in Helsinki (Seppo Nyrkkö) who looked at innards of the R source code for Mac it turned out that this was indeed an issue concerning the Mac locale and its settings and not R. Though we had tried this earlier by changing the LANG variable to 'fi_FI', we hadn't looked hard enough in the available encodings (with locale -a) to select the exactly correct value, being: LANG=fi_FI.IS08859-1; export LANG; With this configuration R was able to happily read in my original table with the Scandinavian characters in the header, without no fuss. Thanks for your advice, and wishing all a good Midsummer, -Antti Arppe On Thu, 8 Jun 2006, Peter Dalgaard wrote: so one can only guess that you have a local or Mac-specific setup issue. On Thu, 8 Jun 2006, Prof Brian Ripley wrote: If so, you need to investigate the locale in use, as which letters are valid depends on the locale: on Linux UTF-8 locales all letters in all languages are valid in R names, but that is not necessarily the MacOS interpretation. (Invalid characters in names will be converted to ., and if the locale is wrong so may be the interpretation of bytes as characters.) You might find more informed help on the r-sig-mac list. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Package structure question (progress!)
Thanks to Jeff Laake for giving me a starting point. It appears my zipping should have been: zip -r hello.zip hello essentially creating an outer folder having the package name inside the zip file. It still didn't quite work, with an error as follows: utils:::menuInstallLocal() updating HTML package descriptions Warning message: no package 'file678418be' was found in: packageDescription(i, lib.loc = lib, field = Title, encoding = UTF-8) The warning doesn't mean anything to me, but I suspect that because I zipped the linux files, it is possible that the unix end-of-line difference might pose a problem. I'll try doing the obvious conversion and will report back. As I wrote to Jeff, I'll be happy to document this experience once I understand it. For those of us interested in spreading R source with no C/Fortran (e.g. packages in their simplest form), this packaging should be fairly simple from first principles without having to rely on special tools. Regards, Jay __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Why different results with different initial values for MLE(optim)!
It is difficult to say much since you haven't described your likelihood function. However, there are some general things that you can do: (1) make sure that you are maximizing, by defining your objective function as the negative of likelihood, (2) use log-likelihood rather than the likelihood function, (3) look at the value of the objective function at different converged points to see whether you truly have multiple local maxima or if the likelihood is very flat near the optimum, (4) look at convergence criterion from the output to see whether you have true convergence, and (5) try different optimization techniques in optim or try fitdistr function in MASS library. Ravi. -- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -- -Original Message- From: [EMAIL PROTECTED] [mailto:r-help- [EMAIL PROTECTED] On Behalf Of Xin Sent: Thursday, June 22, 2006 12:10 PM To: r-help@stat.math.ethz.ch Subject: [R] Why different results with different initial values for MLE(optim)! Hi, All: I used optim() to minimise likelihood function for fitting the data to a partiuclar distribution. The function is converged and the value of log- likelihood is different when I change the intial value. Whether it means the program does not work well? Thanks! Xin [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] useR! Thanks
Dear All, I would like to totally echo Frank Harrell's appreciation of both UseR! and Vienna. I had a wonderful time despite being jet-lagged for the whole time! I was also wondering whether the minutes of the final session on getting credit for contributions to computational statistics software could be made available to the participants. Best, Ravi. -- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -- -Original Message- From: [EMAIL PROTECTED] [mailto:r-help- [EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr Sent: Monday, June 19, 2006 6:14 AM To: RHELP Subject: [R] useR! Thanks After attending my first useR! conference I want to thank the organizers for doing a wonderful job and the presenters for their high quality presentations and stimulating ideas. The conference venue was excellent and of course Vienna is one of the greatest cities in the world to visit. useR! is one of the most fun conferences I've attended. Thanks again! -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting- guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] QUERY: correlation between two time series
I am writing on the behalf of a colleague. She needs to evaluate the correlation between two time series. How can this be performed through R? (how many types of correlations can be performed and which are the commands to do it?) thank you for your help and attention Stefano __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] High breakdown/efficiency statistics -- was RE: Rosner's test [Broadcast]
What would be nice is to have something like a robust task view... Andy From: Berton Gunter Many thanks for this Martin. There now are several packages with what appear to be overlapping functions (or at least algorithms). Besides those you mentioned, robust and roblm are at least two others. Any recommendations about how or whether to choose among these for us enthusiastic but non-expert users? Cheers, Bert -Original Message- From: Martin Maechler [mailto:[EMAIL PROTECTED] Sent: Thursday, June 22, 2006 2:04 AM To: Berton Gunter Cc: 'Robert Powell'; r-help@stat.math.ethz.ch Subject: Re: [R] Rosner's test BertG == Berton Gunter [EMAIL PROTECTED] on Tue, 13 Jun 2006 14:34:48 -0700 writes: BertG RSiteSearch('Rosner') ?RSiteSearch or search directly BertG from CRAN. BertG Incidentally, I'll repeat what I've said BertG before. Don't do outlier tests. They're BertG dangerous. Use robust methods instead. Yes, yes, yes!!! Note that rlm() or cov.rob() from recommended package MASS will most probably be sufficient for your needs. For slightly newer methodology, look at package 'robustbase', or also 'rrcov'. Martin Maechler, ETH Zurich BertG -- Bert Gunter Genentech Non-Clinical Statistics BertG South San Francisco, CA BertG The business of the statistician is to catalyze the BertG scientific learning process. - George E. P. Box __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] useR! Thanks
Achim Zeileis wrote: [...] The plan is to provide the original presentation slides of the panelists and a video of the whole panel disc, and probably some minutes and/or further information. Did you happen to video Ivan Mizera's talk as well? I'd really like to see that again! -- Jeffrey Horner Computer Systems Analyst School of Medicine 615-322-8606 Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] useR! Thanks
On Thu, 2006-06-22 at 15:40 -0500, Jeffrey Horner wrote: Achim Zeileis wrote: [...] The plan is to provide the original presentation slides of the panelists and a video of the whole panel disc, and probably some minutes and/or further information. Did you happen to video Ivan Mizera's talk as well? I'd really like to see that again! Hear! Hear! A most enjoyable presentation! I now suffer from one newly engendered fear: Becoming an abuseR! ;-) Thanks to the Organizers, Hosts and Presenters for another wonderful meeting. It is great to see the continued healthy growth and diversity in this community. I look forward to the next one. Best regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] QUERY: correlation between two time series
Stefano Sofia stefano.sofia at regione.marche.it writes: I am writing on the behalf of a colleague. She needs to evaluate the correlation between two time series. How can this be performed through R? (how many types of correlations can be performed and which are the commands to do it?) This question is too broad, and the first answer is the usual please read the posting guide/documentation/etc. -- as a start, try ?cor help.search(correlation) help.search(time series) RSiteSearch( etc. ) taking a wild guess at your friend's statistical issue, she might needs to estimate cross-correlations between the series in the presence of autocorrelation in each time series, which is typically done using a multiple ARMA model: http://finzi.psych.upenn.edu/R/Rhelp02a.new/archive/24612.html suggests that the dse bundle has such capabilities. This list is *not* a general-purpose statistics query list, but if your friend (or you) submitted a much more careful description of the problem, and showed more evidence of (1) having a good idea of the general statistical background of the problem (e.g. I think model X would be appropriate based on reading the literature in my area but I can't figure out whether R can do it or everyone in my field uses model X but it's not appropriate in this case because ...) and (2) having tried to use the existing R documentation (RSiteSearch, etc.) to find the answers to your questions --- then you'd be much more likely to get a useful response, even if it was not strictly an R question. Ben Bolker __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] xyplot, type=b
For those who are interested, here is a solution using grid. This preserves some of the original spirit of do_plot_xy, but is more satisfactory when x and y are on different scales: my.panel - function(x, y, cex=1, ...){ require(grid) panel.xyplot(x, y, cex = cex, ...) a - 0.5 * cex * convertWidth(unit(1, char), native, valueOnly=TRUE) b - 0.5 * cex * convertHeight(unit(1, char), native,valueOnly=TRUE) for(i in 1:length(x)){ dx - x[i] - x[i - 1] dy - y[i] - y[i - 1] f - 1 / sqrt((dx/a)^2 + (dy/b)^2) if((i 1) identical(f 0.5, TRUE)){ panel.lines(x = c(x[i - 1] + f * dx, x[i] - f * dx), y = c(y[i - 1] + f * dy, y[i] - f * dy)) } } } # example n - 50 dat - data.frame(x = runif(n), y = runif(n)) dat - dat[order(dat$x), ] myplot - xyplot(y~x, data = dat, pch=., panel = my.panel) print(myplot) (It's a little slow due to looping panel.lines.) Ben Benjamin Tyner wrote: Unfortunately this is not quite right; to do it correctly it seems one has to address two problems: 1. how to get the size of the default 'cex' to use for 'd' (do_plot_xy uses 'GConvertYUnits' to accomplish this) 2. figure out how to achieve the same effect as what 'GConvert(xx, yy, USER, INCHES, dd)' does in do_plot_xy. Otherwise, the gap sizes are not constant. (1) sounds easy but I don't know the answer offhand. (2) seems more subtle. Any suggestions would be greatly appreciated. Ben __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Effect size in mixed models
I just learned that my earlier suggestion was wrong. It's better to compute the variance of the predicted or fitted values and compare those with the estimated variance components. To see how to do this, consider the following minor modification of an example in the lme documentation: fm1. - lme(distance ~ age, data = Orthodont, random=~1) fm2. - lme(distance ~ age + Sex, data = Orthodont, random = ~ 1) # str(fm1.) suggested the following: var(fm2.$fitted[, fixed]-fm1.$fitted[, fixed]) [1] 1.312756 VarCorr(fm1.)[, 1] (Intercept)Residual 4.472056 2.049456 VarCorr(fm2.)[, 1] (Intercept)Residual 3.266784 2.049456 In this example, the subject variance without considering Sex was 4.47 but with Sex in the model, it dropped to 3.27, while the Residual variance remained unchanged at 2.05. The difference between fm2.$fitted[, fixed] and fm1.$fitted[, fixed] is the change in the predictions generated by the addition of Sex to the model. The variance of that difference was 1.31. Note that 3.27 + 1.31 = 4.58, which is moderately close to 4.47. In sum, I think we can get a reasonable estimate of the size of an effect from the variance of the differences in the fixed portion of the fitted model. Comments? Hope this helps. Spencer Graves Spencer Graves wrote: You have asked a great question: It would indeed be useful to compare the relative magnitude of fixed and random effects, e.g. to prioritize efforts to better understand and possibly manage processes being studied. I will offer some thoughts on this, and I hope if there are errors in my logic or if someone else has a better idea, we will both benefit from their comments. The ideal might be an estimate of something like a mean square for a particular effect to compare with an estimated variance component. Such mean squares were a mandatory component of any analysis of variance table prior to the (a) popularization of generalized linear models and (b) availability of software that made it feasible to compute maximum likelihood estimates routinely for unbalanced, mixed-effects models. However, anova(lme(...)) such mean squares are for most purposes unnecessary cluster in a modern anova table. To estimate a mean square for a fixed effect, consider the following log(likelihood) for a mixed-effects model: lglk = (-0.5)*(n*log(2*pi*var.e)-log(det(W)) + t(y-X%*%b)%*%W%*%(y-X%*%b)/var.e), where n = the number of observations, b = the fixed-effect parameter variance, and the covariance matrix of the residuals, after integrating out the random effects is var.e*solve(W). In this formulation, the matrix W is a function of the variance components. Since they are not needed to compute the desired mean squares, they are suppressed in the notation here. Then, the maximum likelihood estimate of var.e = SSR/n, where SSR = t(y-X%*%b)%*%W%*%(y-X%*%b). Then mle.lglk = (-0.5)*(n*(log(2*pi*SSR/n)-1)-log(det(W))). Now let SSR0 = this generalized sum of squares of residuals (SSR) without effect 1, and SSR1 = this generalized SSR with this effect 1. If I've done my math correctly, then D = deviance = 2*log(likelihood ratio) = (n*log(SSR0/SSR1)+log(det(W1)/det(W0))) For roughly half a century, a major part of the analysis of variance was the Pythagorean idea that the sum of squares under H0 was the sum of squares under H1 plus the sum of squares for effect 1: SSR0 = SS1 + SSR1. Whence, exp((D/n)-log(det(W1)/det(W0))) = 1+SS1/SSR1. Thus, SS1 = SSR1*(exp((D/n)-log(det(W1)/det(W0)))-1). If the difference between deg(W1) and det(W0) can be ignored, we get: SS1 = SSR1*(exp((D/n)-1). Now compute MS1 = SS1/df1, and compare with the variance components. If there is a flaw in this logic, I hope someone will disabuse me of it. If this seems too terse or convoluted to follow, please provide a simple, self-contained example, as suggested in the posting guide! www.R-project.org/posting-guide.html. You asked a theoretical question, you got a theoretical answer. If you want a concrete answer, it might help to pose a more concrete question. Hope this helps. Spencer Graves Bruno L. Giordano wrote: Hello, Is there a way to compare the relative relevance of fixed and random effects in mixed models? I have in mind measures of effect size in ANOVAs, and would like to obtain similar information with mixed models. Are there information criteria that allow to compare the relevance of each of the effects in a mixed model to the overall fit? Thank you, Bruno __ R-help@stat.math.ethz.ch mailing list
Re: [R] rank(x,y)?
Gabor Grothendieck wrote: On 6/21/06, Duncan Murdoch [EMAIL PROTECTED] wrote: Peter Dalgaard wrote: Duncan Murdoch [EMAIL PROTECTED] writes: Suppose I have two columns, x,y. I can use order(x,y) to calculate a permutation that puts them into increasing order of x, with ties broken by y. I'd like instead to calculate the rank of each pair under the same ordering, but the rank() function doesn't take multiple values as input. Is there a simple way to get what I want? E.g. x - c(1,2,3,4,1,2,3,4) y - c(1,2,3,1,2,3,1,2) rank(x+y/10) [1] 1 3 6 7 2 4 5 8 gives me the answer I want, but only because I know the range of y and the size of gaps in the x values. What do I do in general? Still not quite general, but in the absence of ties: z[order(x,y)]-1:8 z [1] 1 3 6 7 2 4 5 8 Thanks to all who have replied. Unfortunately for me, ties do exist, and I'd like them to get identical ranks. John Fox's suggestion would handle ties properly, but I'm worried about rounding error giving spurious ties. Try this variant of my prior solution: (order(order(x,y)) + rev(order(order(rev(x), rev(y)/2 Note that no arithmetic is done on the original data, only on the output of order, so there should not be any worry about rounding -- in fact its sufficiently general that the data do not have to be numeric, e.g. x - c(a, a, b, a, c, d) y - c(b, a, b, b, a, a) (order(order(x,y)) + rev(order(order(rev(x), rev(y)/2 [1] 2.5 1.0 4.0 2.5 5.0 6.0 This is a very nice solution, thanks! So now we have equivalents to ties=average and first; ties=random would be easy. I wonder if it's worth working out ties=max and ties=min and putting in a new function? Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] hurdle and zip model
This is more of a stats question, but since I'm using R. Can someone tell me if the standard errors produced by hurdle(), zicounts(), poisson, and the negative binomial formulations of three are directly comparable? Why or why not? Thank you, Jeff [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Basic package structure question
On 6/22/06, Duncan Murdoch [EMAIL PROTECTED] wrote: Jay Emerson wrote: At the encouragement of many at UseR, I'm trying to build my first real package. I have no C/Fortran code, just plain old R code, so it should be rocket science. On a Linux box, I used package.skeleton() to create a basic package containing just one hello world type of function. I edited the DESCRIPTION file, changin the package name appropriately. I edited the hello.Rd file. Upon running R CMD check hello, the only warning had to do with the fact that src/ was empty (obviously I had no source in such a simple package). I doubt this is a problem. I was able to install and use the package successfully on the Linux system from the .tar.gz file, so far so good! Next, on to Windows, where the problem arose: I created a zip file from inside the package directory: zip -r ../hello.zip ./* Which package directory, the source or the installed copy? I think this might work in the installed copy, but would not work on the source. It's not the documented way to build a binary zip, though. When I moved this to my Windows machine and tried to install the package, I received the following error: utils:::menuInstallLocal() Error in unpackPkg(pkgs[i], pkgnames[i], lib, installWithVers) : malformed bundle DESCRIPTION file, no Contains field I only found one mention of this in my Google search, with no reply to the thread. The Contains field appears to be used for bundles, but I'm trying to create a package, not a bundle. This leads me to believe that a simple zipping of the package directory structure is not the correct format for Windows. Needless to say, there appears to be wide agreement that making packages requires precision, but fundamentally a package should (as described in the documentation) just be a collection of files and folders organized a certain way. If someone could point me to documentation I may have missed that explains this, I would be grateful. I think the organized in a certain way part is actually important. Using R CMD install --build is the documented way to achieve this. It's not trivial to do this on Windows, because you need to set up a build environment first, but it's not horribly difficult. Duncan Murdoch Regards, Jay One idea that occurred to me in reading this would be to have a server that one can send a package to and get back a Windows build to eliminate having to set up a development environment. Not sure if this is feasible, particularly security aspects, but if it were it would open up package building on Windows to a larger audience. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] eacf
RSiteSearch(EACF) led me to http://finzi.psych.upenn.edu/R/Rhelp02a.new/archive/43927.html;. In this note from Oct. 2004, Brian Ripley said the EACF did not exist but 'acf' and 'pacf' did. Moreover, he said that model selection using acf and pacf is a rather old-fashioned idea. Just fit all the target models and compare their performance. What are you trying to do? If model time series, I suggest you follow Prof. Ripley's advice. If you are conducting research and you want to compare the EACF with something else, I doubt if it would be too hard to program in R (though I could neither recall nor find the formulae involved in the few minutes I spent just now looking). If you try to write an 'eacf' function and get stuck, please submit another post describing what you've tried and what does not seem to work (as suggested in the posting guide! www.R-project.org/posting-guide.html). Hope this helps. Spencer Graves p.s. Your English is just fine. Michel Helcias wrote: I am Brazilian and I don't know how to speak English, for that I apologize for my writing. I'd like of informations about the extended (sample) autocorrelation function (*EACF*) for the time series. where to acquire some related command. thak you. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] xyplot, type=b
Hi Benjamin Tyner wrote: For those who are interested, here is a solution using grid. This preserves some of the original spirit of do_plot_xy, but is more satisfactory when x and y are on different scales: my.panel - function(x, y, cex=1, ...){ require(grid) panel.xyplot(x, y, cex = cex, ...) a - 0.5 * cex * convertWidth(unit(1, char), native, valueOnly=TRUE) b - 0.5 * cex * convertHeight(unit(1, char), native,valueOnly=TRUE) for(i in 1:length(x)){ dx - x[i] - x[i - 1] dy - y[i] - y[i - 1] f - 1 / sqrt((dx/a)^2 + (dy/b)^2) if((i 1) identical(f 0.5, TRUE)){ panel.lines(x = c(x[i - 1] + f * dx, x[i] - f * dx), y = c(y[i - 1] + f * dy, y[i] - f * dy)) } } } # example n - 50 dat - data.frame(x = runif(n), y = runif(n)) dat - dat[order(dat$x), ] myplot - xyplot(y~x, data = dat, pch=., panel = my.panel) print(myplot) (It's a little slow due to looping panel.lines.) Here's one way to vectorize the lines ... my.panel2 - function(x, y, cex=1, ...){ panel.xyplot(x, y, cex = cex, ...) a - 0.5 * cex * convertWidth(unit(1, char), native, valueOnly=TRUE) b - 0.5 * cex * convertHeight(unit(1, char), native,valueOnly=TRUE) dx - diff(x) dy - diff(y) f - 1 / sqrt((dx/a)^2 + (dy/b)^2) n - length(x) x1 - x[1:(n-1)] + f*dx x2 - x[2:n] - f*dx y1 - y[1:(n-1)] + f*dy y2 - y[2:n] - f*dy i - f 0.5 panel.segments(x1[i], y1[i], x2[i], y2[i]) } Paul Benjamin Tyner wrote: Unfortunately this is not quite right; to do it correctly it seems one has to address two problems: 1. how to get the size of the default 'cex' to use for 'd' (do_plot_xy uses 'GConvertYUnits' to accomplish this) 2. figure out how to achieve the same effect as what 'GConvert(xx, yy, USER, INCHES, dd)' does in do_plot_xy. Otherwise, the gap sizes are not constant. (1) sounds easy but I don't know the answer offhand. (2) seems more subtle. Any suggestions would be greatly appreciated. Ben __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] xyplot, type=b
Most clever! I'd completely forgotten about panel.segments. Paul Murrell wrote: Here's one way to vectorize the lines ... my.panel2 - function(x, y, cex=1, ...){ panel.xyplot(x, y, cex = cex, ...) a - 0.5 * cex * convertWidth(unit(1, char), native, valueOnly=TRUE) b - 0.5 * cex * convertHeight(unit(1, char), native,valueOnly=TRUE) dx - diff(x) dy - diff(y) f - 1 / sqrt((dx/a)^2 + (dy/b)^2) n - length(x) x1 - x[1:(n-1)] + f*dx x2 - x[2:n] - f*dx y1 - y[1:(n-1)] + f*dy y2 - y[2:n] - f*dy i - f 0.5 panel.segments(x1[i], y1[i], x2[i], y2[i]) } Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] positioning of separate y-axis labels in xyplot
Hi Benjamin Tyner wrote: I like the functionality provided by outer=TRUE, but when it comes time to place separate xlabs or ylabs, I always end up 'eyeballing' it on a case-by-case basis. For example, ##begin example require(lattice) cars.lo - loess(dist ~ speed, cars) print(xyplot(cars.lo$residuals+cars.lo$fitted~cars.lo$x, strip=FALSE, outer=TRUE, layout=c(1,2), ylab=, scales=list(y=list(relation=free,rot=0)), panel=function(x,y,panel.number,...){ if(panel.number==1){ panel.xyplot(x,y) panel.abline(h=0) }else{ panel.xyplot(x,y=cars.lo$y) panel.xyplot(x,y,type=l) } })) require(grid) trellis.focus(panel, 1, 1, clip.off=TRUE, highlight=FALSE) grid.text(residuals, x=unit(0, npc) + unit(-2, lines),rot=90) trellis.focus(panel, 1, 2, clip.off=TRUE, highlight=FALSE) grid.text(fitted, x=unit(0, npc) + unit(-2, lines),rot=90) ## end example In this case, a distance of -2 lines happens to be enough, but one has to make the plot to know this. I'm interested in learning how one can place the ylabs without fear of overlapping the tick labels; i.e., how to use the exact space allocated by ylab=. I'm thinking it must involve viewports? There is a 'ylab' viewport set up by lattice, although it is just one big one for all panels. Here's what you could do for your example (with knowledge of how many rows of panels there are) ... downViewport(trellis.vpname(ylab)) # If you want to see where the viewport is ... # grid.rect(gp=gpar(col=red), name=temp) grid.text(residuals, y=0.25, rot=90) grid.text(fitted, y=0.75, rot=90) # If you want to remove the rect again ... # grid.remove(temp) upViewport(0) Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] frechet distance
RSiteSearch(Frechet distance) returned only one hit for me just now, and that was for a Frechet distribution, as you mentioned. Google found www.cs.concordia.ca/cccg/papers/39.pdf, which suggests that computing it may not be easy. If you'd like more help from this listserve, I can offer two suggestions: 1. If you absolutely need the Frechet distance and you can describe an algorithm for computing it but get stuck writing a function for it, please submit another question outlining what you've tried and that obstacle you've found. 2. Alternatively, you might describe the more general problem you are trying to solve, why you thought the Frechet distance might help and invite alternative suggestions. Hope this helps. Spencer Graves Rajarshi Guha wrote: Hi, is there any package (or source code snippet) that will evaluate the Frechet distance for curves represented as sets of points? Searching around only threw up references to a Frechet distribution. Thanks, --- Rajarshi Guha [EMAIL PROTECTED] http://jijo.cjb.net GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE --- If you believe in telekinesis, raise my hand. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Nested variance-covariance matrix in Multilevel model
What's your error message? I see a syntax error in random=list(Probe=pdBlocked(list(~1,pdCompSymm(~1 ),End=~1,ProbeNo=pdCompSymm(~1)) I count 5 open parens and 3 close parens. I know that might have been part of what you copied into this email and not what you gave R. However, it sounds like you have several problems. Have you seen Pinheiro and Bates (2000) Mixed-effects models in S and S-PLUS (Springer)? This is the premier (almost mandatory) reference for 'lme'. Moreover, the ~\library\nlme\scripts subdirectory of the R installation directory (at least under Windows) contains files ch01.R, ch02.R, ..., ch06.R, ch08.R containing essentially all the S commands used in the book, organized by chapter. This can save endless hassles with spelling problems (and minor syntax differences between S-Plus and R). Sections 1.5 and 1.6, pp. 40-72, discuss analyses if two different nested experiments. If just want a standard nested analysis, this should be adequate. If you have other special things happening that I missed in skimming your email, please explain why the analyses described in Pinheiro and Bates do you apply. In doing so, please include a simple, self-contained example showing what you tried and why you think it's not adequate. Please don't send us a megabyte of data. Instead, please either (a) modify an example from the book or some other R documentation or (b) generate phony data with a very few lines of R code. Hope this helps. Spencer Graves Tobias Guennel wrote: I completely forgot to supply the R code I tried: vov1i2-read.table(VOV1_INHIBITED6-16-2006-13h35min33sec.txt,header=TRUE) test.lme-lme(fixed=log2(Intensity)~End+logpgc,random=list(Probe=pdBlocked(list(~1,pdCompSymm(~1 ),End=~1,ProbeNo=pdCompSymm(~1)),data=vov1i2) It doesn't look right to me and it produces an error too. It should be something around those lines though. Tobias Guennel Tobias Guennel wrote: Dear R community, I have trouble implementing a nested variance-covariance matrix in the lme function. The model has two fixed effects called End and logpgc, the response variable is the logarithm to base 2 of Intensity ( log2(Intensity) ) and the random effects are called Probe and ProbeNo. The model has the following nesting structure: A Pixel is nested within the ProbeNo,the ProbeNo is within the ProbeEnd ( there are two ends for every probe), and the ProbeEnd is within the Probe. Now the problem I have is that the variance-covariance structure of the model is quite complex and I can not find the right syntax for fitting it in the lme function. The variance-covariance structure is a block diagonal matrix of the form, V1 0 0 V=0 V20 0 0V3 where V1...V3 are of the structure: v11 v12 V1= and so on. v21 v22 V1...V3 are assumed to have a compound symmetric variance-covariance structure and therefore the submatrices are of the form: Lambda Delta1 Delta1 ... Delta1 Delta1 LambdaDelta1 ... Delta1 v11=v22=... Delta1 . Lambda Delta2 Delta2 Delta2 ... Delta2 Delta2 Delta2 Delta2 ... Delta2 v12=v21=... Delta2 . Delta2 The elements of these submatrices depend only upon the three covariance parameters: the compound symmetry parameter delta; the variance of random effect sigma^2g; and the residual variance sigma^2. I have formulas for the submatrices Lambda,Delta1 and Delta2 which I can't really paste in here. The SAS code dealing with this model is the following: proc mixed data=rnadeg.pnau; title 'CV structure for PNAU'; class probepos probeno end probe pixelid newprobeid; model logPM=end logpgc / ddfm=satterth; random probeno newprobeid / subject=probe type=cs; lsmeans end / diff cl; run; Any ideas are appreciated a lot since I am kind of stuck at this point. Thank you Tobias Guennel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Time series labeling with Zoo
Hi, I'm using zoo because it can automatically label the months of a time series composed of daily observations. This works well for certain time series lengths, but not for others, e.g.: While: library(zoo) plot(zoo(runif(10), as.Date(2005-06-01) + 0:50)) Shows up the months and day of month, plot(zoo(runif(10), as.Date(2005-06-01) + 0:380)) results in a single label 2006 in the middle of the x axis, with no months labelled, although there's plenty of room for labels. How do I get zoo to label the months too, without having to manually work out which day was the first day of each month? I'm using zoo 1.0-3 with R 2.2.1. Thanks, Gad -- Gad Abraham Department of Mathematics and Statistics University of Melbourne Parkville 3010, Victoria, Australia email: [EMAIL PROTECTED] web: http://www.ms.unimelb.edu.au/~gabraham __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] frechet distance
On Thu, 2006-06-22 at 19:52 -0700, Spencer Graves wrote: RSiteSearch(Frechet distance) returned only one hit for me just now, and that was for a Frechet distribution, as you mentioned. Google found www.cs.concordia.ca/cccg/papers/39.pdf, which suggests that computing it may not be easy. In addition, from what I have read it is supposed to be NP-hard. However my problems are usually small. 1. If you absolutely need the Frechet distance and you can describe an algorithm for computing it but get stuck writing a function for it, please submit another question outlining what you've tried and that obstacle you've found. 2. Alternatively, you might describe the more general problem you are trying to solve, why you thought the Frechet distance might help and invite alternative suggestions. I have some curves (in the form of points) which are sigmoidal. I also have some curves which look like 2 sigmoid curves joined head to tail. Something like - / / -- / / / -- Now these curves can vary: in some cases the initial lower tail might be truncated. For the 'stepwise' sigmoidal curves, the middle step might not be horizontal but inclined to some degree and so on. My goal is to be given a set of points representing a curve and try and identify whether it is of the standard sigmoid form or of the 'stepwise' sigmoid form. My plan was to generate a 'canonical sigmoid curve' via the logistic equation and then perform a curve matching operation. Thus for a supplied curve that is sigmoid in nature it will match the 'canonical curve' to a better extent than would a curve that is stepwise in nature. (The matching is performed after applying a Procrustes transformation) Initially I tried using the Hausdorff distance, but this does not take into account the ordering of the points in the curve and did not always give a conclusive answer. A number of references (including the one above) indicate that the Frechet distance is better suited for curve matching problems. As you noted, evaluating the Frechet distance is non-trivial and the only code that I could find was some code that is dependent on the CGAL (http://www.cgal.org/) library. As far as I could see, CGAL does not have any R bindings. An alternative that I had considered was to to evaluate the distance matrix of the points making up the curve and then evaluating the root mean square error of the matrix elements for the canonical curve and the supplied curve. My initial experiments indicated that this generally works but I observed some cases where a stepwise curve matched the canonical sigmoid better (ie lower RMSE) than an actual sigmoid curve. Another alternative is look at a graph of the first derivative of the curve. A standard sigmoidal curve will result in a graph with a single peak, a stepwise curve like above will result in a graph with 2 peaks. Thus this could be reduced to a peak picking problem. The problem is the curves I'll get are not smooth and can have small kinks - this leads to (usually) quite small peaks in the graph of the first derivative - but most of the code that has been described on this list for peak picking also picks them up, thus making identification of the curve ambiguous. To be honest I do not fully understand the algorithm used to evaluate the Frechet distance hence my request for code. However, I'm not fixated on the Frechet distance :) If there are simpler approaches I'm open to them. Thanks, --- Rajarshi Guha [EMAIL PROTECTED] http://jijo.cjb.net GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE --- Q: What do you get when you cross a mosquito with a mountain climber? A: Nothing. You can't cross a vector with a scaler. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] frechet distance
see inline Rajarshi Guha wrote: On Thu, 2006-06-22 at 19:52 -0700, Spencer Graves wrote: RSiteSearch(Frechet distance) returned only one hit for me just now, and that was for a Frechet distribution, as you mentioned. Google found www.cs.concordia.ca/cccg/papers/39.pdf, which suggests that computing it may not be easy. In addition, from what I have read it is supposed to be NP-hard. However my problems are usually small. 1. If you absolutely need the Frechet distance and you can describe an algorithm for computing it but get stuck writing a function for it, please submit another question outlining what you've tried and that obstacle you've found. 2. Alternatively, you might describe the more general problem you are trying to solve, why you thought the Frechet distance might help and invite alternative suggestions. I have some curves (in the form of points) which are sigmoidal. I also have some curves which look like 2 sigmoid curves joined head to tail. Something like - / / -- / / / -- If all curves are of this form, it suggests they are continuous and monotonically increasing. Is that correct? If yes, isn't the Frechet distance then just the max of the distances between the two curves at the end points? I don't see a complete proof yet, but it would seem that the difficulties in computing Frechet distances would come from functions that are not monotonic and / or discontinuous. This brings me back to my second question: What's the larger problem you want to solve, for which you think the Frechet distance might help? Hope this helps. Spencer Graves suppose we want Now these curves can vary: in some cases the initial lower tail might be truncated. For the 'stepwise' sigmoidal curves, the middle step might not be horizontal but inclined to some degree and so on. My goal is to be given a set of points representing a curve and try and identify whether it is of the standard sigmoid form or of the 'stepwise' sigmoid form. My plan was to generate a 'canonical sigmoid curve' via the logistic equation and then perform a curve matching operation. Thus for a supplied curve that is sigmoid in nature it will match the 'canonical curve' to a better extent than would a curve that is stepwise in nature. (The matching is performed after applying a Procrustes transformation) Initially I tried using the Hausdorff distance, but this does not take into account the ordering of the points in the curve and did not always give a conclusive answer. A number of references (including the one above) indicate that the Frechet distance is better suited for curve matching problems. As you noted, evaluating the Frechet distance is non-trivial and the only code that I could find was some code that is dependent on the CGAL (http://www.cgal.org/) library. As far as I could see, CGAL does not have any R bindings. An alternative that I had considered was to to evaluate the distance matrix of the points making up the curve and then evaluating the root mean square error of the matrix elements for the canonical curve and the supplied curve. My initial experiments indicated that this generally works but I observed some cases where a stepwise curve matched the canonical sigmoid better (ie lower RMSE) than an actual sigmoid curve. Another alternative is look at a graph of the first derivative of the curve. A standard sigmoidal curve will result in a graph with a single peak, a stepwise curve like above will result in a graph with 2 peaks. Thus this could be reduced to a peak picking problem. The problem is the curves I'll get are not smooth and can have small kinks - this leads to (usually) quite small peaks in the graph of the first derivative - but most of the code that has been described on this list for peak picking also picks them up, thus making identification of the curve ambiguous. To be honest I do not fully understand the algorithm used to evaluate the Frechet distance hence my request for code. However, I'm not fixated on the Frechet distance :) If there are simpler approaches I'm open to them. Thanks, --- Rajarshi Guha [EMAIL PROTECTED] http://jijo.cjb.net GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE --- Q: What do you get when you cross a mosquito with a mountain climber? A: Nothing. You can't cross a vector with a scaler. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
Re: [R] High breakdown/efficiency statistics -- was RE: Rosner's test [Broadcast]
AndyL == Liaw, Andy [EMAIL PROTECTED] on Thu, 22 Jun 2006 14:21:17 -0400 writes: AndyL What would be nice is to have something like a AndyL robust task view... Andy Indeed. I have volunteered to do this a while ago. First wanted to get robustbase to a reasonable state. A 'robust' CRAN task view is planned to be done by the end of the summer, hopefully even earlier. Martin AndyL From: Berton Gunter Many thanks for this Martin. There now are several packages with what appear to be overlapping functions (or at least algorithms). Besides those you mentioned, robust and roblm are at least two others. Any recommendations about how or whether to choose among these for us enthusiastic but non-expert users? Cheers, Bert -Original Message- From: Martin Maechler [mailto:[EMAIL PROTECTED] Sent: Thursday, June 22, 2006 2:04 AM To: Berton Gunter Cc: 'Robert Powell'; r-help@stat.math.ethz.ch Subject: Re: [R] Rosner's test BertG == Berton Gunter [EMAIL PROTECTED] on Tue, 13 Jun 2006 14:34:48 -0700 writes: BertG RSiteSearch('Rosner') ?RSiteSearch or search directly BertG from CRAN. BertG Incidentally, I'll repeat what I've said BertG before. Don't do outlier tests. They're BertG dangerous. Use robust methods instead. Yes, yes, yes!!! Note that rlm() or cov.rob() from recommended package MASS will most probably be sufficient for your needs. For slightly newer methodology, look at package 'robustbase', or also 'rrcov'. Martin Maechler, ETH Zurich BertG -- Bert Gunter Genentech Non-Clinical Statistics BertG South San Francisco, CA BertG The business of the statistician is to catalyze the BertG scientific learning process. - George E. P. Box __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html AndyL -- AndyL Notice: This e-mail message, together with any AndyL attachments, contains information of Merck Co., AndyL Inc. (One Merck Drive, Whitehouse Station, New AndyL Jersey, USA 08889), and/or its affiliates (which may AndyL be known outside the United States as Merck Frosst, AndyL Merck Sharp Dohme or MSD and in Japan, as Banyu) AndyL that may be confidential, proprietary copyrighted AndyL and/or legally privileged. It is intended solely for AndyL the use of the individual or entity named on this AndyL message. If you are not the intended recipient, and AndyL have received this message in error, please notify us AndyL immediately by reply e-mail and then delete it from AndyL your system. AndyL -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Time series labeling with Zoo
When the axis labelling does not work well you will have to do it yourself like this. The plot statement is instructed not to plot the axis and then we extract into tt all the dates which are day of the month 1. Then we manually draw the axis using those. library(zoo) set.seed(1) z - zoo(runif(10), as.Date(2005-06-01) + 0:380) plot(z, xaxt = n) tt - time(z)[as.POSIXlt(time(z))$mday == 1] axis(1, tt, format(tt, %d%b%y)) On 6/22/06, Gad Abraham [EMAIL PROTECTED] wrote: Hi, I'm using zoo because it can automatically label the months of a time series composed of daily observations. This works well for certain time series lengths, but not for others, e.g.: While: library(zoo) plot(zoo(runif(10), as.Date(2005-06-01) + 0:50)) Shows up the months and day of month, plot(zoo(runif(10), as.Date(2005-06-01) + 0:380)) results in a single label 2006 in the middle of the x axis, with no months labelled, although there's plenty of room for labels. How do I get zoo to label the months too, without having to manually work out which day was the first day of each month? I'm using zoo 1.0-3 with R 2.2.1. Thanks, Gad -- Gad Abraham Department of Mathematics and Statistics University of Melbourne Parkville 3010, Victoria, Australia email: [EMAIL PROTECTED] web: http://www.ms.unimelb.edu.au/~gabraham __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html