Re: [R] The hidden costs of GPL software? - None
Martin, what about setting up a new mailing list R-hcgs? (acronym for R - The hidden costs of GPL software?) Seems to be worth given the amount of messages in this thread(s). ;-) Uwe __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] T-test syntax question
Hi, You did not specify if data are paired or not, as data are paired you should use option paired=TRUE in t.test(). Variances of the two samples have to be not significatevely different, (see ? var.test) to use t.test, if not you should specify var.equal=FALSE. var.equal: a logical variable indicating whether to treat the two ariances as being equal. If 'TRUE' then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used. paired: a logical indicating whether you want a paired t-test. If 'paired' is 'TRUE' then both 'x' and 'y' must be specified and they must be the same length. Missing values are removed (in pairs if 'paired' is 'TRUE'). If 'var.equal' is 'TRUE' then the pooled estimate of the variance is used. By default, if 'var.equal' is 'FALSE' then the variance is estimated separately for both groups and the Welch modification to the degrees of freedom is used. From the output of your test you're sending I understand that variances of the two samples are significatively different (Welch Two Sample t-test) and delta values are also significatively different from 0. See ? t.test Regards Vito You wrote: Hi. I'd like to do a t-test to compare the Delta values of items with Crit=1 with Delta values of items with Crit=0. What is the t.test syntax? It should produce a result like this below (I can't get in touch with the person who originally did this for me) Welch Two Sample t-test data: t1$Delta by Crit t = -3.4105, df = 8.674, p-value = 0.008173 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.04506155 -0.00899827 sample estimates: mean in group FALSE mean in group TRUE 0.03331391 0.06034382 Thanks. = Diventare costruttori di soluzioni Became solutions' constructors The business of the statistician is to catalyze the scientific learning process. George E. P. Box Visitate il portale http://www.modugno.it/ e in particolare la sezione su Palese http://www.modugno.it/archivio/palese/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Ks.test (was (no subject))
Hi Angela, I believe you should introduce only df as parameters; t distribution as by default mean=0; see this example. x-rt(100,10) ks.test(x, pt,10) One-sample Kolmogorov-Smirnov test data: x D = 0.1414, p-value = 0.03671 alternative hypothesis: two.sided Ciao Vito you wrote: Good morning, I have to apply the Ks test with the the t distribution. I know I have to write ks.test(data_name,distribution_name, parameters..) but I don't know what is the name fot t distribution and which parameters to introduce? may be mean=0 and freedom degrees in my case? Thank you for helping me. Angela Re = Diventare costruttori di soluzioni Became solutions' constructors The business of the statistician is to catalyze the scientific learning process. George E. P. Box Visitate il portale http://www.modugno.it/ e in particolare la sezione su Palese http://www.modugno.it/archivio/palese/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (no subject)
[EMAIL PROTECTED] wrote: Good morning, I have to apply the Ks test with the the t distribution. I know I have to write ks.test(data_name,distribution_name, parameters..) but I don't know what is the name fot t distribution and which parameters to introduce? may be mean=0 and freedom degrees in my case? For example: ks.test(x, pt, df = 4) See ?ks.test and ?pt Uwe Ligges Thank you for helping me. Angela Re __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Is there a package in R that lets me fit a robust linear mixed model?
Dear R people, Happy Thanksgiving! I just wonder if there is a R package that can supply some kind of robust way to fit a linear mixed model. I mean assigning small weights to those observations with large residuals, like iteratively-reweighted-least-squares approach. Many thanks, Frank __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The hidden costs of GPL software? - None
On 24-Nov-04 John wrote: Off hand, the costs of GPL'd software are not hidden at all. R for instance demands that a would be user sit down and learn the language. This in turn pushes a user into learning more about statistics than the simple overview that Stat 1 presents a student. I'd see this as less a cost than a benefit! In contrast, any program that simplifies use also tends to encourage a simplified understanding. Agreed! So, I believe it can be legitimately argued that the real hidden costs lurk in easy to use software, especially commeercial software with GUI interfaces. Well put; though it's not obvious whom these costs fall on. The people who actually use the easy to use software, or the organisations that employ them, can all too often get away with sloppy or invalid analysis. It may often be the consumer of their results or of products based on them who ultimately loses. Best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 24-Nov-04 Time: 09:21:35 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Posting to 2 mailing lists {was How to extract data?}
You posted the referenced message to both R-help and R-sig-finance. Please do *not* post to more than one R-list! Please do *not* post to more than one R-list! Please do *not* post to more than one R-list! Please decide if something is specific for an R-SIG-foo list or if it belongs to R-help (or R-devel) and post to one and only one mailing list! We had another mess recently with this by a posting to both R-help and R-sig-gui (and I think another one where even 3 mailing lists where affected). The whole thing is a particular impoliteness to all those people -- often the nice helpers! -- who are subscribed to more than one of the implied lists. Chosing one list, the discussion thread will be archived/seen/read consistently both on the server archives and people's mail/news boxes. - If the thread should be *diverted* to another list, there could be *one* overlap message (posting to both), where the move should be announced - If you deem it relevant, you can still alert the readers of one list to a hot topic on another list, e.g., by posting an URL to the starting message in an (online) archive. Martin Maechler, ETH Zurich PS: Of course, I've been tempted for a moment to post this to all R- mailing lists ;-) :-) __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problem with anova and glmmPQL
Hello: I am getting an error message when appplying anova() to two equations estimated using glmmPQL. I did look through the archives but didn't finding anything relevant to my problem. The R-code and results follow. Hope someone can help. ANDREW fm1 - glmmPQL(choice ~ day + stereotypy, +random = ~ 1 | bear, data = learning, family = binomial) iteration 1 iteration 2 iteration 3 iteration 4 fm2 - glmmPQL(choice ~ day + envir + stereotypy, +random = ~ 1 | bear, data = learning, family = binomial) iteration 1 iteration 2 iteration 3 iteration 4 anova(fm1) numDF denDF F-value p-value (Intercept) 1 2032 7.95709 0.0048 day 1 2032 213.98391 .0001 stereotypy 1 2032 0.42810 0.5130 anova(fm2) numDF denDF F-value p-value (Intercept) 1 2031 5.70343 0.0170 day 1 2031 213.21673 .0001 envir 1 2031 12.50388 0.0004 stereotypy 1 2031 0.27256 0.6017 anova(fm1, fm2) Error in anova.lme(fm1, fm2) : Objects must inherit from classes gls, gnls lm,lmList, lme,nlme,nlsList, or nls version _ platform i586-mandrake-linux-gnu arch i586 os linux-gnu system i586, linux-gnu status major2 minor0.0 year 2004 month10 day 04 language R -- Andrew R. Criswell, Ph.D. Graduate School, Bangkok University mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] LDA with previous PCA for dimensionality reduction
Dear all, not really a R question but: If I want to check for the classification accuracy of a LDA with previous PCA for dimensionality reduction by means of the LOOCV method: Is it ok to do the PCA on the WHOLE dataset ONCE and then run the LDA with the CV option set to TRUE (runs LOOCV) -- OR-- do I need - to compute for each 'test-bag' (the n-1 observations) a PCA (my.princomp.1), - then run the LDA on the test-bag scores (- my.lda.1) - then compute the scores of the left-out-observation using my.princomp.1 (- my.scores.2) - and only then use predict.lda(my.lda.1, my.scores.2) on the scores of the left-out-observation ? I read some articles, where they choose procedure 1, but I am not sure, if this is really correct? many thanks for a hint Christoph __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Grumble ...
Hi Folks, A Grumble ... The message I just sent to R-help about The hidden costs of GPL ... has evoked a Challenge response: Hi, You´ve just sent a message to [EMAIL PROTECTED] In order to confirm the sent message, please click here This confirmation is necessary because [EMAIL PROTECTED] uses Antispam UOL, a service that avoids unwanted messages like advertising, pornography, viruses, and spams. Other messages sent to [EMAIL PROTECTED] won't need to be confirmed*. *If you receive another confirmation request, please ask [EMAIL PROTECTED] to include you in his/her authorized e-mail list. I won't be responding to this. Let the recipient simply not receive the mail. Of no great importance in this case, but a disadvantage to the recipient in the long run. I disapprove strongly of this mechanism, and want to oppose it. There must be a few thousand subscribers to R-help. If the Challenge mechanism became widespread, then I would receive thousands of such messages. Rather than respond to all these, I would quit the list (and of course probably many others). The Challenge mechanism would destroy the mailing-list community if it became widely adopted. One reason I am posting this grumble to R-help is in the hope that I get a challenge to this one too. In that case, once and for all, I shall respond, so that the recipient will see this message and (I hope) do something about it, to eliminate the Challenge responder (I can't find the true recipient's email address from the Challenge). The recipient may be able to recognise themselves from the fact that they receive this message but not the message which triggered the response, which began: === On 24-Nov-04 John wrote: Off hand, the costs of GPL'd software are not hidden at all. R for instance demands that a would be user sit down and learn the language. This in turn pushes a user into learning more about statistics than the simple overview that Stat 1 presents a student. I'd see this as less a cost than a benefit! === My apologies for bothering you with this if you didn't want to know about it. Best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 24-Nov-04 Time: 10:36:35 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] T-test syntax question
As Vito Ricci has already pointed out, the Welsh test is for two group unpaired data with unequal variance assumption. If you have the original data, say x and y, then you can simply do t.test( x, y, paired=FALSE, var.equal=FALSE ). If you do not have the original data, you can still calculate the relevant statistics and p-value as long as you know the group length and variance. 'stats:::t.test.default' shows you the code behind t-test. I think the relevant bits are as follows mx - 0 my - 2 mu - 0 # You will need to fill these with your observed values vy - var(y) vx - var(x) ny - length(y) ny - length(y) stderrx - sqrt(vx/nx) stderry - sqrt(vy/ny) stderr - sqrt(stderrx^2 + stderry^2) df - stderr^4/(stderrx^4/(nx - 1) + stderry^4/(ny - 1)) tstat - (mx - my - mu)/stderr # for two sided alternative pval - 2 * pt(-abs(tstat), df) alpha - 1 - conf.level cint - qt(1 - alpha/2, df) cint - tstat + c(-cint, cint) cint - mu + cint * stderr On Wed, 2004-11-24 at 04:28, Steve Freeman wrote: Hi. I'd like to do a t-test to compare the Delta values of items with Crit=1 with Delta values of items with Crit=0. What is the t.test syntax? It should produce a result like this below (I can't get in touch with the person who originally did this for me) Welch Two Sample t-test data: t1$Delta by Crit t = -3.4105, df = 8.674, p-value = 0.008173 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.04506155 -0.00899827 sample estimates: mean in group FALSE mean in group TRUE 0.03331391 0.06034382 Thanks. Steven F. Freeman * Center for Organizational Dynamics * University of Pennsylvania * (215) 898-6967 * Fax: (215) 898-8934 * Cell: (215) 802-4680 * [EMAIL PROTECTED] * http://center.grad.upenn.edu/faculty/freeman.html [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Adaikalavan Ramasamy[EMAIL PROTECTED] Centre for Statistics in Medicine http://www.ihs.ox.ac.uk/csm/ Cancer Research UK Tel : 01865 226 677 Old Road Campus, Headington, Oxford Fax : 01865 226 962 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Automatic file reading
Hi, I want to do automatic reading of a number of tables (files) stored in ascii format without having to specify the variable name in R each time. Below is an example of how I would like to use it (I assume files pair1,...,pair8 exist in spec. dire.) for (i in 1:8){ name - paste(pair,i,sep=) ? ? ? - read.table(paste(/home/andersm/tmp/,name,sep=)) } after which I want to have pair1,...,pair8 as tables. But I can not get it right. Anybody having a smart solution? Best regards, Anders Malmberg __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re: T-test syntax question
Hi, In case of paired data, if you have only differencies and not original data you can get this t test based on differencies: Say d is the vector with differencies data and suppose you wish to test if the mean of differency is equal to zero: md-mean(d) ## sample mean of differencies sdd-sd(d) ## sample sd of differencies n-length(d) ## sample size t.value-(md/(sdd/sqrt(n))) ## sample t-value with n-1 df pt(t.value,n-1,lower.tail=FALSE) ## p-value of test set.seed(13) d-rnorm(50) md-mean(d) ## sample mean of differencies sdd-sd(d) ## sample sd of differencies n-length(d) ## sample size t.value-(md/(sdd/sqrt(n))) ## sample t-value with n-1 df pt(t.value,n-1,lower.tail=FALSE) ## p-value of test [1] 0.5755711 Best regards, Vito Steven F. Freeman wrote: I'd like to do a t-test to compare the Delta values of items with Crit=1 with Delta values of items with Crit=0. What is the t.test syntax? It should produce a result like this below (I can't get in touch with the person who originally did this for me) Welch Two Sample t-test data: t1$Delta by Crit t = -3.4105, df = 8.674, p-value = 0.008173 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.04506155 -0.00899827 sample estimates: mean in group FALSE mean in group TRUE 0.03331391 0.06034382 Thanks. = Diventare costruttori di soluzioni Became solutions' constructors The business of the statistician is to catalyze the scientific learning process. George E. P. Box Visitate il portale http://www.modugno.it/ e in particolare la sezione su Palese http://www.modugno.it/archivio/palese/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re: CRLF-terminated Fortran files
I added a check for CRLF termination of Fortran and C++ source files to R CMD check and found potential problems in packages BsMD MCMCpack (C++) asypow aws bayesSurv (C++) eha fBasics/fOptions/fSeries gam mclust ncomplete noverlap pan rrcov subselect (C++) survrec I'd be interested to know if C++ gives your problems too. (Sun cc used to object to CRLF, but does not in Forte 7.) I don't think I've ever tried one of the above before on Solaris, but had no CRLF problems with Forte 7, just plenty of other problems in MCMCpack and bayesSurv. Could you please try installing subselect. On Wed, 24 Nov 2004, Prof Brian Ripley wrote: What did this have to do with GLMM? I've changed the subject line. On Wed, 24 Nov 2004, Richard A. O'Keefe wrote: I was trying to install some more packages and ran into a problem I hadn't seen before. We've seen it for C, and test it for C in R CMD check. I think we should check C++ and Fortran files too. Version: platform sparc-sun-solaris2.9 arch sparc os solaris2.9 system sparc, solaris2.9 status major2 minor0.1 year 2004 month11 day 15 language R Fortran compilers available to me: f77: Sun WorkShop 6 update 2 FORTRAN 77 5.3 2001/05/15 f90: Sun WorkShop 6 update 2 Fortran 95 6.2 2001/05/15 f95: Sun WorkShop 6 update 2 Fortran 95 6.2 2001/05/15 Package: gam In fact I didn't ask for this one specifically, I had dependencies=TRUE in a call to install.packages(). Problem: Following the installation instructions for R, I had selected F95 as my Fortran compiler. The f95 compiler complained about nearly every line of gam/src/bsplvd.f From the error messages as displayed on the screen, I could see no reason for complaint. However, looking at the file with a text editor immediately revealed the problem. The files bsplvd.fbvalue.fbvalus.floessf.f qsbart.fsgram.f sinerp.fsslvrg.f stxwx.f all use CR-LF line termination. The files linear.flo.fsplsm.f all use LF line termination expected on UNIX. It turns out that the g77 and f77 compilers don't mind CR at the end of a line, but f90 and f95 hate them like poison. Removing the CRs makes f90 and f95 happy again. BTW, in that version of Sun Workshop f90 and f95 are the same compiler, and in later versions so is f77. (I think these compilers are on version 9 now.) Even in version 7, there is no problem with line endings. I did get a warning: call dchdc(a,p,p,work,jpvt,job,info) ^ linear.f, Line = 408, Column = 38: WARNING: Procedure DCHDC is defined at line 194 (linear.f). Illegal association of array actual argument with scalar dummy argument INFO. which seems genuine (make it info(1) in the call). Second-order problem: I know how to fix the immediate problem. What I don't know is how to intervene in the installation process. What I need to do is - get and unpack files (steps normally done by install.packages) - make changes (remove CR, edit configuration, whatever) - resume whatever install.packages normally does - Use install.packages(destdir=) to retain the tarballs which are downloaded. - Unpack the package tarball, make changes. - Run R CMD INSTALL on the changed sources. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Grumble ...
On Wed, 24 Nov 2004 10:36:35 - (GMT), (Ted Harding) [EMAIL PROTECTED] wrote: Hi Folks, A Grumble ... The message I just sent to R-help about The hidden costs of GPL ... has evoked a Challenge response: Hi, You´ve just sent a message to [EMAIL PROTECTED] In order to confirm the sent message, please click here Here's a strategy that I hope subverts this irritating mechanism: Every now and then I get a challenge about a message that I didn't send, because someone (or some virus) forged me into the From: address. Those are the only ones I confirm. Duncan Murdoch __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automatic file reading
for(i in 1:10){ assign( paste(data, i), i ) } data1 [1] 1 data5 [1] 5 data8 + data5 [1] 13 See help(assign) for more details and examples. On Wed, 2004-11-24 at 11:10, Anders Malmberg wrote: Hi, I want to do automatic reading of a number of tables (files) stored in ascii format without having to specify the variable name in R each time. Below is an example of how I would like to use it (I assume files pair1,...,pair8 exist in spec. dire.) for (i in 1:8){ name - paste(pair,i,sep=) ? ? ? - read.table(paste(/home/andersm/tmp/,name,sep=)) } after which I want to have pair1,...,pair8 as tables. But I can not get it right. Anybody having a smart solution? Best regards, Anders Malmberg __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Adaikalavan Ramasamy[EMAIL PROTECTED] Centre for Statistics in Medicine http://www.ihs.ox.ac.uk/csm/ Cancer Research UK Tel : 01865 226 677 Old Road Campus, Headington, Oxford Fax : 01865 226 962 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automatic file reading
Anders Malmberg wrote: Hi, I want to do automatic reading of a number of tables (files) stored in ascii format without having to specify the variable name in R each time. Below is an example of how I would like to use it (I assume files pair1,...,pair8 exist in spec. dire.) for (i in 1:8){ name - paste(pair,i,sep=) ? ? ? - read.table(paste(/home/andersm/tmp/,name,sep=)) } pairlist - vector(8, mode = list) for (i in 1:8){ name - paste(pair,i,sep=) pairlist[[i]] - read.table(paste(/home/andersm/tmp/,name,sep=)) } or use assign(), but you don't want to do that really. Uwe Ligges after which I want to have pair1,...,pair8 as tables. But I can not get it right. Anybody having a smart solution? Best regards, Anders Malmberg __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automatic file reading
Hi Andreas, what's about: pair - list() for (i in 1:8){ name - paste(pair,i,sep=) pair[[ i ]] - read.table(paste(/home/andersm/tmp/,name,sep=)) } Arne On Wednesday 24 November 2004 12:10, Anders Malmberg wrote: Hi, I want to do automatic reading of a number of tables (files) stored in ascii format without having to specify the variable name in R each time. Below is an example of how I would like to use it (I assume files pair1,...,pair8 exist in spec. dire.) for (i in 1:8){ name - paste(pair,i,sep=) ? ? ? - read.table(paste(/home/andersm/tmp/,name,sep=)) } after which I want to have pair1,...,pair8 as tables. But I can not get it right. Anybody having a smart solution? Best regards, Anders Malmberg __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Arne Henningsen Department of Agricultural Economics University of Kiel Olshausenstr. 40 D-24098 Kiel (Germany) Tel: +49-431-880 4445 Fax: +49-431-880 1397 [EMAIL PROTECTED] http://www.uni-kiel.de/agrarpol/ahenningsen/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automatic file reading
If you simply want read all files in a given directory, you can do something like: fullpath = /home/andersm/tmp filenames - dir(fullpath,pattern=*) pair - sapply(filenames,function(x) {read.table(paste(fullpath,'/',x,sep=))}) Sorry, untested. But the point is that you can use dir to get all of the filenames specified by pattern from a directory specified by fullpath. Sean On Nov 24, 2004, at 7:31 AM, Arne Henningsen wrote: Hi Andreas, what's about: pair - list() for (i in 1:8){ name - paste(pair,i,sep=) pair[[ i ]] - read.table(paste(/home/andersm/tmp/,name,sep=)) } Arne On Wednesday 24 November 2004 12:10, Anders Malmberg wrote: Hi, I want to do automatic reading of a number of tables (files) stored in ascii format without having to specify the variable name in R each time. Below is an example of how I would like to use it (I assume files pair1,...,pair8 exist in spec. dire.) for (i in 1:8){ name - paste(pair,i,sep=) ? ? ? - read.table(paste(/home/andersm/tmp/,name,sep=)) } after which I want to have pair1,...,pair8 as tables. But I can not get it right. Anybody having a smart solution? Best regards, Anders Malmberg __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Arne Henningsen Department of Agricultural Economics University of Kiel Olshausenstr. 40 D-24098 Kiel (Germany) Tel: +49-431-880 4445 Fax: +49-431-880 1397 [EMAIL PROTECTED] http://www.uni-kiel.de/agrarpol/ahenningsen/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data.frame into vector
Hi, as other already pointed out as.matrix is what you need. Just one comment: as.matrix(x[1,]) should be much faster for larger data frames compared to as.matrix(x)[1,] Best jan On Tue, 23 Nov 2004, Tiago R Magalhaes wrote: Hi I want to extract a row from a data.frame but I want that object to be a vector . After trying some different ways I end up always with a data.frame or with the wrong vector. Any pointers? x - data.frame(a = factor(c('a',2,'b')), b = c(4,5,6)) I want to get a 4 I tried: as.vector(x[1,]) a b 1 a 4 (resulting in a data.frame even after in my mind having coerced it into a vector!) as.vector(c[1,], numeric='character') [1] 2 4 (almost what I want, except that 2 instead of a - I guess this as to do with levels and factors) Thanks for any help R.Version() $platform [1] powerpc-apple-darwin6.8 $arch [1] powerpc $os [1] darwin6.8 $system [1] powerpc, darwin6.8 $status [1] $major [1] 2 $minor [1] 0.1 $year [1] 2004 $month [1] 11 $day [1] 15 $language [1] R __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- +- Jan Goebel j g o e b e l @ d i w . d e DIW Berlin German Socio-Economic Panel Study (GSOEP) Königin-Luise-Str. 5 D-14195 Berlin -- Germany -- phone: 49 30 89789-377 +- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] timeDate
Gabor Grothendieck ggrothendieck at myway.com writes: : : Yasser El-Zein abu3ammar at gmail.com writes: : : : : : I am looking for up to the millisecond resolution. Is there a package : : that has that? : : : : On Mon, 22 Nov 2004 21:48:20 + (UTC), Gabor Grothendieck : : ggrothendieck at myway.com wrote: : : Yasser El-Zein abu3ammar at gmail.com writes: : : : : : : From the document it is apparent to me that I need as.POSIXct (I have : : a double representing the number of millis since 1/1/1970 and I need : : to construct a datetime object). I see it showing how to construct the : : time object from a string representing the time but now fro a double : : of millis. Does anyone know hoe to do that? : : : : : : If by millis you mean milliseconds (i.e. one thousandths of a second) : : then POSIXct does not support that resolution, but if rounding to : : seconds is ok then : : : :structure(round(x/1000), class = c(POSIXt, POSIXct)) : : : : should give it to you assuming x is the number of milliseconds. : : There is no package/class that represents times and dates : internally as milliseoncds since Jan 1, 1970. You can : rework your data into chron's internal representation, viz. : day number plus fraction of day, like this: : : # x is vector of milliseconds since Jan 1/70 : # x.chron is corresponding chron date/time : # untested : library(chron) : ms.in.day - 1000*24*60*60 : day - floor(x/ms.in.day) : frac - (x-1000*day)/ms.in.day : x.chron - chron(day+frac) Not sure why I made the above so complicated but it can be written just as: library(chron) ms.in.day - 1000*24*60*60 x.chron - chron(x/ms.in.day) : If you need to take leap seconds into account (which the above : does not) then note that R comes with a builtin vector called : leap.seconds. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Grumble ...
Ted == Ted Harding [EMAIL PROTECTED] on Wed, 24 Nov 2004 10:36:35 - (GMT) writes: Ted Hi Folks, A Grumble ... Ted The message I just sent to R-help about The hidden Ted costs of GPL ... has evoked a Challenge response: Ted Hi, You´ve just sent a message to Ted [EMAIL PROTECTED] In order to confirm the sent Ted message, please click here Ted This confirmation is necessary because Ted [EMAIL PROTECTED] uses Antispam UOL, a service Ted that avoids unwanted messages like advertising, Ted pornography, viruses, and spams. Ted Other messages sent to [EMAIL PROTECTED] Ted won't need to be confirmed*. *If you receive another Ted confirmation request, please ask Ted [EMAIL PROTECTED] to include you in his/her Ted authorized e-mail list. Ted I won't be responding to this. Let the recipient simply Ted not receive the mail. Of no great importance in this Ted case, but a disadvantage to the recipient in the long Ted run. Ted I disapprove strongly of this mechanism, and want to Ted oppose it. There must be a few thousand subscribers to Ted R-help. If the Challenge mechanism became widespread, Ted then I would receive thousands of such messages. Rather Ted than respond to all these, I would quit the list (and Ted of course probably many others). The Challenge Ted mechanism would destroy the mailing-list community if Ted it became widely adopted. Exactly. I've received such a message myself from the same machine and -- as mailing list manager -- tried to find out more. The problem is that [EMAIL PROTECTED] is not subscribed to R-help. One other person is and I have written e-mail to that address withOUT getting such a message back.. Again, I completely agree that it is absolutely inacceptable to subscribe from such a spam-blocking address. Ted One reason I am posting this grumble to R-help is in Ted the hope that I get a challenge to this one too. In Ted that case, once and for all, I shall respond, so that Ted the recipient will see this message and (I hope) do Ted something about it, to eliminate the Challenge Ted responder (I can't find the true recipient's email Ted address from the Challenge). please let me (or R-help too) know what you find out. Martin Maechler, ETH Zurich (R-help mailing list maintainer) Ted The recipient may be able to recognise themselves from Ted the fact that they receive this message but not the Ted message which triggered the response, which began: Ted === On 24-Nov-04 Ted John wrote: Off hand, the costs of GPL'd software are not hidden at all. R for instance demands that a would be user sit down and learn the language. This in turn pushes a user into learning more about statistics than the simple overview that Stat 1 presents a student. Ted I'd see this as less a cost than a benefit! Ted === Ted My apologies for bothering you with this if you didn't Ted want to know about it. Ted Best wishes to all, Ted. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Grumble ...
Duncan Murdoch [EMAIL PROTECTED] writes: On Wed, 24 Nov 2004 10:36:35 - (GMT), (Ted Harding) [EMAIL PROTECTED] wrote: Hi Folks, A Grumble ... The message I just sent to R-help about The hidden costs of GPL ... has evoked a Challenge response: Hi, You´ve just sent a message to [EMAIL PROTECTED] In order to confirm the sent message, please click here Here's a strategy that I hope subverts this irritating mechanism: Every now and then I get a challenge about a message that I didn't send, because someone (or some virus) forged me into the From: address. Those are the only ones I confirm. Hehe... But don't you risk getting listed as an active spammer or something that way? Personally I just send them to the bogus folder for later update to the spamfilter. Imagine if everyone had this challenge stuff installed and we had to confirm every message ~1e3 times (how many subscribers are we these days). The vacation messages are annoying enough. I wonder how this guy got on the list in the first place. I suspect that he couldn't actually have completed the subscription process unless the mechanism was installed after the subscription. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] LDA with previous PCA for dimensionality reduction
Dear Cristoph, I guess you want to assess the error rate of a LDA that has been fitted to a set of currently existing training data, and that in the future you will get some new observation(s) for which you want to make a prediction. Then, I'd say that you want to use the second approach. You might find that the first step turns out to be crucial and, after all, your whole subsequent LDA is contingent on the PC scores you obtain on the previous step. Somewhat similar issues have been discussed in the microarray literature. Two references are: @ARTICLE{ambroise-02, author = {Ambroise, C. and McLachlan, G. J.}, title = {Selection bias in gene extraction on the basis of microarray gene-expression data}, journal = {Proc Natl Acad Sci USA}, year = {2002}, volume = {99}, pages = {6562--6566}, number = {10}, } @ARTICLE{simon-03, author = {Simon, R. and Radmacher, M. D. and Dobbin, K. and McShane, L. M.}, title = {Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification}, journal = {Journal of the National Cancer Institute}, year = {2003}, volume = {95}, pages = {14--18}, number = {1}, } I am not sure, though, why you use PCA followed by LDA. But that's another story. Best, R. On Wednesday 24 November 2004 11:16, Christoph Lehmann wrote: Dear all, not really a R question but: If I want to check for the classification accuracy of a LDA with previous PCA for dimensionality reduction by means of the LOOCV method: Is it ok to do the PCA on the WHOLE dataset ONCE and then run the LDA with the CV option set to TRUE (runs LOOCV) -- OR-- do I need - to compute for each 'test-bag' (the n-1 observations) a PCA (my.princomp.1), - then run the LDA on the test-bag scores (- my.lda.1) - then compute the scores of the left-out-observation using my.princomp.1 (- my.scores.2) - and only then use predict.lda(my.lda.1, my.scores.2) on the scores of the left-out-observation ? I read some articles, where they choose procedure 1, but I am not sure, if this is really correct? many thanks for a hint Christoph __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Ramón Díaz-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Two factor ANOVA in lme
nat writes: I want to specify a two-factor model in lme, which should be easy? Here's what I have: factor 1 - treatment FIXED (two levels) factor 2 - genotype RANDOM (160 genotypes in total) I need a model that tells me whether the treatment, genotype and interaction terms are significant. I have been reading 'Mixed effects models in S' but in all examples the random factor is not in the main model - it is a nesting factor etc to specify the error structure. Here i need the random factor in the model. I have tried this: height.aov-lme(height~trt*genotype,data.reps,random=~1|genotype,na.action=na.exclude) but the output is nothing like that from Minitab (my only previous experience of stats software). The results for the interaction term are the same but F values for the factors alone are very different between Minitab and R. This is a very simple model but I can't figure out how to specify it. Help would be much appreciated. As background: The data are from a QTL mapping population, which is why I must test to see if genotype is significant and also why genotype is a random factor. Thanks It seems your message didn't get any replies (at least none posted to r-help). I recentely adjusted such a model (two effects, one fixed, another random, with interaction effects) using lme. I used the following command: z1 - lme(reacao ~ posicao,data=memoria,random=~1|subject/posicao) Where my model is reacao = mu + posicao (fixed) + posicao*subject (random) + subject (random) Beware though that minitab uses different estimation methods (in lme itself you may use maximum likelihood other restricted m.l) and the results need not to be the same. -- Fernando Henrique Ferraz P. da Rosa http://www.ime.usp.br/~feferraz __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] LDA with previous PCA for dimensionality reduction
On Wed, 24 Nov 2004, Ramon Diaz-Uriarte wrote: Dear Cristoph, I guess you want to assess the error rate of a LDA that has been fitted to a set of currently existing training data, and that in the future you will get some new observation(s) for which you want to make a prediction. Then, I'd say that you want to use the second approach. You might find that the first step turns out to be crucial and, after all, your whole subsequent LDA is contingent on the PC scores you obtain on the previous step. Ramon, as long as one does not use the information in the response (the class variable, in this case) I don't think that one ends up with an optimistically biased estimate of the error (although leave-one-out is a suboptimal choice). Of course, when one starts to tune the method used for dimension reduction, a selection of the procedure with minimal error will produce a bias. Or am I missing something important? Btw, `ipred::slda' implements something not completely unlike the procedure Christoph is interested in. Best, Torsten Somewhat similar issues have been discussed in the microarray literature. Two references are: @ARTICLE{ambroise-02, author = {Ambroise, C. and McLachlan, G. J.}, title = {Selection bias in gene extraction on the basis of microarray gene-expression data}, journal = {Proc Natl Acad Sci USA}, year = {2002}, volume = {99}, pages = {6562--6566}, number = {10}, } @ARTICLE{simon-03, author = {Simon, R. and Radmacher, M. D. and Dobbin, K. and McShane, L. M.}, title = {Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification}, journal = {Journal of the National Cancer Institute}, year = {2003}, volume = {95}, pages = {14--18}, number = {1}, } I am not sure, though, why you use PCA followed by LDA. But that's another story. Best, R. On Wednesday 24 November 2004 11:16, Christoph Lehmann wrote: Dear all, not really a R question but: If I want to check for the classification accuracy of a LDA with previous PCA for dimensionality reduction by means of the LOOCV method: Is it ok to do the PCA on the WHOLE dataset ONCE and then run the LDA with the CV option set to TRUE (runs LOOCV) -- OR-- do I need - to compute for each 'test-bag' (the n-1 observations) a PCA (my.princomp.1), - then run the LDA on the test-bag scores (- my.lda.1) - then compute the scores of the left-out-observation using my.princomp.1 (- my.scores.2) - and only then use predict.lda(my.lda.1, my.scores.2) on the scores of the left-out-observation ? I read some articles, where they choose procedure 1, but I am not sure, if this is really correct? many thanks for a hint Christoph __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Ramón Díaz-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] an R function to search on Prof. Baron's site
Using this function with 2.0.0 XP and Firefox 1.0 (I've rediscovered the internet) produces a curious result. myString - RSiteSearch(string = 'Ripley') myString [1] http://finzi.psych.upenn.edu/cgi-bin/htsearch?config=htdigrun1;restrict=Rhe lp00/archive|Rhelp01/archive|Rhelp02a/archive;format=builtin-long;sort=score ;words=Ripley;matchesperpage=10 version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor0.0 year 2004 month10 day 04 language R If no browser is open, then this is the URL that is browsed in Firefox: http://finzi.psych.upenn.edu/cgi-bin/htsearch?config=htdigrun1;restrict=Rhel p00/archive Oddly, these two other windows are opened too: http://finzi.psych.upenn.edu/R/Rhelp01/archive/1000.html and: http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg17461.html This happens regardless of what the search string is. If a browser window is open then everything works as planned. The sticky bit, obviously, is parsing browseURL which has the same behavior if I try: browseURL(myString) However, the searches: RSiteSearch(string = 'browseURL Firefox') RSiteSearch(string = 'browseURL Mozilla') don't turn up much help! If I change browseURL to use IE then browseURL behaves as expected: browseURL(myString, browser=C:/Program Files/Internet Explorer/iexplore.exe) Specifying Firefox explicitly in browseURL doesn't help - It still opens three windows as above (if no browser is open): browseURL(myString, browser=C:/Program Files/Mozilla Firefox/firefox.exe) So, under Windows the 'NULL' argument in 'browser' which determines the browser via file association isn't the problem. Anybody know how I can make Firefox work a little more smoothly? Thanks, Andy -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Gabor Grothendieck Sent: Tuesday, November 23, 2004 11:56 PM To: [EMAIL PROTECTED] Subject: Re: [R] an R function to search on Prof. Baron's site Liaw, Andy andy_liaw at merck.com writes: : : Inspired by the functions that Barry Rawlingson and Dave Forrest posted for : searching Rwiki and R-help archive, I've made up a function that does the : search on Prof. Baron's site (Thanks to Prof. Baron's help on setting up the : query string!): It would be nice if this and the other search functions recently posted were collected into a package or even integrated into R itself. In the case of the Windows Rgui, it would be nice if they appeared on a menu with the other search and help functions. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] LDA with previous PCA for dimensionality reduction
Thank you, Torsten; that's what I thought, as long as one does not use the 'class label' as a constraint in the dimension reduction, the procedure is ok. Of course it is computationally more demanding, since for each new (unknown in respect of the class label) observation one has to compute a new PCA as well. Cheers Christoph Torsten Hothorn wrote: On Wed, 24 Nov 2004, Ramon Diaz-Uriarte wrote: Dear Cristoph, I guess you want to assess the error rate of a LDA that has been fitted to a set of currently existing training data, and that in the future you will get some new observation(s) for which you want to make a prediction. Then, I'd say that you want to use the second approach. You might find that the first step turns out to be crucial and, after all, your whole subsequent LDA is contingent on the PC scores you obtain on the previous step. Ramon, as long as one does not use the information in the response (the class variable, in this case) I don't think that one ends up with an optimistically biased estimate of the error (although leave-one-out is a suboptimal choice). Of course, when one starts to tune the method used for dimension reduction, a selection of the procedure with minimal error will produce a bias. Or am I missing something important? Btw, `ipred::slda' implements something not completely unlike the procedure Christoph is interested in. Best, Torsten Somewhat similar issues have been discussed in the microarray literature. Two references are: @ARTICLE{ambroise-02, author = {Ambroise, C. and McLachlan, G. J.}, title = {Selection bias in gene extraction on the basis of microarray gene-expression data}, journal = {Proc Natl Acad Sci USA}, year = {2002}, volume = {99}, pages = {6562--6566}, number = {10}, } @ARTICLE{simon-03, author = {Simon, R. and Radmacher, M. D. and Dobbin, K. and McShane, L. M.}, title = {Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification}, journal = {Journal of the National Cancer Institute}, year = {2003}, volume = {95}, pages = {14--18}, number = {1}, } I am not sure, though, why you use PCA followed by LDA. But that's another story. Best, R. On Wednesday 24 November 2004 11:16, Christoph Lehmann wrote: Dear all, not really a R question but: If I want to check for the classification accuracy of a LDA with previous PCA for dimensionality reduction by means of the LOOCV method: Is it ok to do the PCA on the WHOLE dataset ONCE and then run the LDA with the CV option set to TRUE (runs LOOCV) -- OR-- do I need - to compute for each 'test-bag' (the n-1 observations) a PCA (my.princomp.1), - then run the LDA on the test-bag scores (- my.lda.1) - then compute the scores of the left-out-observation using my.princomp.1 (- my.scores.2) - and only then use predict.lda(my.lda.1, my.scores.2) on the scores of the left-out-observation ? I read some articles, where they choose procedure 1, but I am not sure, if this is really correct? many thanks for a hint Christoph __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Ramón Díaz-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncológicas (CNIO) (Spanish National Cancer Center) Melchor Fernández Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Respuesta Automatica CorreoDirect.
### Este email esta generado de manera automatica ### Ha escrito usted a una cuenta de correo generica de CorreoDirect. Si desea conocer nuestra política de privacidad, pulse aquí. http://www.correodirect.com/public/nosotros/privacidad.php Si usted esta dado de alta en nuestro servicio, tiene la posibilidad de darse fácilmente de baja o bien modificar su perfil para que las ofertas que le enviamos se ajusten mejor a sus intereses. Si lo que desea es modificar su perfil pulse aquí http://www.correodirect.com/usuarios/area/modif.php Si desea darse de baja de nuestro servicio pulse aquí http://www.correodirect.com/usuarios/area/baja.php Si cree que sus datos se encuentran por error en nuestra Base de Datos, escriba un email a [EMAIL PROTECTED] Si tiene dudas sobre nuestro funcionamiento, acuda a http://www.correodirect.com/public/ayuda/faqusuario.php Atentamente, Atención al Usuario, CorreoDirect www.correodirect.com CorreoDirect, el lider en permission email marketing. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] scatterplot of 100000 points and pdf file format
Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? /E -- Dipl. bio-chem. Witold Eryk Wolski MPI-Moleculare Genetic Ihnestrasse 63-73 14195 Berlin tel: 0049-30-83875219 __(_ http://www.molgen.mpg.de/~wolski \__/'v' http://r4proteomics.sourceforge.net||/ \ mail: [EMAIL PROTECTED]^^ m m [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] 2GB dataset
Hi, do any one have experience with loading dataset that is larger than 2GB into R. My organization is a SAS oriented shop and I'm in the process of switching it to R. One of the complain about R has always been it's inability to handle large dataset (GB) efficiently. I would like some comments from someone with experience of working on 2GB dataset in R. Thanks. Apollo __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] tsdiag for ar?
Is there a way to have the ar function work with tsdiag for on-the-fly visualization of ar fits? I have to fit a great many models of varying order and would like to save the diagnostic graphs. For instance, tsdiag(ar(lh)) tsdiag(arima(lh, order = c(3,0,0))) Thanks... __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 2GB dataset
Absolutely no problem on 64-bit OSes with enough memory. Many 32-bit OSes have problems with 2Gb files. Please do read the posting guide and tell us basic facts like which OS you are running on, so we don't have to speculate to answer your question. Also, what you want to do with the dataset? This matters crucially. On Wed, 24 Nov 2004, apollo wong wrote: Hi, do any one have experience with loading dataset that is larger than 2GB into R. My organization is a SAS oriented shop and I'm in the process of switching it to R. One of the complain about R has always been it's inability to handle large dataset (GB) efficiently. I would like some comments from someone with experience of working on 2GB dataset in R. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] tsdiag for ar?
On Wed, 24 Nov 2004, Dr Carbon wrote: Is there a way to have the ar function work with tsdiag for on-the-fly visualization of ar fits? I have to fit a great many models of varying order and would like to save the diagnostic graphs. First you have to produce them, surely? For instance, tsdiag(ar(lh)) That gives an error. All you have to do is to write an ar method for tsdiag -- a good exercise in R programming for you. tsdiag(arima(lh, order = c(3,0,0))) -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
On Wed, 2004-11-24 at 16:34 +0100, Witold Eryk Wolski wrote: Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? Hi Eryk! Part of the problem is that in a pdf file, the vector based instructions will need to be defined for each of your 10 ^ 6 points in order to draw them. When trying to create a simple example: pdf() plot(rnorm(100), rnorm(100)) dev.off() The pdf file is 55 Mb in size. One immediate thought was to try a ps file and using the above plot, the ps file was only 23 Mb in size. So note that ps can be more efficient. Going to a bitmap might result in a much smaller file, but as you note, the quality does degrade as compared to a vector based image. I tried the above to a png, then converted to a pdf (using 'convert') and as expected, the image both viewed and printed was pixelated, since the pdf instructions are presumably drawing pixels and not vector based objects. Depending upon what you plan to do with the image, you may have to choose among several options, resulting in tradeoffs between image quality and file size. If you can create the bitmap file explicitly in the size that you require for printing or incorporating in a document, that is one way to go and will preserve, to an extent, the overall fixed size image quality, while keeping file size small. Another option to consider for the pdf approach, if it does not compromise the integrity of your plot, is to remove any duplicate data points if any exist. Thus, you will not need what are in effect redundant instructions in the pdf file. This may not be possible depending upon the nature of your data (ie. doubles) without considering some tolerance level for equivalence. Perhaps others will have additional ideas. HTH, Marc Schwartz __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] scatterplot of 100000 points and pdf file format
On 24-Nov-04 Witold Eryk Wolski wrote: Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? If you want the PDF file to preserve the info about all the 1M points then the problem has no solution. The png file will already have suppressed most of this (which is one reason for poor quality). I think you should give thought to reducing what you need to plot. Think about it: suppose you plot with a resolution of 1/200 points per inch (about the limit at which the eye begins to see rough edges). Then you have 4 points per square inch. If your 1M points are separate but as closely packed as possible, this requires 25 square inches, or a 5x5 inch (= 12.7x12.7 cm) square. And this would be solid black! Presumably in your plot there is a very large number of points which are effectively indistinguisable from other points, so these could be eliminated without spoiling the plot. I don't have an obviously best strategy for reducing what you actually plot, but perhaps one line to think along might be the following: 1. Multiply the data by some factor and then round the results to an integer (to avoid problems in step 2). Factor chosen so that the result of (4) below is satisfactory. 2. Eliminate duplicates in the result of (1). 3. Divide by the factor you used in (1). 4. Plot the result; save plot to PDF. As to how to do it in R: the critical step is (2), which with so many points could be very heavy unless done by a well-chosen procedure. I'm not expert enough to advise about that, but no doubt others are. Good luck! Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 24-Nov-04 Time: 16:16:28 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] OOT: frailty-multinivel
Hola! I started to search for information about multilevel survival models, and found frailty in R. This seems to be something of the same, is it the same? Then: why the name frailty (weekness?) -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] scatterplot of 100000 points and pdf file format
Marc/Eryk, I have no experience with it, but I believe the hexbin package in BioC was there for this purpose: avoid heavy over-plotting lots of points. You might want to look into that, if you have not done so yet. Best, Andy From: Marc Schwartz On Wed, 2004-11-24 at 16:34 +0100, Witold Eryk Wolski wrote: Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? Hi Eryk! Part of the problem is that in a pdf file, the vector based instructions will need to be defined for each of your 10 ^ 6 points in order to draw them. When trying to create a simple example: pdf() plot(rnorm(100), rnorm(100)) dev.off() The pdf file is 55 Mb in size. One immediate thought was to try a ps file and using the above plot, the ps file was only 23 Mb in size. So note that ps can be more efficient. Going to a bitmap might result in a much smaller file, but as you note, the quality does degrade as compared to a vector based image. I tried the above to a png, then converted to a pdf (using 'convert') and as expected, the image both viewed and printed was pixelated, since the pdf instructions are presumably drawing pixels and not vector based objects. Depending upon what you plan to do with the image, you may have to choose among several options, resulting in tradeoffs between image quality and file size. If you can create the bitmap file explicitly in the size that you require for printing or incorporating in a document, that is one way to go and will preserve, to an extent, the overall fixed size image quality, while keeping file size small. Another option to consider for the pdf approach, if it does not compromise the integrity of your plot, is to remove any duplicate data points if any exist. Thus, you will not need what are in effect redundant instructions in the pdf file. This may not be possible depending upon the nature of your data (ie. doubles) without considering some tolerance level for equivalence. Perhaps others will have additional ideas. HTH, Marc Schwartz __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
Hi, I tried the ps idea. But I am using pdflatex. You get a even larger size reduction if you convert the ps into a pdf using ps2pdf. But unfortunately there is a quality loss. I have found almost a working solution: a) Save the scatterplot without axes and with par(mar=c(0,0,0,0)) as png . b) convert it using any program to pnm c) read the pnm file using pixmap d) Add axes labels and lines afterwards with par(new=TRUE) And this looks like I would like that it looks like. But unfortunately acroread and gv on window is crashing when I try to print the file. png(file=pepslop.png,width=500,height=500) par(mar=c(0,0,0,0)) X2-rnorm(10) Y2-X2*10+rnorm(10) plot(X2,Y2,pch=.,xlab=,ylab=,main=,axes=F) dev.off() pdf(file=pepslop.pdf,width=7,height=7) par(mar=c(3.2,3.2,1,1)) x - read.pnm(pepslop.pnm ) plot(x) par(new=TRUE) par(mar=c(3.2,3.2,1,1)) plot(X2,Y2,pch=.,xlab=,ylab=,main=,type=n) mtext(expression(m[nominal]),side=1,line=2) mtext(expression(mod(m[monoisotopic],1)),side=2,line=2) legend(1000,4,expression(paste(lambda[DB],=,0.000495)),col=2,lty=1,lwd=1) abline(test,col=2,lwd=2) dev.off() Marc Schwartz wrote: On Wed, 2004-11-24 at 16:34 +0100, Witold Eryk Wolski wrote: Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? Hi Eryk! Part of the problem is that in a pdf file, the vector based instructions will need to be defined for each of your 10 ^ 6 points in order to draw them. When trying to create a simple example: pdf() plot(rnorm(100), rnorm(100)) dev.off() The pdf file is 55 Mb in size. One immediate thought was to try a ps file and using the above plot, the ps file was only 23 Mb in size. So note that ps can be more efficient. Going to a bitmap might result in a much smaller file, but as you note, the quality does degrade as compared to a vector based image. I tried the above to a png, then converted to a pdf (using 'convert') and as expected, the image both viewed and printed was pixelated, since the pdf instructions are presumably drawing pixels and not vector based objects. Depending upon what you plan to do with the image, you may have to choose among several options, resulting in tradeoffs between image quality and file size. If you can create the bitmap file explicitly in the size that you require for printing or incorporating in a document, that is one way to go and will preserve, to an extent, the overall fixed size image quality, while keeping file size small. Another option to consider for the pdf approach, if it does not compromise the integrity of your plot, is to remove any duplicate data points if any exist. Thus, you will not need what are in effect redundant instructions in the pdf file. This may not be possible depending upon the nature of your data (ie. doubles) without considering some tolerance level for equivalence. Perhaps others will have additional ideas. HTH, Marc Schwartz __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Dipl. bio-chem. Witold Eryk Wolski MPI-Moleculare Genetic Ihnestrasse 63-73 14195 Berlin tel: 0049-30-83875219 __(_ http://www.molgen.mpg.de/~wolski \__/'v' http://r4proteomics.sourceforge.net||/ \ mail: [EMAIL PROTECTED]^^ m m [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
On Wed, 24 Nov 2004, Witold Eryk Wolski wrote: Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. Try the hexbin Bioconductor package, which gives hexagonally-binned density scatterplots. Even for tens of thousands of points this is often much better than a scatterplot. -thomas __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
Hi, Yes, indeed the hexbin package generates very cool pix. They look great. I was using it already. But this time I am interested in visualizing exactly the _scatter_ of some extreme points. Eryk Liaw, Andy wrote: Marc/Eryk, I have no experience with it, but I believe the hexbin package in BioC was there for this purpose: avoid heavy over-plotting lots of points. You might want to look into that, if you have not done so yet. Best, Andy From: Marc Schwartz On Wed, 2004-11-24 at 16:34 +0100, Witold Eryk Wolski wrote: Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? Hi Eryk! Part of the problem is that in a pdf file, the vector based instructions will need to be defined for each of your 10 ^ 6 points in order to draw them. When trying to create a simple example: pdf() plot(rnorm(100), rnorm(100)) dev.off() The pdf file is 55 Mb in size. One immediate thought was to try a ps file and using the above plot, the ps file was only 23 Mb in size. So note that ps can be more efficient. Going to a bitmap might result in a much smaller file, but as you note, the quality does degrade as compared to a vector based image. I tried the above to a png, then converted to a pdf (using 'convert') and as expected, the image both viewed and printed was pixelated, since the pdf instructions are presumably drawing pixels and not vector based objects. Depending upon what you plan to do with the image, you may have to choose among several options, resulting in tradeoffs between image quality and file size. If you can create the bitmap file explicitly in the size that you require for printing or incorporating in a document, that is one way to go and will preserve, to an extent, the overall fixed size image quality, while keeping file size small. Another option to consider for the pdf approach, if it does not compromise the integrity of your plot, is to remove any duplicate data points if any exist. Thus, you will not need what are in effect redundant instructions in the pdf file. This may not be possible depending upon the nature of your data (ie. doubles) without considering some tolerance level for equivalence. Perhaps others will have additional ideas. HTH, Marc Schwartz __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- -- Dipl. bio-chem. Witold Eryk Wolski MPI-Moleculare Genetic Ihnestrasse 63-73 14195 Berlin tel: 0049-30-83875219 __(_ http://www.molgen.mpg.de/~wolski \__/'v' http://r4proteomics.sourceforge.net||/ \ mail: [EMAIL PROTECTED]^^ m m [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] scatterplot of 100000 points and pdf file format
On Wed, 24 Nov 2004 [EMAIL PROTECTED] wrote: On 24-Nov-04 Witold Eryk Wolski wrote: Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? If you want the PDF file to preserve the info about all the 1M points then the problem has no solution. The png file will already have suppressed most of this (which is one reason for poor quality). I think you should give thought to reducing what you need to plot. Think about it: suppose you plot with a resolution of 1/200 points per inch (about the limit at which the eye begins to see rough edges). Then you have 4 points per square inch. If your 1M points are separate but as closely packed as possible, this requires 25 square inches, or a 5x5 inch (= 12.7x12.7 cm) square. And this would be solid black! Presumably in your plot there is a very large number of points which are effectively indistinguisable from other points, so these could be eliminated without spoiling the plot. I don't have an obviously best strategy for reducing what you actually plot, but perhaps one line to think along might be the following: 1. Multiply the data by some factor and then round the results to an integer (to avoid problems in step 2). Factor chosen so that the result of (4) below is satisfactory. 2. Eliminate duplicates in the result of (1). 3. Divide by the factor you used in (1). 4. Plot the result; save plot to PDF. As to how to do it in R: the critical step is (2), which with so many points could be very heavy unless done by a well-chosen procedure. I'm not expert enough to advise about that, but no doubt others are. unique will eat that for breakfast x - runif(1e6) system.time(xx - unique(round(x, 4))) [1] 0.55 0.09 0.64 0.00 0.00 length(xx) [1] 10001 -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] scatterplot of 100000 points and pdf file format
Witold, I have found that plotting more than a few thousand data points at a time quickly becomes a loosing proposition. That is, the dense overlap of data points tends to obscure the patterns of interest, with only outliers distinctly visible. I typically deal with this in two ways. The most straight forward is to select a much smaller subset data points to plot, say on the order of 100-1000, depending on the nature of the data and the features you want to illustrate. How you sample depends on the structure of your data set. E.g. you may want to sample fixed proportions within subgroups. You can add loess lines or confidence ellipses estimated from the complete data. Another approach is to estimate the two dimensional density using kde2d() (MASS package) and represent the result with a contour or image plot. See ?kde2d for an example. Both of these will result in much more manageable (and likely more informative) figures. Regards, Matt Matthew R. Nelson, Ph.D. Director, Biostatistics Sequenom, Inc. -Original Message- From: Witold Eryk Wolski [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 24, 2004 7:35 AM To: R Help Mailing List Subject: [R] scatterplot of 10 points and pdf file format Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? /E -- Dipl. bio-chem. Witold Eryk Wolski MPI-Moleculare Genetic Ihnestrasse 63-73 14195 Berlin tel: 0049-30-83875219 __(_ http://www.molgen.mpg.de/~wolski \__/'v' http://r4proteomics.sourceforge.net||/ \ mail: [EMAIL PROTECTED]^^ m m [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] coplot =? gannt chart + bargraph
I would like to display some results from simulations in the form of a Gantt chart (progress) with a barchart (production) of another variable below (something very similar to coplot charts). I'm not sure if I should attempt to build this from scratch (using grid or some of the basic graphics features) or if there's a similar feature in one of the existing packages. I need to take the following (truncated) results, unit,week,machine,volume,pdxratio 0,14,1,925.402525,1.00 0,15,1,925.402525,1.00 0,16,1,925.402525,1.00 0,17,1,702.792425,0.759445 1,46,1,1007.664896,1.00 1,47,1,1007.664896,1.00 1,48,1,1007.664896,1.00 1,49,1,563.005311,0.558723 2,33,1,1019.781108,1.00 2,34,1,1019.781108,1.00 2,35,1,1019.781108,1.00 2,36,1,697.656677,0.684124 3,41,2,1043.451341,1.00 3,42,2,1043.451341,1.00 3,43,2,1043.451341,1.00 3,44,2,741.645977,0.710762 4,7,2,1048.494508,1.00 4,8,2,1048.494508,1.00 and generate charts over unit and week (both as factors?) and think I should be using aggregate, but wanted to find out if there's a better method. Thanks, Jeff. -- Jeff D. Hamann Forest Informatics, Inc. PO Box 1421 Corvallis, Oregon 97339-1421 phone 541-754-1428 fax 541-752-0288 [EMAIL PROTECTED] http://www.forestinformatics.com __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] OOT: frailty-multinivel
On Wed, 24 Nov 2004, Kjetil Brinchmann Halvorsen wrote: Hola! I started to search for information about multilevel survival models, and found frailty in R. This seems to be something of the same, is it the same? More or less. Shared frailty models are the same as hierarchical/mixed survival models. R uses a fitting method that is equivalent to maximum likelihood only when exp(random effects) has a Gamma distribution. The survival package can fit random intercept models; the new kinship package fits much more general mixed models. [I will put in my usual objection to the term multilevel model being used to refer solely to models that include unmeasured variables] Then: why the name frailty (weekness?) The idea is that 'weaker' individuals fail earlier than `stronger' individuals for the same values of covariates. The concept has been used both for modelling correlation between survival times and to motivate parametric models that give an initially decreasing hazard. I have a vague impression that the term originated in Scandinavia somewhere. -thomas __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] what does order() stand for in an lme formula?
I'm a beginner in R, and trying to fit linear models with different intercepts per group, of the type y ~ A*x1 + B, where x1 is a numerical variable. I cannot understand whether I should use y1 ~ x1 +1 or y1 ~ order(x1) + 1 Although in the toy example included it makes a small difference, in models with many groups the models without order() converge slower if at all! Please help --- R script : START --- ## # what does order() do in an lme formula? ## # prep data y1 - c(rnorm(25, 35, sd=5), rnorm(17, 55, sd=4)) # this is a line paralle to x-axis (slope=0) with noise x1 - c(sample(1:25, 25, replace=F), sample(1:17, 17, replace=F)) # scramble the x, so that they do not appear in-oorder f1 - c(rep(A,25), rep(B, 17)) dat1 - data.frame(y1, x1, f1) x1 - NULL y1 - NULL f1 - NULL # load libraries require(nlme) require(gmodels) # for the ci() # fit model with and w/o order() dat1.lm.1 - lm(y1 ~ x1 + f1 -1, data=dat1) dat1.lm.2 - lm(y1 ~ order(x1) + f1 -1, data=dat1) # #using lme, and assigning f1 to the random effects; this is different than in lm(), but in my larger models #f1 is a repeated experiment vs a fixed factor dat1.lme.1 - lme(y1 ~ x1 + 1, random= ~ 1 | f1, data=dat1, method=ML) dat1.lme.2 - update(dat1.lme.1, fixed=y1 ~ order(x1) + 1, random= ~ 1 | f1) # compare summary(dat1.lm.1) summary(dat1.lm.2) ci(dat1.lm.1) ci(dat1.lm.2) # summary(dat1.lme.1) summary(dat1.lme.2) ci(dat1.lme.1) ci(dat1.lme.2) --- R script : END --- --- R session: START --- # compare summary(dat1.lm.1) Call: lm(formula = y1 ~ x1 + f1 - 1, data = dat1) Residuals: Min 1Q Median 3Q Max -7.1774 -2.9020 -0.1616 2.3576 10.0103 Coefficients: Estimate Std. Error t value Pr(|t|) x1 -0.061730.09318 -0.6630.512 f1A 35.257041.43540 24.563 2e-16 *** f1B 54.981931.25519 43.804 2e-16 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 3.851 on 39 degrees of freedom Multiple R-Squared: 0.9928, Adjusted R-squared: 0.9923 F-statistic: 1799 on 3 and 39 DF, p-value: 2.2e-16 summary(dat1.lm.2) Call: lm(formula = y1 ~ order(x1) + f1 - 1, data = dat1) Residuals: Min 1Q Median 3Q Max -7.0089 -3.0955 0.1829 2.3387 10.0083 Coefficients: Estimate Std. Error t value Pr(|t|) order(x1) -0.002098 0.049830 -0.0420.967 f1A 34.502668 1.381577 24.973 2e-16 *** f1B 54.466926 1.346118 40.462 2e-16 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 3.872 on 39 degrees of freedom Multiple R-Squared: 0.9927, Adjusted R-squared: 0.9922 F-statistic: 1779 on 3 and 39 DF, p-value: 2.2e-16 ci(dat1.lm.1) Estimate CI lower CI upper Std. Error p-value x1 -0.06173363 -0.2502001 0.1267329 0.09317612 5.115170e-01 f1A 35.25703803 32.3536752 38.1604009 1.43539619 2.461184e-25 f1B 54.98192838 52.4430771 57.5207797 1.25518501 8.793394e-35 ci(dat1.lm.2) Estimate CI lowerCI upper Std. Error p-value order(x1) -0.002097868 -0.1028889 0.09869315 0.04983016 9.666335e-01 f1A 34.502667919 31.7081648 37.29717105 1.38157694 1.338416e-25 f1B 54.466925648 51.7441455 57.18970575 1.34611772 1.819408e-33 # summary(dat1.lme.1) Linear mixed-effects model fit by maximum likelihood Data: dat1 AIC BIClogLik 249.2458 256.1965 -120.6229 Random effects: Formula: ~1 | f1 (Intercept) Residual StdDev:9.819254 3.802406 Fixed effects: y1 ~ x1 + 1 Value Std.Error DF t-value p-value (Intercept) 45.14347 7.215924 39 6.256090 0. x1 -0.06517 0.094245 39 -0.691489 0.4934 Correlation: (Intr) x1 -0.144 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.90575027 -0.76278244 -0.02205757 0.60302788 2.61718120 Number of Observations: 42 Number of Groups: 2 summary(dat1.lme.2) Linear mixed-effects model fit by maximum likelihood Data: dat1 AIC BIClogLik 249.741 256.6917 -120.8705 Random effects: Formula: ~1 | f1 (Intercept) Residual StdDev:9.944285 3.823607 Fixed effects: y1 ~ order(x1) Value Std.Error DF t-value p-value (Intercept) 44.48952 7.309834 39 6.086256 0. order(x1) -0.00297 0.050415 39 -0.058968 0.9533 Correlation: (Intr) order(x1) -0.146 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -1.85021774 -0.81922106 0.03922119 0.61748886 2.60194590 Number of Observations: 42 Number of Groups: 2 ci(dat1.lme.1) Estimate CI lower CI upper Std. Error DF p-value (Intercept) 45.14346803 30.5478836 59.7390525 7.2159242 39 2.283608e-07 x1 -0.06516927 -0.2557976 0.1254590 0.0942449 39
Re: [R] scatterplot of 100000 points and pdf file format
On Wednesday 24 November 2004 07:34, Witold Eryk Wolski wrote: Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? I would strongly suggest a different method to present the data such as a contour plot or 3D bar plot. An XY plot with a million points is unlikely to be readable unless it is produced as a large format print. At 200 DPI printed, 1,000,000 discrete points requires a minimum of a 5 inch (12.7 cm) by 5 inch area. Besides, other than being visually overwhelming, what information would such a plot offer a viewer? John __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
Do you have a measures of scatter or can you pick outliers that could allow you to produce a mixed plot using either density or hexbinned data with only outliers placed after-the-fact using points()? Sean -Original Message- From: Witold Eryk Wolski [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 24, 2004 7:35 AM To: R Help Mailing List Subject: [R] scatterplot of 10 points and pdf file format Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
Have you tried plot(...,pch='.') This will use the period as the plotting character instead of the 'circle' which is drawn. This should reduce the size of the PDF file. I have done scatter plots with 2M points and they are typically meaningless with that many points overlaid. Check out 'hexbin' on Bioconductor (you can download the package from the RGUI window. This is a much better way of showing some information since it will plot the number of points that are within a hexagon. I have found this to be a better way of looking at some data. __ James HoltmanWhat is the problem you are trying to solve? Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Witold Eryk Wolski [EMAIL PROTECTED]To: R Help Mailing List [EMAIL PROTECTED] cc: Sent by: Subject: [R] scatterplot of 10 points and pdf file format [EMAIL PROTECTED] ath.ethz.ch 11/24/2004 10:34 Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? /E -- Dipl. bio-chem. Witold Eryk Wolski MPI-Moleculare Genetic Ihnestrasse 63-73 14195 Berlin tel: 0049-30-83875219 __(_ http://www.molgen.mpg.de/~wolski \__/'v' http://r4proteomics.sourceforge.net||/ \ mail: [EMAIL PROTECTED]^^ m m [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Re: T-test syntax question
Actually, you can still use t.test with one vector of data. Say, the differences is d ( a vector (or an array of numbers), you can use t.test(d), then by default, it testing whether mu=0, you can also specify confidence level by adding conf.level = 0.95 etc. You can also type ?t.test in R command to get more information with R.help. Hope this helps! S Fan From: Vito Ricci [EMAIL PROTECTED] To: [EMAIL PROTECTED] CC: [EMAIL PROTECTED] Subject: [R] Re: T-test syntax question Date: Wed, 24 Nov 2004 12:32:06 +0100 (CET) Hi, In case of paired data, if you have only differencies and not original data you can get this t test based on differencies: Say d is the vector with differencies data and suppose you wish to test if the mean of differency is equal to zero: md-mean(d) ## sample mean of differencies sdd-sd(d) ## sample sd of differencies n-length(d) ## sample size t.value-(md/(sdd/sqrt(n))) ## sample t-value with n-1 df pt(t.value,n-1,lower.tail=FALSE) ## p-value of test set.seed(13) d-rnorm(50) md-mean(d) ## sample mean of differencies sdd-sd(d) ## sample sd of differencies n-length(d) ## sample size t.value-(md/(sdd/sqrt(n))) ## sample t-value with n-1 df pt(t.value,n-1,lower.tail=FALSE) ## p-value of test [1] 0.5755711 Best regards, Vito Steven F. Freeman wrote: I'd like to do a t-test to compare the Delta values of items with Crit=1 with Delta values of items with Crit=0. What is the t.test syntax? It should produce a result like this below (I can't get in touch with the person who originally did this for me) Welch Two Sample t-test data: t1$Delta by Crit t = -3.4105, df = 8.674, p-value = 0.008173 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.04506155 -0.00899827 sample estimates: mean in group FALSE mean in group TRUE 0.03331391 0.06034382 Thanks. = Diventare costruttori di soluzioni Became solutions' constructors The business of the statistician is to catalyze the scientific learning process. George E. P. Box Visitate il portale http://www.modugno.it/ e in particolare la sezione su Palese http://www.modugno.it/archivio/palese/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Technology. Start enjoying all the benefits of MSN® Premium right now and get the first two months FREE*. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
I would strongly suggest a different method to present the data such as a contour plot or 3D bar plot. An XY plot with a million points is unlikely to be readable unless it is produced as a large format print. At 200 DPI printed, 1,000,000 discrete points requires a minimum of a 5 inch (12.7 cm) by 5 inch area. Besides, other than being visually overwhelming, what information would such a plot offer a viewer? I recall some of our extreme value statistics people printing things like this. Several million points on a plot. Most of which were in a big, thick block of toner, and then a few hundred at the extremes which was where they where interested in looking. Of course these things took an hour to print on a PostScript printer at the time. I think I suggested only plotting points for which X someThreshold. Saved on toner and time. Got a bit tricky in the bivariate case though, where you really needed to plot points outside some ellipse that you knew would otherwise be a big black blob, and then you filled that in with a black ellipse. Contours or aggregation wasn't any use, since they were interested in the point patterns of the extreme value data. Baz __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Searching for antilog function
Dear R-users, I have a basic question about how to determine the antilog of a variable. Say I have some number, x, which is a factor of 2 such that x = 2^y. I want to figure out what y is, i.e. I am looking for the antilog base 2 of x. I have found log2 in the Reference Manual. But I am struggling how to get the antilog of that. Any help will be appreciated! version platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major1 minor9.1 year 2004 month06 day 21 language R ...heather __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] what does order() stand for in an lme formula?
Harry Athanassiou [EMAIL PROTECTED] writes: I'm a beginner in R, and trying to fit linear models with different intercepts per group, of the type y ~ A*x1 + B, where x1 is a numerical variable. I cannot understand whether I should use y1 ~ x1 +1 or y1 ~ order(x1) + 1 Although in the toy example included it makes a small difference, in models with many groups the models without order() converge slower if at all! Er? What gave you the idea of using order in the first place? To the best of my knowledge, order(x) is also in this context just a function, which for the nth observation returns the position of the nth largest observation in x. This is not likely to make sense as a predictor in a model. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] SAS or R software
neela v writes: Hi all there Can some one clarify me on this issue, features wise which is better R or SAS, leaving the commerical aspect associated with it. I suppose there are few people who have worked on both R and SAS and wish they would be able to help me in deciding on this. THank you for the help I very much doubt you can make an informed decision if you leave the commercial aspect (license) aside. A single Base SAS installation (server) can cost tens of thousands of [[your currency here; may need to multiply by 10 or 100 or more]] in the first year, then a percentage of that in the following years. (SAS software is not purchased, but licensed on a yearly basis.) Want more than Base SAS? Prepare your wallet: thousands upon thousands (per year) for regression, anova, clustering (SAS/Stat), graphics (SAS/Graph), time series (SAS/ETS), optimizations (SAS/OR) etc. Then, if you want decision trees and neural networks (Enterprise Miner), I warmly recommend you to quickly find a chair and sit down before you hear the price tag. Will you always work for an organization that licenses SAS software? Will the organization license all the modules you'll need? Will those modules do everything you want? As others have said, R is a lot more flexible, and the GPL ensures that whatever you can do today will continue to be expanded and improved (much faster than SAS Institute would want or be able to expand/improve SAS). All in all, if you're primarily interested in data analysis (and don't want, for example, to get a job as a SAS programmer) and still choose SAS, you will regret it one day. The benefits are few (such as robust manipulation of massive data sets - I mean in excess of hundreds of millions of rows) and the risks are high (whatever you do is dependent on proprietary, very expensive software). With R, almost the opposite is true: lots of benefits and no risks (nothing can take R away from you). HTH, b. __ All your favorites on one personal page Try My Yahoo! __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] SMVs
Hi Everyone I am struggling to get going with support vector machines in R - smv() and predict() etc. Does anyone know of a good tutorial covering R and these things? Stephen [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Searching for antilog function
What's wrong with log2()? log2(16) [1] 4 Isn't that exactly what you asked for? Andy From: Heather J. Branton Dear R-users, I have a basic question about how to determine the antilog of a variable. Say I have some number, x, which is a factor of 2 such that x = 2^y. I want to figure out what y is, i.e. I am looking for the antilog base 2 of x. I have found log2 in the Reference Manual. But I am struggling how to get the antilog of that. Any help will be appreciated! version platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major1 minor9.1 year 2004 month06 day 21 language R ...heather __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Searching for antilog function
Consider: exp(log(1:11)) [1] 1 2 3 4 5 6 7 8 9 10 11 2^log(1:11, 2) [1] 1 2 3 4 5 6 7 8 9 10 11 2^logb(1:11, 2) [1] 1 2 3 4 5 6 7 8 9 10 11 10^log10(1:11) [1] 1 2 3 4 5 6 7 8 9 10 11 2^log2(1:11) [1] 1 2 3 4 5 6 7 8 9 10 11 Does this answer the question? hope this helps. spencer graves Heather J. Branton wrote: Dear R-users, I have a basic question about how to determine the antilog of a variable. Say I have some number, x, which is a factor of 2 such that x = 2^y. I want to figure out what y is, i.e. I am looking for the antilog base 2 of x. I have found log2 in the Reference Manual. But I am struggling how to get the antilog of that. Any help will be appreciated! version platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major1 minor9.1 year 2004 month06day 21 language R ...heather __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Searching for antilog function
On Wed, 24 Nov 2004 12:26:46 -0500, Heather J. Branton [EMAIL PROTECTED] wrote : Dear R-users, I have a basic question about how to determine the antilog of a variable. Say I have some number, x, which is a factor of 2 such that x = 2^y. I want to figure out what y is, i.e. I am looking for the antilog base 2 of x. I have found log2 in the Reference Manual. But I am struggling how to get the antilog of that. You seem to be confusing log with antilog, but log2(x) and 2^y are inverses of each other, i.e. log2(2^y) equals y and 2^log2(x) equals x (up to rounding error, of course). Duncan Murdoch __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Searching for antilog function
Yes! Somehow I must have made an entry error when I tried that before as I was getting something completely different! Thank you. ...heather Liaw, Andy wrote: What's wrong with log2()? log2(16) [1] 4 Isn't that exactly what you asked for? Andy From: Heather J. Branton Dear R-users, I have a basic question about how to determine the antilog of a variable. Say I have some number, x, which is a factor of 2 such that x = 2^y. I want to figure out what y is, i.e. I am looking for the antilog base 2 of x. I have found log2 in the Reference Manual. But I am struggling how to get the antilog of that. Any help will be appreciated! version platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major1 minor9.1 year 2004 month06 day 21 language R ...heather __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- -- ___ Heather J. Branton Public Data Queries Data Specialist 310 Depot Street, Ste C 734.213.4964 x312 Ann Arbor, MI 48104 U.S. Census Microdata At Your Fingertips http://www.pdq.com __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] seriesMerge
Is there a function in R that is equivalent to S-PLUS's seriesMerge(x1, x2, pos=union) where x1, and x2 are of class timeSeries seriesMerge is in S-PLUS's finmetrics. I looked into R's mergeSeries (in fSeries part of Rmetrics) but I could not make it behave quite the same. In R it expected a timeSeries object and a matrix of the same row count. In S-PLUS when using the union option both objects can be of different lengths. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
How about the following to plot only the 1,000 or so most extreem points (the outliers): x - rnorm(1e6) y - 2*x+rnorm(1e6) plot(x,y,pch='.') tmp - chull(x,y) while( length(tmp) 1000 ){ tmp - c(tmp, seq(along=x)[-tmp][ chull(x[-tmp],y[-tmp]) ] ) } points(x[tmp],y[tmp], col='red') now just replace the initial plot with a hexbin or contour plot and you should have something that takes a lot less room but still shows the locations of the outer points. Greg Snow, Ph.D. Statistical Data Center [EMAIL PROTECTED] (801) 408-8111 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Searching for antilog function
Thank you so much for each of your responses. But to make sure I am clear (in my own mind), is this correct? If x = 2^y Then y = log2(x) Thanks again. I know this is basic. ...heather Duncan Murdoch wrote: On Wed, 24 Nov 2004 12:26:46 -0500, Heather J. Branton [EMAIL PROTECTED] wrote : Dear R-users, I have a basic question about how to determine the antilog of a variable. Say I have some number, x, which is a factor of 2 such that x = 2^y. I want to figure out what y is, i.e. I am looking for the antilog base 2 of x. I have found log2 in the Reference Manual. But I am struggling how to get the antilog of that. You seem to be confusing log with antilog, but log2(x) and 2^y are inverses of each other, i.e. log2(2^y) equals y and 2^log2(x) equals x (up to rounding error, of course). Duncan Murdoch -- ___ Heather J. Branton Public Data Queries Data Specialist 310 Depot Street, Ste C 734.213.4964 x312 Ann Arbor, MI 48104 U.S. Census Microdata At Your Fingertips http://www.pdq.com __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reshaping of data for barplot2
On Wed, 2004-11-24 at 19:24 +0100, Jean-Louis Abitbol wrote: Dear All, I have the following data coming out from s - with(final, summarize(norm, llist(gtt,fdiab), function(norm) { n - sum(!is.na(norm)) s - sum(norm, na.rm=T) binconf(s, n) }, type='matrix') ) ie gtt fdiab norm.norm norm.norm2 norm.norm3 18PLNo 3.70370370 0.18997516 18.28346593 19PL Yes 3.57142857 0.18319034 17.71219774 13TT1 No 9.09090909 3.59221932 21.15923917 14TT1 Yes 1.81818182 0.09326054 9.60577606 ... 10 HIGHNo 26.53061224 16.21128213 40.26228897 11 HIGH Yes 10. 4.66428345 20.14946472 I would like to reshape the data so that I can barplot2 treatments (gtt) with 2 beside bars for fdiab yes/no and add CI. Various attemps have been unsuccessful as I have not understood both the logic of beside and the nature of structures to be passed to barplot2. Not enough know-how with reshape and transpose either. Needless to say Dotplot works great with this kind of data but some Authority requests side:side bars with CI. Thanks for any help. Jean-Louis, For an easy example, see the help in barplot2, which uses the VADeaths dataset. The dataset looks like: VADeaths Rural Male Rural Female Urban Male Urban Female 50-54 11.7 8.7 15.4 8.4 55-59 18.1 11.7 24.3 13.6 60-64 26.9 20.3 37.0 19.3 65-69 41.0 30.9 54.6 35.1 70-74 66.0 54.3 71.1 50.0 Now use: barplot2(VADeaths) This will yield a stacked bar plot, where there are 4 bars (one for each column in the matrix). Each bar then consists of 5 stacked sections, with each section representing the row values in each column. Now try: barplot(VADeaths, beside = TRUE) This now yields 4 groups of bars, with one group for each column. Each group then consists of 5 bars, one bar for each row value. Hopefully, that gives you some insight into how the matrix structure interacts with the 'beside' argument. In the case of your data above, I read the few rows into a data frame called 'df'. So 'df' looks like: df gtt fdiab norm.norm norm.norm2 norm.norm3 1 PLNo 3.703704 0.18997516 18.283466 2 PL Yes 3.571429 0.18319034 17.712198 3 TT1No 9.090909 3.59221932 21.159239 4 TT1 Yes 1.818182 0.09326054 9.605776 5 HIGHNo 26.530612 16.21128213 40.262289 6 HIGH Yes 10.00 4.66428345 20.149465 To follow the VADeaths example above, you need to re-shape the required columns, each as three column matrices, as follows: height - matrix(df$norm.norm, ncol = 3) ci.l - matrix(df$norm.norm2, ncol = 3) ci.u - matrix(df$norm.norm3, ncol = 3) bars - matrix(df$fdiab, ncol = 3) Now, 'height' looks like: height [,1] [,2] [,3] [1,] 3.703704 9.090909 26.53061 [2,] 3.571429 1.818182 10.0 ci.l and ci.u and bars will of course look similar. So, now you could use barplot2 as follows: mp - barplot2(height, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u, beside = TRUE, names.arg = bars) Note that I save the bar midpoints in 'mp'. Now, you can go back and put in the bar group labels as follows. First break out the unique values of 'gtt' keeping the order intact by using matrix(): labels - matrix(df$gtt, ncol = 2, byrow = TRUE) mtext(side = 1, at = colMeans(mp), text = labels[, 1], line = 3) Note that I use 'byrow = TRUE' in the call to matrix() so that the order of the matrix is set properly. Thus, each column contains the group labels and looks like: labels [,1] [,2] [1,] PL PL [2,] TT1 TT1 [3,] HIGH HIGH So we just use the first column above in the call to mtext(). So that should do it and can be extended to your full dataset if the format is consistent with what you have above. One final (and important) note. There is another approach here that can be used, which is to keep your data in its initial state and specify the 'space' argument explicitly in the call to barplot2. This is actually less work than what we did above. In this case, we use the 'space' argument to group the bars explicitly, which is in effect, what the 'beside' argument does internally. We use each column from 'df' directly and set the 'space' argument to a repeating sequence of c(1, 0) for each of the 3 groups. Note that here we need to explicitly define the colors to use, since barplot2 uses 'grey' by default when 'height' is a vector (as does barplot). We also need to convert df$diab to a vector, otherwise the numeric factor codes will be used. The sequence then goes like this: mp - barplot2(df$norm.norm, plot.ci = TRUE, ci.l = df$norm.norm2, ci.u = df$norm.norm3, space = rep(c(1, 0), 3), col = rep(c(red, yellow), 3),
Re: [R] seriesMerge
On Wed, Nov 24, 2004 at 03:29:53PM -0500, Yasser El-Zein wrote: Is there a function in R that is equivalent to S-PLUS's seriesMerge(x1, x2, pos=union) where x1, and x2 are of class timeSeries seriesMerge is in S-PLUS's finmetrics. I looked into R's mergeSeries (in fSeries part of Rmetrics) but I could not make it behave quite the same. In R it expected a timeSeries object and a matrix of the same row count. In S-PLUS when using the union option both objects can be of different lengths. The its (short for irregular timeseries) package has union() and intersect(). The zoo package also some functions for this, I think. Topics like this get discussed a bit on the r-sig-finance list, you may want to glance at the archives or do some conditional googleing. Hth, Dirk -- If your hair is standing up, then you are in extreme danger. -- http://www.usafa.af.mil/dfp/cockpit-phys/fp1ex3.htm __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] confidence interval of a average...
I have a sample of lung capacities from a population measured against height. I need to know the 95% CI of the lung capacity of a person of average height. I have fitted a regression line. How do I get a minimum and maximum values of the 95% CI? My thinking was that this has something to do with covariance, but how? My other thinking was that I could derive the 0.975 (sqrt 0.95) CI for the height. Then I could take the lower height 0.975 CI value and calculate from that the lower 0.975 value from the lung capacity. And then do the same for the taller people. That is bound to be wrong though. Dunc __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Installing gregmisc under windows 2000
I went through the following steps using RGUI menus to install gregmisc from CRAN. It appears to install but at the end R does not seem to be able to find it. Any idea what I'm doing wrong? Thankjs, Rob local({a - CRAN.packages() + install.packages(select.list(a[,1],,TRUE), .libPaths()[1], available=a, dependencies=TRUE)}) trying URL `http://cran.r-project.org/bin/windows/contrib/2.0/PACKAGES' Content type `text/plain; charset=iso-8859-1' length 23113 bytes opened URL downloaded 22Kb trying URL `http://cran.r-project.org/bin/windows/contrib/2.0/gregmisc_2.0.0.zip' Content type `application/zip' length 687958 bytes opened URL downloaded 671Kb bundle 'gregmisc' successfully unpacked and MD5 sums checked Delete downloaded files (y/N)? y updating HTML package descriptions library(gregmisc) Error in library(gregmisc) : There is no package called 'gregmisc' version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor0.1 year 2004 month11 day 15 language R __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Modeling censored binomial / Poisson data
I have some count data (0 - 1 at each time point for each test subject) that I would like to model. Since the 1's are rather sparse, the Poisson distribution comes to mind but I would also consider the binomial. The data are censored as the data come from a clinical trial and subjects were able to leave the study and some were therefore lost to follow-up. I am aware of the capabilities of lme( ) and nlme( ) through the excellent book by Pinheiro and Bates, but am at a loss as to what to do with these count data. Ideally, I would like to compare the placebo and treatment groups in a meaningful way. Any input would be greatly appreciated. Thanks, Greg __ All your favorites on one personal page Try My Yahoo! __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to remove time series trend in R?
I got a set of data which has seasonal trend in form of sinx, cosx, I don't have any idea on how to deal with it. Can you give me a starting point? Thanks, Terry __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Installing gregmisc under windows 2000
gregmisc is a bundle, not a package. Its description on CRAN is gregmiscBundle of gtools, gdata, gmodels, gplots so try one of those packages. On Wed, 24 Nov 2004, Robert W. Baer, Ph.D. wrote: I went through the following steps using RGUI menus to install gregmisc from CRAN. It appears to install but at the end R does not seem to be able to find it. Any idea what I'm doing wrong? Thankjs, Rob local({a - CRAN.packages() + install.packages(select.list(a[,1],,TRUE), .libPaths()[1], available=a, dependencies=TRUE)}) trying URL `http://cran.r-project.org/bin/windows/contrib/2.0/PACKAGES' Content type `text/plain; charset=iso-8859-1' length 23113 bytes opened URL downloaded 22Kb trying URL `http://cran.r-project.org/bin/windows/contrib/2.0/gregmisc_2.0.0.zip' Content type `application/zip' length 687958 bytes opened URL downloaded 671Kb bundle 'gregmisc' successfully unpacked and MD5 sums checked ^^^ Delete downloaded files (y/N)? y updating HTML package descriptions library(gregmisc) Error in library(gregmisc) : There is no package called 'gregmisc' ^^^ -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Installing gregmisc under windows 2000
Robert W. Baer, Ph.D. [EMAIL PROTECTED] writes: I went through the following steps using RGUI menus to install gregmisc from CRAN. It appears to install but at the end R does not seem to be able to find it. Any idea what I'm doing wrong? It's a bundle nowadays, so you need to load one of it's constituent packages. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RE: RODBC and Table views
channel2- odbcConnectAccess(C:\\Documents and Settings\\Fælles\\Journal\\DATASUPERMARKED\\DANBIONOVEMBER2004.mdb, uid=) sqlQuery(channel2,select * from Afdelinger_output_tabel1B order by antal desc) Does take views and tables! Niels Steen Krogh Konsulent ZiteLab Mail: -- [EMAIL PROTECTED] Telefon: --- +45 38 88 86 13 Mobil: - +45 22 67 37 38 Adresse: --- Zitelab Solsortvej 44 2000 F. ZiteLab -Let's Empower Your Data with Webservices __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Installing gregmisc under windows 2000
Thanks for the clarification. Pursuant to the recent dicussion of GUI promoting ignornce among users, I plead guilty for CRAN installs but they have generally saved so much time.g. This does raise the question as to whether gregmisc and other bundles should appear on the install packages from CRAN pop-up in RGUI. It also leaves me wondering what exactly was the REAL result of the apparently successful gregmisc install . I did help.search(bundle) and coming away with nada. I am not sure where I should head to de-dumb myself. I found a little in writing R extensions, but this did not clarify the interaction with the RGUI install procedure for me. Thanks again. Rob - - Original Message - From: Peter Dalgaard [EMAIL PROTECTED] To: Robert W. Baer, Ph.D. [EMAIL PROTECTED] Cc: R-Help [EMAIL PROTECTED] Sent: Wednesday, November 24, 2004 4:13 PM Subject: Re: [R] Installing gregmisc under windows 2000 Robert W. Baer, Ph.D. [EMAIL PROTECTED] writes: I went through the following steps using RGUI menus to install gregmisc from CRAN. It appears to install but at the end R does not seem to be able to find it. Any idea what I'm doing wrong? It's a bundle nowadays, so you need to load one of it's constituent packages. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] confidence interval of a average...
It depends on whether you want to do 95% ocnfidence intervals on the predicition or the mean vital capacity. Try the following and see if it gets you started: #Simulate data height=48:72 vc=height*10+20*rnorm(72-48+1) # Do regression lm.vc=lm(vc~height) # Confidence interval on mean vc predict.lm(lm.vc,interval=confidence) #confidence interval on prediced vc predict.lm(lm.vc,interval=prediction) #plot everything plot(vc~height) matlines(height,predict.lm(lm.vc,interval=c), lty=c(1,2,2),col='blue') matlines(height,predict.lm(lm.vc,interval=p),lty=c(1,3,3),col=c('black','r ed','red')) Rob -- Fom: Duncan Harris [EMAIL PROTECTED] I have a sample of lung capacities from a population measured against height. I need to know the 95% CI of the lung capacity of a person of average height. I have fitted a regression line. How do I get a minimum and maximum values of the 95% CI? My thinking was that this has something to do with covariance, but how? My other thinking was that I could derive the 0.975 (sqrt 0.95) CI for the height. Then I could take the lower height 0.975 CI value and calculate from that the lower 0.975 value from the lung capacity. And then do the same for the taller people. That is bound to be wrong though. Dunc __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] confidence interval of a average...
Sorry. The last code line got destroyed by my emailer and should read: matlines(height,predict.lm(lm.vc,interval=p), + lty=c(1,3,3),col=c('black','red','red')) - Original Message - From: Robert W. Baer, Ph.D. [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, November 24, 2004 4:56 PM Subject: Re: [R] confidence interval of a average... It depends on whether you want to do 95% ocnfidence intervals on the predicition or the mean vital capacity. Try the following and see if it gets you started: #Simulate data height=48:72 vc=height*10+20*rnorm(72-48+1) # Do regression lm.vc=lm(vc~height) # Confidence interval on mean vc predict.lm(lm.vc,interval=confidence) #confidence interval on prediced vc predict.lm(lm.vc,interval=prediction) #plot everything plot(vc~height) matlines(height,predict.lm(lm.vc,interval=c), lty=c(1,2,2),col='blue') matlines(height,predict.lm(lm.vc,interval=p),lty=c(1,3,3),col=c('black','r ed','red')) Rob -- Fom: Duncan Harris [EMAIL PROTECTED] I have a sample of lung capacities from a population measured against height. I need to know the 95% CI of the lung capacity of a person of average height. I have fitted a regression line. How do I get a minimum and maximum values of the 95% CI? My thinking was that this has something to do with covariance, but how? My other thinking was that I could derive the 0.975 (sqrt 0.95) CI for the height. Then I could take the lower height 0.975 CI value and calculate from that the lower 0.975 value from the lung capacity. And then do the same for the taller people. That is bound to be wrong though. Dunc __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Installing gregmisc under windows 2000
That seems not the case under linux in term of installation. You can install this bundle in the same way as installing an individul package. eg R CMD INSTALL CRAN/contrib/main/gregmisc_2.0.0.tar.gz to get all constituent packages installed. On Wed, 24 Nov 2004 22:14:14 + (GMT) Prof Brian Ripley [EMAIL PROTECTED] wrote: gregmisc is a bundle, not a package. Its description on CRAN is gregmisc Bundle of gtools, gdata, gmodels, gplots so try one of those packages. -- -- Yuandan Zhang, PhD Animal Genetics and Breeding Unit The University of New England Armidale, NSW, Australia, 2351 E-mail: [EMAIL PROTECTED] Phone:(61) 02 6773 3786 Fax: (61) 02 6773 3266 http://agbu.une.edu.au AGBU is a joint venture of NSW Primary Industries and The University of New England to undertake genetic RD for Australia's Livestock Industries __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
On Wed, 24-Nov-2004 at 10:22AM -0600, Marc Schwartz wrote: | On Wed, 2004-11-24 at 16:34 +0100, Witold Eryk Wolski wrote: | Hi, | | I want to draw a scatter plot with 1M and more points and save it as pdf. | This makes the pdf file large. | So i tried to save the file first as png and than convert it to pdf. | This looks OK if printed but if viewed e.g. with acrobat as document | figure the quality is bad. | | Anyone knows a way to reduce the size but keep the quality? | | Hi Eryk! | | Part of the problem is that in a pdf file, the vector based instructions | will need to be defined for each of your 10 ^ 6 points in order to draw | them. | | When trying to create a simple example: | | pdf() | plot(rnorm(100), rnorm(100)) | dev.off() | | The pdf file is 55 Mb in size. | | One immediate thought was to try a ps file and using the above plot, the | ps file was only 23 Mb in size. So note that ps can be more efficient. | | Going to a bitmap might result in a much smaller file, but as you note, | the quality does degrade as compared to a vector based image. | | I tried the above to a png, then converted to a pdf (using 'convert') | and as expected, the image both viewed and printed was pixelated, | since the pdf instructions are presumably drawing pixels and not vector | based objects. Using bitmap( ... , res = 300), I get a bitmap file of 56 Kb. It's rather slow, most of the time being taken up using gs which is converting the vector image I suspect. Time would be much shorter if, say a circle of diameter of 4 is left unplotted in the middle and others have mentioned other ways of reducing redundant points. A pdf file slightly larger than the png file can be made directly from OpenOffice that has the png imported into it. For a plot of 160mm square, this pdf printed unpixelated. Depending on what size (dimensions) you need to finish up with, you might find you could get away with a lower resolution than 300 dpi, but I usually find 200 too ragged. HTH -- Patrick Connolly HortResearch Mt Albert Auckland New Zealand Ph: +64-9 815 4200 x 7188 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~ I have the world`s largest collection of seashells. I keep it on all the beaches of the world ... Perhaps you`ve seen it. ---Steven Wright ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Installing gregmisc under windows 2000
If I'm not mistaken, bundle is really only useful as a concept for distribution and installation. You distribute and install a bundle, but load the individual packages when you want to use them. Once you install the bundle, you won't see the name of the bundle in the list of installed packages, but you see the constituent packages, and those are what you load when you want to use them. [This is the same on all platforms, BTW.] Andy From: Robert W. Baer, Ph.D. Sent: Wednesday, November 24, 2004 5:45 PM To: Peter Dalgaard Cc: R-Help Subject: Re: [R] Installing gregmisc under windows 2000 Thanks for the clarification. Pursuant to the recent dicussion of GUI promoting ignornce among users, I plead guilty for CRAN installs but they have generally saved so much time.g. This does raise the question as to whether gregmisc and other bundles should appear on the install packages from CRAN pop-up in RGUI. It also leaves me wondering what exactly was the REAL result of the apparently successful gregmisc install . I did help.search(bundle) and coming away with nada. I am not sure where I should head to de-dumb myself. I found a little in writing R extensions, but this did not clarify the interaction with the RGUI install procedure for me. Thanks again. Rob - - Original Message - From: Peter Dalgaard [EMAIL PROTECTED] To: Robert W. Baer, Ph.D. [EMAIL PROTECTED] Cc: R-Help [EMAIL PROTECTED] Sent: Wednesday, November 24, 2004 4:13 PM Subject: Re: [R] Installing gregmisc under windows 2000 Robert W. Baer, Ph.D. [EMAIL PROTECTED] writes: I went through the following steps using RGUI menus to install gregmisc from CRAN. It appears to install but at the end R does not seem to be able to find it. Any idea what I'm doing wrong? It's a bundle nowadays, so you need to load one of it's constituent packages. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] scatterplot of 100000 points and pdf file format
On 24-Nov-04 Prof Brian Ripley wrote: On Wed, 24 Nov 2004 [EMAIL PROTECTED] wrote: 1. Multiply the data by some factor and then round the results to an integer (to avoid problems in step 2). Factor chosen so that the result of (4) below is satisfactory. 2. Eliminate duplicates in the result of (1). 3. Divide by the factor you used in (1). 4. Plot the result; save plot to PDF. As to how to do it in R: the critical step is (2), which with so many points could be very heavy unless done by a well-chosen procedure. I'm not expert enough to advise about that, but no doubt others are. unique will eat that for breakfast x - runif(1e6) system.time(xx - unique(round(x, 4))) [1] 0.55 0.09 0.64 0.00 0.00 length(xx) [1] 10001 'unique' will eat x for breakfast, indeed, but will have some trouble chewing (x,y). I still can't think of a neat way of doing that. Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 25-Nov-04 Time: 00:37:15 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] scatterplot of 100000 points and pdf file format
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED] Sent: Wednesday, November 24, 2004 16:37 PM To: R Help Mailing List Subject: RE: [R] scatterplot of 10 points and pdf file format On 24-Nov-04 Prof Brian Ripley wrote: On Wed, 24 Nov 2004 [EMAIL PROTECTED] wrote: 1. Multiply the data by some factor and then round the results to an integer (to avoid problems in step 2). Factor chosen so that the result of (4) below is satisfactory. 2. Eliminate duplicates in the result of (1). 3. Divide by the factor you used in (1). 4. Plot the result; save plot to PDF. As to how to do it in R: the critical step is (2), which with so many points could be very heavy unless done by a well-chosen procedure. I'm not expert enough to advise about that, but no doubt others are. unique will eat that for breakfast x - runif(1e6) system.time(xx - unique(round(x, 4))) [1] 0.55 0.09 0.64 0.00 0.00 length(xx) [1] 10001 'unique' will eat x for breakfast, indeed, but will have some trouble chewing (x,y). xx - data.frame(x=round(runif(100),4), y=round(runif(100),4)) system.time(xx2 - unique(xx)) [1] 14.23 0.06 14.34NANA The time does not seem too bad, depending on how many times it has to be performed. --Matt Matt Austin Statistician Amgen One Amgen Center Drive M/S 24-2-C Thousand Oaks CA 93021 (805) 447 - 7431 I still can't think of a neat way of doing that. Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 25-Nov-04 Time: 00:37:15 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] scatterplot of 100000 points and pdf file format
On 25-Nov-04 Ted Harding wrote: 'unique' will eat x for breakfast, indeed, but will have some trouble chewing (x,y). I still can't think of a neat way of doing that. Best wishes, Ted. Sorry, I don't want to be misunderstood. I didn't mean that 'unique' won't work for arrays. What I meant was: X-round(rnorm(1e6),3);Y-round(rnorm(1e6),3) system.time(unique(X)) [1] 0.74 0.07 0.81 0.00 0.00 system.time(unique(cbind(X,Y))) [1] 350.81 4.56 356.54 0.00 0.00 However, still rounding to 3 d.p. we can try packing: Z-1*X + 1000*Y system.time(W-unique(Z)) [1] 0.83 0.05 0.88 0.00 0.00 length(W) [1] 961523 Though the runtime is small we don't get much reduction and still W has to be unpacked. With rounding to 2 d.p. X-round(rnorm(1e6),2);Y-round(rnorm(1e6),2) Z-1*X + 1000*Y system.time(W-unique(Z)) [1] 1.31 0.01 1.32 0.00 0.00 length(W) [1] 209882 so now it's about 1/5, but visible discretisation must be getting close. With 1 d.p. X-round(rnorm(1e6),1);Y-round(rnorm(1e6),1) Z-1*X + 1000*Y system.time(W-unique(Z)) [1] 0.92 0.01 0.93 0.00 0.00 length(W) [1] 4953 there's a good reduction (about 1/200) but the discretisation would definitely now be visible. However, as I suggested before, there's an issue of choice of constant (i.e. of the resolution of the discretisation so that there's a useful reduction and also the plot is acceptable). I'd still like to learn of a method which avoids the above method of packing, which strikes me as clumsy (but maybe it's the best way after all). Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 25-Nov-04 Time: 01:45:48 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] confidence interval of a average...
Sorry if this was not clear. This is more of a theoreticla question rather than a R-coding question. I need to calculate The predicted response and 95% prediction interval for a man of average height So I need to predict the average response, which is easily done by taking the mean height and using the regression formula. However, average height has to be calculated from the sample, and thus I have confidence in that. Let's say the mean is 163cm, I think that I can't take the 163cm value and calculate the CI from just the sd of the lung capacity because that would be too narrow; I think covariance must come into it somehow, or can I just do a 97.5% CI on the height and take those extreme values and do a 97.% CI on them? Then, you want the predition interval on the mean VC which is the thighter of the two confidence intervals and does not include the extra variability of VC about its mean. As always with confidence intevals, you are free to look at either 95% CI or 97.5% CI depending on what kind of satement you'd like to make about your confidence. I don't not understand you comment about covariance at all. Let me try again with data in your units. Note that CI varies with height and is smallest at the mean height whether you are talking about CI on the mean VC or CI on the predicted VC. For comparison, the red lines are the 95% CI on mean regression fit VC and the blue lines are 95% CI on predicted VC. The simulated data is set to have a mean height that varies around 163 cm. # Make simulated data with mean height near 163 # vc approximately in liter values with scatter height=sort(rnorm(50,mean=163,sd=35)) vc=0.03*height+.5*rnorm(50) #Plot the simulated data plot(vc~height,ylab='vital capacity (l)',xlab='Height (cm)') # Set up data frame with values of height you wish a ci on # column heading must be same as for lm() fit x variable # in this case, dataframe contains only mean height mean.height.fit.ci=data.frame(height=mean(height)) #print out the mean height mean.height.fit.ci # fit the regression model vc.lm=lm(vc~height) #Draw 95% confidence intervals on mean vc at various heights(red) (min at mean(height) matlines(height,predict.lm(vc.lm,interval=c),lty=c(1,2,2), col=c('black','red','red')) #Draw 95% confidence intervals on new vc at various heights(blue) (min again at mean(height) matlines(height,predict.lm(vc.lm,interval=p),lty=c(1,3,3), col=c('black','blue','blue')) # Determine 95% CI on mean vc at mean height predict.lm(vc.lm,mean.height.fit.ci,interval=confidence) # Determine 97.5 5% CI on mean vc at mean height predict.lm(vc.lm,mean.height.fit.ci,interval=confidence, level=0.975) You might wish to read a little more about regression CIs in a good statistics book. HTH, Rob __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re:Hi!
Thank you for your interest in APHA 2004. This is an automated reply confirming that we have received your email inquiry. General questions, requests, modifications and/or new registrations received by email, fax or mail will be processed within 7 business days. At that time, you will receive either a letter of confirmation reflecting your requested modifications or a response to your inquiry via email. Confirmation letters will be sent to the email address you had provided on your registration form, or by fax, if no email address was provided. Meanwhile, the most up-to-date meeting and program information is online! Visit www.APHA.org and register at the same time! Registration does not get any more convenient than with One-Stop-Registration. Should you have any questions, please do not hesitate to contact us. Sincerely, APHA Registrar. Phone: (514) 228-3009 Fax: (514) 228-3148 Email: [EMAIL PROTECTED] APHA c/o Laser Registration 1200 G Street, NW Suite 800 Washington, DC 20005-3967 On Nov 24, 2004, at 9:03 PM, [EMAIL PROTECTED] wrote: __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] scatterplot of 100000 points and pdf file format
On 25-Nov-04 Austin, Matt wrote: -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED] Sent: Wednesday, November 24, 2004 16:37 PM To: R Help Mailing List Subject: RE: [R] scatterplot of 10 points and pdf file format On 24-Nov-04 Prof Brian Ripley wrote: On Wed, 24 Nov 2004 [EMAIL PROTECTED] wrote: 1. Multiply the data by some factor and then round the results to an integer (to avoid problems in step 2). Factor chosen so that the result of (4) below is satisfactory. 2. Eliminate duplicates in the result of (1). 3. Divide by the factor you used in (1). 4. Plot the result; save plot to PDF. As to how to do it in R: the critical step is (2), which with so many points could be very heavy unless done by a well-chosen procedure. I'm not expert enough to advise about that, but no doubt others are. unique will eat that for breakfast x - runif(1e6) system.time(xx - unique(round(x, 4))) [1] 0.55 0.09 0.64 0.00 0.00 length(xx) [1] 10001 'unique' will eat x for breakfast, indeed, but will have some trouble chewing (x,y). xx - data.frame(x=round(runif(100),4), y=round(runif(100),4)) system.time(xx2 - unique(xx)) [1] 14.23 0.06 14.34NANA The time does not seem too bad, depending on how many times it has to be performed. --Matt Interesting! Let's see: Starting again, X-round(rnorm(1e6),3);Y-round(rnorm(1e6),3) XY-cbind(X,Y) system.time(unique(XY)) [1] 288.22 3.00 291.38 0.00 0.00 XY-data.frame(x=X,y=Y) system.time(unique(XY)) [1] 72.38 0.84 74.44 0.00 0.00 Data Frames Are Fast Food!!! Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 25-Nov-04 Time: 02:12:20 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Error in anova(): objects must inherit from classes
Hello: Let me rephrase my question to attract interest in the problem I'm having. When I appply anova() to two equations estimated using glmmPQL, I get a complaint, anova(fm1, fm2) Error in anova.lme(fm1, fm2) : Objects must inherit from classes gls, gnls lm,lmList, lme,nlme,nlsList, or nls The two equations I estimated are these: fm1 - glmmPQL(choice ~ day + stereotypy, +random = ~ 1 | bear, data = learning, family = binomial) fm2 - glmmPQL(choice ~ day + envir + stereotypy, +random = ~ 1 | bear, data = learning, family = binomial) Individually, I get results from anova(): anova(fm1) numDF denDF F-value p-value (Intercept) 1 2032 7.95709 0.0048 day 1 2032 213.98391 .0001 stereotypy 1 2032 0.42810 0.5130 anova(fm2) numDF denDF F-value p-value (Intercept) 1 2031 5.70343 0.0170 day 1 2031 213.21673 .0001 envir 1 2031 12.50388 0.0004 stereotypy 1 2031 0.27256 0.6017 I did look through the archives but didn't finding anything relevant to my problem. Hope someone can help. ANDREW _ platform i586-mandrake-linux-gnu arch i586 os linux-gnu system i586, linux-gnu status major2 minor0.0 year 2004 month10 day 04 language R -- Andrew R. Criswell, Ph.D. Graduate School, Bangkok University mailto:[EMAIL PROTECTED] http://email.bu.ac.th/src/compose.php?send_to=andrew.c%40bu.ac.th mailto:[EMAIL PROTECTED] http://email.bu.ac.th/src/compose.php?send_to=andrew%40arcriswell.com __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] scatterplot of 100000 points and pdf file format
From: [EMAIL PROTECTED] On 25-Nov-04 Ted Harding wrote: 'unique' will eat x for breakfast, indeed, but will have some trouble chewing (x,y). I still can't think of a neat way of doing that. Best wishes, Ted. Sorry, I don't want to be misunderstood. I didn't mean that 'unique' won't work for arrays. What I meant was: X-round(rnorm(1e6),3);Y-round(rnorm(1e6),3) system.time(unique(X)) [1] 0.74 0.07 0.81 0.00 0.00 system.time(unique(cbind(X,Y))) [1] 350.81 4.56 356.54 0.00 0.00 Do you know if majority of that time is spent in unique() itself? If so, which method? What I see is: X-round(rnorm(1e6),3);Y-round(rnorm(1e6),3) system.time(unique(X), gcFirst=TRUE) [1] 0.25 0.01 0.26 NA NA system.time(unique(cbind(X,Y)), gcFirst=TRUE) [1] 101.80 0.34 104.61 NA NA system.time(dat - data.frame(x=X, y=Y), gcFirst=TRUE) [1] 10.17 0.00 10.24NANA system.time(unique(dat), gcFirst=TRUE) [1] 23.94 0.11 24.15NANA Andy However, still rounding to 3 d.p. we can try packing: Z-1*X + 1000*Y system.time(W-unique(Z)) [1] 0.83 0.05 0.88 0.00 0.00 length(W) [1] 961523 Though the runtime is small we don't get much reduction and still W has to be unpacked. With rounding to 2 d.p. X-round(rnorm(1e6),2);Y-round(rnorm(1e6),2) Z-1*X + 1000*Y system.time(W-unique(Z)) [1] 1.31 0.01 1.32 0.00 0.00 length(W) [1] 209882 so now it's about 1/5, but visible discretisation must be getting close. With 1 d.p. X-round(rnorm(1e6),1);Y-round(rnorm(1e6),1) Z-1*X + 1000*Y system.time(W-unique(Z)) [1] 0.92 0.01 0.93 0.00 0.00 length(W) [1] 4953 there's a good reduction (about 1/200) but the discretisation would definitely now be visible. However, as I suggested before, there's an issue of choice of constant (i.e. of the resolution of the discretisation so that there's a useful reduction and also the plot is acceptable). I'd still like to learn of a method which avoids the above method of packing, which strikes me as clumsy (but maybe it's the best way after all). Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 25-Nov-04 Time: 01:45:48 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
Another possibility might be to use a 2d kernel density estimate (eg. kde2d from library(MASS). Then for the high density areas plot the density contours, for the low density areas plot the individual points. Hadley __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] logistic regression and 3PL model
Hello colleagues, I am a novice with R and am stuck with an analysis I am trying to conduct. Any suggestions or feedback would be very much appreciated. I am analyzing a data set of psi (ESP) ganzfeld trials. The response variable is binary (correct/incorrect), with a 25% base rate. I've looked around the documentation and other online resources and cannot find how I can correct for that base rate when I conduct a logistic regression. I understand that the correction would be equivalent to the three parameter logistic model (3PL) in IRT but am unsure how to best fit it from a logistic regression in R. Thanks much, Mike Lau __ Michael Y. Lau, M.A. 118 Haggar Hall Department of Psychology University of Notre Dame Notre Dame, IN 46556 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Error in anova(): objects must inherit from classes
The lme method for anova() checks the inheritance of the object when a single object is supplied, which is why there is no error when you use one object at a time. When two objects are supplied, the method uses the class of the object by invoking the data.class function (which does not list glmmPQL class). If you replace the check of the class with a check of inheritance it should work. Following is a check from the example listed in MASS (Venables and Ripley) library(MASS) library(nlme) x1 - glmmPQL(y ~ I(week 2), random = ~ 1 | ID, + family = binomial, data = bacteria) iteration 1 iteration 2 iteration 3 iteration 4 iteration 5 iteration 6 x2 - glmmPQL(y ~ trt + I(week 2), random = ~ 1 | ID, + family = binomial, data = bacteria) iteration 1 iteration 2 iteration 3 iteration 4 iteration 5 iteration 6 anova(x1) numDF denDF F-value p-value (Intercept) 1 169 35 .0001 I(week 2) 1 169 21 .0001 anova(x2) numDF denDF F-value p-value (Intercept) 1 169 35 .0001 trt 247 20.22 I(week 2) 1 169 20 .0001 anova(x1, x2) Error in anova.lme(x1, x2) : Objects must inherit from classes gls, gnls lm,lmList, lme,nlme,nlsList, or nls After replacement: anovaLME(x1, x2) Model df AIC BIC logLik Test L.Ratio p-value x1 1 4 1107 1121 -550 x2 2 6 1114 1134 -551 1 vs 2 2.60.28 Matt Austin Statistician Amgen One Amgen Center Drive M/S 24-2-C Thousand Oaks CA 93021 (805) 447 - 7431 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Andrew Criswell Sent: Wednesday, November 24, 2004 18:47 PM To: R-help Subject: [R] Error in anova(): objects must inherit from classes Hello: Let me rephrase my question to attract interest in the problem I'm having. When I appply anova() to two equations estimated using glmmPQL, I get a complaint, anova(fm1, fm2) Error in anova.lme(fm1, fm2) : Objects must inherit from classes gls, gnls lm,lmList, lme,nlme,nlsList, or nls The two equations I estimated are these: fm1 - glmmPQL(choice ~ day + stereotypy, +random = ~ 1 | bear, data = learning, family = binomial) fm2 - glmmPQL(choice ~ day + envir + stereotypy, +random = ~ 1 | bear, data = learning, family = binomial) Individually, I get results from anova(): anova(fm1) numDF denDF F-value p-value (Intercept) 1 2032 7.95709 0.0048 day 1 2032 213.98391 .0001 stereotypy 1 2032 0.42810 0.5130 anova(fm2) numDF denDF F-value p-value (Intercept) 1 2031 5.70343 0.0170 day 1 2031 213.21673 .0001 envir 1 2031 12.50388 0.0004 stereotypy 1 2031 0.27256 0.6017 I did look through the archives but didn't finding anything relevant to my problem. Hope someone can help. ANDREW _ platform i586-mandrake-linux-gnu arch i586 os linux-gnu system i586, linux-gnu status major2 minor0.0 year 2004 month10 day 04 language R -- Andrew R. Criswell, Ph.D. Graduate School, Bangkok University mailto:[EMAIL PROTECTED] http://email.bu.ac.th/src/compose.php?send_to=andrew.c%40bu.ac.th mailto:[EMAIL PROTECTED] http://email.bu.ac.th/src/compose.php?send_to=andrew%40arcris well.com __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] seriesMerge
Yasser El-Zein abu3ammar at gmail.com writes: : : Is there a function in R that is equivalent to S-PLUS's : seriesMerge(x1, x2, pos=union) : where x1, and x2 are of class timeSeries : : seriesMerge is in S-PLUS's finmetrics. I looked into R's mergeSeries : (in fSeries part of Rmetrics) but I could not make it behave quite the : same. In R it expected a timeSeries object and a matrix of the same : row count. In S-PLUS when using the union option both objects can be : of different lengths. merge.zoo in package zoo handles union, intersection, left and right join of unequal length time series according to the setting of the all= argument. zoo can also work with chron dates and times which would allow you to work with your millisecond data and can also merge more than two series at a time. (The its package (see ?itsJoin) and for regular time series, cbind.ts, also support merging unequal length series but neither of these support chron which I gather is a requirement for you.) eg. zoo example. In the following x has length 8 and y has length 6 and they overlap for chron(5:8). chron(1:4) only belongs to x and chron(9:10) only belongs to y. library(chron) library(zoo) x - zoo(1:8, chron(1:8)) y - zoo(5:10, chron(5:10)) merge(x,y) # union merge(x,y,all=FALSE) # intersection merge(x,y,all=c(FALSE, TRUE)) # right join merge(x,y,all=c(TRUE, FALSE)) # left join __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Searching for antilog function
Heather J. Branton hjb at pdq.com writes: : : Thank you so much for each of your responses. But to make sure I am : clear (in my own mind), is this correct? : : If x = 2^y : Then y = log2(x) : : Thanks again. I know this is basic. Although its not a proof, you can still use R to help you verify such hypotheses. Just use actual vectors of numbers and check that your hypothesis, in this case y equals log2(x), holds. For example, R # try it out with the vector 1,2,3,...,10 R y - 1:10 R y [1] 1 2 3 4 5 6 7 8 9 10 R # now calculate x R x - log2(y) R # lets see what 2^x looks like: R 2^x [1] 1 2 3 4 5 6 7 8 9 10 R # it gave back y! __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Installing gregmisc under windows 2000
On Thu, 25 Nov 2004, Yuandan Zhang wrote: That seems not the case under linux in term of installation. You can install this bundle in the same way as installing an individul package. eg R CMD INSTALL CRAN/contrib/main/gregmisc_2.0.0.tar.gz to get all constituent packages installed. And the same under Windows. Please read the rest of the message you silently excised. As gregmisc is not one of the constituent packages, library(gregmisc) does not work under Linux. What exactly were you trying to `correct'? On Wed, 24 Nov 2004 22:14:14 + (GMT) Prof Brian Ripley [EMAIL PROTECTED] wrote: gregmisc is a bundle, not a package. Its description on CRAN is gregmiscBundle of gtools, gdata, gmodels, gplots so try one of those packages. [Important details removed here.] -- -- Yuandan Zhang, PhD Animal Genetics and Breeding Unit The University of New England Armidale, NSW, Australia, 2351 E-mail: [EMAIL PROTECTED] Phone:(61) 02 6773 3786 Fax: (61) 02 6773 3266 http://agbu.une.edu.au AGBU is a joint venture of NSW Primary Industries and The University of New England to undertake genetic RD for Australia's Livestock Industries __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html