[R] How does the r-distribution function work
I am trying to understand what rbinom function does. Here is some sample code. Are both the invocations of bfunc effectively doing the same or I am missing the point? Thanks, Pieter bfunc - function(n1,p1,sims) { c-rbinom(sims,n1,p1) c } a=c() b=c() p1=.5 for (i in 1:1){ a[i]=bfunc(30,p1,1) } b=bfunc(30,p1,1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How does the r-distribution function work
pieter claassen wrote: I am trying to understand what rbinom function does. Here is some sample code. Are both the invocations of bfunc effectively doing the same or I am missing the point? There are some newbie issues with your code (you are extending a on every iteration, and your bfunc is just rbinom with the parameters in a different order), but basically, yes: They are conceptually the same. Both give 1 independent binomial samples. In fact, if you reset the random number generator in between, they also give the same results (this is an implementation issue and not obviously guaranteed for any distribution) . Here's an example with smaller values than 1 and 30. set.seed(123) rbinom(10,1,.5) [1] 0 1 0 1 1 0 1 1 1 0 set.seed(123) for (i in 1:10) print(rbinom(1,1,.5)) [1] 0 [1] 1 [1] 0 [1] 1 [1] 1 [1] 0 [1] 1 [1] 1 [1] 1 [1] 0 set.seed(123) replicate(10, rbinom(1,1,.5)) [1] 0 1 0 1 1 0 1 1 1 0 Thanks, Pieter bfunc - function(n1,p1,sims) { c-rbinom(sims,n1,p1) c } a=c() b=c() p1=.5 for (i in 1:1){ a[i]=bfunc(30,p1,1) } b=bfunc(30,p1,1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How does the r-distribution function work
Hi, I have a problem. how can I solve a problem without t.test for example: x-c(1,2,3,4,5,6) y-c(7,8,9) t.test(x,y,alternative=less,paired=FALSE,var.equal=TRUE,con.level=0.95) sorry for my english :) -- View this message in context: http://www.nabble.com/How-does-the-r-distribution-function-work-tf4034026.html#a11460431 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] distribution graph
The following gives two functions for producing distribution graphs: distribution-graph produces a single graph, and multiple.distribution.graph produces a number of graphs side by side. Regards, Tore Wentzel-Larsen statistician Centre for Clinical research Armauer Hansen house Haukeland University Hospital N-5021 Bergen tlf +47 55 97 55 39 (a) faks +47 55 97 60 88 (a) email [EMAIL PROTECTED] Documentation: distribution.graph Description distribution.graph produces a distribution graph of the data values. Usage distribution.graph(xx, grouping=FALSE, ngroups=10, xplace=c(0,1,.5), halfband=.25, xlab='', ylab='', pch=16, lines=FALSE, lty='solid') Arguments xx numeric, a vector of values for which to produce the distribution graph. Missing values are allowed, and are disregarded. groupinglogical, if FALSE (the default) the actual values are graphed, if TRUE the values are grouped before being plotted. ngroups the number of groups (default 10) if grouping=TRUE. xplace vector with three components. The first two components define the horizontal plotting range. The last component defines the horizontal placement of the centre of the distribution graph. halfbandHalf-length of the maximal horizontal band in the distribution Graph, from the centre outwards. The bands should be within the Horizontal plotting range. xlab, ylabx and y axis labels, as in plot.default. pch plotting symbol, default 16 (solid circle). lines logical, if FALSE (the default) only points are plotted, if TRUE the points are connected by lines. lty line type, as in plot.default. Value A frequency table for the values actually plotted. Examples # a simple distribution graph with no grouping: distribution.graph(floor(runif(100, 200, 310))) # a similar graph with vertical bars only: distribution.graph(floor(runif(100, 200, 310)), lines=TRUE, pch='') # a distribution graph with grouping (points or line bars): distribution.graph(runif(1000 ,0, 3), grouping=TRUE) distribution.graph(runif(1000, 0, 3), grouping=TRUE, lines=TRUE, pch='') # a distribution graph with grouping, 5 groups: distribution.graph(runif(1000, 0, 10), grouping=TRUE, ngroups=5) distribution.graph(rbinom(1000, 20, .7), grouping=TRUE, ngroups=5) - - - - - - - - - - - - - - - multiple.distribution.graph Description multiple.distribution.graph produces a number of distribution graphs of the data values, side by side. Usage multiple.distribution.graph(xx, grouping=FALSE, ngroups=10, xleft=0, xright=1, xmiddle=.5, xband=.5, xlab=c(1:length(xx)), ylab='', pch=16, lines=FALSE, lty='solid') Arguments xx list of numeric variables, a vector of values for which to produce the distribution graph. Missing values are allowed, and are disregarded. groupinglogical, if FALSE (the default) the actual values are graphed, if TRUE the values are grouped before being plotted. ngroups the number of groups (default 10) if grouping=TRUE. xleft xright xmiddle xleft and xright define the horizontal plotting range within each distribution graph. xmiddle defines the horizontal placement of the centre of each distribution graph. xband the part actually used for plotting, of the horizontal range allocated top each individual graph. xlab, ylabx and y axis labels, as in plot.default. pch plotting symbol, default 16 (solid circle). lines logical, if FALSE (the default) only points are plotted, if TRUE the points are connected by lines. lty line type, as in plot.default. Value A list of frequency tables for the values actually plotted. Examples par(ask=TRUE) multiple.distribution.graph(as.list(data.frame(matrix(runif(72),ncol=9 multiple.distribution.graph(as.list(data.frame(matrix(runif(72),ncol=9))), grouping=TRUE) multiple.distribution.graph(as.list(data.frame(matrix(runif(72),ncol=9))), grouping=TRUE,ngroups=3) multiple.distribution.graph(as.list(data.frame(matrix(runif(72),ncol=9))), grouping=TRUE,ngroups=3,lines=TRUE) multiple.distribution.graph(as.list(data.frame(matrix(runif(72),ncol=9))), grouping=TRUE,ngroups=3,lines=TRUE,pch='') multiple.distribution.graph(as.list(data.frame(matrix(runif(72),ncol=9))), grouping=TRUE,ngroups=5,lines=TRUE,pch='') par(ask=FALSE) # a more complicated list of numeric vectors: xx - as.list(as.list(data.frame(matrix(runif(72,10,45),ncol=9 xx[[1]][c(1,3,4,8)]- NA xx[[2]][c(2,4)]- NA xx[[4]][c(3)]- NA xx[[6]][c(2,5,8)]- NA xx[[8]][c(1,2,8)]- NA xx - lapply(xx,stripmiss) xx[[1]][c(3)]- NA xx[[3]][c(1,3,4,5)]- NA xx[[4]][c(2,3)]- NA xx[[8]][c(3,4)]- NA
[R] distribution of peaks in random data results
Dear all, I have the positions of N points spread through some sequence of length L (LN), and I would like to know how can do the following: 1- Permute the positions of the N points along the whole sequence. Assuming a uniform distribution I did: position1 - runif(N, 1, L) 2- Apply a kernel convolution method to the resulting permuted points profile. For this I applied the function: d - density(position1, bw = sj) 3- Record the heights of all peaks. For this I used the estimated density values from the output of the density function above: heights1 - d$y 4- Repeat step 1 and 2 to be able to have a distribution of the peaks from the random data results. I don´t know how to perform this step!!! 5- Compute the threshold by determining the alfa-level in the empirical CDF of the null distribution. Assuming ´heightsALL´ is the output of step 4 I would do this: plot(ecdf(heightsALL)). But I don´t know how to compute the threshold 6- Apply this threshold to the peaks estimate of the real peaks data, resulting in a series of significant peaks. This step can be done by seeing the peaks in the real data that are above the threshold and classify these as significant at the alfa-level. The steps mentioned above are better illustrated with a picture that can be fetched here: http://www.yousendit.com/transfer.php?action=downloadufid=0E3724F26CA53367 Best regards and thanks in advance, João Fadista Ph.d. student UNIVERSITY OF AARHUS Faculty of Agricultural Sciences Dept. of Genetics and Biotechnology Blichers Allé 20, P.O. BOX 50 DK-8830 Tjele Phone: +45 8999 1900 Direct: +45 8999 1900 E-mail: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Web: www.agrsci.org http://www.agrsci.org/ News and news media http://www.agrsci.org/navigation/nyheder_og_presse . This email may contain information that is confidential. Any use or publication of this email without written permission from Faculty of Agricultural Sciences is not allowed. If you are not the intended recipient, please notify Faculty of Agricultural Sciences immediately and delete this email. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] distribution graph
I am looking for a way to produce a distribution graph as in the example: (http://cecsweb.dartmouth.edu/release1.1/datatools/dgraph.php?year=2003geotype=STD_HRRevent=A01_DISeventtype=UTIL Anybody who can help? Christian von Plessen Department of Pulmonary Medicine Haukeland university hospital Bergen Norway __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distribution graph
?violinplot (You need to install the UsingR package first.) On Mar 23, 2007, at 4:06 AM, Plessen, Christian von wrote: I am looking for a way to produce a distribution graph as in the example: (http://cecsweb.dartmouth.edu/release1.1/datatools/dgraph.php? year=2003geotype=STD_HRRevent=A01_DISeventtype=UTIL Anybody who can help? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distribution graph
On 23-Mar-07 11:06:49, Plessen, Christian von wrote: I am looking for a way to produce a distribution graph as in the example: (http://cecsweb.dartmouth.edu/release1.1/datatools/dgraph.php?year=2003; geotype=STD_HRRevent=A01_DISeventtype=UTIL Anybody who can help? Christian von Plessen Department of Pulmonary Medicine Haukeland university hospital Bergen Norway The following (which anyway needs refinement, and can very probably be done better) provides a basis (illustrated using a sample from a log-normal distribution): X-exp(rnorm(200,sd=0.25)+2)/5 H-hist(X,breaks=20) C-H$counts Y-H$mids C1-C/2 C0-(-C1[1]-1/2):(C[1]-1/2); n0-length(C0) plot(C0,rep(Y[1],n0),xlim=c(-max(C)/2,max(C)/2),ylim=c(min(Y),max(Y))) for(i in (2:length(Y))){ C0-(-C1[i]-1/2):(C1[i]-1/2); n0-length(C0) points(C0,rep(Y[i],n0)) } Hoping this helps! Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 23-Mar-07 Time: 13:04:51 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distribution graph
On Fri, 2007-03-23 at 14:22 +, [EMAIL PROTECTED] wrote: [Apologies -- there were errors in the code I posted previously. A corrected version is below] On 23-Mar-07 11:06:49, Plessen, Christian von wrote: I am looking for a way to produce a distribution graph as in the example: (http://cecsweb.dartmouth.edu/release1.1/datatools/dgraph.php?year=2003; geotype=STD_HRRevent=A01_DISeventtype=UTIL Anybody who can help? The following (which anyway needs refinement, and can very probably be done better) provides a basis (illustrated using a sample from a log-normal distribution): X-exp(rnorm(200,sd=0.25)+2)/5 H-hist(X,breaks=20) C-H$counts Y-H$mids C1-C/2 C0-(-(C1[1]-1/2)):(C1[1]-1/2); n0-length(C0) plot(C0,rep(Y[1],n0),xlim=c(-max(C)/2,max(C)/2),ylim=c(min(Y),max(Y))) for(i in (2:length(Y))){ if(C[i]==0) next C0 - (-(C1[i] - 1/2)):(C1[i] - 1/2); n0-length(C0) points(C0,rep(Y[i],n0)) } Hoping this helps! Ted. How about something like this: DistPlot - function(x, digits = 1, ...) { x - round(x, digits) Tab - table(x) Vals - sapply(Tab, function(x) seq(x) - mean(seq(x))) X.Vals - unlist(Vals, use.names = FALSE) tmp - sapply(Vals, length) Y.Vals - rep(names(tmp), tmp) plot(X.Vals, Y.Vals, ...) } Vec - exp(rnorm(200, sd = 0.25) + 2) / 5 DistPlot(Vec, pch = 19) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distribution graph
On 23-Mar-07 16:55:40, Marc Schwartz wrote: [...] How about something like this: DistPlot - function(x, digits = 1, ...) { x - round(x, digits) Tab - table(x) Vals - sapply(Tab, function(x) seq(x) - mean(seq(x))) X.Vals - unlist(Vals, use.names = FALSE) tmp - sapply(Vals, length) Y.Vals - rep(names(tmp), tmp) plot(X.Vals, Y.Vals, ...) } Vec - exp(rnorm(200, sd = 0.25) + 2) / 5 DistPlot(Vec, pch = 19) Very pretty, Marc -- and magic code!! Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 23-Mar-07 Time: 19:20:04 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] distribution overlap - how to quantify?
Dear R-Users, my objective is to measure the overlap/divergence of two probability density functions, p1(x) and p2(x). One could apply the chi-square test or determine the potential mixture components and then compare the respective means and sigmas. But I was rather looking for a simple measure of similarity. Therefore, I used the concept of 'intrinsic discrepancy' which is defined as: \delta{p_{1},p_{2}} = min \left\{ \int_{\chi}p_{1}(x)\log \frac{p_{1}(x)}{p_{2}(x)}dx, \int_{\chi}p_{2}(x)\log\frac{p_{2}(x)}{p_{1}(x)}dx \right\} The smaller the delta the more similar are the distributions (0 when identical). I implemented this in 'R' using an adaptation of the Kullback-Leibler divergence. The function works, I get the expected results. The question is how to interpret the results. Obviously a delta of 0.5 reflects more similarity than a delta of 2.5. But how much more? Is there some kind of a statistical test for such an index (other than a simulation based evaluation)? Thanks in advance, Daniel Daniel Doktor PhD Student Imperial College Royal School of Mines Building, DEST, RRAG Prince Consort Road London SW7 2BP, UK tel: 0044-(0)20-7589-5111-59276(ext) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] distribution of daily rainfall values in binned categories
FJZ == Francisco J Zagmutt [EMAIL PROTECTED] on Wed, 28 Jun 2006 03:51:31 + writes: FJZ Hi Etienne, FJZ Somebody asked a somehow related question recently. FJZ http://tolstoy.newcastle.edu.au/R/help/06/06/29485.html FJZ Take a look at cut? table? and barplot? FJZ i.e. # Creates fake data from uniform(0,30) set.seed(1) ## - added by MM x=runif(50, 0,30) # Creates categories rain=cut(x,breaks=c( 0, 1,2.5,5, 10, 20, Inf)) # Creates contingency table of categories tab=table(rain) # Plots frequencies of rainfall barplot(tab) No, no, no! Do not confuse histograms with bar plots! - barplot() is {one possibility} for visualizing discrete (categorical, factor) data, - hist() is for visualizing *continuous* data (*) As Jim Porzak replied, do use hist(): the example really is a matter of visualization of a continuous distribution which should *not* be done by a barplot. Instead, e.g., hist(x, breaks = c(0, 1,2.5,5, 10,20, max(pretty(max(x, freq = TRUE, col = gray) will give a graphic similar to the above --- BUT also warns you about the hidden deception (aka sillyness) of *both* graphics: Namely, the above hist() call warns you with Warning message: the AREAS in the plot are wrong -- rather use freq=FALSE in: and finally, hist(x, breaks = c(0, 1,2.5,5, 10,20, max(pretty(max(x, col=gray) gives you a more honest graphic --- which -- for the runif() example -- may finally lead to you to realize that using unequal break may really not be such a good idea. Note however that for the OP rainfall data, that may well be different and if I look at rainfall data, I find I would rather view hist(log10( rainfall )) or then plot(density( log10( rainfall ) )) Martin Maechler, ETH Zurich (*) From statistical point of view, histograms just density estimators, and -- as known for a while -- have quite some drawbacks. Hence they should nowadays often be replaced by plot(density(.), ..) From: etienne [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject: [R] distribution of daily rainfall values in binned categories Date: Tue, 27 Jun 2006 11:28:59 -0700 (PDT) Hi, I'm a newbie in using R and I would like to have a few clues as to how I could compute and plot a distribution of daily rainfall intensity in different categories. I have daily values (mm/day) for several years and I need to show the frequency of 0-1, 1-2.5, 2.5-5, 5-10, 10-20, 20+ mm/day. Can this be done easily? Thanks, Etienne __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html FJZ __ FJZ R-help@stat.math.ethz.ch mailing list FJZ https://stat.ethz.ch/mailman/listinfo/r-help FJZ PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] distribution of daily rainfall values in binned categories
Hi Martin I agree with all your previous concerns. I was just answering her question about visualizing frequencies for a continuous variable that is artificially categorized. However, she did mention the word *distribution* (a part that I obviously ignored when I posted my answer) so your comments are more than appropriate. I am surprised nobody else jumped with the usual discussion about violin plots and his friends ;-) Cheers Francisco Dr. Francisco J. Zagmutt College of Veterinary Medicine and Biomedical Sciences Colorado State University From: Martin Maechler [EMAIL PROTECTED] Reply-To: Martin Maechler [EMAIL PROTECTED] To: Francisco J. Zagmutt [EMAIL PROTECTED] CC: [EMAIL PROTECTED], r-help@stat.math.ethz.ch Subject: Re: [R] distribution of daily rainfall values in binned categories Date: Wed, 28 Jun 2006 10:39:58 +0200 FJZ == Francisco J Zagmutt [EMAIL PROTECTED] on Wed, 28 Jun 2006 03:51:31 + writes: FJZ Hi Etienne, FJZ Somebody asked a somehow related question recently. FJZ http://tolstoy.newcastle.edu.au/R/help/06/06/29485.html FJZ Take a look at cut? table? and barplot? FJZ i.e. # Creates fake data from uniform(0,30) set.seed(1) ## - added by MM x=runif(50, 0,30) # Creates categories rain=cut(x,breaks=c( 0, 1,2.5,5, 10, 20, Inf)) # Creates contingency table of categories tab=table(rain) # Plots frequencies of rainfall barplot(tab) No, no, no! Do not confuse histograms with bar plots! - barplot() is {one possibility} for visualizing discrete (categorical, factor) data, - hist() is for visualizing *continuous* data (*) As Jim Porzak replied, do use hist(): the example really is a matter of visualization of a continuous distribution which should *not* be done by a barplot. Instead, e.g., hist(x, breaks = c(0, 1,2.5,5, 10,20, max(pretty(max(x, freq = TRUE, col = gray) will give a graphic similar to the above --- BUT also warns you about the hidden deception (aka sillyness) of *both* graphics: Namely, the above hist() call warns you with Warning message: the AREAS in the plot are wrong -- rather use freq=FALSE in: and finally, hist(x, breaks = c(0, 1,2.5,5, 10,20, max(pretty(max(x, col=gray) gives you a more honest graphic --- which -- for the runif() example -- may finally lead to you to realize that using unequal break may really not be such a good idea. Note however that for the OP rainfall data, that may well be different and if I look at rainfall data, I find I would rather view hist(log10( rainfall )) or then plot(density( log10( rainfall ) )) Martin Maechler, ETH Zurich (*) From statistical point of view, histograms just density estimators, and -- as known for a while -- have quite some drawbacks. Hence they should nowadays often be replaced by plot(density(.), ..) From: etienne [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject: [R] distribution of daily rainfall values in binned categories Date: Tue, 27 Jun 2006 11:28:59 -0700 (PDT) Hi, I'm a newbie in using R and I would like to have a few clues as to how I could compute and plot a distribution of daily rainfall intensity in different categories. I have daily values (mm/day) for several years and I need to show the frequency of 0-1, 1-2.5, 2.5-5, 5-10, 10-20, 20+ mm/day. Can this be done easily? Thanks, Etienne __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html FJZ __ FJZ R-help@stat.math.ethz.ch mailing list FJZ https://stat.ethz.ch/mailman/listinfo/r-help FJZ PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] questions on local customized R distribution CD
From: Duncan Murdoch On 6/26/2006 3:14 PM, Dongseok Choi wrote: Hello all! I hope this is the right place to post this question. The Oregon Chapter of ASA is working with local high school teachers as one of its outreaching program. We hope to use and test R as teaching tools. So, we think that a menu system (like R commander) with a few packages and a bit simplified installation instruction need to be developed. The main question is: 1) Is it OK to develop a customized CD-ROM distribution of R with pre-selected packages for high school? It will be distributed free, of course. Also, we plan to make it available from the chap web or deposit it to R-project, if requested. Generally the answer is yes, but read the GPL for the conditions. You do need to make the source code available. I was under the impression that telling the user how to get the source code would satisfy the GPL, instead of distributing the source along with the binary. Is that right? 2) If the customized distribution CD is OK, I also hope to get some technical help/advice from the core group members if any one is interested. See the R Installation and Administration manual first. It tells how to build R installers with non-standard included packages. Hopefully for 2.4.0 more customizations will be possible. Yes, it's not all that hard. Follow the directions carefully and literally and there shouldn't be problem. Andy Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] questions on local customized R distribution CD
On Tue, 27 Jun 2006, Liaw, Andy wrote: From: Duncan Murdoch On 6/26/2006 3:14 PM, Dongseok Choi wrote: Hello all! I hope this is the right place to post this question. The Oregon Chapter of ASA is working with local high school teachers as one of its outreaching program. We hope to use and test R as teaching tools. So, we think that a menu system (like R commander) with a few packages and a bit simplified installation instruction need to be developed. The main question is: 1) Is it OK to develop a customized CD-ROM distribution of R with pre-selected packages for high school? It will be distributed free, of course. Also, we plan to make it available from the chap web or deposit it to R-project, if requested. Generally the answer is yes, but read the GPL for the conditions. You do need to make the source code available. I was under the impression that telling the user how to get the source code would satisfy the GPL, instead of distributing the source along with the binary. Is that right? No, the first part is definitely wrong. (However, you don't have to distribute 'the source along with the binary', unless it is on the Internet.) The obligation is on the distributor to make the exact sources available, not to rely on anyone else (e.g. CRAN, who might just lose them or not be available 2.99 years from now). The relevant clauses are b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, [The following clause c) does not apply if you repackage the distribution.] If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. See e.g. http://www.gnu.org/licenses/gpl-faq.html#DistributeWithSourceOnInternet http://www.gnu.org/licenses/gpl-faq.html#SourceAndBinaryOnDifferentSites The easiest way to meet the obligations is to put the sources on the CD, especially as the sources concerned are only around 5% of the capacity of the CD. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] questions on local customized R distribution CD
On 6/27/2006 8:05 AM, Liaw, Andy wrote: From: Duncan Murdoch On 6/26/2006 3:14 PM, Dongseok Choi wrote: Hello all! I hope this is the right place to post this question. The Oregon Chapter of ASA is working with local high school teachers as one of its outreaching program. We hope to use and test R as teaching tools. So, we think that a menu system (like R commander) with a few packages and a bit simplified installation instruction need to be developed. The main question is: 1) Is it OK to develop a customized CD-ROM distribution of R with pre-selected packages for high school? It will be distributed free, of course. Also, we plan to make it available from the chap web or deposit it to R-project, if requested. Generally the answer is yes, but read the GPL for the conditions. You do need to make the source code available. I was under the impression that telling the user how to get the source code would satisfy the GPL, instead of distributing the source along with the binary. Is that right? Possibly, but not necessarily. See section 3 of the GPL, distributed in the COPYING file with R. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] distribution of daily rainfall values in binned categories
Hi, I'm a newbie in using R and I would like to have a few clues as to how I could compute and plot a distribution of daily rainfall intensity in different categories. I have daily values (mm/day) for several years and I need to show the frequency of 0-1, 1-2.5, 2.5-5, 5-10, 10-20, 20+ mm/day. Can this be done easily? Thanks, Etienne __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] distribution of daily rainfall values in binned categories
?hist read about breaks On 6/27/06, etienne [EMAIL PROTECTED] wrote: Hi, I'm a newbie in using R and I would like to have a few clues as to how I could compute and plot a distribution of daily rainfall intensity in different categories. I have daily values (mm/day) for several years and I need to show the frequency of 0-1, 1-2.5, 2.5-5, 5-10, 10-20, 20+ mm/day. Can this be done easily? Thanks, Etienne __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- HTH, Jim Porzak Loyalty Matrix Inc. San Francisco, CA [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] distribution of daily rainfall values in binned categories
Hi Etienne, Somebody asked a somehow related question recently. http://tolstoy.newcastle.edu.au/R/help/06/06/29485.html Take a look at cut? table? and barplot? i.e. #Creates fake data from uniform(0,30) x=runif(50, 0,30) #Creates categories rain=cut(x,breaks=c( 0, 1,2.5,5, 10, 20, Inf)) #Creates contingency table of categories tab=table(rain) #Plots frequencies of rainfall barplot(tab) I hope this helps! Francisco Dr. Francisco J. Zagmutt College of Veterinary Medicine and Biomedical Sciences Colorado State University From: etienne [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject: [R] distribution of daily rainfall values in binned categories Date: Tue, 27 Jun 2006 11:28:59 -0700 (PDT) Hi, I'm a newbie in using R and I would like to have a few clues as to how I could compute and plot a distribution of daily rainfall intensity in different categories. I have daily values (mm/day) for several years and I need to show the frequency of 0-1, 1-2.5, 2.5-5, 5-10, 10-20, 20+ mm/day. Can this be done easily? Thanks, Etienne __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] questions on local customized R distribution CD
Hello all! I hope this is the right place to post this question. The Oregon Chapter of ASA is working with local high school teachers as one of its outreaching program. We hope to use and test R as teaching tools. So, we think that a menu system (like R commander) with a few packages and a bit simplified installation instruction need to be developed. The main question is: 1) Is it OK to develop a customized CD-ROM distribution of R with pre-selected packages for high school? It will be distributed free, of course. Also, we plan to make it available from the chap web or deposit it to R-project, if requested. 2) If the customized distribution CD is OK, I also hope to get some technical help/advice from the core group members if any one is interested. Thank you very much in advance, Dongseok Choi, PhD The President of the Oregon Chapter of the ASA [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] questions on local customized R distribution CD
On 6/26/2006 3:14 PM, Dongseok Choi wrote: Hello all! I hope this is the right place to post this question. The Oregon Chapter of ASA is working with local high school teachers as one of its outreaching program. We hope to use and test R as teaching tools. So, we think that a menu system (like R commander) with a few packages and a bit simplified installation instruction need to be developed. The main question is: 1) Is it OK to develop a customized CD-ROM distribution of R with pre-selected packages for high school? It will be distributed free, of course. Also, we plan to make it available from the chap web or deposit it to R-project, if requested. Generally the answer is yes, but read the GPL for the conditions. You do need to make the source code available. 2) If the customized distribution CD is OK, I also hope to get some technical help/advice from the core group members if any one is interested. See the R Installation and Administration manual first. It tells how to build R installers with non-standard included packages. Hopefully for 2.4.0 more customizations will be possible. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Distribution Fitting
Hi, I know this is a bit off-topic, but I am quite puzzled. I am going through several papers about aerosol physics and in this field you often have determine the parameters of a distribution to match your experimental data (one typically uses a Gaussian mixture). However, in many cases people plot a normalized empirical distribution function and then perform some least-square fitting rather than using likelihood functions. As an undergrad, I was told that the former approach is correct only if you have a model for the dynamics (e.g. Ohm law and you perform a least-square fitting), but not if you deal with a distribution and you pick random draws from it (in that case, one should maximize the probability of drawing the data which were actually observed and this leads to likelihood functions). The two approaches do not seem equivalent to me, but I cannot believe that this distinction is ignored in practice... Many thanks Lorenzo __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Distribution Identification/Significance testing
Hi, What are methods for identifying the right distribution for the dataset? As far as I know Fisher test (p alpha) for stat. significance or min(square error) are two criteria for deciding. What are the other alternatives? - CONFIDENCE INTERVAL?. If any, how can I accomplish them in R. Thanx in advance. Sachin - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] distribution of the product of two correlated normal
Yu, Xuesong schrieb: Many thanks to Peter for your quick and detailed response to my question. I tried to run your codes, but seems like u is not defined for functions fp and fm. what is u? I believe t=X1*X2 nen0 - m2+c0*u ## for all u's used in integrate: never positive no, this is not the problem; u is the local integration variable in local functions f, fm, fp over which integrate() performs integration; it is rather the eps = eps default value passed in functions f, fm, fp which causes a recursive default value reference - problem; change it as follows: ### #code by P. Ruckdeschel, [EMAIL PROTECTED], rev. 04-25-06 ### # #pdf of X1X2, X1~N(m1,s1^2), X2~N(m2,s2^2), corr(X1,X2)=rho, evaluated at t # # eps is a very small number to catch errors in division by 0 ### # dnnorm - function(t, m1, m2, s1, s2, rho, eps = .Machine$double.eps ^ 0.5){ a - s1*sqrt(1-rho^2) b - s1*rho c - s2 ### new: f - function(u, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c, eps0 = eps) # new (04-25-06): eps0 instead of eps as local variable to f { nen0 - m2+c0*u #catch a division by 0 nen - ifelse(abs(nen0)eps0, nen0, ifelse(nen00, nen0+eps0, nen0-eps0)) dnorm(u)/a0/nen * dnorm( t/a0/nen -(m1+b0*u)/a0) } -integrate(f, -Inf, -m2/c, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c)$value+ integrate(f, -m2/c, Inf, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c)$value } ### # #cdf of X1X2, X1~N(m1,s1^2), X2~N(m2,s2^2), corr(X1,X2)=rho, evaluated at t # # eps is a very small number to catch errors in division by 0 ### # pnnorm - function(t, m1, m2, s1, s2, rho, eps = .Machine$double.eps ^ 0.5){ a - s1*sqrt(1-rho^2) b - s1*rho c - s2 ### new: fp - function(u, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c, eps0 = eps) # new (04-25-06): eps0 instead of eps as local variable to fp {nen0 - m2+c0*u ## for all u's used in integrate: never negative #catch a division by 0 nen - ifelse(nen0eps0, nen0, nen0+eps0) dnorm(u) * pnorm( t/a0/nen- (m1+b0*u)/a0) } ### new: fm - function(u, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c, eps0 = eps) # new (04-25-06): eps0 instead of eps as local variable to fm {nen0 - m2+c0*u ## for all u's used in integrate: never positive #catch a division by 0 nen - ifelse(nen0 (-eps0), nen0, nen0-eps0) dnorm(u) * pnorm(-t/a0/nen+ (m1+b0*u)/a0) } integrate(fm, -Inf, -m2/c, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c)$value+ integrate(fp, -m2/c, Inf, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c)$value } ## For me this gives, e.g.: pnnorm(0.5,m1=2,m2=3,s1=2,s2=1.4,rho=0.8) [1] 0.1891655 dnnorm(0.5,m1=2,m2=3,s1=2,s2=1.4,rho=0.8) [1] 0.07805282 Hth, Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] distribution of the product of two correlated normal
Yu, Xuesong writes: Does anyone know what the distribution for the product of two correlated normal? Say I have X~N(a, \sigma1^2) and Y~N(b, \sigma2^2), and the \rou(X,Y) is not equal to 0, I want to know the pdf or cdf of XY. Thanks a lot in advance. There is no closed-form expression (at least not to my knowledge) --- but you could easily write some code for a numerical evaluation of the pdf / cdf: ### #code by P. Ruckdeschel, [EMAIL PROTECTED] 04-24-06 ### # #pdf of X1X2, X1~N(m1,s1^2), X2~N(m2,s2^2), corr(X1,X2)=rho, evaluated at t # # eps is a very small number to catch errors in division by 0 ### # dnnorm - function(t, m1, m2, s1, s2, rho, eps = .Machine$double.eps ^ 0.5){ a - s1*sqrt(1-rho^2) b - s1*rho c - s2 f - function(u, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c, eps = eps) { nen0 - m2+c0*u #catch a division by 0 nen - ifelse(abs(nen0)eps, nen0, ifelse(nen00, nen0+eps, nen0-eps)) dnorm(u)/a0/nen * dnorm( t/a0/nen -(m1+b0*u)/a0) } -integrate(f, -Inf, -m2/c, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c)$value+ integrate(f, -m2/c, Inf, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c)$value } ### # #cdf of X1X2, X1~N(m1,s1^2), X2~N(m2,s2^2), corr(X1,X2)=rho, evaluated at t # # eps is a very small number to catch errors in division by 0 ### # pnnorm - function(t, m1, m2, s1, s2, rho, eps = .Machine$double.eps ^ 0.5){ a - s1*sqrt(1-rho^2) b - s1*rho c - s2 fp - function(u, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c, eps = eps) {nen0 - m2+c0*u ## for all u's used in integrate: never negative #catch a division by 0 nen - ifelse(nen0eps, nen0, nen0+eps) dnorm(u) * pnorm( t/a0/nen- (m1+b0*u)/a0) } fm - function(u, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c, eps = eps) { nen0 - m2+c0*u ## for all u's used in integrate: never positive #catch a division by 0 nen - ifelse(nen0 -eps, nen0, nen0-eps) dnorm(u) * pnorm(-t/a0/nen+ (m1+b0*u)/a0) } integrate(fm, -Inf, -m2/c, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c)$value+ integrate(fp, -m2/c, Inf, t = t, m1 = m1, m2 = m2, a0 = a, b0 = b, c0 = c)$value } ## If you have to evalute dnnorm() or pnnorm() at a lot of values of t for some given m1, m2, s1, s2, rho, then you should first evaluate [p,d]nnorm() on a (smaller) number of gridpoints of values for t first and then use something like approxfun() or splinefun() to give you a much faster evaluable function. Hth, Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] distribution of the product of two correlated normal
Hi, Does anyone know what the distribution for the product of two correlated normal? Say I have X~N(a, \sigma1^2) and Y~N(b, \sigma2^2), and the \rou(X,Y) is not equal to 0, I want to know the pdf or cdf of XY. Thanks a lot in advance. yu [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] distribution maps
Dears, I would like to know if there is a R package(s) on CRAN that can generate distribution maps of species. I think that this issue not has been discussed, but I did not search extensively on CRAN or help archives. Best regards Rogério __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] distribution maps
On Fri, 6 Jan 2006, Rogério Rosa da Silva wrote: Dears, I would like to know if there is a R package(s) on CRAN that can generate distribution maps of species. I think that this issue not has been discussed, but I did not search extensively on CRAN or help archives. Could I suggest the Spatial and Environmetrics Task Views reached from the Task View item in the navigation bar on CRAN? You may also find the R-sig-geo mailing list a useful place to make your question a little more detailed - you do not say anything about your data, and a helpful reply would depend on knowing that. Best regards Rogério -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distribution fitting problem
At 10:32 2/11/2005, you wrote: I am using the MASS library function fitdistr(x, dpois, list(lambda=2)) but I get Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) : Function cannot be evaluated at initial parameters In addition: There were 50 or more warnings (use warnings() to see the first 50) and all the first 50 warnings say 1: non-integer x = 1.45 etc Can anyone tell me what I am doing wrong. p.s. the data was read in from a .csv file that I wrote using octave Mark, Try fitdistr(x, Poisson) I think this is enough for fit Poisson distribuition for your data Bernardo Rangel Tura, MD, MSc National Institute of Cardiology Laranjeiras Rio de Janeiro Brazil -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Distribution fitting problem
I am using the MASS library function fitdistr(x, dpois, list(lambda=2)) but I get Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) : Function cannot be evaluated at initial parameters In addition: There were 50 or more warnings (use warnings() to see the first 50) and all the first 50 warnings say 1: non-integer x = 1.45 etc Can anyone tell me what I am doing wrong. p.s. the data was read in from a .csv file that I wrote using octave __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distribution fitting problem
Mark Miller mmiller at nassp.uct.ac.za writes: I am using the MASS library function fitdistr(x, dpois, list(lambda=2)) but I get Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) : Function cannot be evaluated at initial parameters In addition: There were 50 or more warnings (use warnings() to see the first 50) The docs say: For the following named distributions, reasonable starting values will be computed if start is omitted or only partially specified: cauchy, gamma, logistic, negative binomial (parametrized by mu and size), t and weibull. dpois is not among them, so you probably have to provide reasonable starting values for the parameters. Dieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distribution fitting problem
Can you advise another distribution, was thinking of exponential, but was advised poisson since independent, forgot about requiring integers On Wednesday 02 November 2005 14:44, you wrote: Mark Miller wrote: I am using the MASS library function fitdistr(x, dpois, list(lambda=2)) but I get Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) : Function cannot be evaluated at initial parameters In addition: There were 50 or more warnings (use warnings() to see the first 50) and all the first 50 warnings say 1: non-integer x = 1.45 etc Can anyone tell me what I am doing wrong. p.s. the data was read in from a .csv file that I wrote using octave Hi, Mark, If you think the data are poisson, the observations should be integers. --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distribution fitting problem
On Wed, 2 Nov 2005 14:32:52 +0200, Mark Miller wrote: MM I am using the MASS library function MM MM fitdistr(x, dpois, list(lambda=2)) MM MM but I get MM MM Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) : MM Function cannot be evaluated at initial parameters MM In addition: There were 50 or more warnings (use warnings() to see MM the first 50) MM MM and all the first 50 warnings say MM MM 1: non-integer x = 1.45 MM etc MM are the data integers (as implicit in the assumption of Poisson dist'n)? the above message seems to say that they are not Adelchi Azzalini -- Adelchi Azzalini [EMAIL PROTECTED] Dipart.Scienze Statistiche, Università di Padova, Italia tel. +39 049 8274147, http://azzalini.stat.unipd.it/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distribution fitting problem
Hi, Mark, Not without seeing you data. You only provide the first value is a warning message below. --sundar Mark Miller wrote: Can you advise another distribution, was thinking of exponential, but was advised poisson since independent, forgot about requiring integers On Wednesday 02 November 2005 14:44, you wrote: Mark Miller wrote: I am using the MASS library function fitdistr(x, dpois, list(lambda=2)) but I get Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) : Function cannot be evaluated at initial parameters In addition: There were 50 or more warnings (use warnings() to see the first 50) and all the first 50 warnings say 1: non-integer x = 1.45 etc Can anyone tell me what I am doing wrong. p.s. the data was read in from a .csv file that I wrote using octave Hi, Mark, If you think the data are poisson, the observations should be integers. --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distribution
Srini You should probably look at ?hist. If you look at the value section, you will see that you can get the information you want from the values returned from hist. If these are microarray probes and intensities, there may be specific methods for visualizing the data available from the bioconductor project (www.bioconductor.org). Hope this helps, Sean - Original Message - From: Srinivas Iyyer [EMAIL PROTECTED] To: Rhelp r-help@stat.math.ethz.ch Sent: Monday, February 21, 2005 6:21 PM Subject: [R] Distribution Dear group, apologies for asking a simple question. I have a file where the data looks like this: ProbeIntensity 0:0 501.0 1:0 17760.5 2:0 511.0 3:0 18468.3 4:0 199.8 5:0 508.0 6:0 17241.8 7:0 507.5 8:0 17910.0 9:0 482.5 10:0 17480.3 11:0 434.0 12:0 17631.3 13:0 444.8 14:0 17423.0 15:0 505.3 16:0 16693.0 17:0 438.5 18:0 16920.0 19:0 491.3 20:0 16878.0 21:0 486.3 22:0 16582.0 23:0 483.8 24:0 16694.8 25:0 452.3 26:0 16221.5 27:0 438.3 28:0 17119.8 29:0 455.5 30:0 16579.0 31:0 424.5 32:0 16691.3 33:0 472.0 My question is how do I know the distribution of the intensities. My aim is to find out the number of intensities or probes that fall in a certain range. For example 500 probes has intensities ranging from 50 to 150. 300 probes has intensities ranging from 151-250 I have no clue how to do it for 500,000 probes. Can any one please help doing it in R. thanks and apologies again srini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distribution
Have you considered qqnorm or hist? If yes, PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html;. It might help you phrase your question so you are more likely to get a useful response -- and it might help you get the answer for yourself without waiting for someone to reply. hope this helps. spencer graves Srinivas Iyyer wrote: Dear group, apologies for asking a simple question. I have a file where the data looks like this: ProbeIntensity 0:0 501.0 1:0 17760.5 2:0 511.0 3:0 18468.3 4:0 199.8 5:0 508.0 6:0 17241.8 7:0 507.5 8:0 17910.0 9:0 482.5 10:0 17480.3 11:0 434.0 12:0 17631.3 13:0 444.8 14:0 17423.0 15:0 505.3 16:0 16693.0 17:0 438.5 18:0 16920.0 19:0 491.3 20:0 16878.0 21:0 486.3 22:0 16582.0 23:0 483.8 24:0 16694.8 25:0 452.3 26:0 16221.5 27:0 438.3 28:0 17119.8 29:0 455.5 30:0 16579.0 31:0 424.5 32:0 16691.3 33:0 472.0 My question is how do I know the distribution of the intensities. My aim is to find out the number of intensities or probes that fall in a certain range. For example 500 probes has intensities ranging from 50 to 150. 300 probes has intensities ranging from 151-250 I have no clue how to do it for 500,000 probes. Can any one please help doing it in R. thanks and apologies again srini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Distribution
You can use table(cut(intensity, breaks)), where `intensity' is the vector of intensity values, and `breaks' are the boundaries of the bins (e.g., c(0, 150, 250, ...)). Andy From: Srinivas Iyyer Dear group, apologies for asking a simple question. I have a file where the data looks like this: ProbeIntensity 0:0 501.0 1:0 17760.5 2:0 511.0 3:0 18468.3 4:0 199.8 5:0 508.0 6:0 17241.8 7:0 507.5 8:0 17910.0 9:0 482.5 10:0 17480.3 11:0 434.0 12:0 17631.3 13:0 444.8 14:0 17423.0 15:0 505.3 16:0 16693.0 17:0 438.5 18:0 16920.0 19:0 491.3 20:0 16878.0 21:0 486.3 22:0 16582.0 23:0 483.8 24:0 16694.8 25:0 452.3 26:0 16221.5 27:0 438.3 28:0 17119.8 29:0 455.5 30:0 16579.0 31:0 424.5 32:0 16691.3 33:0 472.0 My question is how do I know the distribution of the intensities. My aim is to find out the number of intensities or probes that fall in a certain range. For example 500 probes has intensities ranging from 50 to 150. 300 probes has intensities ranging from 151-250 I have no clue how to do it for 500,000 probes. Can any one please help doing it in R. thanks and apologies again srini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distribution
You can read in the data using read.delim() or read.table(). For illustration let us generate some artificial data and suppose that you are interested in equal sized breaks of 5 (you can define your own break points instead). x - rchisq(50, df=10, ncp=5) brk - seq(0, 5*ceiling(max(x)/5), by=5) # increments of size 5 h - hist(x, breaks=brk, plot=FALSE) h$breaks, h$counts will give you the count and break points but I always have trouble matching which interval the counts belong to. Another easier way is to use cut() followed by table() where the labels of cut is helpful. table( cut( x, breaks=brk ) ) As a bonus, you can simplify specifying the break points by including Infinite as the endpoint in cut. brk2 - seq(0, max(x), by=5) # increments of size 5 table( cut( x, breaks=c(brk2, Inf) ) ) Regards, Adai On Mon, 2005-02-21 at 18:44 -0500, Sean Davis wrote: Srini You should probably look at ?hist. If you look at the value section, you will see that you can get the information you want from the values returned from hist. If these are microarray probes and intensities, there may be specific methods for visualizing the data available from the bioconductor project (www.bioconductor.org). Hope this helps, Sean - Original Message - From: Srinivas Iyyer [EMAIL PROTECTED] To: Rhelp r-help@stat.math.ethz.ch Sent: Monday, February 21, 2005 6:21 PM Subject: [R] Distribution Dear group, apologies for asking a simple question. I have a file where the data looks like this: ProbeIntensity 0:0 501.0 1:0 17760.5 2:0 511.0 3:0 18468.3 4:0 199.8 5:0 508.0 6:0 17241.8 7:0 507.5 8:0 17910.0 9:0 482.5 10:0 17480.3 11:0 434.0 12:0 17631.3 13:0 444.8 14:0 17423.0 15:0 505.3 16:0 16693.0 17:0 438.5 18:0 16920.0 19:0 491.3 20:0 16878.0 21:0 486.3 22:0 16582.0 23:0 483.8 24:0 16694.8 25:0 452.3 26:0 16221.5 27:0 438.3 28:0 17119.8 29:0 455.5 30:0 16579.0 31:0 424.5 32:0 16691.3 33:0 472.0 My question is how do I know the distribution of the intensities. My aim is to find out the number of intensities or probes that fall in a certain range. For example 500 probes has intensities ranging from 50 to 150. 300 probes has intensities ranging from 151-250 I have no clue how to do it for 500,000 probes. Can any one please help doing it in R. thanks and apologies again srini __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Distribution of Data (was: your reference on this problem highly appreciated)
There are many tools for this, e.g., qqnorm, density, and in library(MASS) fitdistr. Also do a literature search on transformations (especially to transformations to normality) and on mixture distributions, esp. Titterington, Smith and Makov (1986) Statistical Analysis of Finite Mixture Distributions (Wiley). What is the nature of your application? If you tell us more about the context, many people could tell you which distributions might be plausible and which would not be credible except as an approximation, e.g., a normal distribution for numbers that can not be negative and whose distribution might be positively skewed. hope this helps. spencer graves p.s. PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Yong Wang wrote: please help me on this - Message Text - Dear all R users first, sorry for that this question might not be appropriate to ask here. I wanna know theories or techinques aimed at following questions: I have a sample, say,K(at the range from 0 to 2); the sample data's central moments m(1)---m(j) are estimated(j can be large). also, I can use some methodology to calculate the upper and lower bound of the probabilty of any interested interval, say, for the interval (400--800) with all these information, I wanna recover the distribution of the data, at least recover to some approximating analytic form.Does anybady know such theory or techiniques? your help will be highly appreciated. best regards yong __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] distribution of second order statistic
Hi, I am getting some weird results here and I think I am missing something. I am trying to program a function that for a set of random variables drawn from uniform distributions plots that distribution of the second order statistic of the ordered variables. (ie I have n uniform distributions on [0, w_i] for w_i different w_j and i=1..n. I want to plot the distribution of the second order statistic ie one less the maximum. I thought that the way to do this is to calculate: F= Sum over i { (1-Fi) * Product of all j different i of Fj} + Product over all i of Fi where Fi are just the respective uniform cdf for variable i. The problem is that when I do this and plot F over the range from 0 to the highest of the w_i I don't get a cdf but something that slopes down at some point again. What is going on? Any help is greately appreciated Thanks, eugene. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] distribution of second order statistic
The order statistics have a beta distribution, so pbeta is all you need. On Mon, 15 Dec 2003, Eugene Salinas (R) wrote: I am getting some weird results here and I think I am missing something. I am trying to program a function that for a set of random variables drawn from uniform distributions plots that distribution of the second order statistic of the ordered variables. (ie I have n uniform distributions on [0, w_i] for w_i different w_j and i=1..n. I want to plot the distribution of the second order statistic ie one less the maximum. I thought that the way to do this is to calculate: F= Sum over i { (1-Fi) * Product of all j different i of Fj} + Product over all i of Fi where Fi are just the respective uniform cdf for variable i. The problem is that when I do this and plot F over the range from 0 to the highest of the w_i I don't get a cdf but something that slopes down at some point again. What is going on? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Distribution transformations
Dear R-Users, I have a question that bothers me in the last few days. It is supposed to be easy but I can't come up with a solution. Are there any functions in R dealing with transforming empirical and parametric distributions? I have two data sets of observed variables that I want to transform to Frechet and Uniform distribution. I would appreciate if someone could inform me about R-functions for this purpose or enligthen me how to do it by myself. Thank you very much in advance, Viola Rossini - [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Distribution transformations
You wrote: Dear R-Users, I have a question that bothers me in the last few days. It is supposed to be easy but I can't come up with a solution. Are there any functions in R dealing with transforming empirical and parametric distributions? I have two data sets of observed variables that I want to transform to Frechet and Uniform distribution. I would appreciate if someone could inform me about R-functions for this purpose or enligthen me how to do it by myself. Thank you very much in advance, Viola Rossini Is this a homework question? cheers, Rolf Turner [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Distribution transformations
For the uniform distribution, have you considered something like (((1:n)-0.5)/n))[order(x)]? For the Frechet distribution, a search - R site search from www.r-project.org exposed something that should help. The information you need seems to be there. hope this helps. spencer graves Viola Rossini wrote: Dear R-Users, I have a question that bothers me in the last few days. It is supposed to be easy but I can't come up with a solution. Are there any functions in R dealing with transforming empirical and parametric distributions? I have two data sets of observed variables that I want to transform to Frechet and Uniform distribution. I would appreciate if someone could inform me about R-functions for this purpose or enligthen me how to do it by myself. Thank you very much in advance, Viola Rossini - [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Distribution transformations
I am still not getting it. I am trying to understand multivariate distributions and copulas. In the beginning of each article it is said that the observed data must be transformed to uniform or frechet distribution by means of probability integral transform. Apparently this is something easy and trivial and a standard procedure in introductory statistics. Well, I have some books in statistics of different degrees of complexity but unfortunately I cannot find the answer there. All examples are only about how to generate a random sample with desired (always exponential) distribution. Now, I have two variables X and Y and I want to transform them to Frechet or uniform. I was just thinking, if this is so simple and trivial as all stat books say, then, it must exist a simple function for it in R. P.s. @Rolf, I would like to have it as a homework but I am afraid I am too old for a school. Rolf Turner [EMAIL PROTECTED] wrote: You wrote: Dear R-Users, I have a question that bothers me in the last few days. It is supposed to be easy but I can't come up with a solution. Are there any functions in R dealing with transforming empirical and parametric distributions? I have two data sets of observed variables that I want to transform to Frechet and Uniform distribution. I would appreciate if someone could inform me about R-functions for this purpose or enligthen me how to do it by myself. Thank you very much in advance, Viola Rossini Is this a homework question? cheers, Rolf Turner [EMAIL PROTECTED] - [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Distribution transformations
On 23 Nov 2003 at 19:35, Viola Rossini wrote: The frechet dirtribution is in the evd (extreme value dist) package on CRAN. The basic preinciple is that if U is uniform (0,1) anf F is a cumulative distrubution function, then F^{-1}(U) is distributed as F. Kjetil Halvorsen I am still not getting it. I am trying to understand multivariate distributions and copulas. In the beginning of each article it is said that the observed data must be transformed to uniform or frechet distribution by means of probability integral transform. Apparently this is something easy and trivial and a standard procedure in introductory statistics. Well, I have some books in statistics of different degrees of complexity but unfortunately I cannot find the answer there. All examples are only about how to generate a random sample with desired (always exponential) distribution. Now, I have two variables X and Y and I want to transform them to Frechet or uniform. I was just thinking, if this is so simple and trivial as all stat books say, then, it must exist a simple function for it in R. P.s. @Rolf, I would like to have it as a homework but I am afraid I am too old for a school. Rolf Turner [EMAIL PROTECTED] wrote: You wrote: Dear R-Users, I have a question that bothers me in the last few days. It is supposed to be easy but I can't come up with a solution. Are there any functions in R dealing with transforming empirical and parametric distributions? I have two data sets of observed variables that I want to transform to Frechet and Uniform distribution. I would appreciate if someone could inform me about R-functions for this purpose or enligthen me how to do it by myself. Thank you very much in advance, Viola Rossini Is this a homework question? cheers, Rolf Turner [EMAIL PROTECTED] - [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Distribution transformations
[EMAIL PROTECTED] writes: On 23 Nov 2003 at 19:35, Viola Rossini wrote: The frechet dirtribution is in the evd (extreme value dist) package on CRAN. The basic preinciple is that if U is uniform (0,1) anf F is a cumulative distrubution function, then F^{-1}(U) is distributed as F. [Slightly unfortunate double use of F there] ...and conversely if X has distribution D with continuous cumulative distribution function F, then F(X) will have a uniform distribution. I suspect this was the clue that Viola was missing. This requires that you know F, though. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help