Re: [R] R Commander QQ Plot with triangular distribution
Thank you for your prompt reply, I apologize for posting as an R Commander question. Your response is pointing me in the right direction, but I am still not quite there. The triangle package has a qtriangle and dtriangle. The formal arguments are a, b, and c. (formals(qtriangle)) When I use R Commander as: qqPlot(Transfer5000$Transfer.Rate, dist=triangle, a=3000,b=5000,c=4000) I get the error message: Error in qqPlot.default(Transfer5000$Transfer.Rate, dist = triangle, : argument 5 matches multiple formal arguments and I am unclear as to how to specify that the c argument needs to pertain specifically to the qtriangle (or perhaps dtriangle) function arguments, that is, how to avoid a conflict with other arguments of qqPlot that are ambiguous when passed c.. Again, sorry to post here, as this has clearly wandered into other territory. My exploratory searches to resolve this take me well beyond my current ability in R. If there is not an obvious answer, I will re-post specifically on the topic of specifying formal arguments of distributions to qqPlot. Thanks. Dick Males On Tue, Mar 8, 2011 at 11:19 AM, John Fox j...@mcmaster.ca wrote: Dear R. Males, This isn't really an R Commander question, since the qqPlot() function is in the car package and is just invoked by the R Commander. From ?qqPlot: distribution: root name of comparison distribution - e.g., norm for the normal distribution ... Any distribution for which quantile and density functions exist in R (with prefixes q and d, respectively) may be used. Thus, if there's a qtriangle() and dtriangle(), I suppose that you should be able to get a QQ plot. Further from ?qqPlot: ... arguments such as df to be passed to the appropriate quantile function. Thus, you should *name* the arguments to be passed to qtriangle() -- perhaps (of course, you should use the correct names) min=3000, max=5000, mode=4000. I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Richard and Barbara Males Sent: March-08-11 10:49 AM To: r-help@r-project.org Subject: [R] R Commander QQ Plot with triangular distribution I am attempting to use the R Commander Graphs Quantile-Comparison functionality on a dataset, to compare with a triangular distribution. I have the package triangle. My question is on the syntax of how to specify the parameters of the theoretical distribution in the Parameters field of the dialog box. For example, the theoretical distribution has min of 3000, max of 5000, mode of 4000. When I enter this info as 3000,5000,4000 in the parameters field, I get: qqPlot(EmpiricalData$Value, dist=triangle, 3000,5000,4000) it produces a plot, but I am not sure that this is correct. I have searched for examples, not found anything. Any help much appreciated. R. Males Cincinnati, Ohio, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Commander QQ Plot with triangular distribution
Wow, quick response, works perfectly, just as needed. Thanks to both of you for pointing me in the right direction, and for your contributions to the R community. Dick On Wed, Mar 9, 2011 at 11:21 AM, Peter Ehlers ehl...@ucalgary.ca wrote: On 2011-03-09 07:53, Richard and Barbara Males wrote: Thank you for your prompt reply, I apologize for posting as an R Commander question. Your response is pointing me in the right direction, but I am still not quite there. The triangle package has a qtriangle and dtriangle. The formal arguments are a, b, and c. (formals(qtriangle)) When I use R Commander as: qqPlot(Transfer5000$Transfer.Rate, dist=triangle, a=3000,b=5000,c=4000) I get the error message: Error in qqPlot.default(Transfer5000$Transfer.Rate, dist = triangle, : argument 5 matches multiple formal arguments and I am unclear as to how to specify that the c argument needs to pertain specifically to the qtriangle (or perhaps dtriangle) function arguments, that is, how to avoid a conflict with other arguments of qqPlot that are ambiguous when passed c.. Again, sorry to post here, as this has clearly wandered into other territory. My exploratory searches to resolve this take me well beyond my current ability in R. If there is not an obvious answer, I will re-post specifically on the topic of specifying formal arguments of distributions to qqPlot. Dick, I think that the author of the triangle package has chosen an unfortunate name for one of the parameters. Here's a work-around: ## define two new functions dtri() and qtri(): dtri - function(q, a=0, b=1, cc=.5) dtriangle(q, a, b, cc) qtri - function(p, a=0, b=1, cc=.5) qtriangle(p, a, b, cc) ## now use tri as the distribution to pass to qqPlot: qqPlot(x, tri, a=3000, b=5000, cc=4000) Peter Ehlers Thanks. Dick Males On Tue, Mar 8, 2011 at 11:19 AM, John Foxj...@mcmaster.ca wrote: Dear R. Males, This isn't really an R Commander question, since the qqPlot() function is in the car package and is just invoked by the R Commander. From ?qqPlot: distribution: root name of comparison distribution - e.g., norm for the normal distribution ... Any distribution for which quantile and density functions exist in R (with prefixes q and d, respectively) may be used. Thus, if there's a qtriangle() and dtriangle(), I suppose that you should be able to get a QQ plot. Further from ?qqPlot: ... arguments such as df to be passed to the appropriate quantile function. Thus, you should *name* the arguments to be passed to qtriangle() -- perhaps (of course, you should use the correct names) min=3000, max=5000, mode=4000. I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Richard and Barbara Males Sent: March-08-11 10:49 AM To: r-help@r-project.org Subject: [R] R Commander QQ Plot with triangular distribution I am attempting to use the R Commander Graphs Quantile-Comparison functionality on a dataset, to compare with a triangular distribution. I have the package triangle. My question is on the syntax of how to specify the parameters of the theoretical distribution in the Parameters field of the dialog box. For example, the theoretical distribution has min of 3000, max of 5000, mode of 4000. When I enter this info as 3000,5000,4000 in the parameters field, I get: qqPlot(EmpiricalData$Value, dist=triangle, 3000,5000,4000) it produces a plot, but I am not sure that this is correct. I have searched for examples, not found anything. Any help much appreciated. R. Males Cincinnati, Ohio, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Commander QQ Plot with triangular distribution
I am attempting to use the R Commander Graphs Quantile-Comparison functionality on a dataset, to compare with a triangular distribution. I have the package triangle. My question is on the syntax of how to specify the parameters of the theoretical distribution in the Parameters field of the dialog box. For example, the theoretical distribution has min of 3000, max of 5000, mode of 4000. When I enter this info as 3000,5000,4000 in the parameters field, I get: qqPlot(EmpiricalData$Value, dist=triangle, 3000,5000,4000) it produces a plot, but I am not sure that this is correct. I have searched for examples, not found anything. Any help much appreciated. R. Males Cincinnati, Ohio, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generating correlated random variables from different distributions
Thank you for your reply. The application is a Monte Carlo simulation in environmental planning. Different possible remediation measures have different costs, and produce different results. For example, a $20,000 plan may add 10 acres of wetlands and 12 acres of bird habitat. The desire is to describe the uncertainty in the cost and the outputs (acres of wetlands, acres of bird habitat) by distributions. The cost may be described by a normal distribution, mean $20k, $5k SD, and the 12 acres of birds may be described by a uniform distribution (10 to 14). [These are just examples, not representative of a real problem]. We may know (or think) that wetlands and bird habitat are positively correlated (0.6), and that there is a stronger correlation of both with cost (0.85). So the effort is to generate, through MCS, values at each iteration of cost, acres of wetland, and acres of bird habitat, such that the resultant values give the same correlation, and the values of cost, bird habitat and wetland habitat return the input distributions. The overall desire is compare different remediation measures, taking into account uncertainty in costs and results. One possible approach (although I have not tried it yet, but will do so in the near future) is to generate, for each iteration, three independent (0,1) random variables, correlate them via the Cholesky approach, and use them as input to the inverse normal, inverse uniform, etc. to get the three variables for each iteration. The primary distributions of interest are normal, uniform, triangular, gamma, and arbitrary cdf, so this approach seems plausible in that inverse distributions are readily available. Thanks in advance. Dick Males Cincinnati, OH, USA On Thu, Apr 29, 2010 at 12:31 PM, Greg Snow greg.s...@imail.org wrote: The method you are using (multiply by cholesky) works for normal distributions, but not necessarily for others (if you want different means/sd, then add/multiply after transforming). For other distributions this process can sometimes give the correlation you want, but may change the variable(s) to no longer have the desired distribution. The short answer to your question is It Depends, the full long answer could fill a full semester course. If you tell us more of your goal we may be able to give a more useful answer. The copula package is one possibility. If you know the conditional distribution of each variable given the others then you can use gibbs sampling. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Richard and Barbara Males Sent: Thursday, April 29, 2010 9:18 AM To: r-help@r-project.org Subject: [R] generating correlated random variables from different distributions I need to generate a set of correlated random variables for a Monte Carlo simulation. The solutions I have found (http://www.stat.uiuc.edu/stat428/cndata.html, http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers), using Cholesky Decomposition, seem to work only if the variables come from the same distribution with the same parameters. My situation is that each variable may be described by a different distribution (or different parameters of the same distribution). This approach does not seem to work, see code and results below. Am I missing something here? My math/statistics is not very good, will I need to generate correlated uniform random variables on (0,1) and then use the inverse distributions to get the desired results I am looking for? That is acceptable, but I would prefer to just generate the individual distributions and then correlate them. Any advice much appreciated. Thanks in advance R. Males Cincinnati, Ohio, USA Sample Code: # Testing Correlated Random Variables # reference http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers # reference http://www.stat.uiuc.edu/stat428/cndata.html # create the correlation matrix corMat=matrix(c(1,0.6,0.3,0.6,1,0.5,0.3,0.5,1),3,3) cholMat=chol(corMat) # create the matrix of random variables set.seed(1000) nValues=1 # generate some random values matNormalAllSame=cbind(rnorm(nValues),rnorm(nValues),rnorm(nValues)) matNormalDifferent=cbind(rnorm(nValues,1,1.5),rnorm(nValues,2,0.5),rnor m(nValues,6,1.8)) matUniformAllSame=cbind(runif(nValues),runif(nValues),runif(nValues)) matUniformDifferent=cbind(runif(nValues,1,1.5),runif(nValues,2,3.5),run if(nValues,6,10.8)) # bind to a matrix print(correlation Matrix) print(corMat) print(Cholesky Decomposition) print (cholMat) # test same normal resultMatNormalAllSame=matNormalAllSame%*%cholMat print(correlation matNormalAllSame) print(cor(resultMatNormalAllSame)) # test different normal resultMatNormalDifferent=matNormalDifferent%*%cholMat print(correlation
[R] generating correlated random variables from different distributions
I need to generate a set of correlated random variables for a Monte Carlo simulation. The solutions I have found (http://www.stat.uiuc.edu/stat428/cndata.html, http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers), using Cholesky Decomposition, seem to work only if the variables come from the same distribution with the same parameters. My situation is that each variable may be described by a different distribution (or different parameters of the same distribution). This approach does not seem to work, see code and results below. Am I missing something here? My math/statistics is not very good, will I need to generate correlated uniform random variables on (0,1) and then use the inverse distributions to get the desired results I am looking for? That is acceptable, but I would prefer to just generate the individual distributions and then correlate them. Any advice much appreciated. Thanks in advance R. Males Cincinnati, Ohio, USA Sample Code: # Testing Correlated Random Variables # reference http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers # reference http://www.stat.uiuc.edu/stat428/cndata.html # create the correlation matrix corMat=matrix(c(1,0.6,0.3,0.6,1,0.5,0.3,0.5,1),3,3) cholMat=chol(corMat) # create the matrix of random variables set.seed(1000) nValues=1 # generate some random values matNormalAllSame=cbind(rnorm(nValues),rnorm(nValues),rnorm(nValues)) matNormalDifferent=cbind(rnorm(nValues,1,1.5),rnorm(nValues,2,0.5),rnorm(nValues,6,1.8)) matUniformAllSame=cbind(runif(nValues),runif(nValues),runif(nValues)) matUniformDifferent=cbind(runif(nValues,1,1.5),runif(nValues,2,3.5),runif(nValues,6,10.8)) # bind to a matrix print(correlation Matrix) print(corMat) print(Cholesky Decomposition) print (cholMat) # test same normal resultMatNormalAllSame=matNormalAllSame%*%cholMat print(correlation matNormalAllSame) print(cor(resultMatNormalAllSame)) # test different normal resultMatNormalDifferent=matNormalDifferent%*%cholMat print(correlation matNormalDifferent) print(cor(resultMatNormalDifferent)) # test same uniform resultMatUniformAllSame=matUniformAllSame%*%cholMat print(correlation matUniformAllSame) print(cor(resultMatUniformAllSame)) # test different uniform resultMatUniformDifferent=matUniformDifferent%*%cholMat print(correlation matUniformDifferent) print(cor(resultMatUniformDifferent)) and results [1] correlation Matrix [,1] [,2] [,3] [1,] 1.0 0.6 0.3 [2,] 0.6 1.0 0.5 [3,] 0.3 0.5 1.0 [1] Cholesky Decomposition [,1] [,2] [,3] [1,]1 0.6 0.300 [2,]0 0.8 0.400 [3,]0 0.0 0.8660254 [1] correlation matNormalAllSame == ok [,1] [,2] [,3] [1,] 1.000 0.6036468 0.3013823 [2,] 0.6036468 1.000 0.5005440 [3,] 0.3013823 0.5005440 1.000 [1] correlation matNormalDifferent == no good [,1] [,2] [,3] [1,] 1.000 0.9141472 0.2676162 [2,] 0.9141472 1.000 0.2959178 [3,] 0.2676162 0.2959178 1.000 [1] correlation matUniformAllSame == ok [,1] [,2] [,3] [1,] 1.000 0.5971519 0.2959195 [2,] 0.5971519 1.000 0.5011267 [3,] 0.2959195 0.5011267 1.000 [1] correlation matUniformDifferent == no good [,1] [,2] [,3] [1,] 1.000 0.2312000 0.0351460 [2,] 0.2312000 1.000 0.1526293 [3,] 0.0351460 0.1526293 1.000 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] determine upper convex hull, 2-dimensional case
For an environmental planning example that involves looking at the relative efficiencies of one plan over another, I need to determine the pareto-efficient plans (which I have done), and then, within that set of plans, determine the convex hull representing the outer upper boundary of those points. I have a dataframe, dfPlans, as follows, representing the pareto-efficient cost and benefit of different environmental restoration plans out of a larger set of plans, where A1, EC, A6, and A4 are identifiers for the plans. Cost Benefit A1 0.00 0.000 EC 0.007821.689 A6 76783.1916094.142 A4 78703.7322245.760 I am interesting in determining what I believe is called the upper convex hull, i.e. the upper outer boundary if I plot benefit on the y axis, cost on the x axis. This should be plans A1, EC, and A4, and not point A6. I have used chull, which returns all of the points, including A6, and have tried to use convhulln with the QU option, but I am unclear as to how to interpret the results, which are returned as follows: chull(dfPlans) [1] 1 2 3 4 convhulln(dfPlans,option=QU) [,1] [,2] [1,]32 [2,]34 [3,]12 [4,]14 Any assistance greatly appreciated, any way to accomplish my goal (need not use convhulln or chull). Thanks in advance. -- Richard M. Males Cincinnati, OH USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fit.dist gnlm question, NaN and Inf results
I am attempting to fit discrete data (daily counts of arrivals of recreational vessels at locks on a river) using the fit.dist package. Some distributions return values of NaN and Inf for certain situations, an example with Inf values is shown below. # of vessels:1 23 4 5 6 7 8 9 10 11 # of days with # of vessels: 35 20 10 5 6 3 1 3 1 0 1 (stored in rTemp$counts) Can anyone tell me under what conditions I will get these Inf/NaN? Thanks in advance Richard Males Cincinnati, Ohio, USA my function call is as follows, for each of the distributions. fit.dist(c(1:length(rTemp$counts)),rTemp$counts,binomial) output from fit.dist for above input situation binomial distribution, n = 85 mean variancenu.hat 2.6352941 4.6552249 0.2635294 -log likelihood AIC Inf Inf beta binomial distribution, n = 85 mean variancenu.hat rho.hat 2.6352941 4.6552249 0.2947923 0.1403921 -log likelihood AIC Inf Inf Poisson distribution, n = 85 mean variance mu.hat 2.635294 4.655225 2.635294 -log likelihood AIC 28.8635229.86352 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] flow map lines between point pairs (latitude/longitude)
made my day, thanks, worked just fine with a little bit of tweaking (flipping x and y around, basically). rapid response much appreciated Dick On Wed, Jul 2, 2008 at 5:25 PM, Ray Brownrigg [EMAIL PROTECTED] wrote: Here's one solution, YMMV: library(maps) # Note, plot the map first to get the aspect ratio (or projection) right map(county, xlim=range(df2VisitTrips[, c(5, 7)]), ylim=range(df2VisitTrips[, c(4, 6)]), col=8) map(state, add=T) map.axes() for (i in 1:length(df2VisitTrips[, 1])) { lines(df2VisitTrips[i, c(5, 7)], df2VisitTrips[i, c(4, 6)], lwd=0.2 + df2VisitTrips[i, 3]/10, col=i+1) } Ray Brownrigg On Thu, 03 Jul 2008, Richard and Barbara Males wrote: I have a dataset giving traffic between pairs of ports, and the lat/lon of each port, roughly as follows: sample data as follows df2VisitTrips[1:4,] Origin Destination NumberOfTrips OriginLatitude OriginLongitude DestinationLatitude DestinationLongitude 1 P1 P16 1 39.45965 -80.1563340.76111-79.54583 2 P1 P3 1 39.45965 -80.1563339.58861-79.98222 3 P102P108 19 36.98210 -88.21597 37.19667-88.84389 4 P102P109 71 36.98210 -88.21597 37.09472-89.13694 I am interested in plotting variable-width lines, based on NumberOfTrips, between point pairs (OriginLongitude,OriginLatitude) and (DestinationLongitude,DestinationLatititude), e.g. a flow map between the ports. At some point, I may wish to add an underlay base map (say counties in the US), but that is not critical at this time. There seem to be many packages in R that deal with spatial data (sp, maps, PBSMapping, etc.), but it is unclear to me which one would work best for this application. Any advice gratefully appreciated. R. Males Cincinnati, Ohio, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.