Re: [R] R Commander QQ Plot with triangular distribution

2011-03-09 Thread Richard and Barbara Males
Thank you for your prompt reply, I apologize for posting as an R
Commander question.  Your response is pointing me in the right
direction, but I am still not quite there.   The triangle package has
a qtriangle and dtriangle.   The formal arguments are a, b, and c.
(formals(qtriangle))  When I use R Commander as:

qqPlot(Transfer5000$Transfer.Rate, dist=triangle, a=3000,b=5000,c=4000)

I get the error message:

Error in qqPlot.default(Transfer5000$Transfer.Rate, dist = triangle,  :
  argument 5 matches multiple formal arguments

and I am unclear as to how to specify that the c argument needs to
pertain specifically to the qtriangle (or perhaps dtriangle) function
arguments, that is, how to avoid a conflict with other arguments of
qqPlot that are ambiguous when passed c..

Again, sorry to post here, as this has clearly wandered into other
territory.  My exploratory searches to resolve this take me well
beyond my current ability in R.  If there is not an obvious answer, I
will re-post specifically on the topic of specifying formal arguments
of distributions to qqPlot.

Thanks.

Dick Males

On Tue, Mar 8, 2011 at 11:19 AM, John Fox j...@mcmaster.ca wrote:
 Dear R. Males,

 This isn't really an R Commander question, since the qqPlot() function is in
 the car package and is just invoked by the R Commander.

 From ?qqPlot: distribution: root name of comparison distribution - e.g.,
 norm for the normal distribution ... Any distribution for which quantile
 and density functions exist in R (with prefixes q and d, respectively) may
 be used. Thus, if there's a qtriangle() and dtriangle(), I suppose that you
 should be able to get a QQ plot. Further from ?qqPlot:  ...    arguments
 such as df to be passed to the appropriate quantile function. Thus, you
 should *name* the arguments to be passed to qtriangle() -- perhaps (of
 course, you should use the correct names) min=3000, max=5000, mode=4000.

 I hope this helps,
  John

 
 John Fox
 Senator William McMaster
  Professor of Social Statistics
 Department of Sociology
 McMaster University
 Hamilton, Ontario, Canada
 http://socserv.mcmaster.ca/jfox




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Richard and Barbara Males
 Sent: March-08-11 10:49 AM
 To: r-help@r-project.org
 Subject: [R] R Commander QQ Plot with triangular distribution

 I am attempting to use the R Commander Graphs Quantile-Comparison
 functionality on a dataset, to compare with a triangular distribution.
   I have the package triangle.   My question is on the syntax of how
 to specify the parameters of the theoretical distribution in the
 Parameters field of the dialog box.  For example, the theoretical
 distribution has min of 3000, max of 5000, mode of 4000.   When I
 enter this info as 3000,5000,4000 in the parameters field, I get:

 qqPlot(EmpiricalData$Value, dist=triangle, 3000,5000,4000)

 it produces a plot, but I am not sure that this is correct.

 I have searched for examples, not found anything.  Any help much
 appreciated.

 R. Males
 Cincinnati, Ohio, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Commander QQ Plot with triangular distribution

2011-03-09 Thread Richard and Barbara Males
Wow, quick response, works perfectly, just as needed.

Thanks to both of you for pointing me in the right direction, and for
your contributions to the R community.

Dick

On Wed, Mar 9, 2011 at 11:21 AM, Peter Ehlers ehl...@ucalgary.ca wrote:
 On 2011-03-09 07:53, Richard and Barbara Males wrote:

 Thank you for your prompt reply, I apologize for posting as an R
 Commander question.  Your response is pointing me in the right
 direction, but I am still not quite there.   The triangle package has
 a qtriangle and dtriangle.   The formal arguments are a, b, and c.
 (formals(qtriangle))  When I use R Commander as:

 qqPlot(Transfer5000$Transfer.Rate, dist=triangle, a=3000,b=5000,c=4000)

 I get the error message:

 Error in qqPlot.default(Transfer5000$Transfer.Rate, dist = triangle,  :
   argument 5 matches multiple formal arguments

 and I am unclear as to how to specify that the c argument needs to
 pertain specifically to the qtriangle (or perhaps dtriangle) function
 arguments, that is, how to avoid a conflict with other arguments of
 qqPlot that are ambiguous when passed c..

 Again, sorry to post here, as this has clearly wandered into other
 territory.  My exploratory searches to resolve this take me well
 beyond my current ability in R.  If there is not an obvious answer, I
 will re-post specifically on the topic of specifying formal arguments
 of distributions to qqPlot.

 Dick,

 I think that the author of the triangle package has chosen
 an unfortunate name for one of the parameters. Here's a
 work-around:

 ## define two new functions dtri() and qtri():
  dtri - function(q, a=0, b=1, cc=.5) dtriangle(q, a, b, cc)
  qtri - function(p, a=0, b=1, cc=.5) qtriangle(p, a, b, cc)

 ## now use tri as the distribution to pass to qqPlot:
  qqPlot(x, tri, a=3000, b=5000, cc=4000)

 Peter Ehlers


 Thanks.

 Dick Males

 On Tue, Mar 8, 2011 at 11:19 AM, John Foxj...@mcmaster.ca  wrote:

 Dear R. Males,

 This isn't really an R Commander question, since the qqPlot() function is
 in
 the car package and is just invoked by the R Commander.

  From ?qqPlot: distribution: root name of comparison distribution -
 e.g.,
 norm for the normal distribution ... Any distribution for which
 quantile
 and density functions exist in R (with prefixes q and d, respectively)
 may
 be used. Thus, if there's a qtriangle() and dtriangle(), I suppose that
 you
 should be able to get a QQ plot. Further from ?qqPlot:  ...    arguments
 such as df to be passed to the appropriate quantile function. Thus, you
 should *name* the arguments to be passed to qtriangle() -- perhaps (of
 course, you should use the correct names) min=3000, max=5000, mode=4000.

 I hope this helps,
  John

 
 John Fox
 Senator William McMaster
  Professor of Social Statistics
 Department of Sociology
 McMaster University
 Hamilton, Ontario, Canada
 http://socserv.mcmaster.ca/jfox




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Richard and Barbara Males
 Sent: March-08-11 10:49 AM
 To: r-help@r-project.org
 Subject: [R] R Commander QQ Plot with triangular distribution

 I am attempting to use the R Commander Graphs Quantile-Comparison
 functionality on a dataset, to compare with a triangular distribution.
   I have the package triangle.   My question is on the syntax of how
 to specify the parameters of the theoretical distribution in the
 Parameters field of the dialog box.  For example, the theoretical
 distribution has min of 3000, max of 5000, mode of 4000.   When I
 enter this info as 3000,5000,4000 in the parameters field, I get:

 qqPlot(EmpiricalData$Value, dist=triangle, 3000,5000,4000)

 it produces a plot, but I am not sure that this is correct.

 I have searched for examples, not found anything.  Any help much
 appreciated.

 R. Males
 Cincinnati, Ohio, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Commander QQ Plot with triangular distribution

2011-03-08 Thread Richard and Barbara Males
I am attempting to use the R Commander Graphs Quantile-Comparison
functionality on a dataset, to compare with a triangular distribution.
  I have the package triangle.   My question is on the syntax of how
to specify the parameters of the theoretical distribution in the
Parameters field of the dialog box.  For example, the theoretical
distribution has min of 3000, max of 5000, mode of 4000.   When I
enter this info as 3000,5000,4000 in the parameters field, I get:

qqPlot(EmpiricalData$Value, dist=triangle, 3000,5000,4000)

it produces a plot, but I am not sure that this is correct.

I have searched for examples, not found anything.  Any help much appreciated.

R. Males
Cincinnati, Ohio, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] generating correlated random variables from different distributions

2010-05-02 Thread Richard and Barbara Males
Thank you for your reply.  The application is a Monte Carlo simulation
in environmental planning.   Different possible remediation measures
have different costs, and produce different results.  For example, a
$20,000 plan may add 10 acres of wetlands and 12 acres of bird
habitat.  The desire is to describe the uncertainty in the cost and
the outputs (acres of wetlands, acres of bird habitat) by
distributions.  The cost may be described by a normal distribution,
mean $20k,  $5k SD, and the 12 acres of birds may be described by a
uniform distribution (10 to 14).  [These are just examples, not
representative of a real problem].  We may know (or think) that
wetlands and bird habitat are positively correlated (0.6), and that
there is a stronger correlation of both with cost (0.85).  So the
effort is to generate, through MCS, values at each iteration of cost,
acres of wetland, and acres of bird habitat, such that the resultant
values give the same correlation, and the values of cost, bird habitat
and wetland habitat return the input distributions.  The overall
desire is compare different remediation measures, taking into account
uncertainty in costs and results.

One possible approach (although I have not tried it yet, but will do
so in the near future) is to generate, for each iteration, three
independent (0,1) random variables, correlate them via the Cholesky
approach, and use them as input to the inverse normal, inverse
uniform, etc. to get the three variables for each iteration.  The
primary distributions of interest are normal, uniform, triangular,
gamma, and arbitrary cdf, so this approach seems plausible in that
inverse distributions are readily available.

Thanks in advance.

Dick Males
Cincinnati, OH, USA

On Thu, Apr 29, 2010 at 12:31 PM, Greg Snow greg.s...@imail.org wrote:
 The method you are using (multiply by cholesky) works for normal 
 distributions, but not necessarily for others (if you want different 
 means/sd, then add/multiply after transforming).

 For other distributions this process can sometimes give the correlation you 
 want, but may change the variable(s) to no longer have the desired 
 distribution.

 The short answer to your question is It Depends, the full long answer could 
 fill a full semester course.  If you tell us more of your goal we may be able 
 to give a more useful answer.  The copula package is one possibility.  If you 
 know the conditional distribution of each variable given the others then you 
 can use gibbs sampling.

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Richard and Barbara Males
 Sent: Thursday, April 29, 2010 9:18 AM
 To: r-help@r-project.org
 Subject: [R] generating correlated random variables from different
 distributions

 I need to generate a set of correlated random variables for a Monte
 Carlo simulation.   The solutions I have found
 (http://www.stat.uiuc.edu/stat428/cndata.html,
 http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers), using
 Cholesky Decomposition, seem to work only if the variables come from
 the same distribution with the same parameters.  My situation is that
 each variable may be described by a different distribution (or
 different parameters of the same distribution).  This approach does
 not seem to work, see code and results below.  Am I missing something
 here?  My math/statistics is not very good, will I need to generate
 correlated uniform random variables on (0,1) and then use the inverse
 distributions to get the desired results I am looking for?  That is
 acceptable, but I would prefer to just generate the individual
 distributions and then correlate them.  Any advice much appreciated.
 Thanks in advance

 R. Males
 Cincinnati, Ohio, USA

 Sample Code:
 # Testing Correlated Random Variables

 # reference
 http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers
 # reference http://www.stat.uiuc.edu/stat428/cndata.html
 # create the correlation matrix
 corMat=matrix(c(1,0.6,0.3,0.6,1,0.5,0.3,0.5,1),3,3)
 cholMat=chol(corMat)
 # create the matrix of random variables
 set.seed(1000)
 nValues=1

 # generate some random values

 matNormalAllSame=cbind(rnorm(nValues),rnorm(nValues),rnorm(nValues))
 matNormalDifferent=cbind(rnorm(nValues,1,1.5),rnorm(nValues,2,0.5),rnor
 m(nValues,6,1.8))
 matUniformAllSame=cbind(runif(nValues),runif(nValues),runif(nValues))
 matUniformDifferent=cbind(runif(nValues,1,1.5),runif(nValues,2,3.5),run
 if(nValues,6,10.8))

 # bind to a matrix
 print(correlation Matrix)
 print(corMat)
 print(Cholesky Decomposition)
 print (cholMat)

 # test same normal

 resultMatNormalAllSame=matNormalAllSame%*%cholMat
 print(correlation matNormalAllSame)
 print(cor(resultMatNormalAllSame))

 # test different normal

 resultMatNormalDifferent=matNormalDifferent%*%cholMat
 print(correlation

[R] generating correlated random variables from different distributions

2010-04-29 Thread Richard and Barbara Males
I need to generate a set of correlated random variables for a Monte
Carlo simulation.   The solutions I have found
(http://www.stat.uiuc.edu/stat428/cndata.html,
http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers), using
Cholesky Decomposition, seem to work only if the variables come from
the same distribution with the same parameters.  My situation is that
each variable may be described by a different distribution (or
different parameters of the same distribution).  This approach does
not seem to work, see code and results below.  Am I missing something
here?  My math/statistics is not very good, will I need to generate
correlated uniform random variables on (0,1) and then use the inverse
distributions to get the desired results I am looking for?  That is
acceptable, but I would prefer to just generate the individual
distributions and then correlate them.  Any advice much appreciated.
Thanks in advance

R. Males
Cincinnati, Ohio, USA

Sample Code:
# Testing Correlated Random Variables

# reference http://www.sitmo.com/doc/Generating_Correlated_Random_Numbers
# reference http://www.stat.uiuc.edu/stat428/cndata.html
# create the correlation matrix
corMat=matrix(c(1,0.6,0.3,0.6,1,0.5,0.3,0.5,1),3,3)
cholMat=chol(corMat)
# create the matrix of random variables
set.seed(1000)
nValues=1

# generate some random values

matNormalAllSame=cbind(rnorm(nValues),rnorm(nValues),rnorm(nValues))
matNormalDifferent=cbind(rnorm(nValues,1,1.5),rnorm(nValues,2,0.5),rnorm(nValues,6,1.8))
matUniformAllSame=cbind(runif(nValues),runif(nValues),runif(nValues))
matUniformDifferent=cbind(runif(nValues,1,1.5),runif(nValues,2,3.5),runif(nValues,6,10.8))

# bind to a matrix
print(correlation Matrix)
print(corMat)
print(Cholesky Decomposition)
print (cholMat)

# test same normal

resultMatNormalAllSame=matNormalAllSame%*%cholMat
print(correlation matNormalAllSame)
print(cor(resultMatNormalAllSame))

# test different normal

resultMatNormalDifferent=matNormalDifferent%*%cholMat
print(correlation matNormalDifferent)
print(cor(resultMatNormalDifferent))

# test same uniform
resultMatUniformAllSame=matUniformAllSame%*%cholMat
print(correlation matUniformAllSame)
print(cor(resultMatUniformAllSame))

# test different uniform
resultMatUniformDifferent=matUniformDifferent%*%cholMat
print(correlation matUniformDifferent)
print(cor(resultMatUniformDifferent))

and results

[1] correlation Matrix
 [,1] [,2] [,3]
[1,]  1.0  0.6  0.3
[2,]  0.6  1.0  0.5
[3,]  0.3  0.5  1.0
[1] Cholesky Decomposition
 [,1] [,2]  [,3]
[1,]1  0.6 0.300
[2,]0  0.8 0.400
[3,]0  0.0 0.8660254
[1] correlation matNormalAllSame == ok
  [,1]  [,2]  [,3]
[1,] 1.000 0.6036468 0.3013823
[2,] 0.6036468 1.000 0.5005440
[3,] 0.3013823 0.5005440 1.000
[1] correlation matNormalDifferent == no good
  [,1]  [,2]  [,3]
[1,] 1.000 0.9141472 0.2676162
[2,] 0.9141472 1.000 0.2959178
[3,] 0.2676162 0.2959178 1.000
[1] correlation matUniformAllSame == ok
  [,1]  [,2]  [,3]
[1,] 1.000 0.5971519 0.2959195
[2,] 0.5971519 1.000 0.5011267
[3,] 0.2959195 0.5011267 1.000
[1] correlation matUniformDifferent == no good
  [,1]  [,2]  [,3]
[1,] 1.000 0.2312000 0.0351460
[2,] 0.2312000 1.000 0.1526293
[3,] 0.0351460 0.1526293 1.000


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] determine upper convex hull, 2-dimensional case

2010-03-22 Thread Richard and Barbara Males
For an environmental planning example that involves looking at the
relative efficiencies of one plan over another, I need to determine
the pareto-efficient plans (which I have done), and then, within that
set of plans, determine the convex hull representing the outer upper
boundary of those points.  I have a dataframe, dfPlans,  as follows,
representing the pareto-efficient  cost and benefit of different
environmental restoration plans out of a larger set of plans, where
A1, EC, A6, and A4 are identifiers for the plans.

  Cost  Benefit
A1  0.00  0.000
EC  0.007821.689
A6  76783.1916094.142
A4  78703.7322245.760

I am interesting in determining what I believe is called the upper
convex hull, i.e. the upper outer boundary if I plot benefit on the y
axis, cost on the x axis.  This should be plans A1, EC, and A4, and
not point A6.   I have used chull, which returns all of the points,
including A6, and have tried to use convhulln with the QU option, but
I am unclear as to how to interpret the results, which are returned as
follows:

 chull(dfPlans)
[1] 1 2 3 4
 convhulln(dfPlans,option=QU)
   [,1] [,2]
[1,]32
[2,]34
[3,]12
[4,]14

Any assistance greatly appreciated, any way to accomplish my goal
(need not use convhulln or chull).

Thanks in advance.

--
Richard M. Males
Cincinnati, OH USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fit.dist gnlm question, NaN and Inf results

2008-07-25 Thread Richard and Barbara Males
I am attempting to fit discrete data (daily counts of arrivals of
recreational vessels at locks on a river) using the fit.dist package.
Some distributions return values of NaN and Inf for certain
situations, an example with Inf values is shown below.

# of vessels:1   23  4  5
6  7  8  9 10 11
# of days with # of vessels:  35  20 10  5  6  3  1  3
 1  0  1  (stored in rTemp$counts)

Can anyone tell me under what conditions I will get these Inf/NaN?

Thanks in advance

Richard Males
Cincinnati, Ohio, USA

my function call is as follows, for each of the distributions.

fit.dist(c(1:length(rTemp$counts)),rTemp$counts,binomial)

output from fit.dist for above input situation

binomial distribution,  n = 85

 mean  variancenu.hat
2.6352941 4.6552249 0.2635294

-log likelihood AIC
Inf Inf


beta binomial distribution,  n = 85

 mean  variancenu.hat   rho.hat
2.6352941 4.6552249 0.2947923 0.1403921

-log likelihood AIC
Inf Inf


Poisson distribution,  n = 85

mean variance   mu.hat
2.635294 4.655225 2.635294

-log likelihood AIC
   28.8635229.86352

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] flow map lines between point pairs (latitude/longitude)

2008-07-02 Thread Richard and Barbara Males
made my day, thanks, worked just fine with a little bit of tweaking
(flipping x and y around, basically).

rapid response much appreciated

Dick

On Wed, Jul 2, 2008 at 5:25 PM, Ray Brownrigg
[EMAIL PROTECTED] wrote:
 Here's one solution, YMMV:
 library(maps)
 # Note, plot the map first to get the aspect ratio (or projection) right
 map(county, xlim=range(df2VisitTrips[, c(5, 7)]), 
 ylim=range(df2VisitTrips[, c(4, 6)]),
 col=8)
 map(state, add=T)
 map.axes()
 for (i in 1:length(df2VisitTrips[, 1])) {
  lines(df2VisitTrips[i, c(5, 7)], df2VisitTrips[i, c(4, 6)], lwd=0.2 + 
 df2VisitTrips[i,
 3]/10, col=i+1)
 }

 Ray Brownrigg

 On Thu, 03 Jul 2008, Richard and Barbara Males wrote:
 I have a dataset giving traffic between pairs of ports, and the
 lat/lon of each port, roughly as follows:

 sample data as follows

  df2VisitTrips[1:4,]

   Origin Destination NumberOfTrips OriginLatitude OriginLongitude
 DestinationLatitude DestinationLongitude
 1 P1 P16 1   39.45965
 -80.1563340.76111-79.54583
 2 P1  P3  1   39.45965
 -80.1563339.58861-79.98222
 3   P102P108  19  36.98210   -88.21597
37.19667-88.84389
 4   P102P109   71  36.98210
 -88.21597   37.09472-89.13694


 I am interested in plotting variable-width lines, based on
 NumberOfTrips, between point pairs (OriginLongitude,OriginLatitude)
 and (DestinationLongitude,DestinationLatititude), e.g. a flow map
 between the ports.  At some point, I may wish to add an underlay base
 map (say counties in the US), but that is not critical at this time.

 There seem to be many packages in R that deal with spatial data (sp,
 maps, PBSMapping, etc.), but it is unclear to me which one would work
 best for this application.

 Any advice gratefully appreciated.

 R. Males
 Cincinnati, Ohio, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.