[R] mcmc simulation

2011-04-18 Thread Peter Francis
Dear List,

I am reading a method and am unsure how to do something similar, so have come 
here for advice!

I have a observed value and a expected value generated from a null model.  I am 
looking to see if observed - expected is significantly different from zero.

So in the method in the paper it states:  .. significance of difference from 
zero was tested with Markov chain Monte Carlo simulation

I have installed the package mcmc as this seemed a good place to start but i am 
unsure what function to call?

I have 600 data points in a data frame called diversity, where the observed 
value is called, well observed and the expected.. expected! 

Thanks for any advice.

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mean from confidence intervals

2011-01-29 Thread Peter Francis
Dear all,

The data is generated from 1000 random samples of a phylogenetic tree to 
calculate phylogenetic diversity. I sampled the tree 1000 times at with various 
species communities (600) to get a random PD per community. I then want to test 
my observed PD with that of a random sample to test for significance.

However the script i used, output 

q0.005  q0.01 etc upto q0.995

But i wanted to know the mean PD value per community based on the output, and 
that is where i am struggling

Thanks,

Peter
On 29 Jan 2011, at 02:16, Joshua Wiley wrote:

Hi Peter,

Do you know the formula used to calculate the confidence interval?  I
suspect it is possible with minimal algebraic manipulation of the CI
formula to find what the mean is.  Assuming a normal distribution (as
David), then it is certainly possible to find.  This wikipedia page
might help:

http://en.wikipedia.org/wiki/Confidence_interval

And no, this is not really the correct place to ask.  My basic rule of
thumb is, Does my question have anything to do with R?  If my answer
is, No. then I usually look for somewhere else to post.  Of course,
for a comprehensive list, see the posting guide.

If you are wondering if there is a function to do it for you, I am not
sure, but it would be trivial to programme and if you show us the
formula for it (the mean from the CI), we can certainly give you
pointers for how to write your own :)

Cheers,

Josh

On Fri, Jan 28, 2011 at 2:15 PM, Peter Francis peterfran...@me.com wrote:
 Dear List,
 
 I am not sure if A) this is possible or B) the correct place to ask.
 
 I am looking to find the mean - i have n, and the two-tailed confidence 
 intervals 0.95  0.25 with a  p-value of 0.05.
 
 Can i find the mean from this data ?
 
 Thanks
 
 Peter
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot 1:1 relationship

2011-01-28 Thread Peter Francis
Dear list,

I am looking to plot a line on a graph to show the 1:1 relationship, in order 
to demonstrate the pattern i have observed is off the line, however i am 
unsure. I have tried abline but i can not see a function to plot the 1:1?


Any help would be greatly appreciated

#generate data
y - rnorm(30, mean=1.5, sd= 0.005)
x - rnorm(30, mean=1.5, sd= 0.005)
plot(x,y)

Thanks,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mean from confidence intervals

2011-01-28 Thread Peter Francis
Dear List,

I am not sure if A) this is possible or B) the correct place to ask.

I am looking to find the mean - i have n, and the two-tailed confidence 
intervals 0.95  0.25 with a  p-value of 0.05.

Can i find the mean from this data ?

Thanks

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sum by column

2011-01-12 Thread Peter Francis
Dear List,

I have a question of convenience,

I am looking to sum the values of one column based on another column - a 
example may help explain better!

ED  ECOCODE
21.809467   AA0101
36.229566   PA1201
51.861284   PA1201
11.36232PA1201
27.264634   PA1201
12.261986   PA1201
46.519313   PA1201
7.815376PA1201
2.810428PA1201
13.478372   PA1201
35.670182   PA1301
27.128715   AT0801
19.010294   AT1201
15.475368   AT1201
18.597983   AT0101
29.292615   AT0101
6.749846AT0101
14.981488   AT0101
14.93511AT0101
14.93511AT0101
21.040785   AT0101
8.271615AT0101
12.94232AT0101
6.749846AT0101
15.484412   AT0101
29.644494   AT0101
43.211212   AT0101

So for AA0101 it would be = 21.809467
AT1201 it would be = 19.010294+15.475368

etc

I would then like to be able to output a table with ECOCODE in one column and 
the sum of ED in the other.

This is stored in a dataframe called ecoregion, i understand people like having 
code to change but i have none as i am a relative beginner! Sorry in advance!

Thanks 

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum by column

2011-01-12 Thread Peter Francis
David and Josh,

Thanks very much for your help, it is much appreciated.

Peter


On 12 Jan 2011, at 14:28, David Winsemius wrote:

There are two functions you need to become familiar with:

?tapply
?ave

If you wanted these summed values to be placed in another column of the same 
dataframe, you would use ave. If you wanted a new structure (somewhat shorter) 
you would use tapply with sum as the function. E. g:

tapply(ecoregion$ED, ecoregion$ECOCODE, sum)

-- 
David.

On Jan 12, 2011, at 5:38 AM, Peter Francis wrote:

 Dear List,
 
 I have a question of convenience,
 
 I am looking to sum the values of one column based on another column - a 
 example may help explain better!
 
 EDECOCODE
 21.809467 AA0101
 36.229566 PA1201
 51.861284 PA1201
 11.36232  PA1201
 27.264634 PA1201
 12.261986 PA1201
 46.519313 PA1201
 7.815376  PA1201
 2.810428  PA1201
 13.478372 PA1201
 35.670182 PA1301
 27.128715 AT0801
 19.010294 AT1201
 15.475368 AT1201
 18.597983 AT0101
 29.292615 AT0101
 6.749846  AT0101
 14.981488 AT0101
 14.93511  AT0101
 14.93511  AT0101
 21.040785 AT0101
 8.271615  AT0101
 12.94232  AT0101
 6.749846  AT0101
 15.484412 AT0101
 29.644494 AT0101
 43.211212 AT0101
 
 So for AA0101 it would be = 21.809467
   AT1201 it would be = 19.010294+15.475368
 
 etc
 
 I would then like to be able to output a table with ECOCODE in one column and 
 the sum of ED in the other.
 
 This is stored in a dataframe called ecoregion, i understand people like 
 having code to change but i have none as i am a relative beginner! Sorry in 
 advance!
 
 Thanks
 
 Peter
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dispersion ( Plot error bars ) help

2010-10-26 Thread Peter Francis
Dear List,

I am looking to plot error bars on a line using dispersion.

I have values for the upper value and for the lower values, however i am unsure 
how to plot different values for the upper CI and the lower CI?

I have been using 

dispersion(1:35,sim,simCI,col=red)

Where there are 35 points, and simCI relates to a vector containing the lower 
confidence intervals, however i want different upper and lower bands and am 
unsure how to call this  - my upper Ci values are called simCIUPPER

Any ideas?

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot help

2010-10-20 Thread Peter Francis
Dear List,

I am relatively new to R and am trying to create more attractive plots than 
excel can manage!

I have looked through the various programmes ggplot, lattice, hmisc etc but my 
case seems to be not metnioned, maybe it is but i have not noticed - if this is 
the case i apologise.

*
#I have a series of simulated values, which are means

sim - 
c(0.0012,0.0009,2,2,9,12,0.0009,2,19,1,1,0.0013,1,0.0009,0.0009,1,26,3,1,2,1,0.0009,1,0.2323,4,2,0.0009,0.0009,0.0009,52,49,1,3,7)

#and actual values

actual - 
c(0,0,2,0,13,20,0,3,38,0,0,0,1,0,0,0,27,2,0,0,1,0,1,0,4,2,0,0,0,54,21,0,4,11)

#The X axes is family, ranging from 1-35, where the Y axes is sim and 
actual values.

#What i want to do is plot the simulated values with the 95% CI values, and 
then plot the actual values and see if they fall in the CI'S which they do. The 
idea is that there is no significant difference between the actual values and 
the simulated values.

#I  thave Ci for sim and this is where the trouble begins!

simCI - 
c(0.000908781,0.001248025,0.000928731,0.000885441,0.002384808,0.002700088,0.005377963,0.006202863,0.000918969,0.002566072,0.007687229,0.001593536,0.001578519,0.001299327,0.00217493,0.000908781,0.00090428,0.001550469,0.008840134,0.003300862,0.001546501,0.002775418,0.0014778,0.00090428,0.001546201,0.000898151,0.003446757,0.002854941,0.000863444,0.000918969,0.000924599,0.011732253,0.011488353,0.001788464)

# i then put this in a dataframe
simvsact - data.frame(sim = sim, actual = actual, simCI.lower = sim - simCI, 
simCI.upper = sim + simCI, fam = factor(paste('Family', 1:34, sep = '')))

*

As afore mentioned i was looking at getting a x/y scatter plot ( i think this 
would be best, if not other suggestions would be greatly appreciated) with the 
CI range block highlighted and the actual line a different colour running 
through the CI range.

I hope this makes sense.

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot help

2010-10-20 Thread Peter Francis
Dear List,

I am relatively new to R and am trying to create more attractive plots than 
excel can manage!

I have looked through the various programmes ggplot, lattice, hmisc etc but my 
case seems to be not metnioned, maybe it is but i have not noticed - if this is 
the case i apologise.

*
#I have a series of simulated values, which are means

sim - 
c(0.0012,0.0009,2,2,9,12,0.0009,2,19,1,1,0.0013,1,0.0009,0.0009,1,26,3,1,2,1,0.0009,1,0.2323,4,2,0.0009,0.0009,0.0009,52,49,1,3,7)
 
#and actual values

actual - 
c(0,0,2,0,13,20,0,3,38,0,0,0,1,0,0,0,27,2,0,0,1,0,1,0,4,2,0,0,0,54,21,0,4,11)

#The X axes is family, ranging from 1-35, where the Y axes is sim and 
actual values.

#What i want to do is plot the simulated values with the 95% CI values, and 
then plot the actual values and see if they fall in the CI'S which they do. The 
idea is that there is no significant difference between the actual values and 
the simulated values.

#I  thave Ci for sim and this is where the trouble begins!

simCI - 
c(0.000908781,0.001248025,0.000928731,0.000885441,0.002384808,0.002700088,0.005377963,0.006202863,0.000918969,0.002566072,0.007687229,0.001593536,0.001578519,0.001299327,0.00217493,0.000908781,0.00090428,0.001550469,0.008840134,0.003300862,0.001546501,0.002775418,0.0014778,0.00090428,0.001546201,0.000898151,0.003446757,0.002854941,0.000863444,0.000918969,0.000924599,0.011732253,0.011488353,0.001788464)

# i then put this in a dataframe
simvsact - data.frame(sim = sim, actual = actual, simCI.lower = sim - simCI, 
simCI.upper = sim + simCI, fam = factor(paste('Family', 1:34, sep = '')))

*

As afore mentioned i was looking at getting a x/y scatter plot ( i think this 
would be best, if not other suggestions would be greatly appreciated) with the 
CI range block highlighted and the actual line a different colour running 
through the CI range.

I hope this makes sense.

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Gini Coefficient

2010-10-19 Thread Peter Francis
Dear List,

I am unsure if this is specifically a R question or a stats question? I thought 
i would ask here and if i get no replies it will answer that!

I am trying to calculate Gini coefficients in R, based on a slight modification 
of the typical  equation that i have seen in a paper. 



PastedGraphic-2.pdf
Description: Adobe PDF document




where X is the cumulated proportion of Cars  and Y is the cumulated proportion 
of People. The value k indexes from the first to the next to last (n-1).

So i have a rough idea of how to implement this in R however i am unsure how 
the data should be sorted. Typically when i have calculated Gini coefficients 
in the past i have sorted X into ascending order then calculated the 
cumulative proportion from this. However if i have two factors X + Y i am 
unsure how to sort the data? I.E do i sort x and expand the section and also 
sort y based on the sorting of  X, or do i sort X calculate the coefficient 
then sort Y and calculate coefficient and add them together?

Once again i am sorry if this is completely the wrong place to ask such a 
question.

Peter






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Randomly shuffle an array 1000 times

2010-10-18 Thread Peter Francis
Dear List,

I have a table i have read into R:

NameYes/No

John0
Frank   1
Ann 0
James   1
Alex1

etc  - 800 different times.

What i want to do is shuffle yes/no and randomly re-assign them to the name.

I have used sample() and permute(), however there is no way to do this 1000 
times. Furthermore, i want to copy the data into a excel spreadsheet in the 
same order as the data was input so i can build up a distribution of the 
statistic for each name. When i use shuffle the date gets returned like this -

[1] 1 0 0 1 0 1 0 0 0 1 1 1 1 1 1 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1
  [34] 0 1 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 1 0 0 0
  [67] 0 0 0 0 0 0 0 1 1 1 0 1 0 0 1 1 0 1 0 1 0 1 1 1 0 0 1 0 0 1 1 1 1
 [100] 1 1 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0
 [133] 0 0 0 0 0 0 1 0 1 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0
 [166] 0 0 0 1 1 1 0 1 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 0 1 1 0 1
 [199] 0 0 0 0 1 0 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1
 [232] 0 0 0 1 1 0 1 0 0 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 1 0 0 0 0 0 0 1
 [265] 0 1 0 0 0 1 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1
 [298] 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 0 1 0
 [331] 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 0 1 1
  
etc

Rather than like this

John0
Frank   1
Ann 0
James   1
Alex1

Can anyone suggest a script that would achieve this?

Thanks

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.