[R] mcmc simulation
Dear List, I am reading a method and am unsure how to do something similar, so have come here for advice! I have a observed value and a expected value generated from a null model. I am looking to see if observed - expected is significantly different from zero. So in the method in the paper it states: .. significance of difference from zero was tested with Markov chain Monte Carlo simulation I have installed the package mcmc as this seemed a good place to start but i am unsure what function to call? I have 600 data points in a data frame called diversity, where the observed value is called, well observed and the expected.. expected! Thanks for any advice. Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean from confidence intervals
Dear all, The data is generated from 1000 random samples of a phylogenetic tree to calculate phylogenetic diversity. I sampled the tree 1000 times at with various species communities (600) to get a random PD per community. I then want to test my observed PD with that of a random sample to test for significance. However the script i used, output q0.005 q0.01 etc upto q0.995 But i wanted to know the mean PD value per community based on the output, and that is where i am struggling Thanks, Peter On 29 Jan 2011, at 02:16, Joshua Wiley wrote: Hi Peter, Do you know the formula used to calculate the confidence interval? I suspect it is possible with minimal algebraic manipulation of the CI formula to find what the mean is. Assuming a normal distribution (as David), then it is certainly possible to find. This wikipedia page might help: http://en.wikipedia.org/wiki/Confidence_interval And no, this is not really the correct place to ask. My basic rule of thumb is, Does my question have anything to do with R? If my answer is, No. then I usually look for somewhere else to post. Of course, for a comprehensive list, see the posting guide. If you are wondering if there is a function to do it for you, I am not sure, but it would be trivial to programme and if you show us the formula for it (the mean from the CI), we can certainly give you pointers for how to write your own :) Cheers, Josh On Fri, Jan 28, 2011 at 2:15 PM, Peter Francis peterfran...@me.com wrote: Dear List, I am not sure if A) this is possible or B) the correct place to ask. I am looking to find the mean - i have n, and the two-tailed confidence intervals 0.95 0.25 with a p-value of 0.05. Can i find the mean from this data ? Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot 1:1 relationship
Dear list, I am looking to plot a line on a graph to show the 1:1 relationship, in order to demonstrate the pattern i have observed is off the line, however i am unsure. I have tried abline but i can not see a function to plot the 1:1? Any help would be greatly appreciated #generate data y - rnorm(30, mean=1.5, sd= 0.005) x - rnorm(30, mean=1.5, sd= 0.005) plot(x,y) Thanks, Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mean from confidence intervals
Dear List, I am not sure if A) this is possible or B) the correct place to ask. I am looking to find the mean - i have n, and the two-tailed confidence intervals 0.95 0.25 with a p-value of 0.05. Can i find the mean from this data ? Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sum by column
Dear List, I have a question of convenience, I am looking to sum the values of one column based on another column - a example may help explain better! ED ECOCODE 21.809467 AA0101 36.229566 PA1201 51.861284 PA1201 11.36232PA1201 27.264634 PA1201 12.261986 PA1201 46.519313 PA1201 7.815376PA1201 2.810428PA1201 13.478372 PA1201 35.670182 PA1301 27.128715 AT0801 19.010294 AT1201 15.475368 AT1201 18.597983 AT0101 29.292615 AT0101 6.749846AT0101 14.981488 AT0101 14.93511AT0101 14.93511AT0101 21.040785 AT0101 8.271615AT0101 12.94232AT0101 6.749846AT0101 15.484412 AT0101 29.644494 AT0101 43.211212 AT0101 So for AA0101 it would be = 21.809467 AT1201 it would be = 19.010294+15.475368 etc I would then like to be able to output a table with ECOCODE in one column and the sum of ED in the other. This is stored in a dataframe called ecoregion, i understand people like having code to change but i have none as i am a relative beginner! Sorry in advance! Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum by column
David and Josh, Thanks very much for your help, it is much appreciated. Peter On 12 Jan 2011, at 14:28, David Winsemius wrote: There are two functions you need to become familiar with: ?tapply ?ave If you wanted these summed values to be placed in another column of the same dataframe, you would use ave. If you wanted a new structure (somewhat shorter) you would use tapply with sum as the function. E. g: tapply(ecoregion$ED, ecoregion$ECOCODE, sum) -- David. On Jan 12, 2011, at 5:38 AM, Peter Francis wrote: Dear List, I have a question of convenience, I am looking to sum the values of one column based on another column - a example may help explain better! EDECOCODE 21.809467 AA0101 36.229566 PA1201 51.861284 PA1201 11.36232 PA1201 27.264634 PA1201 12.261986 PA1201 46.519313 PA1201 7.815376 PA1201 2.810428 PA1201 13.478372 PA1201 35.670182 PA1301 27.128715 AT0801 19.010294 AT1201 15.475368 AT1201 18.597983 AT0101 29.292615 AT0101 6.749846 AT0101 14.981488 AT0101 14.93511 AT0101 14.93511 AT0101 21.040785 AT0101 8.271615 AT0101 12.94232 AT0101 6.749846 AT0101 15.484412 AT0101 29.644494 AT0101 43.211212 AT0101 So for AA0101 it would be = 21.809467 AT1201 it would be = 19.010294+15.475368 etc I would then like to be able to output a table with ECOCODE in one column and the sum of ED in the other. This is stored in a dataframe called ecoregion, i understand people like having code to change but i have none as i am a relative beginner! Sorry in advance! Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dispersion ( Plot error bars ) help
Dear List, I am looking to plot error bars on a line using dispersion. I have values for the upper value and for the lower values, however i am unsure how to plot different values for the upper CI and the lower CI? I have been using dispersion(1:35,sim,simCI,col=red) Where there are 35 points, and simCI relates to a vector containing the lower confidence intervals, however i want different upper and lower bands and am unsure how to call this - my upper Ci values are called simCIUPPER Any ideas? Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot help
Dear List, I am relatively new to R and am trying to create more attractive plots than excel can manage! I have looked through the various programmes ggplot, lattice, hmisc etc but my case seems to be not metnioned, maybe it is but i have not noticed - if this is the case i apologise. * #I have a series of simulated values, which are means sim - c(0.0012,0.0009,2,2,9,12,0.0009,2,19,1,1,0.0013,1,0.0009,0.0009,1,26,3,1,2,1,0.0009,1,0.2323,4,2,0.0009,0.0009,0.0009,52,49,1,3,7) #and actual values actual - c(0,0,2,0,13,20,0,3,38,0,0,0,1,0,0,0,27,2,0,0,1,0,1,0,4,2,0,0,0,54,21,0,4,11) #The X axes is family, ranging from 1-35, where the Y axes is sim and actual values. #What i want to do is plot the simulated values with the 95% CI values, and then plot the actual values and see if they fall in the CI'S which they do. The idea is that there is no significant difference between the actual values and the simulated values. #I thave Ci for sim and this is where the trouble begins! simCI - c(0.000908781,0.001248025,0.000928731,0.000885441,0.002384808,0.002700088,0.005377963,0.006202863,0.000918969,0.002566072,0.007687229,0.001593536,0.001578519,0.001299327,0.00217493,0.000908781,0.00090428,0.001550469,0.008840134,0.003300862,0.001546501,0.002775418,0.0014778,0.00090428,0.001546201,0.000898151,0.003446757,0.002854941,0.000863444,0.000918969,0.000924599,0.011732253,0.011488353,0.001788464) # i then put this in a dataframe simvsact - data.frame(sim = sim, actual = actual, simCI.lower = sim - simCI, simCI.upper = sim + simCI, fam = factor(paste('Family', 1:34, sep = ''))) * As afore mentioned i was looking at getting a x/y scatter plot ( i think this would be best, if not other suggestions would be greatly appreciated) with the CI range block highlighted and the actual line a different colour running through the CI range. I hope this makes sense. Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot help
Dear List, I am relatively new to R and am trying to create more attractive plots than excel can manage! I have looked through the various programmes ggplot, lattice, hmisc etc but my case seems to be not metnioned, maybe it is but i have not noticed - if this is the case i apologise. * #I have a series of simulated values, which are means sim - c(0.0012,0.0009,2,2,9,12,0.0009,2,19,1,1,0.0013,1,0.0009,0.0009,1,26,3,1,2,1,0.0009,1,0.2323,4,2,0.0009,0.0009,0.0009,52,49,1,3,7) #and actual values actual - c(0,0,2,0,13,20,0,3,38,0,0,0,1,0,0,0,27,2,0,0,1,0,1,0,4,2,0,0,0,54,21,0,4,11) #The X axes is family, ranging from 1-35, where the Y axes is sim and actual values. #What i want to do is plot the simulated values with the 95% CI values, and then plot the actual values and see if they fall in the CI'S which they do. The idea is that there is no significant difference between the actual values and the simulated values. #I thave Ci for sim and this is where the trouble begins! simCI - c(0.000908781,0.001248025,0.000928731,0.000885441,0.002384808,0.002700088,0.005377963,0.006202863,0.000918969,0.002566072,0.007687229,0.001593536,0.001578519,0.001299327,0.00217493,0.000908781,0.00090428,0.001550469,0.008840134,0.003300862,0.001546501,0.002775418,0.0014778,0.00090428,0.001546201,0.000898151,0.003446757,0.002854941,0.000863444,0.000918969,0.000924599,0.011732253,0.011488353,0.001788464) # i then put this in a dataframe simvsact - data.frame(sim = sim, actual = actual, simCI.lower = sim - simCI, simCI.upper = sim + simCI, fam = factor(paste('Family', 1:34, sep = ''))) * As afore mentioned i was looking at getting a x/y scatter plot ( i think this would be best, if not other suggestions would be greatly appreciated) with the CI range block highlighted and the actual line a different colour running through the CI range. I hope this makes sense. Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Gini Coefficient
Dear List, I am unsure if this is specifically a R question or a stats question? I thought i would ask here and if i get no replies it will answer that! I am trying to calculate Gini coefficients in R, based on a slight modification of the typical equation that i have seen in a paper. PastedGraphic-2.pdf Description: Adobe PDF document where X is the cumulated proportion of Cars and Y is the cumulated proportion of People. The value k indexes from the first to the next to last (n-1). So i have a rough idea of how to implement this in R however i am unsure how the data should be sorted. Typically when i have calculated Gini coefficients in the past i have sorted X into ascending order then calculated the cumulative proportion from this. However if i have two factors X + Y i am unsure how to sort the data? I.E do i sort x and expand the section and also sort y based on the sorting of X, or do i sort X calculate the coefficient then sort Y and calculate coefficient and add them together? Once again i am sorry if this is completely the wrong place to ask such a question. Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Randomly shuffle an array 1000 times
Dear List, I have a table i have read into R: NameYes/No John0 Frank 1 Ann 0 James 1 Alex1 etc - 800 different times. What i want to do is shuffle yes/no and randomly re-assign them to the name. I have used sample() and permute(), however there is no way to do this 1000 times. Furthermore, i want to copy the data into a excel spreadsheet in the same order as the data was input so i can build up a distribution of the statistic for each name. When i use shuffle the date gets returned like this - [1] 1 0 0 1 0 1 0 0 0 1 1 1 1 1 1 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 [34] 0 1 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 1 0 0 0 [67] 0 0 0 0 0 0 0 1 1 1 0 1 0 0 1 1 0 1 0 1 0 1 1 1 0 0 1 0 0 1 1 1 1 [100] 1 1 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 [133] 0 0 0 0 0 0 1 0 1 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 [166] 0 0 0 1 1 1 0 1 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 0 1 1 0 1 [199] 0 0 0 0 1 0 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 [232] 0 0 0 1 1 0 1 0 0 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 1 0 0 0 0 0 0 1 [265] 0 1 0 0 0 1 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1 [298] 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 0 1 0 [331] 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1 0 1 1 etc Rather than like this John0 Frank 1 Ann 0 James 1 Alex1 Can anyone suggest a script that would achieve this? Thanks Peter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.