[R] factor() in lm
Dear R users, I am running a linear regression in R. My observations are Census Tracts in several metropolitan areas (MSAs). In my data set, each MSA has at least 50 observations. I use factor(msa_code) in the lm formula to control for metropolitan fixed effects. But I kept getting something like this: . factor(msa_code)12420 4.910e-01 1.517e-01 3.237 0.001221 ** factor(msa_code)12580 1.966e-01 6.861e-02 2.865 0.004194 ** factor(msa_code)14460 -3.892e-02 1.653e-02 -2.355 0.018601 * factor(msa_code)16980 -2.873e-01 3.278e-02 -8.764 2e-16 *** factor(msa_code)17140 1.088e-01 6.771e-02 1.607 0.108127 factor(msa_code)17460 -1.173e-01 4.380e-02 -2.678 0.007441 ** factor(msa_code)19100 1.368e-01 5.550e-02 2.465 0.013753 * factor(msa_code)19740 5.819e-01 1.173e-01 4.962 7.33e-07 *** factor(msa_code)19820 -4.214e-01 6.641e-02 -6.346 2.51e-10 *** factor(msa_code)26420 1.258e-01 7.541e-02 1.668 0.095486 . factor(msa_code)28140 2.010e-01 3.847e-02 5.224 1.85e-07 *** factor(msa_code)29820 7.102e-02 6.593e-02 1.077 0.281435 factor(msa_code)31100 -4.832e-01 1.088e-01 -4.440 9.28e-06 *** factor(msa_code)33100 -2.534e-01 6.391e-02 -3.965 7.49e-05 *** factor(msa_code)33460 5.229e-02 7.891e-02 0.663 0.507609 factor(msa_code)35620 -3.197e-01 7.565e-02 -4.225 2.45e-05 *** factor(msa_code)36740 1.269e-01 6.948e-02 1.826 0.067868 . factor(msa_code)37980 1.394e-01 4.388e-02 3.178 0.001497 ** factor(msa_code)38060 -6.935e-02 6.124e-02 -1.132 0.257540 factor(msa_code)38300 1.647e-01 3.986e-02 4.133 3.67e-05 *** factor(msa_code)38900 2.605e-01 1.420e-01 1.835 0.04 . factor(msa_code)39300 -9.612e-02 4.704e-02 -2.043 0.041103 * factor(msa_code)40140 -2.353e-01 3.562e-02 -6.605 4.59e-11 *** factor(msa_code)40900 NA NA NA NA factor(msa_code)41740 NA NA NA NA factor(msa_code)41860 NA NA NA NA factor(msa_code)42660 NA NA NA NA factor(msa_code)45300 NA NA NA NA factor(msa_code)47900 NA NA NA NA I wonder why I kep getting those NAs. Thank you! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] factor() in lm
Thanks, Bert. It seems I got these NAs because I already had MSA population controlled for in my model, besides the fixed effect variable, which led to overestimation. Those NAs disappeared after I dropped the population variable. Gary On Sun, Dec 1, 2013 at 10:27 AM, Bert Gunter gunter.ber...@gene.com wrote: You may wish to talk to a local statistician or read up on linear models, as you appear to not understand some basics. Anyway, either 1. You have other covariates in your model that you haven't shown and your model is overdetermined. 2. You have NA's in your data that causes 1) to occur. As an example of the above: x - rep(letters[1:3],e=5) y - factor(rep(1:3,c(5,8,2))) summary(lm(rnorm(15)~x+y)) Call: lm(formula = rnorm(15) ~ x + y) Residuals: Min 1Q Median 3Q Max -1.6768 -0.3865 -0.1108 0.3090 1.9632 Coefficients: (1 not defined because of singularities) Estimate Std. Error t value Pr(|t|) (Intercept) 0.041380.47160 0.0880.932 xb 1.592591.17111 1.3600.201 xc 0.368220.88228 0.4170.684 y2 -1.585170.96264 -1.6470.128 y3NA NA NA NA Incidentally, I was surprised to find in R3.0.2 that if some levels of a factor are missing either due to NA's in the response or otherwise, R estimates the coefficients for the remaining factor levels quite nicely. I expected it to complain, but it did not. Maybe it has always been so nicely behaved -- I don't fit overdetermined models and take care that my factor levels are actually present, so don't run into trouble. But if this is newish behavior and you are using an oldish version, you might try upgrading to the current version. Or (more likely) both clauses of this conditional are false and should be ignored, and I should preemptively apologize for my foolishness. Cheers, Bert On Sun, Dec 1, 2013 at 9:48 AM, Gary Dong pdxgary...@gmail.com wrote: Dear R users, I am running a linear regression in R. My observations are Census Tracts in several metropolitan areas (MSAs). In my data set, each MSA has at least 50 observations. I use factor(msa_code) in the lm formula to control for metropolitan fixed effects. But I kept getting something like this: . factor(msa_code)12420 4.910e-01 1.517e-01 3.237 0.001221 ** factor(msa_code)12580 1.966e-01 6.861e-02 2.865 0.004194 ** factor(msa_code)14460 -3.892e-02 1.653e-02 -2.355 0.018601 * factor(msa_code)16980 -2.873e-01 3.278e-02 -8.764 2e-16 *** factor(msa_code)17140 1.088e-01 6.771e-02 1.607 0.108127 factor(msa_code)17460 -1.173e-01 4.380e-02 -2.678 0.007441 ** factor(msa_code)19100 1.368e-01 5.550e-02 2.465 0.013753 * factor(msa_code)19740 5.819e-01 1.173e-01 4.962 7.33e-07 *** factor(msa_code)19820 -4.214e-01 6.641e-02 -6.346 2.51e-10 *** factor(msa_code)26420 1.258e-01 7.541e-02 1.668 0.095486 . factor(msa_code)28140 2.010e-01 3.847e-02 5.224 1.85e-07 *** factor(msa_code)29820 7.102e-02 6.593e-02 1.077 0.281435 factor(msa_code)31100 -4.832e-01 1.088e-01 -4.440 9.28e-06 *** factor(msa_code)33100 -2.534e-01 6.391e-02 -3.965 7.49e-05 *** factor(msa_code)33460 5.229e-02 7.891e-02 0.663 0.507609 factor(msa_code)35620 -3.197e-01 7.565e-02 -4.225 2.45e-05 *** factor(msa_code)36740 1.269e-01 6.948e-02 1.826 0.067868 . factor(msa_code)37980 1.394e-01 4.388e-02 3.178 0.001497 ** factor(msa_code)38060 -6.935e-02 6.124e-02 -1.132 0.257540 factor(msa_code)38300 1.647e-01 3.986e-02 4.133 3.67e-05 *** factor(msa_code)38900 2.605e-01 1.420e-01 1.835 0.04 . factor(msa_code)39300 -9.612e-02 4.704e-02 -2.043 0.041103 * factor(msa_code)40140 -2.353e-01 3.562e-02 -6.605 4.59e-11 *** factor(msa_code)40900 NA NA NA NA factor(msa_code)41740 NA NA NA NA factor(msa_code)41860 NA NA NA NA factor(msa_code)42660 NA NA NA NA factor(msa_code)45300 NA NA NA NA factor(msa_code)47900 NA NA NA NA I wonder why I kep getting those NAs. Thank you! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible
[R] summary many regressions
Dear R users, I have a large data set which includes data from 300 cities. I want to run a biviriate regression for each city and record the coefficient and the adjusted R square. For example, in the following, I have 10 cities represented by numbers from 1 to 10: x = cumsum(c(0, runif(999, -1, +1))) y = cumsum(c(0, runif(999, -1, +1))) city = rep(1:10,each=100) data-data.frame(cbind(x,y,city)) I can manually run regressions for each city: fit_city1 - lm(y ~ x,data=subset(data,data$city==1)) summary(fit_city1) Obvious, it is very tedious to run 300 regressions. I wonder if there is a quicker way to do this. Use for loop? what I want to see is something like this: CityCoefficient Adjusted R square 1 -0.05 0.36 2 -0.12 0.20 3 -0.05 0.32 . Any advice is appreciated! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] if else in R
Dear R users, I am a R beginner and I am having trouble in using If Else in R. Here is an example: ## create a data frame a-c(1,2,3,4) b-c(2,0,5,0) ab-data.frame(cbind(a,b)) ##calculate c, which is the ratio between a and b if(ab$b0) { ab$c-ab$a/ab$b } else { ab$c-0 } here is the error I got: Warning message: In if (ab$b 0) { : the condition has length 1 and only the first element will be used. Any help is appreciated! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] find max value in each row and return column number and column name
Dear R users, I wonder how I can use R to identify the max value of each row, the column number column name: For example: a - data.frame(x = rnorm(4), y = rnorm(4), z = rnorm(4)) a x y z 1 -0.7289964 0.2194702 -2.4674780 2 1.0889353 0.3167629 -0.9208548 3 -0.6374692 -1.7249049 0.6567313 4 -0.1348642 0.4507473 -1.7309010 In this data frame, I compare y and z only. What I need: x y z max max.col.num max.col.name 1 -0.7289964 0.2194702 -2.4674780 0.2194702 2 y 2 1.0889353 0.3167629 -0.9208548 0.3167629 2 y 3 -0.6374692 -1.7249049 0.6567313 0.6567313 3 z 4 -0.1348642 0.4507473 -1.7309010 0.4507473 2 y Any suggestion will be greatly appreciated! Thank you! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extraction of roots in R
Dear R users, I wonder if R has a default function that I can use to do extraction of roots. Here is an example: X N 2.5 5 3.4 7 8.9 9 6.4 1 2.1 0 1.1 2 I want to calculate Y = root(X)^N, where N represents the power. what is the easy way to do this? Thank you! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot time series data in wide format
Dear R users, I wonder if there is a way that I can plot a time series data which is in a wide format like this: CITY_NAME 2000Q12000Q2 2000Q32000Q4 2001Q1 2001Q2 2001Q3 2001Q4 2002Q1 2002Q2 CITY1100.5210 101.9667 103.24933 104.0506 104.4317 105.3921 106.7643 107.5202 107.2561 107.8184 CITY2100.0412 100.6146 103.20293 104.0867 104.6612 106.6126 109.3514 110.1943 110.9480 113.0071 CITY3 99.589599.2298 99.2694799.4101 100.5776 101.3719 101.5957 102.2411 103.4390 105.1745 CITY4 99.6491 101.5386 104.90953 106.1065 108.1785 110.6845 113.3746 114.1254 116.2121 119.1033 CITY5100.9828 103.6847 105.04793 106.5925 108.7437 110.5549 111.9343 112.6704 113.6201 115.3020 Ideally, each city of the five city is represented by a line in the plot. Any suggestion is appreciated! Thanks! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interpretation of coefficients in spatial lag models
Dear R users, I have estimated a spatial lag model using the spdep package. I understand that to interpret the magnitude of coefficients correctly, I have to use impact() to calculate the direct and indirect impacts. Here are the estimation results for a variable X, which is statistically significant in the model: Coefficient: 0.104 Direct impact: 0.105 Indirect impact: 0.02 Total impact: 0.107 What I want to know through the model is how the change in X impacts the dependent variable Y. Based on the results shown above, is this the correct way to interpret the impact of X on Y: The average total impact of a one-unit increase in X on Y is about 0.107. In other words, a one-unit increase in X is associated with a 0.107 unit increase in Y. Thanks! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Logged ratio as Dependent variable
Dear R users, I have a question regarding using logged ratio as dependent variable in regression. I am reading a paper discussing how a waste management facility has influenced housing price in surrounding areas. The author used Ln(P2/P1) as the dependent variable, where P1 and P2 represent trade prices of the same property before and after the opening of the facility. One key explanatory variable is the distance from the property to the facility. The author used spatial lag models to control for spatial autocorrelation between nearby properties. The author explained in the paper that the logged ratio is equivalent to the percentage change in the property price between time 1 and time 2. Thus, if the coefficient of the distance variable is 0.02, it means that with 1 mile closer to the facility, homes have a net appreciation rate 2% lower than comparable properties. I wonder if this is the appropriate way to interpret model results. If yes, I also wonder how the distance coefficient should be interpreted if the distance variable takes natural log form in the regression. Thank you! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in x[[1]] : subscript out of bounds
Dear R users, I am estimating a nested logit model using mlogit, but keep getting this error message: Error in x[[1]] : subscript out of bounds Anyone can tell me what this error means? Simple MNL model worked with the same data. Thank you! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mlogit does not report number of observation?
Dear R users, I am wondering if there is an easy way to have mlogit to report number of observations (N) used in the model. It seems summary() does not work. Thank you! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mlogit error
Hi, Graham, Thank you for sharing this. I encountered the same problem. My choice set was unbalanced because some alternatives were not available for some observations. If I forced the choice sets balanced, model estimation results would be biased. However, after including chid.var = and alt.var= in the script, the model worked fined, with unbalanced choice set. Gary On Mon, Apr 8, 2013 at 1:13 AM, Leask, Graham g.le...@aston.ac.uk wrote: Dear Listmembers Thank you for your help in resolving the duplicate row.names error. The solution was very straightforward to ensure that the choice set is completely balanced. Once that was achieved the program worked fine. Kind Regards Graham __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] write table in ascii
Dear R users, I have a data set in .csv and I hope to convert it to .dat (ascii) so it can work in an UNIX environment. Anyone can help me? Thank you! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write table in ascii
I am working in Windows system. But a software I am using requires data input in ascii (.dat, delimiters can be tabs or spaces). I was able to save the data into .dat (ascii) from SPSS, but it keeps causing problems. Thank you! Gary On Thu, May 30, 2013 at 3:51 PM, Sarah Goslee sarah.gos...@gmail.comwrote: On Thu, May 30, 2013 at 6:35 PM, Gary Dong pdxgary...@gmail.com wrote: Dear R users, I have a data set in .csv and I hope to convert it to .dat (ascii) so it can work in an UNIX environment. Anyone can help me? Thank you! Please expand. Why can't you work with csv in a UNIX environment? I do it all the time. Sarah -- Sarah Goslee http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cannot print neighbor list object
Hi, R users, I am estimating a spatial lag model using the spdep package. Before running the model, I was creating K nearest neighbours for spatial weights. My observations are points. Let me use the coords in Columbus dataset as an example, and I consider 5 nearest neighbors. library(spdep) data(columbus) ### create neighbor list using point coordinates coords col_nb5-knn2nb(knearneigh(coords,k=5,longlat=FALSE,RANN=TRUE)) ### calculate weighting col_wts-nb2listw(col_nb5, style=W,zero.policy=FALSE) ### now I want to see the neighborhood list: col_nb5 print(col_nb5) ### instead of a neighbor list, what I get is: Neighbour list object: Number of regions: 49 Number of nonzero links: 245 Percentage nonzero weights: 10.20408 Average number of links: 5 Non-symmetric neighbours list I had similar problem when trying to view col_wts My question is: what is the right way to view the neighbor list? Thanks! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Warning: Spatial weights matrix not row standardized
Hi, R users, I am doing Lagrange Multiplier Test Statistics for Spatial Autocorrelation with spdep and got this warning message: Spatial weights matrix not row standardized. It is a warning, not an error. I am wondering if this is a problem. Thanks! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multi lines in R plot
Dear R users, I'm plotting 5 loess smooth lines in one paragraph. Since the publisher does not print colorful pictures, I differentiate them by using different line types. I'm wondering if there are other options to make the graph more readable. It is really difficult for readers to tell the differences between those dash lines. Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Pseudo R squared in gls model
Dear R users, I'm wondering if the gls function reports pseudo R. I do not see it by summary(). If the package does not report, can I calculate it in this way? Adjusted pseudo R squared = 1 - [(Loglik(beta) - k ) / Loglik(null)] where k is the number of IVs. Thanks! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] negative AIC and BIC values in gls
Dear R users, I obtained negative AIC and BIC and positive Loglik values in a gls model. Is this normal? how should I interpret them? Thanks! AIC BIC logLik -659.978 -587.5541 345.989 Best Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] spatial auto-correlation structure in nlme
Dear R users, I'm estimating a mixed effects model in which the spatial correlation is controlled for by the corGaus structure. I'm wondering if there is a document or paper that explains how the spatial correlation structure (such as corExp or corGaus) works. Let me use the example and data posted on UCLA's R FAQ webpage to explain my problems. The link for the webpage is: http://www.ats.ucla.edu/stat/r/faq/spatial_regression.htm install.packages(nlme) library(nlme) spdata - read.table(http://www.ats.ucla.edu/stat/R/faq/thick.csv;, header = T, sep = ,) dummy - rep(1, 75) spdata - cbind(spdata, dummy) ### estimate the null model ### soil.model - lme(fixed = thick ~ soil, data = spdata, random = ~ 1 | dummy, method = ML) summary(soil.model) plot(Variogram(soil.model,form=~north+east)) ### updated the model by the spatial correlation structure soil.gaus - update(soil.model, correlation=corGaus(1,form=~north+east)) summary(soil.gaus) plot(Variogram(soil.gaus,form=~north+east)) My questions are: 1) Is there a way that I can tell, to what extent the spatial correlation has been controlled for by the model, besides the improvements of AIC, BIC, and Loglik? 2) where can I find the formulas for the corExp or corGaus? Thanks! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] memory size
Dear R community, I'm running a mlogit function with a very large data set. Since the data size is so large, I got the Error: Error: cannot allocate vector of size 5.4 Mb In addition: Warning messages: 1: In mapply(*, X, P, SIMPLIFY = FALSE) : Reached total allocation of 16340Mb: see help(memory.size) 2: In mapply(*, X, P, SIMPLIFY = FALSE) : Reached total allocation of 16340Mb: see help(memory.size) 3: In mapply(*, X, P, SIMPLIFY = FALSE) : Reached total allocation of 16340Mb: see help(memory.size) 4: In mapply(*, X, P, SIMPLIFY = FALSE) : Reached total allocation of 16340Mb: see help(memory.size) I'm using win7 and my computer has 16G memory. Is there a way that I can make R use the memory more efficiently? Thanks! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting points to a map
Dear R users, I have a city map in shape file (polygon). I also have some points that I hope to plot them to the city map. The only information I have about those points are their relative longitude and latitude to the city center by miles. Is there a way that R can help me to do this? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting points to a map
Thanks, Sara. For your first question: Geographic Coordinate System: GCS_North_American_1983_HARN Projected Coordinate System: NAD_1983_HARN_StatePlane_Oregon_North_FIPS_3601_Feet_Intl I have imported the shp file in R by read.shaplefiles(). For your second question: 1) by relative long and lat, I mean distances to the city center from south-north and east-west directions. 2) I know the coordinate of the reference point. Thanks. Best Gary On Wed, Jul 18, 2012 at 11:30 AM, Sarah Goslee sarah.gos...@gmail.comwrote: Hi, On Wed, Jul 18, 2012 at 2:11 PM, Gary Dong pdxgary...@gmail.com wrote: Dear R users, I have a city map in shape file (polygon). I also have some points that I hope to plot them to the city map. What projection is the shape file in? Have you successfully imported it into R yet? This page has a nice overview on dealing with shapefiles in R. http://www.nceas.ucsb.edu/scicomp/usecases/ReadWriteESRIShapeFiles The only information I have about those points are their relative longitude and latitude to the city center by miles. Is there a way that R can help me to do this? Thanks. What does relative longitude and latitude in miles mean? That makes no sense to me. Do you mean distance and direction? (Which wouldn't be latitude and longitude.) If the latter, do you know the coordinates of the reference point? I think we need more information to be able to help. You might also be better served by asking on the r-sig-geo list instead of the main R-help list. Sarah -- Sarah Goslee http://www.functionaldiversity.org [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3-d kernel smooth by the kde function
Dear R community, I'm having hard time to understand the kde function in ks package. Let me use a 3-dimensional kernel smooth example to explain my question using the elevation data in geoR. ### here is what I did ### library(ks) require(geoR) data(elevation) elev.df - data.frame(x = elevation$coords[,x], y = elevation$coords[,y], z = elevation$data) H-Hpi(elev.df) elev.smt-kde(elev.df,H=H) plot(elev.smt, drawpoints=TRUE) If I understand it correctly, with the kde function, I'm using the two coordinate variables x and y to estimate (or say smooth) elevation (z). Is this correct? With this kernel smooth, my goal is to identify peaks, which are defined as areas that have estimated elevations =950. Can someone show me how to do this? Thanks! Best Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Kernel smoothing (ks package)
Dear R users, I'm trying to determine local population centers in a region through kernel smoothing. What I have is population density in each neighborhood, which can be presented as point density. So in my data set pop, I have three columns for about 1000 neighborhoods: x (x coordinate), y (y coordinate), and z (population density at xy). I searched R documentations and found that the kde function in the ks package might be the one I'm looking for. To illustrate what I have done, I use the elevation data in GeoR: # Kernel smooth require(geoR) data(elevation) elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 * elevation$coords[,y], z = 10 * elevation$data) H-Hpi(elev.df) popcnt-kde(elev.df,H=H) plot(popcnt, axes=TRUE, box=TRUE, drawpoints=TRUE) To confirm this is what I want, I also created Loess density surface require(geoR) data(elevation) elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 * elevation$coords[,y], z = 10 * elevation$data) elev.loess - loess(z ~ x + y, data = elev.df, degree = 2, span = 0.15) grid.x - seq(10, 300, 1) grid.y - seq(10, 300, 1) grid.mar - list(x=grid.x, y=grid.y) elev.fit-expand.grid(grid.mar) z.pred - predict(elev.loess, newdata = elev.fit) persp(seq(10, 300, 1), seq(10, 300, 1), z.pred, phi = 45, theta = 45) # After comparing the two, I felt I did something wrong with the kernel smooth, but could not figure out the right way to do it. Your help would be greatly appreciated! Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identifying local maxima
Thanks, Hans. Much appreciated! I'm also wondering, if a population center (local maxima) is defined as a contiguous cluster of neighborhoods with population density that is significantly higher than the neighborhoods surrounding it, will things become easier? I mean, the identification of local maxima. Thanks. Gary On Thu, Jul 12, 2012 at 5:26 AM, Hans W Borchers hwborch...@googlemail.comwrote: Gary Dong pdxgary163 at gmail.com writes: Dear R users, I have created a Loess surface in R, in which x is relative longitude by miles, y is relative latitude by miles, and z is population density at the neighborhood level. The purpose is to identify some population centers in the region. I'm wondering if there is a way to determine the coordinates (x,y) of each center, so I can know exactly where they are. Let me use the elevation data set in geoR to illustrate what have done: require(geoR) data(elevation) elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 * elevation$coords[,y], z = 10 * elevation$data) elev.loess - loess(z ~ x * y, data = elev.df, degree = 2, span = 0.1) grid.x - seq(10, 300, 1) grid.y - seq(10, 300, 1) grid.mar - list(x=grid.x, y=grid.y) elev.fit-expand.grid(grid.mar) z.pred - predict(elev.loess, newdata = elev.fit) persp(seq(10, 300, 5), seq(10, 300, 5), emp.pred, phi = 45, theta = 45, xlab = X Coordinate (feet), ylab = Y Coordinate (feet), main = Surface elevation data) With this, what's the right way to determine the coordinates of the local maixma on the surface? If there are more local maxima, you probably need to start the optimization process from each point of your grid and see if you stumble into different maxima. Here is how to find the one maximum in your example. require(geoR); data(elevation) elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 *elevation$coords[,y], z = 10 * elevation$data) ## Compute the surface on an irregular grid library(akima) foo - function(xy) with( elev.df, -interp(x, y, z, xy[1], xy[2], linear=FALSE, extrap=TRUE)$z ) ## Use Nelder-Mead to find a local maximum optim(c(150, 150), foo) # $par # [1] 195.8372 21.5866 # $value # [1] -9705.536 ## See contour plot if this is the only maximum with(elev.df, {A - akima::interp(x, y, z, linear=FALSE) plot(x, y, pch=20, col=blue) contour(A$x, A$y, A$z, add=TRUE) }) points(195.8372, 21.5866, col=red) Thanks. Gary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] identifying local maxima
Dear R users, I have created a Loess surface in R, in which x is relative longitude by miles, y is relative latitude by miles, and z is population density at the neighborhood level. The purpose is to identify some population centers in the region. I'm wondering if there is a way to determine the coordinates (x,y) of each center, so I can know exactly where they are. Let me use the elevation data set in geoR to illustrate what have done: require(geoR) data(elevation) elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 * elevation$coords[,y], z = 10 * elevation$data) elev.loess - loess(z ~ x * y, data = elev.df, degree = 2, span = 0.1) grid.x - seq(10, 300, 1) grid.y - seq(10, 300, 1) grid.mar - list(x=grid.x, y=grid.y) elev.fit-expand.grid(grid.mar) z.pred - predict(elev.loess, newdata = elev.fit) persp(seq(10, 300, 5), seq(10, 300, 5), emp.pred, phi = 45, theta = 45, xlab = X Coordinate (feet), ylab = Y Coordinate (feet), main = Surface elevation data) With this, what's the right way to determine the coordinates of the local maixma on the surface? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] calculating inflection point in mixed effect model
Hi, dear R users, I estimated a mixed-effects model with the lme function in R. In this model, the relationship between my predictor x and the DV y follows quadratic function. In other words, in the model, x has a positive coefficient while x squared has a negative coefficient. I'm wondering how I can calculate the inflection point, at which the impacts of x on y turns positive into negative. Thanks! Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multi-threads in R
Dear R users, I'm wonder if there is a easy way to make R use multi-CPUs on my computer. My computer has four CPUs but R uses only one. Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multi-threads in R
Thanks for all replied. I read the introduction of R parallel. Is it for loops only? Gary On Sun, Jun 17, 2012 at 10:04 AM, R. Michael Weylandt michael.weyla...@gmail.com michael.weyla...@gmail.com wrote: Take a look at the parallel package which ships with all current versions of R. Michael On Jun 17, 2012, at 11:39 AM, Gary Dong pdxgary...@gmail.com wrote: Dear R users, I'm wonder if there is a easy way to make R use multi-CPUs on my computer. My computer has four CPUs but R uses only one. Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Specifying spatial correlation Form in nmle
Dear R users, I'm applying a correlation structure in a mixed model (nmle function) to control for spatial correlation between land parcels that are adjacent to each other. I generated X,Y coordinates in ArcGIS for each land parcel and used them in the correlation form like this: test.exp-corExp(1, form = ~ X + Y) test.exp- Initialize(test.exp,dataset) However, the correlation Matrix generated does not look right. Here is a portion of the matrix, in which all correlations are 0: corMatrix(pdxspc.exp)[1:5, 1:5] [,1] [,2] [,3] [,4] [,5] [1,]10000 [2,]01000 [3,]00100 [4,]00010 [5,]00001 It seems I'm not using the X Y coordinates correctly in the form. I saw two examples that used east and north in their form. But I do not quite know what east and north are. Any help would be greatly appreciated. Best Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] controlling spatial autocorrelation in linear regression models
Thanks! This link is very helpful. Best Gary On Mon, Jun 11, 2012 at 4:28 AM, D.Soudis d.sou...@rug.nl wrote: Hi, Maybe this is helpful..?? http://www.ats.ucla.edu/stat/r/faq/spatial_regression.htm Best, Dimitrios -- View this message in context: http://r.789695.n4.nabble.com/controlling-spatial-autocorrelation-in-linear-regression-models-tp4632940p4632998.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] controlling spatial autocorrelation in linear regression models
Dear R users, I'm estimating a hedonic land price model at the parcel level, basically a linear regression model with land price as the dependent variable. I'm wondering if there is a function in R that I can use to control spatial auto-correlation among parcels that are adjacent to each other. I know multi-level regression can solve the problem partially by grouping parcels to a higher spatial level, such as block group or tract. However, the grouping could be very arbitrary. Is there a better way to do this? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer function in R
Dear R users, I'm estimating a two-level regresion model but having difficuly in finding a good R syntax example. Let me use an example to explain what I'm doing. The dependent variable is math score (mathscore). Predictors are at two levels. At the student level, they are household income (hinc) and IQ (IQ). At the school level, there is a dummy variable indicating if the school is a private school (prvsch). I also have school IDs (id) that can be used as the subject. I'm wondering what the syntax would be in R. Any advice is appreciated. Best Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two Y axes (same scale) in ggplot2 or plot
Thanks, Duncan. I tried axis(). It appears it allows you to add an axis, but does not say you can plot a second Y in the graph. Maybe I'm understanding it correctly. Any help will be appreciated! Gary On Wed, May 9, 2012 at 7:26 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 08/05/2012 3:23 PM, Gary Dong wrote: Dear R users, I'm plotting housing prices in City A over past 30 years in ggplot2. The Xs are years since 1980. I have two housing price variables: new home prices and old home prices, both of them measured by $/sqft. I have searched related threads on multiple Y axes in ggplot2 and I understand that multiple Y axes in different scales are not possible. I'm wondering if it is possible to have multiple Y axes with the same scale in ggplot2, like in my case. If still not possible, is there a easy way to do it in R's default plot function? Thanks. In base graphics, you can have as many axes as you like, displaying anything. Use the axis() function. See ?axis for the arguments that determine placement, ticks, etc. I would guess the same flexibility is there in ggplot2, but I don't know how to do it. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] two Y Axes (in the same scale) in ggplot2
Dear R users, I'm plotting housing prices in City A over past 30 years in ggplot2. The Xs are years since 1980. I have two housing price variables: new home prices and old home prices, both of them measured by $/sqft. I have searched related threads on multiple Y axes in ggplot2 and I understand that multiple Y axes in different scales are not possible. I'm wondering if it is possible to have multiple Y axes with the same scale in ggplot2, like in my case. If still not possible, is there a easy way to do it in R's default plot function? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two Y axes (same scale) in ggplot2 or plot
Dear R users, I'm plotting housing prices in City A over past 30 years in ggplot2. The Xs are years since 1980. I have two housing price variables: new home prices and old home prices, both of them measured by $/sqft. I have searched related threads on multiple Y axes in ggplot2 and I understand that multiple Y axes in different scales are not possible. I'm wondering if it is possible to have multiple Y axes with the same scale in ggplot2, like in my case. If still not possible, is there a easy way to do it in R's default plot function? Thanks. Gary [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.