[R] factor() in lm

2013-12-01 Thread Gary Dong
Dear R users,

I am running a linear regression in R. My observations are Census Tracts in
several metropolitan areas (MSAs). In my data set, each MSA has at least 50
observations. I use factor(msa_code) in the lm formula to control for
metropolitan fixed effects. But I kept getting something like this:

.
factor(msa_code)12420  4.910e-01  1.517e-01   3.237 0.001221 **
factor(msa_code)12580  1.966e-01  6.861e-02   2.865 0.004194 **
factor(msa_code)14460 -3.892e-02  1.653e-02  -2.355 0.018601 *
factor(msa_code)16980 -2.873e-01  3.278e-02  -8.764   2e-16 ***
factor(msa_code)17140  1.088e-01  6.771e-02   1.607 0.108127
factor(msa_code)17460 -1.173e-01  4.380e-02  -2.678 0.007441 **
factor(msa_code)19100  1.368e-01  5.550e-02   2.465 0.013753 *
factor(msa_code)19740  5.819e-01  1.173e-01   4.962 7.33e-07 ***
factor(msa_code)19820 -4.214e-01  6.641e-02  -6.346 2.51e-10 ***
factor(msa_code)26420  1.258e-01  7.541e-02   1.668 0.095486 .
factor(msa_code)28140  2.010e-01  3.847e-02   5.224 1.85e-07 ***
factor(msa_code)29820  7.102e-02  6.593e-02   1.077 0.281435
factor(msa_code)31100 -4.832e-01  1.088e-01  -4.440 9.28e-06 ***
factor(msa_code)33100 -2.534e-01  6.391e-02  -3.965 7.49e-05 ***
factor(msa_code)33460  5.229e-02  7.891e-02   0.663 0.507609
factor(msa_code)35620 -3.197e-01  7.565e-02  -4.225 2.45e-05 ***
factor(msa_code)36740  1.269e-01  6.948e-02   1.826 0.067868 .
factor(msa_code)37980  1.394e-01  4.388e-02   3.178 0.001497 **
factor(msa_code)38060 -6.935e-02  6.124e-02  -1.132 0.257540
factor(msa_code)38300  1.647e-01  3.986e-02   4.133 3.67e-05 ***
factor(msa_code)38900  2.605e-01  1.420e-01   1.835 0.04 .
factor(msa_code)39300 -9.612e-02  4.704e-02  -2.043 0.041103 *
factor(msa_code)40140 -2.353e-01  3.562e-02  -6.605 4.59e-11 ***
factor(msa_code)40900 NA NA  NA   NA
factor(msa_code)41740 NA NA  NA   NA
factor(msa_code)41860 NA NA  NA   NA
factor(msa_code)42660 NA NA  NA   NA
factor(msa_code)45300 NA NA  NA   NA
factor(msa_code)47900 NA NA  NA   NA

 I wonder why I kep getting those NAs. Thank you!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor() in lm

2013-12-01 Thread Gary Dong
Thanks, Bert. It seems I got these NAs because I already had MSA population
controlled for in my model, besides the fixed effect variable, which led to
overestimation. Those NAs disappeared after I dropped the population
variable.

Gary


On Sun, Dec 1, 2013 at 10:27 AM, Bert Gunter gunter.ber...@gene.com wrote:

 You may wish to talk to a local statistician or read up on linear
 models, as you appear to not understand some basics. Anyway,  either

 1. You have other covariates in your model that you haven't shown and
 your model is overdetermined.
 2. You have NA's in your data that causes 1) to occur.

 As an example of the above:

 x - rep(letters[1:3],e=5)
 y - factor(rep(1:3,c(5,8,2)))
 summary(lm(rnorm(15)~x+y))

 Call:
 lm(formula = rnorm(15) ~ x + y)

 Residuals:
 Min  1Q  Median  3Q Max
 -1.6768 -0.3865 -0.1108  0.3090  1.9632

 Coefficients: (1 not defined because of singularities)
 Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.041380.47160   0.0880.932
 xb   1.592591.17111   1.3600.201
 xc   0.368220.88228   0.4170.684
 y2  -1.585170.96264  -1.6470.128
 y3NA NA  NA   NA


 Incidentally, I was surprised to find in R3.0.2 that if some levels of
 a factor are missing either due to NA's in the response or otherwise,
 R estimates the coefficients for the remaining factor levels quite
 nicely. I expected it to complain, but it did not. Maybe it has always
 been so nicely behaved -- I don't fit overdetermined models and take
 care that my factor levels are actually present, so don't run into
 trouble. But if this is newish behavior and you are using an oldish
 version, you might try upgrading to the current version. Or (more
 likely) both clauses of this conditional are false and should be
 ignored, and I should preemptively apologize for my foolishness.

 Cheers,
 Bert

 On Sun, Dec 1, 2013 at 9:48 AM, Gary Dong pdxgary...@gmail.com wrote:
  Dear R users,
 
  I am running a linear regression in R. My observations are Census Tracts
 in
  several metropolitan areas (MSAs). In my data set, each MSA has at least
 50
  observations. I use factor(msa_code) in the lm formula to control for
  metropolitan fixed effects. But I kept getting something like this:
 
  .
  factor(msa_code)12420  4.910e-01  1.517e-01   3.237 0.001221 **
  factor(msa_code)12580  1.966e-01  6.861e-02   2.865 0.004194 **
  factor(msa_code)14460 -3.892e-02  1.653e-02  -2.355 0.018601 *
  factor(msa_code)16980 -2.873e-01  3.278e-02  -8.764   2e-16 ***
  factor(msa_code)17140  1.088e-01  6.771e-02   1.607 0.108127
  factor(msa_code)17460 -1.173e-01  4.380e-02  -2.678 0.007441 **
  factor(msa_code)19100  1.368e-01  5.550e-02   2.465 0.013753 *
  factor(msa_code)19740  5.819e-01  1.173e-01   4.962 7.33e-07 ***
  factor(msa_code)19820 -4.214e-01  6.641e-02  -6.346 2.51e-10 ***
  factor(msa_code)26420  1.258e-01  7.541e-02   1.668 0.095486 .
  factor(msa_code)28140  2.010e-01  3.847e-02   5.224 1.85e-07 ***
  factor(msa_code)29820  7.102e-02  6.593e-02   1.077 0.281435
  factor(msa_code)31100 -4.832e-01  1.088e-01  -4.440 9.28e-06 ***
  factor(msa_code)33100 -2.534e-01  6.391e-02  -3.965 7.49e-05 ***
  factor(msa_code)33460  5.229e-02  7.891e-02   0.663 0.507609
  factor(msa_code)35620 -3.197e-01  7.565e-02  -4.225 2.45e-05 ***
  factor(msa_code)36740  1.269e-01  6.948e-02   1.826 0.067868 .
  factor(msa_code)37980  1.394e-01  4.388e-02   3.178 0.001497 **
  factor(msa_code)38060 -6.935e-02  6.124e-02  -1.132 0.257540
  factor(msa_code)38300  1.647e-01  3.986e-02   4.133 3.67e-05 ***
  factor(msa_code)38900  2.605e-01  1.420e-01   1.835 0.04 .
  factor(msa_code)39300 -9.612e-02  4.704e-02  -2.043 0.041103 *
  factor(msa_code)40140 -2.353e-01  3.562e-02  -6.605 4.59e-11 ***
  factor(msa_code)40900 NA NA  NA   NA
  factor(msa_code)41740 NA NA  NA   NA
  factor(msa_code)41860 NA NA  NA   NA
  factor(msa_code)42660 NA NA  NA   NA
  factor(msa_code)45300 NA NA  NA   NA
  factor(msa_code)47900 NA NA  NA   NA
 
   I wonder why I kep getting those NAs. Thank you!
 
  Gary
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 (650) 467-7374


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible

[R] summary many regressions

2013-11-25 Thread Gary Dong
Dear R users,

I have a large data set which includes data from 300 cities. I want to run
a biviriate regression for each city and record the coefficient and the
adjusted R square.

For example, in the following, I have 10 cities represented by numbers from
1 to 10:

x = cumsum(c(0, runif(999, -1, +1)))
y = cumsum(c(0, runif(999, -1, +1)))
city = rep(1:10,each=100)
data-data.frame(cbind(x,y,city))

I can manually run regressions for each city:
fit_city1 - lm(y ~ x,data=subset(data,data$city==1))
summary(fit_city1)

Obvious, it is very tedious to run 300 regressions. I wonder if there is a
quicker way to do this. Use for loop?  what I want to see is something like
this:

CityCoefficient   Adjusted R square
1  -0.05  0.36
2  -0.12  0.20
3  -0.05  0.32
.

Any advice is appreciated!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] if else in R

2013-11-19 Thread Gary Dong
Dear R users,

I am a R beginner and I am having trouble in using If Else in R. Here is
an example:

## create a data frame

a-c(1,2,3,4)
b-c(2,0,5,0)
ab-data.frame(cbind(a,b))

##calculate c, which is the ratio between a and b

if(ab$b0) {
 ab$c-ab$a/ab$b
} else {
 ab$c-0
 }

here is the error I got:

Warning message:
In if (ab$b  0) { :
  the condition has length  1 and only the first element will be used.

Any help is appreciated!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] find max value in each row and return column number and column name

2013-11-01 Thread Gary Dong
Dear R users,

I wonder how I can use R to identify the max value of each row, the column
number column name:

For example:

a - data.frame(x = rnorm(4), y = rnorm(4), z = rnorm(4))

 a
   x  y  z
1 -0.7289964  0.2194702 -2.4674780
2  1.0889353  0.3167629 -0.9208548
3 -0.6374692 -1.7249049  0.6567313
4 -0.1348642  0.4507473 -1.7309010

In this data frame, I compare y and z only.

What I need:

x y z
 max max.col.num max.col.name
1 -0.7289964  0.2194702 -2.4674780 0.2194702   2
 y
2  1.0889353  0.3167629 -0.9208548 0.3167629   2
 y
3 -0.6374692 -1.7249049  0.6567313 0.6567313   3
z
4 -0.1348642  0.4507473 -1.7309010 0.4507473   2
y


Any suggestion will be greatly appreciated!

Thank you!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extraction of roots in R

2013-11-01 Thread Gary Dong
Dear R users,

I wonder if R has a default function that I can use to do extraction of
roots.

Here is an example:

X  N
2.5  5
3.4  7
8.9  9
6.4  1
2.1  0
1.1 2

I want to calculate Y = root(X)^N, where N represents the power. what is
the easy way to do this?

Thank you!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot time series data in wide format

2013-11-01 Thread Gary Dong
Dear R users,

I wonder if there is a way that I can plot a time series data which is in a
wide format like this:

CITY_NAME   2000Q12000Q2  2000Q32000Q4 2001Q1
2001Q2  2001Q3 2001Q4 2002Q1  2002Q2
CITY1100.5210   101.9667  103.24933   104.0506   104.4317
105.3921   106.7643   107.5202   107.2561   107.8184
CITY2100.0412   100.6146  103.20293   104.0867   104.6612
106.6126   109.3514   110.1943   110.9480   113.0071
CITY3 99.589599.2298   99.2694799.4101   100.5776
101.3719   101.5957   102.2411   103.4390   105.1745
CITY4 99.6491   101.5386  104.90953   106.1065   108.1785
110.6845   113.3746   114.1254   116.2121   119.1033
CITY5100.9828   103.6847  105.04793   106.5925   108.7437
110.5549   111.9343   112.6704   113.6201   115.3020

Ideally, each city of the five city is represented by a line in the plot.

Any suggestion is appreciated!

Thanks!
Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Interpretation of coefficients in spatial lag models

2013-10-07 Thread Gary Dong
Dear R users,

I have estimated a spatial lag model using the spdep package. I
understand that to interpret the magnitude of coefficients correctly, I
have to use impact() to calculate the direct and indirect impacts.

Here are the estimation results for a variable X, which is statistically
significant in the model:

Coefficient: 0.104
Direct impact: 0.105
Indirect impact: 0.02
Total impact: 0.107

What I want to know through the model is how the change in X impacts the
dependent variable Y.  Based on the results shown above, is this the
correct way to interpret the impact of X on Y: The average total impact of
a one-unit increase in X on Y is about 0.107. In other words, a one-unit
increase in X is associated with a 0.107 unit increase in Y.

Thanks!
Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Logged ratio as Dependent variable

2013-09-12 Thread Gary Dong
Dear R users,

I have a question regarding using logged ratio as dependent variable in
regression. I am reading a paper discussing how a waste management facility
has influenced housing price in surrounding areas. The author used
Ln(P2/P1) as the dependent variable, where P1 and P2 represent trade prices
of the same property before and after the opening of the facility. One key
explanatory variable is the distance from the property to the facility. The
author used spatial lag models to control for spatial autocorrelation
between nearby properties.

The author explained in the paper that the logged ratio is equivalent to
the percentage change in the property price between time 1 and time 2.
Thus, if the coefficient of the distance variable is 0.02, it means that
with 1 mile closer to the facility, homes have a net appreciation rate 2%
lower than comparable properties.

I wonder if this is the appropriate way to interpret model results. If yes,
I also wonder how the distance coefficient should be interpreted if the
distance variable takes natural log form in the regression. Thank you!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in x[[1]] : subscript out of bounds

2013-05-31 Thread Gary Dong
Dear R users,

I am estimating a nested logit model using mlogit, but keep getting this
error message:

Error in x[[1]] : subscript out of bounds

Anyone can tell me what this error means?

Simple MNL model worked with the same data.

Thank you!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mlogit does not report number of observation?

2013-05-30 Thread Gary Dong
Dear R users,

I am wondering if there is an easy way to have mlogit to report number of
observations (N) used in the model. It seems summary() does not work.
Thank you!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mlogit error

2013-05-30 Thread Gary Dong
Hi, Graham, Thank you for sharing this.

I encountered the same problem. My choice set was unbalanced because some
alternatives were not available for some observations. If I forced the
choice sets balanced, model estimation results would be biased. However,
after including chid.var =  and alt.var= in the script, the model
worked fined, with unbalanced choice set.

Gary



On Mon, Apr 8, 2013 at 1:13 AM, Leask, Graham g.le...@aston.ac.uk wrote:

 Dear Listmembers

 Thank you for your help in resolving the duplicate row.names error. The
 solution was very straightforward to ensure that the choice set is
 completely balanced. Once that was achieved the program worked fine.

 Kind Regards


 Graham
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] write table in ascii

2013-05-30 Thread Gary Dong
Dear R users,

I have a data set in .csv and I hope to convert it to .dat (ascii) so it
can work in an UNIX environment. Anyone can help me? Thank you!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write table in ascii

2013-05-30 Thread Gary Dong
I am working in Windows system. But a software I am using requires data
input in ascii (.dat, delimiters can be tabs or spaces). I was able to save
the data into .dat (ascii) from SPSS, but it keeps causing problems. Thank
you!

Gary


On Thu, May 30, 2013 at 3:51 PM, Sarah Goslee sarah.gos...@gmail.comwrote:

 On Thu, May 30, 2013 at 6:35 PM, Gary Dong pdxgary...@gmail.com wrote:
  Dear R users,
 
  I have a data set in .csv and I hope to convert it to .dat (ascii) so it
  can work in an UNIX environment. Anyone can help me? Thank you!

 Please expand. Why can't you work with csv in a UNIX environment?

 I do it all the time.

 Sarah

 --
 Sarah Goslee
 http://www.functionaldiversity.org


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cannot print neighbor list object

2013-01-29 Thread Gary Dong
Hi, R users,

I am estimating a spatial lag model using the spdep package. Before
running the model, I was creating K nearest neighbours for spatial weights.
My observations are points.

Let me use the coords in Columbus dataset as an example, and I consider 5
nearest neighbors.

library(spdep)
data(columbus)
### create neighbor list using point coordinates coords
col_nb5-knn2nb(knearneigh(coords,k=5,longlat=FALSE,RANN=TRUE))
### calculate weighting
col_wts-nb2listw(col_nb5, style=W,zero.policy=FALSE)

### now I want to see the neighborhood list: col_nb5

print(col_nb5)

### instead of a neighbor list, what I get is:

Neighbour list object:
Number of regions: 49
Number of nonzero links: 245
Percentage nonzero weights: 10.20408
Average number of links: 5
Non-symmetric neighbours list

I had similar problem when trying to view col_wts

My question is: what is the right way to view the neighbor list?

Thanks!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Warning: Spatial weights matrix not row standardized

2013-01-29 Thread Gary Dong
Hi, R users,

I am doing Lagrange Multiplier Test Statistics for Spatial Autocorrelation
with spdep and got this warning message: Spatial weights matrix not row
standardized.

It is a warning, not an error.

I am wondering if this is a problem. Thanks!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multi lines in R plot

2012-09-19 Thread Gary Dong
Dear R users,

I'm plotting 5 loess smooth lines in one paragraph. Since the publisher
does not print colorful pictures, I differentiate them by using different
line types. I'm wondering if there are other options to make the graph more
readable. It is really difficult for readers to tell the differences
between those dash lines. Thanks.

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Pseudo R squared in gls model

2012-08-23 Thread Gary Dong
Dear R users,

I'm wondering if the gls function reports pseudo R. I do not see it by
summary(). If the package does not report, can I calculate it in this way?

Adjusted pseudo R squared = 1 - [(Loglik(beta) - k ) / Loglik(null)] where
k is the number of IVs.

Thanks!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] negative AIC and BIC values in gls

2012-08-22 Thread Gary Dong
Dear R users,

I obtained negative AIC and BIC and positive Loglik values in a gls model.
Is this normal? how should I interpret them? Thanks!

   AIC   BIC   logLik
  -659.978  -587.5541   345.989

Best
Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] spatial auto-correlation structure in nlme

2012-08-17 Thread Gary Dong
Dear R users,

I'm estimating a mixed effects model in which the spatial correlation is
controlled for by the corGaus structure. I'm wondering if there is a
document or paper that explains how the spatial correlation structure (such
as corExp or corGaus) works.

Let me use the example and data posted on UCLA's R FAQ webpage to explain
my problems. The link for the webpage is:
http://www.ats.ucla.edu/stat/r/faq/spatial_regression.htm

install.packages(nlme)
library(nlme)

spdata - read.table(http://www.ats.ucla.edu/stat/R/faq/thick.csv;, header
= T, sep = ,)
dummy - rep(1, 75)
spdata - cbind(spdata, dummy)

### estimate the null model ###
soil.model - lme(fixed = thick ~ soil, data = spdata, random = ~ 1 |
dummy, method = ML)
summary(soil.model)
plot(Variogram(soil.model,form=~north+east))

### updated the model by the spatial correlation structure 
soil.gaus - update(soil.model, correlation=corGaus(1,form=~north+east))
summary(soil.gaus)
plot(Variogram(soil.gaus,form=~north+east))

My questions are:

1) Is there a way that I can tell, to what extent the spatial correlation
has been controlled for by the model, besides the improvements of AIC, BIC,
and Loglik?

2) where can I find the formulas for the corExp or corGaus?

Thanks!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] memory size

2012-07-21 Thread Gary Dong
Dear R community,

I'm running a mlogit function with a very large data set. Since the data
size is so large, I got the Error:

Error: cannot allocate vector of size 5.4 Mb
In addition: Warning messages:
1: In mapply(*, X, P, SIMPLIFY = FALSE) :
  Reached total allocation of 16340Mb: see help(memory.size)
2: In mapply(*, X, P, SIMPLIFY = FALSE) :
  Reached total allocation of 16340Mb: see help(memory.size)
3: In mapply(*, X, P, SIMPLIFY = FALSE) :
  Reached total allocation of 16340Mb: see help(memory.size)
4: In mapply(*, X, P, SIMPLIFY = FALSE) :
  Reached total allocation of 16340Mb: see help(memory.size)

I'm using win7 and my computer has 16G memory. Is there a way that I can
make R use the memory more efficiently?

Thanks!
Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotting points to a map

2012-07-18 Thread Gary Dong
Dear R users,

I have a city map in shape file (polygon). I also have some points that I
hope to plot them to the city map. The only information I have about those
points are their relative longitude and latitude to the city center by
miles. Is there a way that R can help me to do this? Thanks.

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting points to a map

2012-07-18 Thread Gary Dong
Thanks, Sara.

For your first question:

Geographic Coordinate System: GCS_North_American_1983_HARN

Projected Coordinate System:
NAD_1983_HARN_StatePlane_Oregon_North_FIPS_3601_Feet_Intl

I have imported the shp file in R by read.shaplefiles().

For your second question:
1) by relative long and lat, I mean distances to the city center from
south-north and east-west directions.
2) I know the coordinate of the reference point.

Thanks.

Best
Gary


On Wed, Jul 18, 2012 at 11:30 AM, Sarah Goslee sarah.gos...@gmail.comwrote:

 Hi,

 On Wed, Jul 18, 2012 at 2:11 PM, Gary Dong pdxgary...@gmail.com wrote:
  Dear R users,
 
  I have a city map in shape file (polygon). I also have some points that I
  hope to plot them to the city map.

 What projection is the shape file in? Have you successfully imported
 it into R yet?

 This page has a nice overview on dealing with shapefiles in R.

 http://www.nceas.ucsb.edu/scicomp/usecases/ReadWriteESRIShapeFiles


  The only information I have about those
  points are their relative longitude and latitude to the city center by
  miles. Is there a way that R can help me to do this? Thanks.

 What does relative longitude and latitude in miles mean? That makes
 no sense to me. Do you mean distance and direction? (Which wouldn't be
 latitude and longitude.)

 If the latter, do you know the coordinates of the reference point?

 I think we need more information to be able to help. You might also be
 better served by asking on the r-sig-geo list instead of the main
 R-help list.

 Sarah

 --
 Sarah Goslee
 http://www.functionaldiversity.org


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3-d kernel smooth by the kde function

2012-07-18 Thread Gary Dong
Dear R community,

I'm having hard time to understand the kde function in ks package. Let me
use a 3-dimensional kernel smooth example to explain my question using the
elevation data in geoR.

### here is what I did ###

library(ks)
require(geoR)
data(elevation)
elev.df - data.frame(x = elevation$coords[,x], y =
elevation$coords[,y], z = elevation$data)
H-Hpi(elev.df)
elev.smt-kde(elev.df,H=H)
plot(elev.smt, drawpoints=TRUE)

If I understand it correctly, with the kde function, I'm using the two
coordinate variables x and y to estimate (or say smooth) elevation (z). Is
this correct?

With this kernel smooth, my goal is to identify peaks, which are defined as
areas that have estimated elevations =950. Can someone show me how to do
this?

Thanks!


Best
Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kernel smoothing (ks package)

2012-07-17 Thread Gary Dong
Dear R users,

I'm trying to determine local population centers in a region through kernel
smoothing. What I have is population density in each neighborhood, which
can be presented as point density. So in my data set “pop”, I have three
columns for about 1000 neighborhoods: x (x coordinate), y (y coordinate),
and z (population density at xy). I searched R documentations and found
that the kde function in the ks package might be the one I'm looking
for.

To illustrate what I have done, I use the elevation data in GeoR:

# Kernel smooth 

require(geoR)

data(elevation)

elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 *
elevation$coords[,y], z = 10 * elevation$data)

H-Hpi(elev.df)

popcnt-kde(elev.df,H=H)

plot(popcnt, axes=TRUE, box=TRUE, drawpoints=TRUE)

 To confirm this is what I want, I also created Loess density surface


require(geoR)

data(elevation)

elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 *
elevation$coords[,y], z = 10 * elevation$data)

elev.loess - loess(z ~ x + y, data = elev.df, degree = 2, span = 0.15)

grid.x - seq(10, 300, 1)
grid.y - seq(10, 300, 1)
grid.mar - list(x=grid.x, y=grid.y)
elev.fit-expand.grid(grid.mar)

z.pred - predict(elev.loess, newdata = elev.fit)

persp(seq(10, 300, 1), seq(10, 300, 1), z.pred, phi = 45, theta = 45)

#

After comparing the two, I felt I did something wrong with the kernel
smooth, but could not figure out the right way to do it. Your help would be
greatly appreciated!

Thanks.

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identifying local maxima

2012-07-16 Thread Gary Dong
Thanks, Hans. Much appreciated!

I'm also wondering, if a population center (local maxima) is defined as
a contiguous cluster of neighborhoods with population density that is
significantly higher than the neighborhoods surrounding it, will things
become easier? I mean, the identification of local maxima. Thanks.

Gary


On Thu, Jul 12, 2012 at 5:26 AM, Hans W Borchers
hwborch...@googlemail.comwrote:

 Gary Dong pdxgary163 at gmail.com writes:
 
  Dear R users,
 
  I have created a Loess surface in R, in which x is relative longitude by
  miles, y is relative latitude by miles, and z is population density at
 the
  neighborhood level. The purpose is to identify some population centers in
  the region. I'm wondering if there is a way to determine the coordinates
  (x,y) of each center, so I can know exactly where they are.
 
  Let me use the elevation data set in geoR to illustrate what have done:
 
  require(geoR)
 
  data(elevation)
 
  elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 *
  elevation$coords[,y], z = 10 * elevation$data)
 
  elev.loess - loess(z ~ x * y, data = elev.df, degree = 2, span = 0.1)
 
  grid.x - seq(10, 300, 1)
  grid.y - seq(10, 300, 1)
  grid.mar - list(x=grid.x, y=grid.y)
  elev.fit-expand.grid(grid.mar)
 
  z.pred - predict(elev.loess, newdata = elev.fit)
 
  persp(seq(10, 300, 5), seq(10, 300, 5), emp.pred, phi = 45, theta = 45,
  xlab = X Coordinate (feet), ylab = Y Coordinate (feet), main =
 Surface
  elevation data)
 
  With this, what's the right way to determine the coordinates of the local
  maixma on the surface?

 If there are more local maxima, you probably need to start the optimization
 process from each point of your grid and see if you stumble into different
 maxima. Here is how to find the one maximum in your example.

 require(geoR); data(elevation)
 elev.df - data.frame(x = 50 * elevation$coords[,x],
   y = 50 *elevation$coords[,y],
   z = 10 * elevation$data)

 ##  Compute the surface on an irregular grid
 library(akima)
 foo - function(xy) with( elev.df,
 -interp(x, y, z, xy[1], xy[2], linear=FALSE, extrap=TRUE)$z )

 ##  Use Nelder-Mead to find a local maximum
 optim(c(150, 150), foo)
 # $par
 # [1] 195.8372  21.5866
 # $value
 # [1] -9705.536

 ##  See contour plot if this is the only maximum
 with(elev.df,
 {A - akima::interp(x, y, z, linear=FALSE)
  plot(x, y, pch=20, col=blue)
  contour(A$x, A$y, A$z, add=TRUE) })
 points(195.8372, 21.5866, col=red)

  Thanks.
 
  Gary

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] identifying local maxima

2012-07-11 Thread Gary Dong
Dear R users,

I have created a Loess surface in R, in which x is relative longitude by
miles, y is relative latitude by miles, and z is population density at the
neighborhood level. The purpose is to identify some population centers in
the region. I'm wondering if there is a way to determine the coordinates
(x,y) of each center, so I can know exactly where they are.

Let me use the elevation data set in geoR to illustrate what have done:

require(geoR)

data(elevation)

elev.df - data.frame(x = 50 * elevation$coords[,x], y = 50 *
elevation$coords[,y], z = 10 * elevation$data)

elev.loess - loess(z ~ x * y, data = elev.df, degree = 2, span = 0.1)

grid.x - seq(10, 300, 1)
grid.y - seq(10, 300, 1)
grid.mar - list(x=grid.x, y=grid.y)
elev.fit-expand.grid(grid.mar)

z.pred - predict(elev.loess, newdata = elev.fit)

persp(seq(10, 300, 5), seq(10, 300, 5), emp.pred, phi = 45, theta = 45,
xlab = X Coordinate (feet), ylab = Y Coordinate (feet), main = Surface
elevation data)

With this, what's the right way to determine the coordinates of the local
maixma on the surface?

Thanks.

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] calculating inflection point in mixed effect model

2012-06-21 Thread Gary Dong
Hi, dear R users,

I estimated a mixed-effects model with the lme function in R. In this
model, the relationship between my predictor x and the DV y follows
quadratic function. In other words, in the model, x has a positive
coefficient while x squared has a negative coefficient. I'm wondering how I
can calculate the inflection point, at which the impacts of x on y turns
positive into negative. Thanks!

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multi-threads in R

2012-06-17 Thread Gary Dong
Dear R users,

I'm wonder if there is a easy way to make R use multi-CPUs on my computer.
My computer has four CPUs but R uses only one. Thanks.

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multi-threads in R

2012-06-17 Thread Gary Dong
Thanks for all replied.

I read the introduction of R parallel. Is it for loops only?

Gary


On Sun, Jun 17, 2012 at 10:04 AM, R. Michael Weylandt 
michael.weyla...@gmail.com michael.weyla...@gmail.com wrote:

 Take a look at the parallel package which ships with all current versions
 of R.

 Michael

 On Jun 17, 2012, at 11:39 AM, Gary Dong pdxgary...@gmail.com wrote:

  Dear R users,
 
  I'm wonder if there is a easy way to make R use multi-CPUs on my
 computer.
  My computer has four CPUs but R uses only one. Thanks.
 
  Gary
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Specifying spatial correlation Form in nmle

2012-06-12 Thread Gary Dong
Dear R users,

I'm applying a correlation structure in a mixed model (nmle function) to
control for spatial correlation between land parcels that are adjacent to
each other. I generated X,Y coordinates in ArcGIS for each land parcel and
used them in the correlation form like this:

test.exp-corExp(1, form = ~ X + Y)

test.exp- Initialize(test.exp,dataset)

However, the correlation Matrix generated does not look right. Here is a
portion of the matrix, in which all correlations are 0:

 corMatrix(pdxspc.exp)[1:5, 1:5]

 [,1] [,2] [,3] [,4] [,5]
[1,]10000
[2,]01000
[3,]00100
[4,]00010
[5,]00001

It seems I'm not using the X Y coordinates correctly in the form. I saw two
examples that used east and north in their form. But I do not quite
know what east and north are. Any help would be greatly appreciated.

Best
Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] controlling spatial autocorrelation in linear regression models

2012-06-11 Thread Gary Dong
Thanks! This link is very helpful.

Best
Gary

On Mon, Jun 11, 2012 at 4:28 AM, D.Soudis d.sou...@rug.nl wrote:


 Hi,

 Maybe this is helpful..??

 http://www.ats.ucla.edu/stat/r/faq/spatial_regression.htm

 Best,
 Dimitrios

 --
 View this message in context:
 http://r.789695.n4.nabble.com/controlling-spatial-autocorrelation-in-linear-regression-models-tp4632940p4632998.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] controlling spatial autocorrelation in linear regression models

2012-06-10 Thread Gary Dong
Dear R users,

I'm estimating a hedonic land price model at the parcel level, basically a
linear regression model with land price as the dependent variable. I'm
wondering if there is a function in R that I can use to control
spatial auto-correlation among parcels that are adjacent to each other. I
know multi-level regression can solve the problem partially by grouping
parcels to a higher spatial level, such as block group or tract. However,
the grouping could be very arbitrary. Is there a better way to do this?
Thanks.

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmer function in R

2012-06-09 Thread Gary Dong
Dear R users,

I'm estimating a two-level regresion model but having difficuly in finding
a good R syntax example.

Let me use an example to explain what I'm doing. The dependent variable is
math score (mathscore). Predictors are at two levels. At the student level,
they are household income (hinc) and IQ (IQ). At the school level, there is
a dummy variable indicating if the school is a private school (prvsch). I
also have school IDs (id) that can be used as the subject.

I'm wondering what the syntax would be in R. Any advice is appreciated.

Best
Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two Y axes (same scale) in ggplot2 or plot

2012-05-09 Thread Gary Dong
Thanks, Duncan.

I tried axis(). It appears it allows you to add an axis, but does not say
you can plot a second Y in the graph. Maybe I'm understanding it correctly.
Any help will be appreciated!

Gary


On Wed, May 9, 2012 at 7:26 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 On 08/05/2012 3:23 PM, Gary Dong wrote:

 Dear R users,

 I'm plotting housing prices in City A over past 30 years in ggplot2. The
 Xs
 are years since 1980. I have two housing price variables: new home prices
 and old home prices, both of them measured by $/sqft. I have searched
 related threads on multiple Y axes in ggplot2 and I understand that
 multiple Y axes in different scales are not possible. I'm wondering if it
 is possible to have multiple Y axes with the same scale in ggplot2, like
 in
 my case. If still not possible, is there a easy way to do it in R's
 default
 plot function? Thanks.


 In base graphics, you can have as many axes as you like, displaying
 anything.  Use the axis() function.  See ?axis for the arguments that
 determine placement, ticks, etc.

 I would guess the same flexibility is there in ggplot2, but I don't know
 how to do it.

 Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] two Y Axes (in the same scale) in ggplot2

2012-05-08 Thread Gary Dong
Dear R users,

I'm plotting housing prices in City A over past 30 years in ggplot2. The Xs
are years since 1980. I have two housing price variables: new home prices
and old home prices, both of them measured by $/sqft. I have searched
related threads on multiple Y axes in ggplot2 and I understand that
multiple Y axes in different scales are not possible. I'm wondering if it
is possible to have multiple Y axes with the same scale in ggplot2, like in
my case. If still not possible, is there a easy way to do it in R's default
plot function? Thanks.

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Two Y axes (same scale) in ggplot2 or plot

2012-05-08 Thread Gary Dong
Dear R users,

I'm plotting housing prices in City A over past 30 years in ggplot2. The Xs
are years since 1980. I have two housing price variables: new home prices
and old home prices, both of them measured by $/sqft. I have searched
related threads on multiple Y axes in ggplot2 and I understand that
multiple Y axes in different scales are not possible. I'm wondering if it
is possible to have multiple Y axes with the same scale in ggplot2, like in
my case. If still not possible, is there a easy way to do it in R's default
plot function? Thanks.

Gary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.