Re: [R] why does lm() not allow for negative weights?
Thanks Duncan Murdoch, > > Why do commonly used estimator functions (such as lm(), > > glm(), etc.) > > not allow negative case weights? > Residual sums of squares (or deviances) could be negative > with negative case weights. This doesn't seem like a good > thing: would you really want the fit to be far from those points? Yes, this is actually what I want for this particular estimator. But I can see now why this generally doesn't seem like a a good idea. Best, Jens > -Ursprüngliche Nachricht- > Von: Duncan Murdoch [mailto:[EMAIL PROTECTED] > Gesendet: Friday, August 04, 2006 7:36 PM > An: Jens Hainmueller > Cc: r-help@stat.math.ethz.ch > Betreff: Re: [R] why does lm() not allow for negative weights? > > On 8/4/2006 1:26 PM, Jens Hainmueller wrote: > > Dear List, > > > > > > I suspect that there is a good reason for this. > > Yet, I can see reasonable cases when one wants to use > negative case weights. > > > > Take lm() for example: > > > > ### > > > > n <- 20 > > Y <- rnorm(n) > > X <- cbind(rep(1,n),runif(n),rnorm(n)) Weights <- rnorm(n) > # Includes > > Pos and Neg Weights Weights > > > > # Now do Weighted LS and get beta coeffs: > > b <- solve(t(X)%*%diag(Weights)%*%X) %*% t(X) %*% diag(Weights)%*%Y > > That formula does not necessarily give least squares > estimates in the case where weights might be negative. For > example, with a single observation y, a single parameter mu, > design matrix X = 1, and weight -1, that formula becomes > > b <- y, > > but that is the worst possible estimator in a least squares > sense. The residual sum of squares can be made arbitrarily > large and negative by setting b to a large value. > > Duncan Murdoch > > > > b > > > > # This seems like a valid model, but when I try lm(Y ~ > > X[,2:3],weights=Weights) > > > > # I get: "missing or negative weights not allowed" > > > > ### > > > > What is the rationale for not allowing negative weights? I > ask this, > > because I am currently trying to implement a (two stage) estimator > > into R that involves negative case weights. Weights are > generated in > > the first stage, so it would be nice if I could use canned > functions > > such as > > lm(,weights=Weights) in the second stage. > > > > Thank you for your help. > > > > Best, > > Jens > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] why does lm() not allow for negative weights?
Dear List, Why do commonly used estimator functions (such as lm(), glm(), etc.) not allow negative case weights? I suspect that there is a good reason for this. Yet, I can see reasonable cases when one wants to use negative case weights. Take lm() for example: ### n <- 20 Y <- rnorm(n) X <- cbind(rep(1,n),runif(n),rnorm(n)) Weights <- rnorm(n) # Includes Pos and Neg Weights Weights # Now do Weighted LS and get beta coeffs: b <- solve(t(X)%*%diag(Weights)%*%X) %*% t(X) %*% diag(Weights)%*%Y b # This seems like a valid model, but when I try lm(Y ~ X[,2:3],weights=Weights) # I get: "missing or negative weights not allowed" ### What is the rationale for not allowing negative weights? I ask this, because I am currently trying to implement a (two stage) estimator into R that involves negative case weights. Weights are generated in the first stage, so it would be nice if I could use canned functions such as lm(,weights=Weights) in the second stage. Thank you for your help. Best, Jens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sensitivity tests fo causal inference
Hi all, Following up on Holger's email last week: Does anyone know if there exists a library that implements the sensitivity tests for hidden bias for matched pairs and unmatched groups as proposed in Rosenbaum's Observational Studies (2002: ch.4)? Thanks. Best, jens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Optim with two constraints
Hi R-list, I am new to optimization in R and would appreciate help on the following question. I would like to minimize the following function using two constraints: ## fn <- function(par,H,F){ fval <- 0.5 * t(par) %*% H %*% par + F%*% par fval } # matrix H is (n by k) # matrix F is (n by 1) # par is a (n by 1) set of weights # I need two constraints: # 1. each element in par needs to be between 0 and 1 # 2. sum(par)=1 i.e. the elements in par need to sum to 1 ## I try to use optim res <- optim(c(runif(16),fn, method="L-BFGS-B", H=H, F=f ,control=list(fnscale=-1), lower=0, upper=1) ## If I understand this correctly, using L-BFGS-B with lower=0 and upper=1 should take care of constraint 1 (box constraints). What I am lacking is the skill to include constraint no 2. I guess I could solve this by reparametrization but I am not sure how exactly. I could not find (i.e. wasn't able to infer) the answer to this in the archives despite the many comments on optim and constrained optimization (sorry if I missed it there). I am using version 2.1.1 under windows XP. Thank you very much. Jens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] local average
Hello, probably this isn't hard, but I can't get R to do this. Thanks for your help! Assume I have a matrix of two covariates: n<- 1000 Y<- runif(n) X<- runif(n,min=0,max=100) data <- cbind(Y,X) Now, I would like to compute the local average of Y for each X interval 0-1, 1-2, 2-3, ... 99-100. In other words, I would like to obtain 100 (local) Ybars, one for each X interval with width 1. Also, I would like to do the same but instead of local means of Y obtain local medians of Y for each X interval. Best, Jens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help with multiple imputation using imp.mix
I am desperately trying to impute missing data using 'imp.mix' but always run into this yucky error message to which I cannot find the solution. It's the first time I am using mix and I'm trying really hard to understand, but there's just this one step I don't get...perhaps someone knows the answer? Thanks! Jens My code runs: data<-read.table('http://www.courses.fas.harvard.edu/~gov2001/Data/immigrati on.dat',header=TRUE) library(mix) rngseed(12345678) # Preare data for imputation gender1<-c() gender1<-as.integer(data$gender) gender1[gender1==1]<-2 gender1[gender1==0]<-1 data$gender<-gender1 x<-cbind(data$gender,data$ipip,data$ideol,data$prtyid, data$wage1992) colnames(x)<-c("gender","ipip", "ideol", "prtyid","wage") # start imputation s <- prelim.mix(x,4) thetahat <- em.mix(s) And here comes the error message: > newtheta <- da.mix(s,thetahat, steps=100,showits=TRUE) Steps of Data Augmentation: 1...Error in da.mix(s, thetahat, steps = 100, showits = TRUE) : Improper posterior--empty cells > imp.mix(s, newtheta, x) __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] control of font size & colour for title, subtitles, axis, and tick marks in LATTICE graph
Hi, I very much appreciate any help on this "fine tuning" problem in a lattice graph (I am new to LATTICE and could not find an example in the help files that worked for me. My apologies if I missed it there). I am running the following box plots to compare conditional distributions of x at different levels of y under two treatment conditions ID=1 (upper panel ) & ID=0 (lower panel of the plot). bwplot(HF.ELECYEAR ~ stparvotech | ID , data=data, aspect=1, layout=c(1,2), xlab="Changes in Party Vote Shares", xlim=(-20:20), ylab="Periods Following Last Federal Election", main="Divided Government", panel = function(x,y) { panel.bwplot(x,y) panel.abline(v=0, col="red") } ) How can I: 1. Control the font size of the main title, the panel titles, the axis, and the tick marks? The usual cex.main=1/3, cex.xlab etc. do not work. 2. Change the color of the boxes where the panel titles (ID=1 & ID=0) are located? Thank you very much. Best, Jens __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] vector extraction
Hello, I could need some help on this one: >From the data.frame "Test.dataset2" below (TSCS data for 151 "countries.to.map" for "year" 1973-95; each "country.to.map" is described by a unique code), I would like to extract a vector "color" that for each "country.to.map" takes on the value of "dv" (a categorical variable with values 1,2,..4) for a specified "year". Thus, for a specified "year", "color" should have 151 obs - one for each "country.to.map" represented by its respective value of "dv"). I tried this: > color <- dv[country.to.map][(year == 1980)[country.to.map]] but it does not give me what I need. I can't figure out where my error is, however. Thanks, Jens > Test.dataset2[1:40,] country.to.map dv year 1 1936 NA 1973 2 1936 NA 1974 3 1936 NA 1975 4 1936 NA 1976 5 1936 NA 1977 6 1936 NA 1978 7 1936 NA 1979 8 1936 NA 1980 9 1936 NA 1981 10 1936 NA 1982 11 1936 NA 1983 12 1936 NA 1984 13 1936 NA 1985 14 1936 NA 1986 15 1936 NA 1987 16 1936 NA 1988 17 1936 4 1989 18 1936 4 1990 19 1936 4 1991 20 1936 4 1992 21 1936 4 1993 22 1936 4 1994 23 1936 4 1995 24 56 4 1973 25 56 4 1974 26 56 2 1975 27 56 2 1976 28 56 2 1977 29 56 2 1978 30 56 2 1979 31 56 2 1980 32 56 2 1981 33 56 2 1982 34 56 4 1983 35 56 4 1984 36 56 4 1985 37 56 4 1986 38 56 4 1987 39 56 4 1988 40 56 4 1989 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] drawing filled countries according to data using map('world')? - follow up
Hello, this is a follow up on my previous inquiry regarding the use of the map library (Becker and Wilks 1993). Using the 'world' database I would like to draw filled countries in a world map so that the filling colors of each country corresponds to the value of a policy variable "fix.float" at a specific "year" (the goal is to visualize a policy diffusion pattern over time using different maps for year=1985, 1990, etc.). In my dataset [Test] I have created a vector 'map.name' that contains country names that I have made identical to the country names in file world.N in .../library/maps/mapdata/. > Test[1:10,] region fix.float wbcode name year dv dv.lag map.name polygon 1 lacNAABW Aruba 1973 NA NAAruba1936 2 lacNAABW Aruba 1974 NA NAAruba1936 3 lacNAABW Aruba 1975 NA NAAruba1936 4 lacNAABW Aruba 1976 NA NAAruba1936 5 lacNAABW Aruba 1977 NA NAAruba1936 6 lacNAABW Aruba 1978 NA NAAruba1936 7 lacNAABW Aruba 1979 NA NAAruba1936 8 lacNAABW Aruba 1980 NA NAAruba1936 9 lacNAABW Aruba 1981 NA NAAruba1936 10lacNAABW Aruba 1982 NA NAAruba1936 Now I would like to translate the country names in the 'world' database to the country names in my dataset (following Becker and Wilks 1993). For some reason, the translation does not work. > map.country<- map(database = "world", names=T,plot=F) > state.to.map <- match(map.name,map.country) > color <- dv[state.to.map, year == 1980] Error in dv[state.to.map, year == 1980] : incorrect number of dimensions > color <- dv[state.to.map, year == 1980]/100 Error in dv[state.to.map, year == 1980] : incorrect number of dimensions What am I doing wrong? (there are a few values missing in "fix.float") Thanks for your help! Best Jens Hainmueller __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] drawing filled countries according to data using map('world')?
Hello, I am looking for somebody who has experience with the map library (Becker and Wilks 1993) and might be able to help me with the following problem: Using the 'world' database I would like to draw filled countries in a world map so that the filling colors of each country corresponds to the value of a policy variable X at time t (the goal is to visualize a policy diffusion pattern over time using different maps for t=1985, 1990, etc.). In their explanatory note, Becker and Wilks show how to accomplish this with the 'states' database, for filling US states with color according to the republican vote in 1900. > state.names <- unix(tr "[A-Z]" "[a-z]", state.name) > map.states <- unix(sed "s/:.*//", map(names=T, plot=F)) > state.to.map <- match(map.states, state.names) > color <- votes.repub[state.to.map, votes.year == 1900] / 100 > map(state, fill=T, col=color) > map(state, add=T) "The first expression changes uppercase to lowercase in the standard S dataset giving state names, so that these can be compared with the names returned by map. Next the complete set of state polygon names is requested (using map(names=T,plot=F); the default database is state) and the trailing portions (from the : onwards) are removed so that we have a list of the state for which each polygon is a part or the whole. Then we create state.to.map that gives the translation from the ordering of the states known to S (alphabetical) to the ordering known to the mapping mechanism. By using this vector, as in the next expression, all the pieces of a state will be colored the same color. The state.to.map vector is a useful one to keep around, for it will work in any context where the ordering of the state data is as here. Notice that unless such a vector is being reused, it will usually be the case that there will be a step like this one, finding the translation between the ordering for the regions in your data and the ordering according to map. In general, the translation will have to be computed each time the set of selected polygons changes." My question then is, how to compute a similar procedure using the 'world' database. Specifically, how can I access the country names in the 'world' database to accomplish the translation to the country names in my dataset? Is there any way to unpack the 'world' database to do the matching in an external program? And does anybody now of other (more recent) world maps that I could use? Thanks very much! Best, Jens __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html