At 08:31 AM 9/5/2003 -0500, Francisco J. Bido wrote:
Hi There,
While looking through the mailing list archive, I did not come across a simple minded example regarding the creation of dummy variables. The Gauss language provides the command "y = dummydn(x,v,p)" for creating dummy variables.
Here:
x = Nx1 vector of data to be broken up into dummy variables.
v = Kx1 vector specifying the K-1 breakpoints
p = positive integer in the range [1,K], specifying which column should be dropped in the matrix of dummy variables.
y = Nx(K-1) matrix containing the K-1 dummy variables.
My recent mailing list archive inquiry has led me to examine R's "model.matrix" but it has so many options that I'm not seeing the forest because of the trees. Is that really the easiest way? or is there something similar to the dummydn command described above?
To provide a concrete scenario, please consider the following. Using the above notation, say, I had:
x <- c(1:10) #data to be broken up into dummy variables v <- c(3,5,7) #breakpoints p = 1 #drop this column to avoid dummy variable trap
How can I get a matrix "y" that has the associated dummy variables for columns?
Thank You,
-Francisco
My initial question would be why do you want to do this? Statistical-model formulas in R implicitly generate dummy variables (and other contrasts) directly from factors, so if this is the context that you had in mind, there's no need to generate the dummy variables explicitly.
If you really do want the matrix of dummy regressors, say for a factor named "factor," then you can use model.matrix() to get them. Because the default contrast type for unordered factors is "contr.treatment", which corresponds to 0/1 dummy regressors, you can get the dummy variables as model.matrix(~factor)[,-1]. Here I've removed the initial column of ones returned by model matrix. Alternatively, model.matrix(~ factor - 1) gives you a complete set of dummy regressors; you could then drop whichever column you wanted to.
More generally, if you haven't already done so you might see how linear-model formulas are implemented in R. All of the introductions to R cover this topic. I think that this is one of the strengths of the S language, by the way.
I hope that this helps, John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: [EMAIL PROTECTED] phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
