[R] creating dummy variables based on conditions

2013-07-14 Thread Anup Nandialath
Hello everyone,

I have a dataset which includes the first three variables from the demo
data below (year, id and var). I need to create the new variable ans as
follows

If var=1, then for each year (where var=1), i need to create a new dummy
ans which takes the value of 1 for all corresponding id's where an instance
of one was recorded. Sample data with the output is shown below.

yearid var ans
[1,] 2010  1   1   1
[2,] 2010  2   0   0
[3,] 2010  1   0   1
[4,] 2010  1   0   1
[5,] 2011  2   1   1
[6,] 2011  2   0   1
[7,] 2011  1   0   0
[8,] 2011  1   0   0

Any help on how to achieve this is much appreciated.

Thanks
Anup

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables based on conditions

2013-07-14 Thread Rui Barradas

Hello,

Your data seems to be of class 'matrix'. The following code needs it to 
be a data.frame.


dat - as.data.frame(your input matrix)

res - do.call(rbind, lapply(split(dat, list(dat$id, dat$year)), 
function(x){

x$ans - if(any(x$var == 1)) 1 else 0
x}))
rownames(res) - NULL
res


Hope this helps,

Rui Barradas

Em 14-07-2013 12:30, Anup Nandialath escreveu:

Hello everyone,

I have a dataset which includes the first three variables from the demo
data below (year, id and var). I need to create the new variable ans as
follows

If var=1, then for each year (where var=1), i need to create a new dummy
ans which takes the value of 1 for all corresponding id's where an instance
of one was recorded. Sample data with the output is shown below.

 yearid var ans
[1,] 2010  1   1   1
[2,] 2010  2   0   0
[3,] 2010  1   0   1
[4,] 2010  1   0   1
[5,] 2011  2   1   1
[6,] 2011  2   0   1
[7,] 2011  1   0   0
[8,] 2011  1   0   0

Any help on how to achieve this is much appreciated.

Thanks
Anup

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables based on conditions

2013-07-14 Thread arun
Hi,
You could try this: (if I understand it correctly)
dat1- read.table(text=
year    id var ans
 2010  1  1  1
 2010  2  0  0
 2010  1  0  1
2010  1  0  1
 2011  2  1  1
 2011  2  0  1
 2011  1  0  0
2011  1  0  0
,sep=,header=TRUE,stringsAsFactors=FALSE)

dat1$newres-with(dat1,ave(var,id,year,FUN=function(x) any(x==1)*1))
 dat1
#  year id var ans newres
#1 2010  1   1   1  1
#2 2010  2   0   0  0
#3 2010  1   0   1  1
#4 2010  1   0   1  1
#5 2011  2   1   1  1
#6 2011  2   0   1  1
#7 2011  1   0   0  0
#8 2011  1   0   0  0

A.K.

- Original Message -
From: Anup Nandialath anupme...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Sunday, July 14, 2013 7:30 AM
Subject: [R] creating dummy variables based on conditions

Hello everyone,

I have a dataset which includes the first three variables from the demo
data below (year, id and var). I need to create the new variable ans as
follows

If var=1, then for each year (where var=1), i need to create a new dummy
ans which takes the value of 1 for all corresponding id's where an instance
of one was recorded. Sample data with the output is shown below.

    year    id var ans
[1,] 2010  1   1   1
[2,] 2010  2   0   0
[3,] 2010  1   0   1
[4,] 2010  1   0   1
[5,] 2011  2   1   1
[6,] 2011  2   0   1
[7,] 2011  1   0   0
[8,] 2011  1   0   0

Any help on how to achieve this is much appreciated.

Thanks
Anup

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables based on conditions

2013-07-14 Thread Anup Nandialath
hi Arun,

Thanks for this. This solution works great.

Knid Regards

Anup


On Sun, Jul 14, 2013 at 8:07 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 You could try this: (if I understand it correctly)
 dat1- read.table(text=
 yearid var ans
  2010  1  1  1
  2010  2  0  0
  2010  1  0  1
 2010  1  0  1
  2011  2  1  1
  2011  2  0  1
  2011  1  0  0
 2011  1  0  0
 ,sep=,header=TRUE,stringsAsFactors=FALSE)

 dat1$newres-with(dat1,ave(var,id,year,FUN=function(x) any(x==1)*1))
  dat1
 #  year id var ans newres
 #1 2010  1   1   1  1
 #2 2010  2   0   0  0
 #3 2010  1   0   1  1
 #4 2010  1   0   1  1
 #5 2011  2   1   1  1
 #6 2011  2   0   1  1
 #7 2011  1   0   0  0
 #8 2011  1   0   0  0

 A.K.

 - Original Message -
 From: Anup Nandialath anupme...@gmail.com
 To: r-help@r-project.org
 Cc:
 Sent: Sunday, July 14, 2013 7:30 AM
 Subject: [R] creating dummy variables based on conditions

 Hello everyone,

 I have a dataset which includes the first three variables from the demo
 data below (year, id and var). I need to create the new variable ans as
 follows

 If var=1, then for each year (where var=1), i need to create a new dummy
 ans which takes the value of 1 for all corresponding id's where an instance
 of one was recorded. Sample data with the output is shown below.

 yearid var ans
 [1,] 2010  1   1   1
 [2,] 2010  2   0   0
 [3,] 2010  1   0   1
 [4,] 2010  1   0   1
 [5,] 2011  2   1   1
 [6,] 2011  2   0   1
 [7,] 2011  1   0   0
 [8,] 2011  1   0   0

 Any help on how to achieve this is much appreciated.

 Thanks
 Anup

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables

2013-04-21 Thread Santosh
R is for dummies (like me, but I don't use dummy variables) or for the
non-Dummies like all experts who help us all the time@@.. so dummy
variables are not needed! :)
QED...


On Sat, Apr 20, 2013 at 6:16 PM, Rolf Turner rolf.tur...@xtra.co.nz wrote:

 On 21/04/13 10:56, Eva Prieto Castro wrote:

 Hi,

 Why do you write that dummy variables are not needed in R?. I would like
 you explain it.


 As others have said --- do some self-study.  But a brief answer is that in
 any
 reasonable modelling problem in which dummy variables might arise, R
 creates
 the dummy variables that it uses automagically , behind the scenes, from
 the *factors*
 whose levels correspond to the dummy variables.

 Summary:  Learn about and understand *factors*; forget about dummy
 variables.

 cheers,

 Rolf Turner


 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating dummy variables

2013-04-20 Thread shyam basnet
Hello R-users,
  
The below is a snippet of my data:
 
  
fid  crop  year  value   
5_1_1  SWHE  1995  171   
5_1_1  SWHE  1997  696   
5_1_1  BARL  1996  114   
5_1_1  BARL  1997  344   
5_2_2  SWHE  1995  120   
5_2_2  SWHE  1996  511   
5_2_2  BARL  1996  239   
5_2_2  BARL  1997  349   
 
Here, I want to create dummy variables with the names of the content of a 
column 'crop' in a way that the new variable 'SWHE' would receive a value of 1 
if the column 'crop' contains 'SWHE' and 0 otherwise. So, I would have two new 
variables SWHE and BARL as below:
 
  
fid  crop  year  value  SWHE  BARL   
5_1_1  SWHE  1995  171  1  0   
5_1_1  SWHE  1997  696  1  0   
5_1_1  BARL  1996  114  0  1   
5_1_1  BARL  1997  344  0  1   
5_2_2  SWHE  1995  120  1  0   
5_2_2  SWHE  1996  511  1  0   
5_2_2  BARL  1996  239  0  1   
5_2_2  BARL  1997  349  0  1   
 
 
Cheers,
Shyam
Nepal
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables

2013-04-20 Thread Bert Gunter
Dummy variables are not needed in R.

Bert

Sent from my iPhone -- please excuse typos.

On Apr 20, 2013, at 11:23 AM, shyam basnet shyamabc2...@yahoo.com wrote:

 Hello R-users,
   
 The below is a snippet of my data:
  
 
 fid  crop  year  value   
 5_1_1  SWHE  1995  171   
 5_1_1  SWHE  1997  696   
 5_1_1  BARL  1996  114   
 5_1_1  BARL  1997  344   
 5_2_2  SWHE  1995  120   
 5_2_2  SWHE  1996  511   
 5_2_2  BARL  1996  239   
 5_2_2  BARL  1997  349   
  
 Here, I want to create dummy variables with the names of the content of a 
 column 'crop' in a way that the new variable 'SWHE' would receive a value of 
 1 if the column 'crop' contains 'SWHE' and 0 otherwise. So, I would have two 
 new variables SWHE and BARL as below:
  
 
 fid  crop  year  value  SWHE  BARL   
 5_1_1  SWHE  1995  171  1  0   
 5_1_1  SWHE  1997  696  1  0   
 5_1_1  BARL  1996  114  0  1   
 5_1_1  BARL  1997  344  0  1   
 5_2_2  SWHE  1995  120  1  0   
 5_2_2  SWHE  1996  511  1  0   
 5_2_2  BARL  1996  239  0  1   
 5_2_2  BARL  1997  349  0  1   
  
  
 Cheers,
 Shyam
 Nepal
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables

2013-04-20 Thread Patrick Coulombe
Hello Shyam,

if your data is stored in variable dataset, for example, the following
code will create the desired dummy-coded variables and attach them to the
dataset:


##

#init vars
SWHE=BARL - vector(length=nrow(dataset))
SWHE[]=BARL[] - 0 #initialize dummy-coded vars with all 0s

#fill in variables
SWHE[grep(SWHE, dataset$crop)] - 1 #grep returns the indices where a
match is found, see ?grep
BARL[grep(BARL, dataset$crop)] - 1

#attach new dummy codes to dataset
dataset$SWHE - SWHE
dataset$BARL - BARL

##



Hope this helps,

Patrick


2013/4/20 shyam basnet shyamabc2...@yahoo.com

 Hello R-users,

 The below is a snippet of my data:


 fid  crop  year  value
 5_1_1  SWHE  1995  171
 5_1_1  SWHE  1997  696
 5_1_1  BARL  1996  114
 5_1_1  BARL  1997  344
 5_2_2  SWHE  1995  120
 5_2_2  SWHE  1996  511
 5_2_2  BARL  1996  239
 5_2_2  BARL  1997  349

 Here, I want to create dummy variables with the names of the content of a
 column 'crop' in a way that the new variable 'SWHE' would receive a value
 of 1 if the column 'crop' contains 'SWHE' and 0 otherwise. So, I would have
 two new variables SWHE and BARL as below:


 fid  crop  year  value  SWHE  BARL
 5_1_1  SWHE  1995  171  1  0
 5_1_1  SWHE  1997  696  1  0
 5_1_1  BARL  1996  114  0  1
 5_1_1  BARL  1997  344  0  1
 5_2_2  SWHE  1995  120  1  0
 5_2_2  SWHE  1996  511  1  0
 5_2_2  BARL  1996  239  0  1
 5_2_2  BARL  1997  349  0  1


 Cheers,
 Shyam
 Nepal
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables

2013-04-20 Thread David Winsemius

On Apr 20, 2013, at 2:03 PM, Bert Gunter wrote:

 Dummy variables are not needed in R.
 
 Bert
 

Bert is correct on this point, but if you what to know how the regression 
functions in R do this behind the scenes then you could always look at:

?model.matrix # where _some_ of the the automagical stuff happens

 model.matrix( ~ crop, data=dat[,crop, drop=FALSE])
  (Intercept) cropSWHE
1   11
2   11
3   10
4   10
5   11
6   11
7   10
8   10
attr(,assign)
[1] 0 1
attr(,contrasts)
attr(,contrasts)$crop
[1] contr.treatment



 Sent from my iPhone -- please excuse typos.
 
 On Apr 20, 2013, at 11:23 AM, shyam basnet shyamabc2...@yahoo.com wrote:
 
 Hello R-users,
 
 The below is a snippet of my data:
 
 
 fid  crop  year  value   
 5_1_1  SWHE  1995  171   
 5_1_1  SWHE  1997  696   
 5_1_1  BARL  1996  114   
 5_1_1  BARL  1997  344   
 5_2_2  SWHE  1995  120   
 5_2_2  SWHE  1996  511   
 5_2_2  BARL  1996  239   
 5_2_2  BARL  1997  349   
 
 Here, I want to create dummy variables with the names of the content of a 
 column 'crop' in a way that the new variable 'SWHE' would receive a value of 
 1 if the column 'crop' contains 'SWHE' and 0 otherwise. So, I would have two 
 new variables SWHE and BARL as below:
 
 
 fid  crop  year  value  SWHE  BARL   
 5_1_1  SWHE  1995  171  1  0   
 5_1_1  SWHE  1997  696  1  0   
 5_1_1  BARL  1996  114  0  1   
 5_1_1  BARL  1997  344  0  1   
 5_2_2  SWHE  1995  120  1  0   
 5_2_2  SWHE  1996  511  1  0   
 5_2_2  BARL  1996  239  0  1   
 5_2_2  BARL  1997  349  0  1   
 
 
 Cheers,
 Shyam
 Nepal
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables

2013-04-20 Thread Eva Prieto Castro

Hi,

Why do you write that dummy variables are not needed in R?. I would like you 
explain it.

Thanks, 

Eva

--- El dom, 21/4/13, David Winsemius dwinsem...@comcast.net escribió:

De: David Winsemius dwinsem...@comcast.net
Asunto: Re: [R] creating dummy variables
Para: Bert Gunter gunter.ber...@gene.com
CC: r-help@R-project.org r-help@r-project.org, shyam basnet 
shyamabc2...@yahoo.com
Fecha: domingo, 21 de abril, 2013 00:38


On Apr 20, 2013, at 2:03 PM, Bert Gunter wrote:

 Dummy variables are not needed in R.
 
 Bert
 

Bert is correct on this point, but if you what to know how the regression 
functions in R do this behind the scenes then you could always look at:

?model.matrix     # where _some_ of the the automagical stuff happens

 model.matrix( ~ crop, data=dat[,crop, drop=FALSE])
  (Intercept) cropSWHE
1           1        1
2           1        1
3           1        0
4           1        0
5           1        1
6           1        1
7           1        0
8           1        0
attr(,assign)
[1] 0 1
attr(,contrasts)
attr(,contrasts)$crop
[1] contr.treatment



 Sent from my iPhone -- please excuse typos.
 
 On Apr 20, 2013, at 11:23 AM, shyam basnet shyamabc2...@yahoo.com wrote:
 
 Hello R-users,
 
 The below is a snippet of my data:
 
 
 fid  crop  year  value   
 5_1_1  SWHE  1995  171   
 5_1_1  SWHE  1997  696   
 5_1_1  BARL  1996  114   
 5_1_1  BARL  1997  344   
 5_2_2  SWHE  1995  120   
 5_2_2  SWHE  1996  511   
 5_2_2  BARL  1996  239   
 5_2_2  BARL  1997  349   
 
 Here, I want to create dummy variables with the names of the content of a 
 column 'crop' in a way that the new variable 'SWHE' would receive a value of 
 1 if the column 'crop' contains 'SWHE' and 0 otherwise. So, I would have two 
 new variables SWHE and BARL as below:
 
 
 fid  crop  year  value  SWHE  BARL   
 5_1_1  SWHE  1995  171  1  0   
 5_1_1  SWHE  1997  696  1  0   
 5_1_1  BARL  1996  114  0  1   
 5_1_1  BARL  1997  344  0  1   
 5_2_2  SWHE  1995  120  1  0   
 5_2_2  SWHE  1996  511  1  0   
 5_2_2  BARL  1996  239  0  1   
 5_2_2  BARL  1997  349  0  1   
 
 
 Cheers,
 Shyam
 Nepal
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables

2013-04-20 Thread David Winsemius

On Apr 20, 2013, at 3:56 PM, Eva Prieto Castro wrote:

 
 Hi,
 
 Why do you write that dummy variables are not needed in R?. I would like you 
 explain it.

I suppose you might want individual instruction, but Rhelp was established with 
certain principles (expressed in the Posting Guide), one of which is that 
persons posting to Rhelp should have made demonstrated effort on their own to 
study the offered documentation. You are not demonstrating that you have yet 
understood this principle.

-- 
David.
 
 Thanks, 
 
 Eva
 
 --- El dom, 21/4/13, David Winsemius dwinsem...@comcast.net escribió:
 
 De: David Winsemius dwinsem...@comcast.net
 Asunto: Re: [R] creating dummy variables
 Para: Bert Gunter gunter.ber...@gene.com
 CC: r-help@R-project.org r-help@r-project.org, shyam basnet 
 shyamabc2...@yahoo.com
 Fecha: domingo, 21 de abril, 2013 00:38
 
 
 On Apr 20, 2013, at 2:03 PM, Bert Gunter wrote:
 
  Dummy variables are not needed in R.
  
  Bert
  
 
 Bert is correct on this point, but if you what to know how the regression 
 functions in R do this behind the scenes then you could always look at:
 
 ?model.matrix # where _some_ of the the automagical stuff happens
 
  model.matrix( ~ crop, data=dat[,crop, drop=FALSE])
   (Intercept) cropSWHE
 1   11
 2   11
 3   10
 4   10
 5   11
 6   11
 7   10
 8   10
 attr(,assign)
 [1] 0 1
 attr(,contrasts)
 attr(,contrasts)$crop
 [1] contr.treatment
 
 
 
  Sent from my iPhone -- please excuse typos.
  
  On Apr 20, 2013, at 11:23 AM, shyam basnet shyamabc2...@yahoo.com wrote:
  
  Hello R-users,
  
  The below is a snippet of my data:
  
  
  fid  crop  year  value   
  5_1_1  SWHE  1995  171   
  5_1_1  SWHE  1997  696   
  5_1_1  BARL  1996  114   
  5_1_1  BARL  1997  344   
  5_2_2  SWHE  1995  120   
  5_2_2  SWHE  1996  511   
  5_2_2  BARL  1996  239   
  5_2_2  BARL  1997  349   
  
  Here, I want to create dummy variables with the names of the content of a 
  column 'crop' in a way that the new variable 'SWHE' would receive a value 
  of 1 if the column 'crop' contains 'SWHE' and 0 otherwise. So, I would 
  have two new variables SWHE and BARL as below:
  
  
  fid  crop  year  value  SWHE  BARL   
  5_1_1  SWHE  1995  171  1  0   
  5_1_1  SWHE  1997  696  1  0   
  5_1_1  BARL  1996  114  0  1   
  5_1_1  BARL  1997  344  0  1   
  5_2_2  SWHE  1995  120  1  0   
  5_2_2  SWHE  1996  511  1  0   
  5_2_2  BARL  1996  239  0  1   
  5_2_2  BARL  1997  349  0  1   
  
  
  Cheers,
  Shyam
  Nepal
  
 
 David Winsemius
 Alameda, CA, USA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables

2013-04-20 Thread Bert Gunter
To all who ask about dummy variables in R:

Please please read the Introduction to R section on Statistical Models
in R or other tutorial (there are many on the web) on regression modeling
in R. (For that matter -- please read all of this or other basic R tutorial
before posting here!). An excellent but somewhat terse and technical
discussion can be found in the latest edition of VR's MASS -- the chapter
on linear models. I do not intend to (poorly) try to recapitulate what
others have already (well) explained. Do Your Homework!

Documentation within R can be found in

?lm
?formula
?contrasts
?model.matrix

These may not make much sense unless you have done your homework first --
or already understand the statistical issues.

-- Bert





On Sat, Apr 20, 2013 at 3:56 PM, Eva Prieto Castro evapcas...@yahoo.eswrote:


 Hi,

 Why do you write that dummy variables are not needed in R?. I would like
 you explain it.

 Thanks,

 Eva

 --- El *dom, 21/4/13, David Winsemius dwinsem...@comcast.net* escribió:


 De: David Winsemius dwinsem...@comcast.net
 Asunto: Re: [R] creating dummy variables
 Para: Bert Gunter gunter.ber...@gene.com
 CC: r-help@R-project.org r-help@r-project.org, shyam basnet 
 shyamabc2...@yahoo.com
 Fecha: domingo, 21 de abril, 2013 00:38


 On Apr 20, 2013, at 2:03 PM, Bert Gunter wrote:

  Dummy variables are not needed in R.
 
  Bert
 

 Bert is correct on this point, but if you what to know how the regression
 functions in R do this behind the scenes then you could always look at:

 ?model.matrix # where _some_ of the the automagical stuff happens

  model.matrix( ~ crop, data=dat[,crop, drop=FALSE])
   (Intercept) cropSWHE
 1   11
 2   11
 3   10
 4   10
 5   11
 6   11
 7   10
 8   10
 attr(,assign)
 [1] 0 1
 attr(,contrasts)
 attr(,contrasts)$crop
 [1] contr.treatment



  Sent from my iPhone -- please excuse typos.
 
  On Apr 20, 2013, at 11:23 AM, shyam basnet 
  shyamabc2...@yahoo.comhttp://mc/compose?to=shyamabc2...@yahoo.com
 wrote:
 
  Hello R-users,
 
  The below is a snippet of my data:
 
 
  fid  crop  year  value
  5_1_1  SWHE  1995  171
  5_1_1  SWHE  1997  696
  5_1_1  BARL  1996  114
  5_1_1  BARL  1997  344
  5_2_2  SWHE  1995  120
  5_2_2  SWHE  1996  511
  5_2_2  BARL  1996  239
  5_2_2  BARL  1997  349
 
  Here, I want to create dummy variables with the names of the content of
 a column 'crop' in a way that the new variable 'SWHE' would receive a value
 of 1 if the column 'crop' contains 'SWHE' and 0 otherwise. So, I would have
 two new variables SWHE and BARL as below:
 
 
  fid  crop  year  value  SWHE  BARL
  5_1_1  SWHE  1995  171  1  0
  5_1_1  SWHE  1997  696  1  0
  5_1_1  BARL  1996  114  0  1
  5_1_1  BARL  1997  344  0  1
  5_2_2  SWHE  1995  120  1  0
  5_2_2  SWHE  1996  511  1  0
  5_2_2  BARL  1996  239  0  1
  5_2_2  BARL  1997  349  0  1
 
 
  Cheers,
  Shyam
  Nepal
 

 David Winsemius
 Alameda, CA, USA

 __
 R-help@r-project.org http://mc/compose?to=R-help@r-project.org mailing
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating dummy variables

2013-04-20 Thread Rolf Turner

On 21/04/13 10:56, Eva Prieto Castro wrote:

Hi,

Why do you write that dummy variables are not needed in R?. I would like you 
explain it.


As others have said --- do some self-study.  But a brief answer is that 
in any
reasonable modelling problem in which dummy variables might arise, R 
creates
the dummy variables that it uses automagically , behind the scenes, from 
the *factors*

whose levels correspond to the dummy variables.

Summary:  Learn about and understand *factors*; forget about dummy
variables.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating dummy variables in r

2013-01-30 Thread peter dalgaard

On Jan 30, 2013, at 04:58 , Bert Gunter wrote:

 You almost never need dummy variables in R. R creates them
 automatically from factors given model and possibly contrasts
 specification.
 
 ?contrasts  ## for some technical details.
 
 If you have not read An Introduction to R do so now. Pay particular
 attention to the chapter on modeling and categorical variables. You
 can also google around to find appropriate tutorials. Here is one:
 
 http://www.ats.ucla.edu/stat/r/modules/dummy_vars.htm
 
 I repeat: DO not create dummy variablesby hand in R unless you have
 understood the above and have good reason to do so.

In this case it's a cutpoint-type situation, and the user might be excused for 
not wanting to deal with the mysteries of cut() (yet). 

More importantly, the main issue here seems to be a lack of understanding of 
where new variables are located. I.e., if the data set is called dd, you need

dd$prev1 - (etc)

and if you use attach(), do it _after_ modifying the data (or detach() and 
reattach).

Otherwise, new variables end up in the global environment. (This is logical 
enough once you realize that the result of a computation does not necessarily 
fit into the dataset.)

By the way: You don't need ifelse(): as.numeric(ret1 = .5) or even just (ret1 
= .5) works. 

 
 -- Bert
 
 On Tue, Jan 29, 2013 at 7:21 PM, Joseph Norman Thomson
 thoms...@email.arizona.edu wrote:
 Hello,
 
 Semi-new r user here and still learning the ropes. I am creating dummy
 variables for a dataset on stock prices in r. One dummy variable is
 called prev1 and is:
 
 prev1 - ifelse(ret1 = .5, 1, 0)
 
 where ret1 is the previous day's return.
 
 The variable prev1 is created fine and works in my regression model
 and for running conditional statistics. However, when I call the
 names() function on the dataset the freshly created variable (prev1)
 doesn't show up; also, when I export the dataset the prev1 variable
 doesn't show up in the exported file. Is there a way to make the
 variable show up on both the call function but more importantly on the
 exported file? Or am I forced to create dummy variables elsewhere(much
 tougher)?
 
 
 Thanks,
 
 Joe
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating dummy variables in r

2013-01-29 Thread Joseph Norman Thomson
Hello,

Semi-new r user here and still learning the ropes. I am creating dummy
variables for a dataset on stock prices in r. One dummy variable is
called prev1 and is:

prev1 - ifelse(ret1 = .5, 1, 0)

where ret1 is the previous day's return.

The variable prev1 is created fine and works in my regression model
and for running conditional statistics. However, when I call the
names() function on the dataset the freshly created variable (prev1)
doesn't show up; also, when I export the dataset the prev1 variable
doesn't show up in the exported file. Is there a way to make the
variable show up on both the call function but more importantly on the
exported file? Or am I forced to create dummy variables elsewhere(much
tougher)?


Thanks,

Joe

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating dummy variables in r

2013-01-29 Thread Bert Gunter
You almost never need dummy variables in R. R creates them
automatically from factors given model and possibly contrasts
specification.

?contrasts  ## for some technical details.

If you have not read An Introduction to R do so now. Pay particular
attention to the chapter on modeling and categorical variables. You
can also google around to find appropriate tutorials. Here is one:

http://www.ats.ucla.edu/stat/r/modules/dummy_vars.htm

I repeat: DO not create dummy variablesby hand in R unless you have
understood the above and have good reason to do so.

-- Bert

On Tue, Jan 29, 2013 at 7:21 PM, Joseph Norman Thomson
thoms...@email.arizona.edu wrote:
 Hello,

 Semi-new r user here and still learning the ropes. I am creating dummy
 variables for a dataset on stock prices in r. One dummy variable is
 called prev1 and is:

 prev1 - ifelse(ret1 = .5, 1, 0)

 where ret1 is the previous day's return.

 The variable prev1 is created fine and works in my regression model
 and for running conditional statistics. However, when I call the
 names() function on the dataset the freshly created variable (prev1)
 doesn't show up; also, when I export the dataset the prev1 variable
 doesn't show up in the exported file. Is there a way to make the
 variable show up on both the call function but more importantly on the
 exported file? Or am I forced to create dummy variables elsewhere(much
 tougher)?


 Thanks,

 Joe

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating dummy variables

2010-06-03 Thread Arantzazu Blanco Bernardeau

Hello R project
I am a R beginner trying to create a dummy variable to clasificate soil types.
So, I have a column in my database called codtipo (typecode in english) where 
soil type is coded as 
1.1 to 1.4 arenosol (I have 4 types)
2.1 to 2.3 calcisols 
4.1 to 4.4 fluvisols
and so on
To make dummy variables I understand that, I create different columns as for 
gipsisols
datos$gipsi=datos$codsuelo
for (i in 1:length(datos$gipsi)){if(datos$codsuelo[i]=5.1  
(datos$codsuelo[i]=5.4){datos$gipsi[i]=1}else{0}
}
for cambisols it should be
datos$cambi=datos$codsuelo
for (i in 1:length(datos$cambi)){if(datos$codsuelo[i]=3.1  
datos$codsuelo[i]=3.3){datos$cambi[i]=1}else{0} 
}
and so on... 
but anyway R answers that a necesary value TRUE/FALSE is not existing.
What can I do?
thanks a lot!!


Arantzazu Blanco Bernardeau
Dpto de Química Agrícola, Geología y Edafología  
Universidad de Murcia-Campus de Espinardo







 Date: Thu, 3 Jun 2010 06:51:42 -0700
 From: lampria...@yahoo.com
 To: jorism...@gmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] ordinal variables
 
 Thank you Joris,
 I'll have a look into the commands you sent me. They look convincing. I hope 
 my students will also see them in a positive way (although I can force them 
 to pretend that they have a positive attitude)!
 
 Dr. Iasonas Lamprianou
 
 
 
 
 
 Assistant Professor (Educational Research and Evaluation)
 
 Department of Education Sciences
 
 European University-Cyprus
 
 P.O. Box 22006
 
 1516 Nicosia
 
 Cyprus 
 
 Tel.: +357-22-713178
 
 Fax: +357-22-590539
 
 
 
 
 
 Honorary Research Fellow
 
 Department of Education
 
 The University of Manchester
 
 Oxford Road, Manchester M13 9PL, UK
 
 Tel. 0044  161 275 3485
 
 iasonas.lampria...@manchester.ac.uk
 
 --- On Thu, 3/6/10, Joris Meys jorism...@gmail.com wrote:
 
 From: Joris Meys jorism...@gmail.com
 Subject: Re: [R] ordinal variables
 To: Iasonas Lamprianou lampria...@yahoo.com
 Cc: r-help@r-project.org
 Date: Thursday, 3 June, 2010, 14:35
 
 see ?factor and ?as.factor. On ordered factors you can technically do a 
 spearman without problem, apart from the fact that a spearman test by 
 definition cannot give exact p-values with ties present.
 
 x - sample(c(a,b,c,d,e),100,replace=T)
 
 y - sample(c(a,b,c,d,e),100,replace=T)
 
 x.ordered - factor(x,levels=c(e,b,a,d,c),ordered=T)
 
 x.ordered
 y.ordered - factor(y,levels=c(e,b,a,d,c),ordered=T)
 y.ordered
 
 cor.test(x.ordered,y.ordered,method=spearman)
 
 require(pspearman)
 
 spearman.test(x.ordered,y.ordered)
 
 R commander has some menu options to deal with factors. R commander also 
 provides a scripting window. Please do your students a favor, and show them 
 how to use those commands. 
 
 
 Cheers
 Joris
 
 
 On Thu, Jun 3, 2010 at 2:25 PM, Iasonas Lamprianou lampria...@yahoo.com 
 wrote:
 
 Dear colleagues,
 
 
 
 I teach statistics using SPSS. I want to use R instead. I hit on one problem 
 and I need some quick advice. When I want to work with ordinal variables, in 
 SPSS I can compute the median or create a barchart or compute a spearman 
 correlation with no problems. In R, if I read the ordinal variable as 
 numeric, then I cannot do a barplot because I miss the category names. If I 
 read the variables as characters, then I cannot run a spearman. How can I 
 read a variable as numeric, still have the chance to assign value labels, and 
 be able to get table of frequencies etc? I want to be able to do all these 
 things in R commander. My students will probable be scared away if I try 
 anything else other than R commander (just writing commands will not make 
 them happy).
 
 
 
 
 I hope I am not asking for too much. Hopefully there is a way
 
 
 
 
 
 
 
 
 
 __
 
 R-help@r-project.org mailing list
 
 https://stat.ethz.ch/mailman/listinfo/r-help
 
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Joris Meys
 Statistical Consultant
 
 Ghent University
 Faculty of Bioscience Engineering 
 Department of Applied mathematics, biometrics and process control
 
 
 Coupure Links 653
 B-9000 Gent
 
 tel : +32 9 264 59 87
 joris.m...@ugent.be 
 ---
 Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
 
 
 
 
 
   
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
  
_
Citas sin compromiso por Internet Te damos las claves para encontrar pareja en 
la red

[[alternative HTML version deleted]]

__
R-help@r-project.org 

Re: [R] Creating dummy variables

2010-06-03 Thread Jannis
Was that the original code that you ran? As there appear to be several mistakes 
in the code:

1. In the gipsisoil stuff, there is a ')' too much
2. In the gambisoil stuff both  signs point in the same direction, you 
probably want one  and one 


My general suggestion would be to skip the loops altogether and vectorize your 
code:

datos$cambi=datos$codsuelo
datos$cambi[datos$codsuelo=3.1  datos$codsuelo =3.3] - 1

Another source of your error could be that datos$codtipo is not numeric. What 
does class(datos$codzuelo) say?


HTH
Jannis
 for (i in
 1:length(datos$cambi)){if(datos$codsuelo[i]=3.1
 
 datos$codsuelo[i]=3.3){datos$cambi[i]=1}else{0} 
 }

--- Arantzazu Blanco Bernardeau aramu...@hotmail.com schrieb am Do, 3.6.2010:

 Von: Arantzazu Blanco Bernardeau aramu...@hotmail.com
 Betreff: [R] Creating dummy variables
 An: r-help@r-project.org
 Datum: Donnerstag, 3. Juni, 2010 14:11 Uhr
 
 Hello R project
 I am a R beginner trying to create a dummy variable to
 clasificate soil types.
 So, I have a column in my database called codtipo (typecode
 in english) where soil type is coded as 
 1.1 to 1.4 arenosol (I have 4 types)
 2.1 to 2.3 calcisols 
 4.1 to 4.4 fluvisols
 and so on
 To make dummy variables I understand that, I create
 different columns as for gipsisols
 datos$gipsi=datos$codsuelo
 for (i in
 1:length(datos$gipsi)){if(datos$codsuelo[i]=5.1
 
 (datos$codsuelo[i]=5.4){datos$gipsi[i]=1}else{0}
 }
 for cambisols it should be
 datos$cambi=datos$codsuelo
 for (i in
 1:length(datos$cambi)){if(datos$codsuelo[i]=3.1
 
 datos$codsuelo[i]=3.3){datos$cambi[i]=1}else{0} 
 }
 and so on... 
 but anyway R answers that a necesary value TRUE/FALSE is
 not existing.
 What can I do?
 thanks a lot!!
 
 
 Arantzazu Blanco Bernardeau
 Dpto de Química Agrícola, Geología y Edafología  
 Universidad de Murcia-Campus de Espinardo
 
 
 
 
 
 
 
  Date: Thu, 3 Jun 2010 06:51:42 -0700
  From: lampria...@yahoo.com
  To: jorism...@gmail.com
  CC: r-help@r-project.org
  Subject: Re: [R] ordinal variables
  
  Thank you Joris,
  I'll have a look into the commands you sent me. They
 look convincing. I hope my students will also see them in a
 positive way (although I can force them to pretend that they
 have a positive attitude)!
  
  Dr. Iasonas Lamprianou
  
  
  
  
  
  Assistant Professor (Educational Research and
 Evaluation)
  
  Department of Education Sciences
  
  European University-Cyprus
  
  P.O. Box 22006
  
  1516 Nicosia
  
  Cyprus 
  
  Tel.: +357-22-713178
  
  Fax: +357-22-590539
  
  
  
  
  
  Honorary Research Fellow
  
  Department of Education
  
  The University of Manchester
  
  Oxford Road, Manchester M13 9PL, UK
  
  Tel. 0044  161 275 3485
  
  iasonas.lampria...@manchester.ac.uk
  
  --- On Thu, 3/6/10, Joris Meys jorism...@gmail.com
 wrote:
  
  From: Joris Meys jorism...@gmail.com
  Subject: Re: [R] ordinal variables
  To: Iasonas Lamprianou lampria...@yahoo.com
  Cc: r-help@r-project.org
  Date: Thursday, 3 June, 2010, 14:35
  
  see ?factor and ?as.factor. On ordered factors you can
 technically do a spearman without problem, apart from the
 fact that a spearman test by definition cannot give exact
 p-values with ties present.
  
  x - sample(c(a,b,c,d,e),100,replace=T)
  
  y - sample(c(a,b,c,d,e),100,replace=T)
  
  x.ordered -
 factor(x,levels=c(e,b,a,d,c),ordered=T)
  
  x.ordered
  y.ordered -
 factor(y,levels=c(e,b,a,d,c),ordered=T)
  y.ordered
  
  cor.test(x.ordered,y.ordered,method=spearman)
  
  require(pspearman)
  
  spearman.test(x.ordered,y.ordered)
  
  R commander has some menu options to deal with
 factors. R commander also provides a scripting window.
 Please do your students a favor, and show them how to use
 those commands. 
  
  
  Cheers
  Joris
  
  
  On Thu, Jun 3, 2010 at 2:25 PM, Iasonas Lamprianou
 lampria...@yahoo.com
 wrote:
  
  Dear colleagues,
  
  
  
  I teach statistics using SPSS. I want to use R
 instead. I hit on one problem and I need some quick advice.
 When I want to work with ordinal variables, in SPSS I can
 compute the median or create a barchart or compute a
 spearman correlation with no problems. In R, if I read the
 ordinal variable as numeric, then I cannot do a barplot
 because I miss the category names. If I read the variables
 as characters, then I cannot run a spearman. How can I read
 a variable as numeric, still have the chance to assign value
 labels, and be able to get table of frequencies etc? I want
 to be able to do all these things in R commander. My
 students will probable be scared away if I try anything else
 other than R commander (just writing commands will not make
 them happy).
  
  
  
  
  I hope I am not asking for too much. Hopefully there
 is a way
  
  
  
  
  
  
  
  
  
  __
  
  R-help@r-project.org
 mailing list
  
  https://stat.ethz.ch/mailman/listinfo/r-help
  
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  
  and provide

Re: [R] Creating dummy variables

2010-06-03 Thread Arantzazu Blanco Bernardeau

hey thanks
I did solve it already, it had more mistakes as you see :S
bye

Arantzazu Blanco Bernardeau
Dpto de Química Agrícola, Geología y Edafología  
Universidad de Murcia-Campus de Espinardo







 Date: Thu, 3 Jun 2010 14:40:30 +
 From: bt_jan...@yahoo.de
 Subject: AW: [R] Creating dummy variables
 To: r-help@r-project.org; aramu...@hotmail.com
 
 Was that the original code that you ran? As there appear to be several 
 mistakes in the code:
 
 1. In the gipsisoil stuff, there is a ')' too much
 2. In the gambisoil stuff both  signs point in the same direction, you 
 probably want one  and one 
 
 
 My general suggestion would be to skip the loops altogether and vectorize 
 your code:
 
 datos$cambi=datos$codsuelo
 datos$cambi[datos$codsuelo=3.1  datos$codsuelo =3.3] - 1
 
 Another source of your error could be that datos$codtipo is not numeric. What 
 does class(datos$codzuelo) say?
 
 
 HTH
 Jannis
  for (i in
  1:length(datos$cambi)){if(datos$codsuelo[i]=3.1
  
  datos$codsuelo[i]=3.3){datos$cambi[i]=1}else{0} 
  }
 
 --- Arantzazu Blanco Bernardeau aramu...@hotmail.com schrieb am Do, 
 3.6.2010:
 
  Von: Arantzazu Blanco Bernardeau aramu...@hotmail.com
  Betreff: [R] Creating dummy variables
  An: r-help@r-project.org
  Datum: Donnerstag, 3. Juni, 2010 14:11 Uhr
  
  Hello R project
  I am a R beginner trying to create a dummy variable to
  clasificate soil types.
  So, I have a column in my database called codtipo (typecode
  in english) where soil type is coded as 
  1.1 to 1.4 arenosol (I have 4 types)
  2.1 to 2.3 calcisols 
  4.1 to 4.4 fluvisols
  and so on
  To make dummy variables I understand that, I create
  different columns as for gipsisols
  datos$gipsi=datos$codsuelo
  for (i in
  1:length(datos$gipsi)){if(datos$codsuelo[i]=5.1
  
  (datos$codsuelo[i]=5.4){datos$gipsi[i]=1}else{0}
  }
  for cambisols it should be
  datos$cambi=datos$codsuelo
  for (i in
  1:length(datos$cambi)){if(datos$codsuelo[i]=3.1
  
  datos$codsuelo[i]=3.3){datos$cambi[i]=1}else{0} 
  }
  and so on... 
  but anyway R answers that a necesary value TRUE/FALSE is
  not existing.
  What can I do?
  thanks a lot!!
  
  
  Arantzazu Blanco Bernardeau
  Dpto de Química Agrícola, Geología y Edafología  
  Universidad de Murcia-Campus de Espinardo
  
  
  
  
  
  
  
   Date: Thu, 3 Jun 2010 06:51:42 -0700
   From: lampria...@yahoo.com
   To: jorism...@gmail.com
   CC: r-help@r-project.org
   Subject: Re: [R] ordinal variables
   
   Thank you Joris,
   I'll have a look into the commands you sent me. They
  look convincing. I hope my students will also see them in a
  positive way (although I can force them to pretend that they
  have a positive attitude)!
   
   Dr. Iasonas Lamprianou
   
   
   
   
   
   Assistant Professor (Educational Research and
  Evaluation)
   
   Department of Education Sciences
   
   European University-Cyprus
   
   P.O. Box 22006
   
   1516 Nicosia
   
   Cyprus 
   
   Tel.: +357-22-713178
   
   Fax: +357-22-590539
   
   
   
   
   
   Honorary Research Fellow
   
   Department of Education
   
   The University of Manchester
   
   Oxford Road, Manchester M13 9PL, UK
   
   Tel. 0044  161 275 3485
   
   iasonas.lampria...@manchester.ac.uk
   
   --- On Thu, 3/6/10, Joris Meys jorism...@gmail.com
  wrote:
   
   From: Joris Meys jorism...@gmail.com
   Subject: Re: [R] ordinal variables
   To: Iasonas Lamprianou lampria...@yahoo.com
   Cc: r-help@r-project.org
   Date: Thursday, 3 June, 2010, 14:35
   
   see ?factor and ?as.factor. On ordered factors you can
  technically do a spearman without problem, apart from the
  fact that a spearman test by definition cannot give exact
  p-values with ties present.
   
   x - sample(c(a,b,c,d,e),100,replace=T)
   
   y - sample(c(a,b,c,d,e),100,replace=T)
   
   x.ordered -
  factor(x,levels=c(e,b,a,d,c),ordered=T)
   
   x.ordered
   y.ordered -
  factor(y,levels=c(e,b,a,d,c),ordered=T)
   y.ordered
   
   cor.test(x.ordered,y.ordered,method=spearman)
   
   require(pspearman)
   
   spearman.test(x.ordered,y.ordered)
   
   R commander has some menu options to deal with
  factors. R commander also provides a scripting window.
  Please do your students a favor, and show them how to use
  those commands. 
   
   
   Cheers
   Joris
   
   
   On Thu, Jun 3, 2010 at 2:25 PM, Iasonas Lamprianou
  lampria...@yahoo.com
  wrote:
   
   Dear colleagues,
   
   
   
   I teach statistics using SPSS. I want to use R
  instead. I hit on one problem and I need some quick advice.
  When I want to work with ordinal variables, in SPSS I can
  compute the median or create a barchart or compute a
  spearman correlation with no problems. In R, if I read the
  ordinal variable as numeric, then I cannot do a barplot
  because I miss the category names. If I read the variables
  as characters, then I cannot run a spearman. How can I read
  a variable as numeric, still have the chance to assign value
  labels, and be able to get table of frequencies etc

Re: [R] Creating dummy variables

2010-06-03 Thread Bert Gunter
Do **NOT** use dummy variables in R. R's modeling functions takes care of
this themselves using factors. You say you are a beginner. OK, so begin
**properly** -- by reading An Introduction to R. Chapter 11 on Statistical
Models in R was written precisely to help people like you learn what to do
and avoid asking inappropriate questions like this on this list.

Bert Gunter
Genentech Nonclinical Biostatistics
 
 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Arantzazu Blanco Bernardeau
Sent: Thursday, June 03, 2010 10:04 AM
To: bt_jan...@yahoo.de; r-help@r-project.org
Subject: Re: [R] Creating dummy variables


hey thanks
I did solve it already, it had more mistakes as you see :S
bye

Arantzazu Blanco Bernardeau
Dpto de Qummica Agrmcola, Geologma y Edafologma  
Universidad de Murcia-Campus de Espinardo







 Date: Thu, 3 Jun 2010 14:40:30 +
 From: bt_jan...@yahoo.de
 Subject: AW: [R] Creating dummy variables
 To: r-help@r-project.org; aramu...@hotmail.com
 
 Was that the original code that you ran? As there appear to be several
mistakes in the code:
 
 1. In the gipsisoil stuff, there is a ')' too much
 2. In the gambisoil stuff both  signs point in the same direction, you
probably want one  and one 
 
 
 My general suggestion would be to skip the loops altogether and vectorize
your code:
 
 datos$cambi=datos$codsuelo
 datos$cambi[datos$codsuelo=3.1  datos$codsuelo =3.3] - 1
 
 Another source of your error could be that datos$codtipo is not numeric.
What does class(datos$codzuelo) say?
 
 
 HTH
 Jannis
  for (i in
  1:length(datos$cambi)){if(datos$codsuelo[i]=3.1
  
  datos$codsuelo[i]=3.3){datos$cambi[i]=1}else{0} 
  }
 
 --- Arantzazu Blanco Bernardeau aramu...@hotmail.com schrieb am Do,
3.6.2010:
 
  Von: Arantzazu Blanco Bernardeau aramu...@hotmail.com
  Betreff: [R] Creating dummy variables
  An: r-help@r-project.org
  Datum: Donnerstag, 3. Juni, 2010 14:11 Uhr
  
  Hello R project
  I am a R beginner trying to create a dummy variable to
  clasificate soil types.
  So, I have a column in my database called codtipo (typecode
  in english) where soil type is coded as 
  1.1 to 1.4 arenosol (I have 4 types)
  2.1 to 2.3 calcisols 
  4.1 to 4.4 fluvisols
  and so on
  To make dummy variables I understand that, I create
  different columns as for gipsisols
  datos$gipsi=datos$codsuelo
  for (i in
  1:length(datos$gipsi)){if(datos$codsuelo[i]=5.1
  
  (datos$codsuelo[i]=5.4){datos$gipsi[i]=1}else{0}
  }
  for cambisols it should be
  datos$cambi=datos$codsuelo
  for (i in
  1:length(datos$cambi)){if(datos$codsuelo[i]=3.1
  
  datos$codsuelo[i]=3.3){datos$cambi[i]=1}else{0} 
  }
  and so on... 
  but anyway R answers that a necesary value TRUE/FALSE is
  not existing.
  What can I do?
  thanks a lot!!
  
  
  Arantzazu Blanco Bernardeau
  Dpto de Qummica Agrmcola, Geologma y Edafologma  
  Universidad de Murcia-Campus de Espinardo
  
  
  
  
  
  
  
   Date: Thu, 3 Jun 2010 06:51:42 -0700
   From: lampria...@yahoo.com
   To: jorism...@gmail.com
   CC: r-help@r-project.org
   Subject: Re: [R] ordinal variables
   
   Thank you Joris,
   I'll have a look into the commands you sent me. They
  look convincing. I hope my students will also see them in a
  positive way (although I can force them to pretend that they
  have a positive attitude)!
   
   Dr. Iasonas Lamprianou
   
   
   
   
   
   Assistant Professor (Educational Research and
  Evaluation)
   
   Department of Education Sciences
   
   European University-Cyprus
   
   P.O. Box 22006
   
   1516 Nicosia
   
   Cyprus 
   
   Tel.: +357-22-713178
   
   Fax: +357-22-590539
   
   
   
   
   
   Honorary Research Fellow
   
   Department of Education
   
   The University of Manchester
   
   Oxford Road, Manchester M13 9PL, UK
   
   Tel. 0044  161 275 3485
   
   iasonas.lampria...@manchester.ac.uk
   
   --- On Thu, 3/6/10, Joris Meys jorism...@gmail.com
  wrote:
   
   From: Joris Meys jorism...@gmail.com
   Subject: Re: [R] ordinal variables
   To: Iasonas Lamprianou lampria...@yahoo.com
   Cc: r-help@r-project.org
   Date: Thursday, 3 June, 2010, 14:35
   
   see ?factor and ?as.factor. On ordered factors you can
  technically do a spearman without problem, apart from the
  fact that a spearman test by definition cannot give exact
  p-values with ties present.
   
   x - sample(c(a,b,c,d,e),100,replace=T)
   
   y - sample(c(a,b,c,d,e),100,replace=T)
   
   x.ordered -
  factor(x,levels=c(e,b,a,d,c),ordered=T)
   
   x.ordered
   y.ordered -
  factor(y,levels=c(e,b,a,d,c),ordered=T)
   y.ordered
   
   cor.test(x.ordered,y.ordered,method=spearman)
   
   require(pspearman)
   
   spearman.test(x.ordered,y.ordered)
   
   R commander has some menu options to deal with
  factors. R commander also provides a scripting window.
  Please do your students a favor, and show them how to use
  those commands. 
   
   
   Cheers
   Joris
   
   
   On Thu, Jun 3, 2010 at 2:25 PM

[R] Creating Dummy Variables in R

2009-12-16 Thread whitaker m. (mw1006)
Hi,
I am trying to create a set of dummy variables to use within a multiple linear 
regression and am unable to find the codes within the manuals.

For example i have:
Price Weight Clarity
 IF  VVS1VVS2
5008 1 0  0
1000  5.2  0 0  1
8643  01  0
3402.6  0 0  1
90  0.5  1 0  0 
4502.3  0 1  0

Where price is dependent upon weight (single value in each observation) and 
clarity (split into three levels, IF, VVS1, VVS2).
I am having trouble telling the program that clarity is a set of 3 dummy 
variables and keep getting error messages, what is the correct way?

Any helps is greatly appreciated.
Matthew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating Dummy Variables in R

2009-12-16 Thread S Devriese
On 12/16/2009 03:58 PM, whitaker m. (mw1006) wrote:
 Hi,
 I am trying to create a set of dummy variables to use within a multiple 
 linear regression and am unable to find the codes within the manuals.
 
 For example i have:
 Price Weight Clarity
  IF  VVS1VVS2
 5008 1 0  0
 1000  5.2  0 0  1
 8643  01  0
 3402.6  0 0  1
 90  0.5  1 0  0 
 4502.3  0 1  0
 
 Where price is dependent upon weight (single value in each observation) and 
 clarity (split into three levels, IF, VVS1, VVS2).
 I am having trouble telling the program that clarity is a set of 3 dummy 
 variables and keep getting error messages, what is the correct way?
 

Without an example of your code, it's a bit difficult. But it might be
easier to use one variable clarity with three possible values (IF,
VVS1, VVS2), defined as a factor.
lm(Price ~ Weight + Clarity) should then do the trick (unless you
explicitly want to use a different dummy coding than the default)

Stephan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating Dummy Variables in R

2009-12-16 Thread Achim Zeileis

On Wed, 16 Dec 2009, whitaker m. (mw1006) wrote:


Hi,
I am trying to create a set of dummy variables to use within a multiple linear 
regression and am unable to find the codes within the manuals.

For example i have:
Price Weight Clarity
IF  VVS1VVS2
5008 1 0  0
1000  5.2  0 0  1
8643  01  0
3402.6  0 0  1
90  0.5  1 0  0
4502.3  0 1  0

Where price is dependent upon weight (single value in each observation) and 
clarity (split into three levels, IF, VVS1, VVS2).
I am having trouble telling the program that clarity is a set of 3 dummy 
variables and keep getting error messages, what is the correct way?


You should code the categorical variable Clarity as a factor so that R 
knows that this is a categorical variable and can deal with it 
appropriately in subsequent computations such as summary() or lm().


Thus, I would recommend to store your data as

dat - data.frame(
  Price = c(500, 1000, 864, 340, 90, 450),
  Weight = c(8, 5.2, 3, 2.6, 0.5, 2.3),
  Clarity = c(IF, VVS1, VVS2)[c(1, 3, 2, 3, 1, 2)])

which yields, e.g.,

R summary(dat)
 PriceWeight  Clarity
 Min.   :  90.0   Min.   :0.500   IF  :2
 1st Qu.: 367.5   1st Qu.:2.375   VVS1:2
 Median : 475.0   Median :2.800   VVS2:2
 Mean   : 540.7   Mean   :3.600
 3rd Qu.: 773.0   3rd Qu.:4.650
 Max.   :1000.0   Max.   :8.000

and then you can also do

R lm(Price ~ Weight + Clarity, data = dat)

Call:
lm(formula = Price ~ Weight + Clarity, data = dat)

Coefficients:
(Intercept)   Weight  ClarityVVS1  ClarityVVS2
 -45.0580.01   490.02   403.00

or if you wish to choose a different coding

R lm(Price ~ 0 + Weight + Clarity, data = dat)

Call:
lm(formula = Price ~ 0 + Weight + Clarity, data = dat)

Coefficients:
 WeightClarityIF  ClarityVVS1  ClarityVVS2
  80.01   -45.05   444.97   357.95


Some further reading of introductory material on linear regression in R 
would be useful. Also look at ?lm, ?factor, ?model.matrix, ?contrasts etc.


hth,
Z


Any helps is greatly appreciated.
Matthew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating Dummy Variables in R

2009-12-16 Thread Tom Fletcher
Is your variable Clarity a categorical with 4 levels? Thus, the need for
k-1 (3) dummies? Your error may be the result of creating k instead of
k-1 dummies, but can't be sure from the example.

In R, you don't have to (unless you really want to) explicitly create
separate variables. You can use the internal contrast functions. 

See

?contr.treatment

Which is dummy coding by default. You can specify which group is the
reference group. 

Alternatively, if you prefer effects coding, you can see
?contr.sum 

There are others as well. 

Tom Fletcher



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of whitaker m. (mw1006)
Sent: Wednesday, December 16, 2009 8:59 AM
To: r-help@r-project.org
Subject: [R] Creating Dummy Variables in R

Hi,
I am trying to create a set of dummy variables to use within a multiple
linear regression and am unable to find the codes within the manuals.

For example i have:
Price Weight Clarity
 IF  VVS1VVS2
5008 1 0  0
1000  5.2  0 0  1
8643  01  0
3402.6  0 0  1
90  0.5  1 0  0 
4502.3  0 1  0

Where price is dependent upon weight (single value in each observation)
and clarity (split into three levels, IF, VVS1, VVS2).
I am having trouble telling the program that clarity is a set of 3 dummy
variables and keep getting error messages, what is the correct way?

Any helps is greatly appreciated.
Matthew

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.