Re: [R] how to Subset based on partial matching of columns?

2015-04-09 Thread samarvir singh
Thank you. Sarah Goslee. I am rather new in learning R. So people like you
are great support. Really appreciate you, taking the time to correct my
mistakes. Thanks

On Thu 9 Apr, 2015 6:54 pm Sarah Goslee sarah.gos...@gmail.com wrote:

 Hi,

 Please don't put quotes around your code. It makes it hard to copy and
 paste. Alternatively, don't post in HTML, because it screws up your
 code.

 On Wed, Apr 8, 2015 at 8:57 PM, samarvir singh samarvir1...@gmail.com
 wrote:
  So I have a list that contains certain characters as shown below
 
  `list - c(MY,GM+ ,TY,RS,LG)`

 That's a character vector, not a list. A list is a specific type of object
 in R.

  And I have a variable named CODE in the data frame as follows
 
  `code - c(MY GM+, ,LGTY, RS,TY)`

 That doesn't work, and I have no idea what you expect to have there,
 so I'm deleting the extra comma. Also, your vector is named code, not
 CODE.

 code - c(MY GM+, LGTY, RS,TY)
 x - c(1:4)

  'x - c(1:5)
  `df - data.frame(x,code)`

 You problably actually want
 mydf - data.frame(x, code, stringsAsFactors=FALSE)

 Note I changed the name, because df() is a base R function.


  Now I want to create 5 new variables named MY,GM+,TY,RS,LG
 
  Which takes binary value, 1 if there's a match case in the CODE variable
 
  df
   x  code MY GM+ TY RS LG
  1  MY GM+  1 1  00   0
  2  0 0  00   0
  3  LGTY   0 0 1 0   1
  4  RS   0 0  010
  5  TY   0 0  100

 grepl() will give you a logical match

 data.frame(mydf, sapply(code, function(x)grepl(x, mydf$code)),
 stringsAsFactors=FALSE, check.names=FALSE)

 Sarah


 --
 Sarah Goslee
 http://www.functionaldiversity.org


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to Subset based on partial matching of columns?

2015-04-09 Thread Sarah Goslee
Hi,

Please don't put quotes around your code. It makes it hard to copy and
paste. Alternatively, don't post in HTML, because it screws up your
code.

On Wed, Apr 8, 2015 at 8:57 PM, samarvir singh samarvir1...@gmail.com wrote:
 So I have a list that contains certain characters as shown below

 `list - c(MY,GM+ ,TY,RS,LG)`

That's a character vector, not a list. A list is a specific type of object in R.

 And I have a variable named CODE in the data frame as follows

 `code - c(MY GM+, ,LGTY, RS,TY)`

That doesn't work, and I have no idea what you expect to have there,
so I'm deleting the extra comma. Also, your vector is named code, not
CODE.

code - c(MY GM+, LGTY, RS,TY)
x - c(1:4)

 'x - c(1:5)
 `df - data.frame(x,code)`

You problably actually want
mydf - data.frame(x, code, stringsAsFactors=FALSE)

Note I changed the name, because df() is a base R function.


 Now I want to create 5 new variables named MY,GM+,TY,RS,LG

 Which takes binary value, 1 if there's a match case in the CODE variable

 df
  x  code MY GM+ TY RS LG
 1  MY GM+  1 1  00   0
 2  0 0  00   0
 3  LGTY   0 0 1 0   1
 4  RS   0 0  010
 5  TY   0 0  100

grepl() will give you a logical match

data.frame(mydf, sapply(code, function(x)grepl(x, mydf$code)),
stringsAsFactors=FALSE, check.names=FALSE)

Sarah


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to Subset based on partial matching of columns?

2015-04-09 Thread David L Carlson
From Sarah's data frame you can get what you want directly with the table() 
function which will create a table object, mydf.tbl. If you want a data frame 
you need to convert the table using as.data.frame.matrix() to make mydf.df. 
Finally combine the two data frames if your x column consists of unique values 
in ascending order to make mydf.all.

 mydf.tbl - table(mydf$x, mydf$code)
 mydf.tbl
   
LGTY MY GM+ RS TY
  10  1  0  0
  21  0  0  0
  30  0  1  0
  40  0  0  1
 mydf.df - as.data.frame.matrix(mydf.tbl)
 mydf.df
  LGTY MY GM+ RS TY
10  1  0  0
21  0  0  0
30  0  1  0
40  0  0  1
 mydf.all - data.frame(mydf, mydf.df)
 mydf.all
  x   code LGTY MY.GM. RS TY
1 1 MY GM+0  1  0  0
2 2   LGTY1  0  0  0
3 3 RS0  0  1  0
4 4 TY0  0  0  1


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of samarvir singh
Sent: Thursday, April 9, 2015 8:50 AM
To: Sarah Goslee
Cc: r-help
Subject: Re: [R] how to Subset based on partial matching of columns?

Thank you. Sarah Goslee. I am rather new in learning R. So people like you
are great support. Really appreciate you, taking the time to correct my
mistakes. Thanks

On Thu 9 Apr, 2015 6:54 pm Sarah Goslee sarah.gos...@gmail.com wrote:

 Hi,

 Please don't put quotes around your code. It makes it hard to copy and
 paste. Alternatively, don't post in HTML, because it screws up your
 code.

 On Wed, Apr 8, 2015 at 8:57 PM, samarvir singh samarvir1...@gmail.com
 wrote:
  So I have a list that contains certain characters as shown below
 
  `list - c(MY,GM+ ,TY,RS,LG)`

 That's a character vector, not a list. A list is a specific type of object
 in R.

  And I have a variable named CODE in the data frame as follows
 
  `code - c(MY GM+, ,LGTY, RS,TY)`

 That doesn't work, and I have no idea what you expect to have there,
 so I'm deleting the extra comma. Also, your vector is named code, not
 CODE.

 code - c(MY GM+, LGTY, RS,TY)
 x - c(1:4)

  'x - c(1:5)
  `df - data.frame(x,code)`

 You problably actually want
 mydf - data.frame(x, code, stringsAsFactors=FALSE)

 Note I changed the name, because df() is a base R function.


  Now I want to create 5 new variables named MY,GM+,TY,RS,LG
 
  Which takes binary value, 1 if there's a match case in the CODE variable
 
  df
   x  code MY GM+ TY RS LG
  1  MY GM+  1 1  00   0
  2  0 0  00   0
  3  LGTY   0 0 1 0   1
  4  RS   0 0  010
  5  TY   0 0  100

 grepl() will give you a logical match

 data.frame(mydf, sapply(code, function(x)grepl(x, mydf$code)),
 stringsAsFactors=FALSE, check.names=FALSE)

 Sarah


 --
 Sarah Goslee
 http://www.functionaldiversity.org


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.