[R] dataframe, transform, strsplit

2010-10-25 Thread Matthew Pettis
Hi,

I have a dataframe that has a column of vectors that I need to extract off
the character string before the first '.' character and put it into a
separate column.  I thought I could use 'strsplit' for it within
'transform', but I can't seem to get the right invocation.  Here is a sample
dataframe that has what I have, what I want, and what I get.  Can someone
tell me how to get what is in the 'want' column from the 'have' column
programatically?

tia,
Matt

df - data.frame(have=c(a.b.c, d.e.f, g.h.i), want=c(a,d,g))
df.xform - transform(df, get=strsplit(as.character(have), split=.,
fixed=TRUE)[[1]][1])
df.xform

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe, transform, strsplit

2010-10-25 Thread jim holtman
try this:

 df
   have want
1 a.b.ca
2 d.e.fd
3 g.h.ig
 df$get - gsub(^([^.]+).*, \\1, df$have)
 df
   have want get
1 a.b.ca   a
2 d.e.fd   d
3 g.h.ig   g


On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis
matthew.pet...@gmail.com wrote:
 Hi,

 I have a dataframe that has a column of vectors that I need to extract off
 the character string before the first '.' character and put it into a
 separate column.  I thought I could use 'strsplit' for it within
 'transform', but I can't seem to get the right invocation.  Here is a sample
 dataframe that has what I have, what I want, and what I get.  Can someone
 tell me how to get what is in the 'want' column from the 'have' column
 programatically?

 tia,
 Matt

 df - data.frame(have=c(a.b.c, d.e.f, g.h.i), want=c(a,d,g))
 df.xform - transform(df, get=strsplit(as.character(have), split=.,
 fixed=TRUE)[[1]][1])
 df.xform

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe, transform, strsplit

2010-10-25 Thread Gabor Grothendieck
On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis
matthew.pet...@gmail.com wrote:
 Hi,

 I have a dataframe that has a column of vectors that I need to extract off
 the character string before the first '.' character and put it into a
 separate column.  I thought I could use 'strsplit' for it within
 'transform', but I can't seem to get the right invocation.  Here is a sample
 dataframe that has what I have, what I want, and what I get.  Can someone
 tell me how to get what is in the 'want' column from the 'have' column
 programatically?

 tia,
 Matt

 df - data.frame(have=c(a.b.c, d.e.f, g.h.i), want=c(a,d,g))
 df.xform - transform(df, get=strsplit(as.character(have), split=.,
 fixed=TRUE)[[1]][1])
 df.xform


Try replacing the dot [.] and everything thereafter .* with nothing 
like this:

transform(df, want = sub([.].*, , have))

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe, transform, strsplit

2010-10-25 Thread Matthew Pettis
Thanks Gabor and Jim,

Both solutions worked equally well for me (now I have an embarrassment of
riches for a solution :-) ).

Now that my main problem is solved, I am happy, but I was wondering if
anyone would care to comment as to why my 'strsplit' solution doesn't behave
the way I think it should...

Thank you both again,
Matt

On Mon, Oct 25, 2010 at 12:09 PM, Gabor Grothendieck 
ggrothendi...@gmail.com wrote:

 On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis
 matthew.pet...@gmail.com wrote:
  Hi,
 
  I have a dataframe that has a column of vectors that I need to extract
 off
  the character string before the first '.' character and put it into a
  separate column.  I thought I could use 'strsplit' for it within
  'transform', but I can't seem to get the right invocation.  Here is a
 sample
  dataframe that has what I have, what I want, and what I get.  Can someone
  tell me how to get what is in the 'want' column from the 'have' column
  programatically?
 
  tia,
  Matt
 
  df - data.frame(have=c(a.b.c, d.e.f, g.h.i), want=c(a,d,g))
  df.xform - transform(df, get=strsplit(as.character(have), split=.,
  fixed=TRUE)[[1]][1])
  df.xform
 

 Try replacing the dot [.] and everything thereafter .* with nothing 
 like this:

 transform(df, want = sub([.].*, , have))

 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com




-- 
Seven Deadly Sins (Gandhi):
  - Wealth without work - Politics without principle
  - Pleasure without conscience - Commerce without morality
  - Science without humanity- Worship without sacrifice
  - Knowledge without character

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe, transform, strsplit

2010-10-25 Thread Gabor Grothendieck
On Mon, Oct 25, 2010 at 1:20 PM, Matthew Pettis
matthew.pet...@gmail.com wrote:
 Thanks Gabor and Jim,
 Both solutions worked equally well for me (now I have an embarrassment of
 riches for a solution :-) ).
 Now that my main problem is solved, I am happy, but I was wondering if
 anyone would care to comment as to why my 'strsplit' solution doesn't behave
 the way I think it should...
 Thank you both again,
 Matt

 On Mon, Oct 25, 2010 at 12:09 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:

 On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis
 matthew.pet...@gmail.com wrote:
  Hi,
 
  I have a dataframe that has a column of vectors that I need to extract
  off
  the character string before the first '.' character and put it into a
  separate column.  I thought I could use 'strsplit' for it within
  'transform', but I can't seem to get the right invocation.  Here is a
  sample
  dataframe that has what I have, what I want, and what I get.  Can
  someone
  tell me how to get what is in the 'want' column from the 'have' column
  programatically?
 

1. split = . is a regular expression which means every character is
a split character, not just dot.

2.  Even if this is corrected picking off [[1]] means picking off the
first element which would be c(a, b, c) whereas we want the
first element of each component of the result, not the first element
overall.

A corrected version using the same approach looks like this:

   transform(df, want = sapply(strsplit(as.character(have), ., fixed
= TRUE), [, 1))

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.