[R] dataframe, transform, strsplit
Hi, I have a dataframe that has a column of vectors that I need to extract off the character string before the first '.' character and put it into a separate column. I thought I could use 'strsplit' for it within 'transform', but I can't seem to get the right invocation. Here is a sample dataframe that has what I have, what I want, and what I get. Can someone tell me how to get what is in the 'want' column from the 'have' column programatically? tia, Matt df - data.frame(have=c(a.b.c, d.e.f, g.h.i), want=c(a,d,g)) df.xform - transform(df, get=strsplit(as.character(have), split=., fixed=TRUE)[[1]][1]) df.xform [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe, transform, strsplit
try this: df have want 1 a.b.ca 2 d.e.fd 3 g.h.ig df$get - gsub(^([^.]+).*, \\1, df$have) df have want get 1 a.b.ca a 2 d.e.fd d 3 g.h.ig g On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis matthew.pet...@gmail.com wrote: Hi, I have a dataframe that has a column of vectors that I need to extract off the character string before the first '.' character and put it into a separate column. I thought I could use 'strsplit' for it within 'transform', but I can't seem to get the right invocation. Here is a sample dataframe that has what I have, what I want, and what I get. Can someone tell me how to get what is in the 'want' column from the 'have' column programatically? tia, Matt df - data.frame(have=c(a.b.c, d.e.f, g.h.i), want=c(a,d,g)) df.xform - transform(df, get=strsplit(as.character(have), split=., fixed=TRUE)[[1]][1]) df.xform [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe, transform, strsplit
On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis matthew.pet...@gmail.com wrote: Hi, I have a dataframe that has a column of vectors that I need to extract off the character string before the first '.' character and put it into a separate column. I thought I could use 'strsplit' for it within 'transform', but I can't seem to get the right invocation. Here is a sample dataframe that has what I have, what I want, and what I get. Can someone tell me how to get what is in the 'want' column from the 'have' column programatically? tia, Matt df - data.frame(have=c(a.b.c, d.e.f, g.h.i), want=c(a,d,g)) df.xform - transform(df, get=strsplit(as.character(have), split=., fixed=TRUE)[[1]][1]) df.xform Try replacing the dot [.] and everything thereafter .* with nothing like this: transform(df, want = sub([.].*, , have)) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe, transform, strsplit
Thanks Gabor and Jim, Both solutions worked equally well for me (now I have an embarrassment of riches for a solution :-) ). Now that my main problem is solved, I am happy, but I was wondering if anyone would care to comment as to why my 'strsplit' solution doesn't behave the way I think it should... Thank you both again, Matt On Mon, Oct 25, 2010 at 12:09 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis matthew.pet...@gmail.com wrote: Hi, I have a dataframe that has a column of vectors that I need to extract off the character string before the first '.' character and put it into a separate column. I thought I could use 'strsplit' for it within 'transform', but I can't seem to get the right invocation. Here is a sample dataframe that has what I have, what I want, and what I get. Can someone tell me how to get what is in the 'want' column from the 'have' column programatically? tia, Matt df - data.frame(have=c(a.b.c, d.e.f, g.h.i), want=c(a,d,g)) df.xform - transform(df, get=strsplit(as.character(have), split=., fixed=TRUE)[[1]][1]) df.xform Try replacing the dot [.] and everything thereafter .* with nothing like this: transform(df, want = sub([.].*, , have)) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com -- Seven Deadly Sins (Gandhi): - Wealth without work - Politics without principle - Pleasure without conscience - Commerce without morality - Science without humanity- Worship without sacrifice - Knowledge without character [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dataframe, transform, strsplit
On Mon, Oct 25, 2010 at 1:20 PM, Matthew Pettis matthew.pet...@gmail.com wrote: Thanks Gabor and Jim, Both solutions worked equally well for me (now I have an embarrassment of riches for a solution :-) ). Now that my main problem is solved, I am happy, but I was wondering if anyone would care to comment as to why my 'strsplit' solution doesn't behave the way I think it should... Thank you both again, Matt On Mon, Oct 25, 2010 at 12:09 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Oct 25, 2010 at 12:53 PM, Matthew Pettis matthew.pet...@gmail.com wrote: Hi, I have a dataframe that has a column of vectors that I need to extract off the character string before the first '.' character and put it into a separate column. I thought I could use 'strsplit' for it within 'transform', but I can't seem to get the right invocation. Here is a sample dataframe that has what I have, what I want, and what I get. Can someone tell me how to get what is in the 'want' column from the 'have' column programatically? 1. split = . is a regular expression which means every character is a split character, not just dot. 2. Even if this is corrected picking off [[1]] means picking off the first element which would be c(a, b, c) whereas we want the first element of each component of the result, not the first element overall. A corrected version using the same approach looks like this: transform(df, want = sapply(strsplit(as.character(have), ., fixed = TRUE), [, 1)) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.