Re: [R] paste first row string onto every string in column

Gene Leynes Thu, 13 Aug 2009 08:57:55 -0700

I'm also a newbie, but I've been getting loads of utility out of the "grep"
function and it's cousin "gsub".


Using asterisks are tricky because * often means "anything of any length" in
a search pattern (e.g. delete *.* means delete all your files!).  To find
the literal * using grep you would need to put some \'s in front of *, so
that R knows you mean the character * and not "anything of any length".
eg:
> txt=c('a','a*')
> txt
[1] "a"  "a*"
> grep('\\*',txt)         # tell me, where is the star?
[1] 2
> gsub('\\*', ' success',txt)  #Replace the star with " success"
[1] "a"        "a success"

However, with the grep/gsub commands there are some other important symbols:
^    Pattern occurs at beginning
$    Pattern occurs at end
.    Means "anything", but only once
+    Preceeding character occurs more than once... so
.+   Means "anything, more than once"

So, to strip off everything up to the first *, I would try something like
this:
> txt=c('aa*very important','a*important')
> txt
[1] "aa*very important" "a*important"
> gsub('^.+\\*', 'success',txt)
[1] "successvery important" "successimportant"

On Wed, Aug 12, 2009 at 11:06 AM, Jill Hollenbach <jhollenb...@chori.org>wrote:

>
> Thanks so much everybody, this has been incredibly helpful--not only is my
> immediate issue solved but I've learned a lot in the process. The lapply
> solution is best for me, as I need flexibility to edit df's with varying
> numbers of columns.
>
> Now, one more question: after appending the string from the first line, I
> am
> manipulating the df further(recoding the original contents; this I have
> working fine), and afterwards I will need to strip back off that string. It
> seems relatively straightforward, except that, as shown in the example
> above
> (df2), there is an astersik involved (I need to remove all characters up to
> and including the asterisk) which seems problematic.
> Any suggestions?
> Many thanks,
> Jill
>
>
>
> Don MacQueen wrote:
> >
> > Let's start with something simple and relatively easy to understand,
> > since you're new to this.
> >
> > First, here's an example of the core of the idea:
> >>  paste('a',1:4)
> > [1] "a 1" "a 2" "a 3" "a 4"
> >
> > Make it a little closer to your situation:
> >>  paste('a*',1:4, sep='')
> > [1] "a*1" "a*2" "a*3" "a*4"
> >
> > Sometimes it helps to save the number of rows in your dataframe in a
> > new variable
> >
> > nr <- nrow(df)
> >
> > Then, for your first column, the "a*" in the above example is df$V1[1]
> > For the 1:4 in the example, you use  df$V1[ 2:nr]
> > Put it together and you have:
> >
> >     dfnew <- df
> >     dfnew$V1[ 2:nr] <- paste( dfnew$V1[1], dfnew$V1[ 2:nr] )
> >
> > But you can use "-1" instead of "2:nr", and you get
> >
> >    dfnew$V1[ -1 ] <- paste( dfnew$V1[1], dfnew$V1[ -1] )
> >
> > That's how you can do it one column at a time.
> > Since you have only four columns, just do the same thing to V2, V3, and
> > V4.
> >
> > But if you want a more general method, one that works no matter how
> > many columns you have, and no matter what they are named, then you
> > can use lapply() to loop over the columns. This is what Patrick
> > Connolly suggested, which is
> >
> >     as.data.frame(lapply(df, function(x) paste(x[1], x[-1], sep = "")))
> >
> > Note, though, that this will do it to all columns, so if you ever
> > happen to have a dataframe where you don't want to do all columns,
> > you'll have to be a little trickier with the lapply() solution.
> >
> > -Don
> >
> > At 6:48 PM -0700 8/11/09, Jill Hollenbach wrote:
> >>Hi,
> >>I am trying to edit a data frame such that the string in the first line
> is
> >>appended onto the beginning of each element in the subsequent rows. The
> data
> >>looks like this:
> >>
> >>>  df
> >>       V1   V2   V3   V4
> >>1   DPA1* DPA1* DPB1* DPB1*
> >>2   0103 0104 0401 0601
> >>3   0103 0103 0301 0402
> >>.
> >>.
> >>  and what I want is this:
> >>
> >>>dfnew
> >>       V1   V2   V3   V4
> >>1   DPA1* DPA1* DPB1* DPB1*
> >>2   DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601
> >>3   DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402
> >>
> >>any help is much appreciated, I am new to this and struggling.
> >>Jill
> >>
> >>___
> >>  Jill Hollenbach, PhD, MPH
> >>     Assistant Staff Scientist
> >>     Center for Genetics
> >>     Children's Hospital Oakland Research Institute
> >>     jhollenb...@chori.org
> >>
> >>--
> >>View this message in context:
> >>http://*www.*
> nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html
> >>Sent from the R help mailing list archive at Nabble.com.
> >>
> >>______________________________________________
> >>R-help@r-project.org mailing list
> >>https://*stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> http://*www.*R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
> >
> >
> > --
> > --------------------------------------
> > Don MacQueen
> > Environmental Protection Department
> > Lawrence Livermore National Laboratory
> > Livermore, CA, USA
> > 925-423-1062
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24939755.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] paste first row string onto every string in column

Reply via email to