I'm also a newbie, but I've been getting loads of utility out of the "grep" function and it's cousin "gsub".
Using asterisks are tricky because * often means "anything of any length" in a search pattern (e.g. delete *.* means delete all your files!). To find the literal * using grep you would need to put some \'s in front of *, so that R knows you mean the character * and not "anything of any length". eg: > txt=c('a','a*') > txt [1] "a" "a*" > grep('\\*',txt) # tell me, where is the star? [1] 2 > gsub('\\*', ' success',txt) #Replace the star with " success" [1] "a" "a success" However, with the grep/gsub commands there are some other important symbols: ^ Pattern occurs at beginning $ Pattern occurs at end . Means "anything", but only once + Preceeding character occurs more than once... so .+ Means "anything, more than once" So, to strip off everything up to the first *, I would try something like this: > txt=c('aa*very important','a*important') > txt [1] "aa*very important" "a*important" > gsub('^.+\\*', 'success',txt) [1] "successvery important" "successimportant" On Wed, Aug 12, 2009 at 11:06 AM, Jill Hollenbach <jhollenb...@chori.org>wrote: > > Thanks so much everybody, this has been incredibly helpful--not only is my > immediate issue solved but I've learned a lot in the process. The lapply > solution is best for me, as I need flexibility to edit df's with varying > numbers of columns. > > Now, one more question: after appending the string from the first line, I > am > manipulating the df further(recoding the original contents; this I have > working fine), and afterwards I will need to strip back off that string. It > seems relatively straightforward, except that, as shown in the example > above > (df2), there is an astersik involved (I need to remove all characters up to > and including the asterisk) which seems problematic. > Any suggestions? > Many thanks, > Jill > > > > Don MacQueen wrote: > > > > Let's start with something simple and relatively easy to understand, > > since you're new to this. > > > > First, here's an example of the core of the idea: > >> paste('a',1:4) > > [1] "a 1" "a 2" "a 3" "a 4" > > > > Make it a little closer to your situation: > >> paste('a*',1:4, sep='') > > [1] "a*1" "a*2" "a*3" "a*4" > > > > Sometimes it helps to save the number of rows in your dataframe in a > > new variable > > > > nr <- nrow(df) > > > > Then, for your first column, the "a*" in the above example is df$V1[1] > > For the 1:4 in the example, you use df$V1[ 2:nr] > > Put it together and you have: > > > > dfnew <- df > > dfnew$V1[ 2:nr] <- paste( dfnew$V1[1], dfnew$V1[ 2:nr] ) > > > > But you can use "-1" instead of "2:nr", and you get > > > > dfnew$V1[ -1 ] <- paste( dfnew$V1[1], dfnew$V1[ -1] ) > > > > That's how you can do it one column at a time. > > Since you have only four columns, just do the same thing to V2, V3, and > > V4. > > > > But if you want a more general method, one that works no matter how > > many columns you have, and no matter what they are named, then you > > can use lapply() to loop over the columns. This is what Patrick > > Connolly suggested, which is > > > > as.data.frame(lapply(df, function(x) paste(x[1], x[-1], sep = ""))) > > > > Note, though, that this will do it to all columns, so if you ever > > happen to have a dataframe where you don't want to do all columns, > > you'll have to be a little trickier with the lapply() solution. > > > > -Don > > > > At 6:48 PM -0700 8/11/09, Jill Hollenbach wrote: > >>Hi, > >>I am trying to edit a data frame such that the string in the first line > is > >>appended onto the beginning of each element in the subsequent rows. The > data > >>looks like this: > >> > >>> df > >> V1 V2 V3 V4 > >>1 DPA1* DPA1* DPB1* DPB1* > >>2 0103 0104 0401 0601 > >>3 0103 0103 0301 0402 > >>. > >>. > >> and what I want is this: > >> > >>>dfnew > >> V1 V2 V3 V4 > >>1 DPA1* DPA1* DPB1* DPB1* > >>2 DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601 > >>3 DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402 > >> > >>any help is much appreciated, I am new to this and struggling. > >>Jill > >> > >>___ > >> Jill Hollenbach, PhD, MPH > >> Assistant Staff Scientist > >> Center for Genetics > >> Children's Hospital Oakland Research Institute > >> jhollenb...@chori.org > >> > >>-- > >>View this message in context: > >>http://*www.* > nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html > >>Sent from the R help mailing list archive at Nabble.com. > >> > >>______________________________________________ > >>R-help@r-project.org mailing list > >>https://*stat.ethz.ch/mailman/listinfo/r-help > >>PLEASE do read the posting guide > http://*www.*R-project.org/posting-guide.html > >>and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > > -------------------------------------- > > Don MacQueen > > Environmental Protection Department > > Lawrence Livermore National Laboratory > > Livermore, CA, USA > > 925-423-1062 > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > View this message in context: > http://www.nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24939755.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.