Hi Joshua This is one way to do it. Not sure if it this is an efficient implementation for your needs; it depends on the size of your data.
string1 <- "ATCGCCCGTA[AGA]TAACCG" string2 <- "ATTATACGCA[AAATGCCCCA]GCTA[AT]GCATTA" foo <- function(genes){ mypaste <- function(x) paste("[", paste(x, collapse = "]["), "]", sep = "") tmp <- strsplit(genes, "[[:punct:]]")[[1]] str <- gregexpr("\\[", genes)[[1]] stp <- gregexpr("\\]", genes)[[1]] tmp2 <- substring(genes, str + 1, stp - 1) ndx <- match(tmp2, tmp) tmp[ndx] <- lapply(strsplit(tmp2, ""), mypaste) result <- paste(tmp, collapse = "") return(result) } > foo(string2) [1] "ATTATACGCA[A][A][A][T][G][C][C][C][C][A]GCTA[A][T]GCATTA" > foo(string1) [1] "ATCGCCCGTA[A][G][A]TAACCG" > Yours sincerely / Med venlig hilsen Frede Aakmann Tøgersen Specialist, M.Sc., Ph.D. Plant Performance & Modeling Technology & Service Solutions T +45 9730 5135 M +45 2547 6050 fr...@vestas.com http://www.vestas.com Company reg. name: Vestas Wind Systems A/S This e-mail is subject to our e-mail disclaimer statement. Please refer to www.vestas.com/legal/notice If you have received this e-mail in error please contact the sender. > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Joshua Banta > Sent: 2. januar 2014 04:56 > To: R Help > Subject: [R] Data parsing question: adding characters within a string of > characters > > Dear Listserve, > > I have a data-parsing question for you. I recognize this is more in the domain > of PERL/Python, but I don't know those languages! On the other hand, I am > pretty good overall with R, so I'd rather get the job done within the R > "ecosphere." > > Here is what I want to do. Consider the following data: > > string <- "ATCGCCCGTA[AGA]TAACCG" > > I want to alter string so that it looks like this: > > ATCGCCCGTA[A][G][A]TAACCG > > In other words, I want to design a piece of code that will scan a character > string, find bracketed groups of characters, break up each character within > the bracket into its own individual bracketed character, and then put the > group of individually bracketed characters back into the character string. The > lengths of the character strings enclosed by a bracket will vary, but in every > case, I want to do the same thing: break up each character within the bracket > into its own individual bracketed character, and then put the group of > individually bracketed characters back into the character string. > > So, for example, another string may look like this: > > string2 <- "ATTATACGCA[AAATGCCCCA]GCTA[AT]GCATTA" > > I want to alter string so that it looks like this: > > "ATTATACGCA[A][A][A][T][G][C][C][C][C][A]GCTA[A][T]GCATTA" > > Thank you all in advance and have a great 2014! > > ----------------------------------- > Josh Banta, Ph.D > Assistant Professor > Department of Biology > The University of Texas at Tyler > Tyler, TX 75799 > Tel: (903) 565-5655 > http://plantevolutionaryecology.org > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.