On Wed, Jan 1, 2014 at 10:55 PM, Joshua Banta <jba...@uttyler.edu> wrote:
> Dear Listserve,
>
> I have a data-parsing question for you. I recognize this is more in the 
> domain of PERL/Python, but I don't know those languages! On the other hand, I 
> am pretty good overall with R, so I'd rather get the job done within the R 
> "ecosphere."
>
> Here is what I want to do. Consider the following data:
>
> string <- "ATCGCCCGTA[AGA]TAACCG"
>
> I want to alter string so that it looks like this:
>
> ATCGCCCGTA[A][G][A]TAACCG
>
> In other words, I want to design a piece of code that will scan a character 
> string, find bracketed groups of characters, break up each character within 
> the bracket into its own individual bracketed character, and then put the 
> group of individually bracketed characters back into the character string. 
> The lengths of the character strings enclosed by a bracket will vary, but in 
> every case, I want to do the same thing: break up each character within the 
> bracket into its own individual bracketed character, and then put the group 
> of individually bracketed characters back into the character string.
>
> So, for example, another string may look like this:
>
> string2 <- "ATTATACGCA[AAATGCCCCA]GCTA[AT]GCATTA"
>
> I want to alter string so that it looks like this:
>
> "ATTATACGCA[A][A][A][T][G][C][C][C][C][A]GCTA[A][T]GCATTA"
>

Here is a one line solution:

library(gsubfn)
> gsubfn("\\[([^]]+)\\]", ~ paste(paste0("[", strsplit(x, "")[[1]], "]"), 
> collapse = ""), string)
[1] "ATCGCCCGTA[A][G][A]TAACCG"
>
> gsubfn("\\[([^]]+)\\]", ~ paste(paste0("[", strsplit(x, "")[[1]], "]"), 
> collapse = ""), string2)
[1] "ATTATACGCA[A][A][A][T][G][C][C][C][C][A]GCTA[A][T]GCATTA"

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to