Through converting a miRNAs file from FASTA to character format I get a vector which looks like the following:
> nml [1] "hsa-let-7a MIMAT0000062 Homo sapiens let-7a" [2] "hsa-let-7b MIMAT0000063 Homo sapiens let-7b" [3] "hsa-let-7c MIMAT0000064 Homo sapiens let-7c" [4] "hsa-let-7d MIMAT0000065 Homo sapiens let-7d" [5] "hsa-let-7e MIMAT0000066 Homo sapiens let-7e" [6] "hsa-let-7f MIMAT0000067 Homo sapiens let-7f" [7] "hsa-miR-15a MIMAT0000068 Homo sapiens miR-15a" [8] "hsa-miR-16 MIMAT0000069 Homo sapiens miR-16" [9] "hsa-miR-17 MIMAT0000070 Homo sapiens miR-17" [10] "hsa-miR-18a MIMAT0000072 Homo sapiens miR-18a" ....................................................................................................... [888] "hsa-miR-675* MIMAT0006790 Homo sapiens miR-675*" [889] "hsa-miR-888* MIMAT0004917 Homo sapiens miR-888*" [890] "hsa-miR-541* MIMAT0004919 Homo sapiens miR-541*" My goal is to separate into a vector only the first string preceding the 1st space starting from the left. With reference to the records above listed I would obtain: [1] "hsa-let-7a" [2] "hsa-let-7b" [3] "hsa-let-7c" [4] "hsa-let-7d" [5] "hsa-let-7e" [6] "hsa-let-7f f" [7] "hsa-miR-15a" [8] "hsa-miR-16" [9] "hsa-miR-17" [10] "hsa-miR-18a" ....................................................................................................... [888] "hsa-miR-675*" [889] "hsa-miR-888*" [890] "hsa-miR-541*" I tried using strsplit as follows: strsplit(nml,"MIMAT[0-9]*") from here I get a vector of lists and I can separate the string I need through the [[]] operator, as shown in the following. > strsplit(nml,"MIMAT[0-9]*")[[1]][1] [1] "hsa-let-7a " > strsplit(nml,"MIMAT[0-9]*")[[2]][1] [1] "hsa-let-7b " Unluckily the [[]] operator acts on one vector element at a time. In fact: > strsplit(nml,"MIMAT[0-9]*")[[]][1] Error in strsplit(nml, "MIMAT[0-9]*")[[]] : invalid subscript type 'symbol' I guess a smart combination os strsplit ans sapply or lapply could do the job with one command line only ... but I haven't been able to get the syntax right ... I would greatly appreciate some help from R language experts. I know I can use a for-loop to get what I am struggling for. But Idefinitely wish to learn to use a high-level language as it deserves rather than the C-style. Thank you in advance, Maura tutti i telefonini TIM! [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.