On Tue, 29 Sep 2009, mau...@alice.it wrote:

Through converting a miRNAs file from FASTA to character  format I get a vector 
which looks like the following:

nml
 [1] "hsa-let-7a MIMAT0000062 Homo sapiens let-7a"
 [2] "hsa-let-7b MIMAT0000063 Homo sapiens let-7b"
 [3] "hsa-let-7c MIMAT0000064 Homo sapiens let-7c"
 [4] "hsa-let-7d MIMAT0000065 Homo sapiens let-7d"
 [5] "hsa-let-7e MIMAT0000066 Homo sapiens let-7e"
 [6] "hsa-let-7f MIMAT0000067 Homo sapiens let-7f"
 [7] "hsa-miR-15a MIMAT0000068 Homo sapiens miR-15a"
 [8] "hsa-miR-16 MIMAT0000069 Homo sapiens miR-16"
 [9] "hsa-miR-17 MIMAT0000070 Homo sapiens miR-17"
[10] "hsa-miR-18a MIMAT0000072 Homo sapiens miR-18a"
       
.......................................................................................................
[888] "hsa-miR-675* MIMAT0006790 Homo sapiens miR-675*"
[889] "hsa-miR-888* MIMAT0004917 Homo sapiens miR-888*"
[890] "hsa-miR-541* MIMAT0004919 Homo sapiens miR-541*"


My goal is to separate into a vector only the first string preceding the 1st 
space starting from the left.
With reference to the records above listed I would obtain:
[1] "hsa-let-7a"
 [2] "hsa-let-7b"
 [3] "hsa-let-7c"
 [4] "hsa-let-7d"
 [5] "hsa-let-7e"
 [6] "hsa-let-7f f"
 [7] "hsa-miR-15a"
 [8] "hsa-miR-16"
 [9] "hsa-miR-17"
[10] "hsa-miR-18a"
       
.......................................................................................................
[888] "hsa-miR-675*"
[889] "hsa-miR-888*"
[890] "hsa-miR-541*"



sub( "[ ].*", "", nml )



I tried using strsplit as follows:
strsplit(nml,"MIMAT[0-9]*")
from here I get a vector of lists and I can separate the string I need through 
the [[]] operator, as shown in the following.
strsplit(nml,"MIMAT[0-9]*")[[1]][1]
[1] "hsa-let-7a "
strsplit(nml,"MIMAT[0-9]*")[[2]][1]
[1] "hsa-let-7b "

Unluckily the [[]] operator acts on one vector element at a time. In fact:
strsplit(nml,"MIMAT[0-9]*")[[]][1]
Error in strsplit(nml, "MIMAT[0-9]*")[[]] :
 invalid subscript type 'symbol'

I guess a smart combination os strsplit ans sapply or lapply could do the job 
with one command line only ...
but I haven't been able to get the syntax right ... I would greatly appreciate 
some help from R language experts.
I know I can use a for-loop to get what I am struggling for. But Idefinitely 
wish to learn to use a high-level language
as it deserves rather than the C-style.

Thank you in advance,
Maura







tutti i telefonini TIM!


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu               UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to