On Tue, 29 Sep 2009, mau...@alice.it wrote:
Through converting a miRNAs file from FASTA to character format I get a vector
which looks like the following:
nml
[1] "hsa-let-7a MIMAT0000062 Homo sapiens let-7a"
[2] "hsa-let-7b MIMAT0000063 Homo sapiens let-7b"
[3] "hsa-let-7c MIMAT0000064 Homo sapiens let-7c"
[4] "hsa-let-7d MIMAT0000065 Homo sapiens let-7d"
[5] "hsa-let-7e MIMAT0000066 Homo sapiens let-7e"
[6] "hsa-let-7f MIMAT0000067 Homo sapiens let-7f"
[7] "hsa-miR-15a MIMAT0000068 Homo sapiens miR-15a"
[8] "hsa-miR-16 MIMAT0000069 Homo sapiens miR-16"
[9] "hsa-miR-17 MIMAT0000070 Homo sapiens miR-17"
[10] "hsa-miR-18a MIMAT0000072 Homo sapiens miR-18a"
.......................................................................................................
[888] "hsa-miR-675* MIMAT0006790 Homo sapiens miR-675*"
[889] "hsa-miR-888* MIMAT0004917 Homo sapiens miR-888*"
[890] "hsa-miR-541* MIMAT0004919 Homo sapiens miR-541*"
My goal is to separate into a vector only the first string preceding the 1st
space starting from the left.
With reference to the records above listed I would obtain:
[1] "hsa-let-7a"
[2] "hsa-let-7b"
[3] "hsa-let-7c"
[4] "hsa-let-7d"
[5] "hsa-let-7e"
[6] "hsa-let-7f f"
[7] "hsa-miR-15a"
[8] "hsa-miR-16"
[9] "hsa-miR-17"
[10] "hsa-miR-18a"
.......................................................................................................
[888] "hsa-miR-675*"
[889] "hsa-miR-888*"
[890] "hsa-miR-541*"
sub( "[ ].*", "", nml )
I tried using strsplit as follows:
strsplit(nml,"MIMAT[0-9]*")
from here I get a vector of lists and I can separate the string I need through
the [[]] operator, as shown in the following.
strsplit(nml,"MIMAT[0-9]*")[[1]][1]
[1] "hsa-let-7a "
strsplit(nml,"MIMAT[0-9]*")[[2]][1]
[1] "hsa-let-7b "
Unluckily the [[]] operator acts on one vector element at a time. In fact:
strsplit(nml,"MIMAT[0-9]*")[[]][1]
Error in strsplit(nml, "MIMAT[0-9]*")[[]] :
invalid subscript type 'symbol'
I guess a smart combination os strsplit ans sapply or lapply could do the job
with one command line only ...
but I haven't been able to get the syntax right ... I would greatly appreciate
some help from R language experts.
I know I can use a for-loop to get what I am struggling for. But Idefinitely
wish to learn to use a high-level language
as it deserves rather than the C-style.
Thank you in advance,
Maura
tutti i telefonini TIM!
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.