Hello,
Please have a look at the code below, which I use to read in the attached
file. As line 18 of the file reads "1065:>sp|Q9V3T9|ADRO_DROME
NADPH:adrenodoxin oxidoreductase, mitochondrial OS=Drosophila melanogaster
GN=dare PE=2 SV=1", I expect the code below to produce a 3 column data frame
with most of the last column empty and line 18 to produce a data.frame row
like so:
V1
1065
V2
>sp|Q9V3T9|ADRO_DROME NADPH
V3
adrenodoxin oxidoreductase, mitochondrial OS=Drosophila
melanogaster GN=dare PE=2 SV=1
Why is that not so?
Thanks for any hint.
Sincerely, Joh
read.table(
"/tmp/testfile.txt",
sep=":",
header=FALSE,
quote="",
fill=TRUE
)[19,]
0:>sp|Q7K2G1|ADRM1_DROME Proteasomal ubiquitin receptor ADRM1 homolog
OS=Drosophila melanogaster GN=CG13349 PE=1 SV=1
116:MFGRQSGLGSSSNSSNLVEFRAGRMNMVGKMVHPDPRKGLVYMTQSDDGLMHFCWKDRTS
177:GKVEDDLIVFPDDFEYKRVDQCKTGRVYVLKFKSSTRRMFFWMQEPKTDKDDEQCRRINE
238:LLNNPPSAHQRGGGGSNDGDLQYMLNNMSQQQLMQLFGGVGQMGGLSSLLGQMNSRTPSS
299:RNTSSSGGGGASALQTPENVSVPRTPSAPSKSGSSRSSSNVNSQVGEGAGSSVDADAPGR
360:SLNIDLSTALPGADAINQIIADPEHVKTLIVHLPESEDVDDDRKQQIKDNITSPQFQQAL
421:AQFSSALQSAQLGPVIKQFELSNEAVAAAFSGNLEDFVRALEKSLPPGATMGGKPSASEK
482:KASDPETPTSVARDENTDPATEKQEEKQK
512:>sp|Q7K2G1-2|ADRM1_DROME Isoform 2 of Proteasomal ubiquitin receptor ADRM1
homolog OS=Drosophila melanogaster GN=CG13349
633:MFGRQSGLGSSSNSSNLVEFRAGRMNMVGKMVHPDPRKGLVYMTQSDDGLMHFCWKDRTS
694:GKVEDDLIVFPDDFEYKRVDQCKTGRVYVLKFKSSTRRMFFWMQEPKTDKDDEQCRRINE
755:LLNNPPSAHQRGGGGSNDGDLQYMLNNMSQQQLMQLFGGVGQMGGLSSLLGQMNSRTPSS
816:RNTSSSGGGGASALQTPENVSVPRTPSAPSKSGSSRSSSNVNSQVGEGAGSSVDADAPGK
877:NSTTSTTTASKSTGAYANPFQAYLSNLSPEHGAGRSLNIDLSTALPGADAINQIIADPEH
938:VKTLIVHLPESEDVDDDRKQQIKDNITSPQFQQALAQFSSALQSAQLGPVIKQFELSNEA
999:VAAAFSGNLEDFVRALEKSLPPGATMGGKPSASEKKASDPETPTSVARDENTDPATEKQE
1060:EKQK
1065:>sp|Q9V3T9|ADRO_DROME NADPH:adrenodoxin oxidoreductase, mitochondrial
OS=Drosophila melanogaster GN=dare PE=2 SV=1
1180:MGINCLNIFRRGLHTSSARLQVIQSTTPTKRICIVGAGPAGFYAAQLILKQLDNCVVDVV
1241:EKLPVPFGLVRFGVAPDHPEVKNVINTFTKTAEHPRLRYFGNISLGTDVSLRELRDRYHA
1302:VLLTYGADQDRQLELENEQLDNVISARKFVAWYNGLPGAENLAPDLSGRDVTIVGQGNVA
1363:VDVARMLLSPLDALKTTDTTEYALEALSCSQVERVHLVGRRGPLQAAFTIKELREMLKLP
1424:NVDTRWRTEDFSGIDMQLDKLQRPRKRLTELMLKSLKEQGRISGSKQFLPIFLRAPKAIA
1485:PGEMEFSVTELQQEAAVPTSSTERLPSHLILRSIGYKSSCVDTGINFDTRRGRVHNINGR
1546:ILKDDATGEVDPGLYVAGWLGTGPTGVIVTTMNGAFAVAKTICDDINTNALDTSSVKPGY
1607:DADGKRVVTWDGWQRINDFESAAGKAKGKPREKIVSIEEMLRVAGV
1654:>sp|Q26365|ADT_DROME ADP,ATP carrier protein OS=Drosophila melanogaster
GN=sesB PE=2 SV=4
1744:MGNISASITSQSKMGKDFDAVGFVKDFAAGGISAAVSKTAVAPIERVKLLLQVQHISKQI
1805:SPDKQYKGMVDCFIRIPKEQGFSSFWRGNLANVIRYFPTQALNFAFKDKYKQVFLGGVDK
1866:NTQFWRYFAGNLASGGAAGATSLCFVYPLDFARTRLAADTGKGGQREFTGLGNCLTKIFK
1927:SDGIVGLYRGFGVSVQGIIIYRAAYFGFYDTARGMLPDPKNTPIYISWAIAQVVTTVAGI
1988:VSYPFDTVRRRMMMQSGRKATEVIYKNTLHCWATIAKQEGTGAFFKGAFSNILRGTGGAF
2049:VLVLYDEIKKVL
2062:>sp|Q26365-2|ADT_DROME Isoform A of ADP,ATP carrier protein OS=Drosophila
melanogaster GN=sesB
2157:MGKDFDAVGFVKDFAAGGISAAVSKTAVAPIERVKLLLQVQHISKQISPDKQYKGMVDCF
2218:IRIPKEQGFSSFWRGNLANVIRYFPTQALNFAFKDKYKQVFLGGVDKNTQFWRYFAGNLA
2279:SGGAAGATSLCFVYPLDFARTRLAADTGKGGQREFTGLGNCLTKIFKSDGIVGLYRGFGV
2340:SVQGIIIYRAAYFGFYDTARGMLPDPKNTPIYISWAIAQVVTTVAGIVSYPFDTVRRRMM
2401:MQSGRKATEVIYKNTLHCWATIAKQEGTGAFFKGAFSNILRGTGGAFVLVLYDEIKKVL
2461:>sp|P37193|ADXH_DROME Adrenodoxin-like protein, mitochondrial
OS=Drosophila melanogaster GN=Fdxh PE=2 SV=3
2568:MFCLLLRRSAVHNSCKLISKQIAKPAFYTPHNALHTTIPRRHGEFEWQDPKSTDEIVNIT
2629:YVDKDGKRTKVQGKVGDNVLYLAHRHGIEMEGACEASLACTTCHVYVQHDYLQKLKEAEE
2690:QEDDLLDMAPFLRENSRLGCQILLDKSMEGMELELPKATRNFYVDGHKPKPH
2743:>sp|P39413|AEF1_DROME Adult enhancer factor 1 OS=Drosophila melanogaster
GN=Aef1 PE=1 SV=1
2834:MMHIKSLPHAHAAATAMSSNCDIVIVAAQPQTTIANNNNNETVTQATHPAHMAAVQQQQQ
2895:QQQQQQQQHHQQQQQQSSGPPSVPPPPTELPLPFQMHLSGISAEAHSAAQAAAMAAAQAA
2956:AAQAAAAEQQQPPPPTSHLTHLTTHSPTTIHSEHYLANGHSEHPGEGNAAVGVGGAVREP
3017:EKPFHCTVCDRRFRQLSTLTNHVKIHTGEKPYKCNVCDKTFRQSSTLTNHLKIHTGEKPY
3078:NCNFCPKHFRQLSTLANHVKIHTGEKPFECVICKKQFRQSSTLNNHIKIHVMDKVYVPVK
3139:IKTEEDEG
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.