Greetings again, If I may, another question on the issue of IG format: how difficult would it be to support database indexing for this format?
With best regards, Daniel -- Daniel Rozenbaum Biocceleration, Inc. OCIO/ Office of Application Engineering & Development/ Patent System Division 600 Dulany St. Alexandria, VA 22314 On Sep 14, 2012, at 9:36 AM, "Rozenbaum, Daniel (Biocceleration Inc)" <daniel.rozenb...@uspto.gov> wrote: > Hello Peter and everyone, > > I was wondering if I could revive the discussion about the support of IG > format if possible. I'm helping deploy EMBOSS at the US Patent and Trademark > Office, where this format, in its multi-line sequence annotation form, is > used extensively. > > Here's an example of an additional issue I've run into when trying to work > with IG format in EMBOSS: > > % makeprotseq -amount 10 -length 10 -nouseinsert -osformat ig -auto -osname > ig1 > > % cat ig1.ig > ;, 10 bases > EMBOSS_001 > hcsptpstas1 > ;, 10 bases > EMBOSS_002 > rdgwcvmtrm1 > ;, 10 bases > EMBOSS_003 > fgtifgdgid1 > <snip> > > % entret -sequence ig1.ig:EMBOSS_001 -nofirstonly -auto -stdout > ;, 10 bases > EMBOSS_001 > hcsptpstas1 > ;, 10 bases > > In the entret result above the first annotation line of the subsequent record > is returned as part of the requested record. > > Many thanks, > Daniel > -- > Daniel Rozenbaum > Biocceleration, Inc. > OCIO/ Office of Application Engineering & Development/ Patent System Division > 600 Dulany St. > Alexandria VA 22314 > > ------------------------- > On 15/08/2012 17:57, Daniel Rozenbaum wrote: >> Dear list, >> >> (Peter, many thanks for your prompt reply to my previous inquiry!) >> >> We need to deal with extensive databases in Intelligenetics format with >> multiple lines in annotation of each record. It appears however that EMBOSS >> concatenates all annotation lines into a single line when building its >> internal representation of the sequence description: >> >> % cat /tmp/IGSEQ.ig >> ; Annotation line 1 >> ; Annotation line 2 >> ; Annotation line 3 >> IGSEQ >> ACGCATCGCATCAGACTACGC1 >> >> >> % seqret /tmp/IGSEQ.ig -osformat2 ig -auto -osname IGSEQ.emboss_ig2ig >> -osdirectory /tmp >> >> >> % cat /tmp/IGSEQ.emboss_ig2ig.ig >> ;Annotation line 1 Annotation line 2 Annotation line 3, 21 bases >> IGSEQ >> ACGCATCGCATCAGACTACGC1 >> >> Are there any plans to support multi-line annotation in this format? > > Interesting thought. We will take a look. It will need some care to > maintain compatibility with other formats that have single (FASTA) or > multiple (swissprot) descriptions. > > Which package is using this IG format? > > regards, > > Peter Rice > EMBOSS Team > > > > _______________________________________________ > EMBOSS mailing list > EMBOSS@lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss