Hi Daniel, We index the geneseq databases in the EMBL format. The problem with Geneseq format is that each entry has several lines starting with "SQ ". Therefore in order to make this work you just need to write a program which only prints the first line starting with "SQ " in each entry and skips the following SQ lines. Hope this helps!
Best regards, Isabelle Wells F. Hoffmann-La Roche LtdĀ -----Original Message----- From: emboss-boun...@lists.open-bio.org [mailto:emboss-boun...@lists.open-bio.org] On Behalf Of Rozenbaum, Daniel (Biocceleration Inc) Sent: Wednesday, 6. February 2013 16:50 To: emboss@lists.open-bio.org Subject: [EMBOSS] Working with Geneseq databases Dear all, Does anyone have experience getting EMBOSS to work with the Geneseq database distributed by Thomson Reuters ( http://thomsonreuters.com/products_services/science/science_products/a-z/geneseq/ ) ? This database comes in "EMBL-like" format that uses some line codes that are not defined in EMBL format proper, which in our experiments has caused problems when, for example, trying to index these databases as EMBL-formatted. -- Daniel Rozenbaum Biocceleration, Inc. OCIO/Office of Application Engineering and Development/Patent System Division 600 Dulany St., Alexandria, VA 22314 _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss _______________________________________________ EMBOSS mailing list EMBOSS@lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss