Yes, i meant both input and output. It would not be default, so hopefully no programs should get a long-line surprise. The speed advantage is a single read for the whole sequence and not having to remove newlines. Indexing sub-sequences with locators becomes straightforward, the newlines don't get in the way. Most genome packages use it, i think, including mine. Thanks, yes i thought it must be quite easy to do ..
Niels On Tue, 2013-08-27 at 10:41 +0100, Peter Rice wrote: > On 27/08/2013 09:40, Niels Larsen wrote: > > EMBOSS list, > > > > I could not find a fasta single-line sequence format, is it > > missing? having the sequence as a single line does not > > violate fasta format i think, and many programs use it > > because of speed and indexing convenience. > > You mean as an output format I assume? (it would be no problem for input). > > Easy to implement, but needs a name so you can so specify > -osformat fastasingle (for example) > > It can also be an issue for applications that fail to check for very > long input lines. > > I don't see any real benefit for indexing - you only need to point to > the start of the ID line for that. Maybe there are applications that map > the sequence string and want to have no extra characters. > > regards, > > Peter Rice > EMBOSS Team > _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
