Hi all, On the continuing topic of the nebulous FASTQ format, are there any strong views as to weather a FASTQ files could hold records without a sequence (and therefore no quality scores)? This could make sense as output from an (aggressive) quality filter.
This was a discussion I meant to start on the OBF list, not the EMBOSS list - so here is the start of the thread: http://lists.open-bio.org/pipermail/emboss/2009-July/003707.html Basically in some contexts an empty FASTQ record makes sense, so perhaps we should include examples of this for our test suite. However, there is more than one reasonable way to represent such a record (either omitting the sequence and quality lines, or including blank sequence and quality lines). On Thu, Jul 30, 2009 at 4:09 PM, Peter Rice<[email protected]> wrote: > > Peter C. wrote: > >> As we are recommending no line wrapping on output this means >> typical FASTQ records would be four lines - so doing the same >> makes sense here too. > > I vote for 4 lines on output. If we want to allow zero length sequences, then yes, I would also vote for the 4 line output (i.e. blank lines for the sequence and the quality string). > It should be possible to allow zero lines on input depending on > where the '+' check is. Yes, I'm pretty sure a parser could cope with any of the zero length sequence FASTQ examples I gave. Peter _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
