On 28 Nov 2016, at 17:51, Sebastian Gregoricchio <sebastian.gregoricc...@gmail.com> wrote: > I am a student of molecular biology (master degree) and I am studying genome > assembly against a reference genome. > I have a, maybe stupid, question: if i have a quality raw in a fastq file > that starts for symbol @, how can samtools distinguish that raw from the > "description" one?
Samtools itself doesn't currently much read FASTQ files, but of course other tools do. FASTQ is not particularly well-described (see https://en.wikipedia.org/wiki/FASTQ_format and the paper referenced therein) and there are essentially two ways to handle this problem, depending on whether you insist that the sequence and quality strings appear on one line apiece or allow them to be split over multiple lines. 1. As FASTQ files typically represent shortish sequencing reads, the great majority of FASTQ files in the wild likely have sequence and quality on one line each, as described in the Wikipedia article. (FASTA files are the opposite and are usually line-wrapped.) So if you require your input to be formatted like this, you simply effectively read four lines at a time. Then line 1 is the description and line 4 is the quality regardless of what character it starts with. 2. If you allow them to be wrapped over multiple lines, parsing is a little harder. But after you've read the '+' line, you already know how many sequence letters there are in the sequence, so you can read that same number of quality letters without caring whether they might be '@' or not. You will then be in the right place to read the @description line of the following record. John -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help