On 28 Nov 2016, at 17:51, Sebastian Gregoricchio 
<sebastian.gregoricc...@gmail.com> wrote:
> I am a student of molecular biology (master degree) and I am studying genome 
> assembly against a reference genome.
> I have a, maybe stupid, question: if i have a quality raw in a fastq file 
> that starts for symbol  @, how can samtools distinguish that raw from the 
> "description" one?

Samtools itself doesn't currently much read FASTQ files, but of course other 
tools do.  FASTQ is not particularly well-described (see 
https://en.wikipedia.org/wiki/FASTQ_format and the paper referenced therein) 
and there are essentially two ways to handle this problem, depending on whether 
you insist that the sequence and quality strings appear on one line apiece or 
allow them to be split over multiple lines.

1. As FASTQ files typically represent shortish sequencing reads, the great 
majority of FASTQ files in the wild likely have sequence and quality on one 
line each, as described in the Wikipedia article.  (FASTA files are the 
opposite and are usually line-wrapped.) 

So if you require your input to be formatted like this, you simply effectively 
read four lines at a time.  Then line 1 is the description and line 4 is the 
quality regardless of what character it starts with.

2. If you allow them to be wrapped over multiple lines, parsing is a little 
harder.  But after you've read the '+' line, you already know how many sequence 
letters there are in the sequence, so you can read that same number of quality 
letters without caring whether they might be '@' or not.  You will then be in 
the right place to read the @description line of the following record.

    John

-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to