On Tue, Dec 21, 2021 at 01:55:15PM +0100, Lo?c Meunier wrote:
> I'm currently using the text-formatted alignment produced with the tview
> command to discard reads that do not fully cover my region of interest.
> 
> For this, I filter out the alignment lines which contain a space character,
> my assumption being that these delimitate reads. However, as I observe some
> oddities with this filtering method, could you give me more information
> about the meaning of the space characters in the tview command output?

The tview output is designed for human readability and as a
pictorial-style representation of the sequence alignments.  It's
absolutely not the right format for parsing / filtering, and besides
the white space used may well differ between curses library and/or
terminal type.

I'd suggest giving up with this avenue and looking at the pileup
command instead.  Note modern versions of this have a bunch of command
line options for filtering out things that you may not be interested
in, such as indels or read start/end markers.  That can make parsing
easier.

James

-- 
James Bonfield (j...@sanger.ac.uk)
The Sanger Institute, Hinxton, Cambs, CB10 1SA


-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to