On Tue, May 17, 2016 at 09:17:20PM +0000, Lin, Chih-Hsu wrote: > During the conversion, BAM->CRAM->BAM, using samtools 1.3.1, I found the NM > tags were changed. Does anyone have solution to that?
Can you give us a concrete example please; an alignment record along with the relevant @SQ line so we can see what the reference is. The reason for NM and MD changes is that CRAM doesn't explicitly store these (although it could, it leads to larger files). Instead it uses the reference to compute them on-the-fly. However we have seen a number of cases where NM/MD in the original BAM file are incorrect, due to bugs in aligners. This leads to changes after round-tripping through CRAM. So that said, are the NM values output by CRAM->BAM the same values that samtools calmd generates? If so then this is fixing your data! If not then we possibly have a bug that needs fixing. Some have asked for a way to store the invalid data in CRAM regardless. There is perhaps some (albeit twisted) logic to this as it makes the validation of the data the responsibility of other tools and not the file format itself. I experimented with this by writing out all NM/MD, but it usually leads to 5-10% growth. A better solution would be to check when they differ to the computed values and only store them then, although that will slow up CRAM encoding somewhat so I'm not convinced yet this is a problem in need of a solution. James -- James Bonfield (j...@sanger.ac.uk) | Hora aderat briligi. Nunc et Slythia Tova | Plurima gyrabant gymbolitare vabo; A Staden Package developer: | Et Borogovorum mimzebant undique formae, https://sf.net/projects/staden/ | Momiferique omnes exgrabure Rathi. -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ------------------------------------------------------------------------------ Mobile security can be enabling, not merely restricting. Employees who bring their own devices (BYOD) to work are irked by the imposition of MDM restrictions. Mobile Device Manager Plus allows you to control only the apps on BYO-devices by containerizing them, leaving personal data untouched! https://ad.doubleclick.net/ddm/clk/304595813;131938128;j _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help