On Tue, Nov 07, 2017 at 07:11:06AM -0700, Brent Pedersen wrote: > 1 243185013 4362 16775145505 611 180875 > 1 10173 13477 16775327057 623 260625
I assume this is somewhere in the middle of your index file, given 16775145505 is a very large offset. > note that 10173 follows 243185013. Is there any way this can occur for > a valid crai? If the input data is position sorted then this should not occur, as 1:10173 would come before 1:243185013. I didn't think it would be possible to produce an index on an unsorted file, but apparently this isn't detected by samtools index for CRAM files (it is for BAM). I am assuming this is how your index was created. Can you confirm that your data is not position sorted? That said, actually indexing an unsorted file may be a "feature" as the CRAM index also permits random access to retrieve blocks of data by Nth block as well as by genomic region, which theoretically can be used for distributed processing. I'm sure it's just a bug though rather than a deliberate feature. James -- James Bonfield (j...@sanger.ac.uk) | Hora aderat briligi. Nunc et Slythia Tova | Plurima gyrabant gymbolitare vabo; A Staden Package developer: | Et Borogovorum mimzebant undique formae, https://sf.net/projects/staden/ | Momiferique omnes exgrabure Rathi. -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help