On Tue, Nov 07, 2017 at 07:11:06AM -0700, Brent Pedersen wrote:
> 1       243185013       4362    16775145505     611     180875
> 1       10173   13477   16775327057     623     260625

I assume this is somewhere in the middle of your index file, given
16775145505 is a very large offset.

> note that 10173 follows 243185013. Is there any way this can occur for
> a valid crai?

If the input data is position sorted then this should not occur, as
1:10173 would come before 1:243185013.  I didn't think it would be
possible to produce an index on an unsorted file, but apparently this
isn't detected by samtools index for CRAM files (it is for BAM).

I am assuming this is how your index was created.  Can you confirm
that your data is not position sorted?

That said, actually indexing an unsorted file may be a "feature" as
the CRAM index also permits random access to retrieve blocks of data
by Nth block as well as by genomic region, which theoretically can be
used for distributed processing.  I'm sure it's just a bug though
rather than a deliberate feature.

James

-- 
James Bonfield (j...@sanger.ac.uk) | Hora aderat briligi. Nunc et Slythia Tova
                                  | Plurima gyrabant gymbolitare vabo;
  A Staden Package developer:     | Et Borogovorum mimzebant undique formae,
https://sf.net/projects/staden/   | Momiferique omnes exgrabure Rathi. 


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to