Sergej -
I can't give you the details, but you should look at the SAM/BAM
Format document from www.htslib.org. My old copy is dated 28 Dec
2014. Go to page 15, the two-line paragraph just before 5.1.2.
This suggests using CSI index format rather than the default BAI
index format. I think samtools supports both. Let the list know if
this helps.
- tom blackwell -
On Fri, 9 Mar 2018, Nowoshilow,Sergej wrote:
Dear SAMtools developers and community
Our group is working with the axolotl genome, which is 10x larger than that of
the human. It has 14 chromosomes and, thus, some (if not all) of the
chromosomes are longer than 2Gbp.. Although we don?t have chromosome-size
scaffolds yet, we are trying our best and managed to assemble some very long
scaffolds (with quite some gaps ? N?s): ~1.5Gbp.
Now I am running into problems with those long scaffolds, since although it is
perfectly possible to map the RNA/DNAseq reads to the scaffolds it is not
possible to sort and index the resulting BAM files, which means that they
cannot be viewed in the genome browser?
I tried ?samtools index -c -m?, but unsuccessfully irrespective of the value
specified by the ?-m? option?
The problem seems to be the LENGTH of the scaffold. I also looked at the source
code (however, not deep enough, therefore, excuse me if I?m wrong) and did some
testing with different datasets and reference sequences and have a feeling that
some internal variables might ?overrun? if the position of a read within the
scaffold exceeds ~500,000,000.. is it right?
If I?m right, is there any fix to that problem or is it an inherent issue that
cannot be fixed easily?
I would highly appreciate any advice on how to deal with that issue.
Theoretically, I could split our long scaffolds into shorter pieces, however,
that would defeat the notion of assembling chromosome-size scaffolds.
Thank you very much in advance!
Best regards
Sergej
Dr. Sergej Nowoshilow
Post-doc/Bioinformatician in Tanaka Lab
Elly Tanaka group
Animal models of regeneration
Campus-Vienna-Biocenter 1
1030 Vienna
email: sergej.nowoshi...@imp.ac.at
phone: +43 (0) 1 79730 3203
This message is confidential and may contain privileges information. It is
intended for the named recipients only. If you receive it in error please
notify me and permanently delete the original message and any copies.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help