Re: [Samtools-help] samtools sort command results in symbols calculation

2020-04-23 Thread John Marshall
On 22 Apr 2020, at 17:10, Pedro Hollanda Carvalho mailto:hollandacarva...@gmail.com>> wrote: I'm unable to perform samtools sort command. Result is a weird buch of symbols followed by letters and numbers as follows (just a part of it): yOR�G(��< {`�m�wF�D/�q|lOS�}�H This weird bunch of symbo

Re: [Samtools-help] samtools sort for paired-end .bam cannot find chromosome name in text header

2017-01-04 Thread Holbrook J .
Davies mailto:r...@sanger.ac.uk>>, John Marshall mailto:j...@sanger.ac.uk>> Cc: "samtools-help@lists.sourceforge.net<mailto:samtools-help@lists.sourceforge.net>" mailto:samtools-help@lists.sourceforge.net>> Subject: RE: [Samtools-help] samtools sort for paired-end .bam can

Re: [Samtools-help] samtools sort for paired-end .bam cannot find chromosome name in text header

2017-01-04 Thread Holbrook J .
ubject: Re: [Samtools-help] samtools sort for paired-end .bam cannot find chromosome name in text header On Wed, 4 Jan 2017, John Marshall wrote: > TL;DR i.e. what Rob said. But note that when you output SAM with >samtools view, samtools appends basic @SQ headers if there aren't &g

Re: [Samtools-help] samtools sort for paired-end .bam cannot find chromosome name in text header

2017-01-04 Thread Robert Davies
On Wed, 4 Jan 2017, John Marshall wrote: > TL;DR i.e. what Rob said. But note that when you output SAM with >samtools view, samtools appends basic @SQ headers if there aren't already >any. So Rob's `samtools view -H ... | grep '^@SQ'` will display some >headers whether the file contains text

Re: [Samtools-help] samtools sort for paired-end .bam cannot find chromosome name in text header

2017-01-04 Thread John Marshall
On 3 Jan 2017, at 16:21, Holbrook J. wrote: > I am trying to manipulate .bam files created by ernebs5 > (http://erne.sourceforge.net) aligning against hg19. > I am running samtools 1.3.1 [...] > > samtools sort -T /dev/shm/jostemp -@ 8 -m 4G -o sample1b_paired_sorted.bam > Sample1b_unmasked.bam

Re: [Samtools-help] samtools sort for paired-end .bam cannot find chromosome name in text header

2017-01-04 Thread Robert Davies
On Tue, 3 Jan 2017, Holbrook J. wrote: Dear Samtools help members Happy 2017! I would be very grateful for your help: I am trying to manipulate .bam files created by ernebs5 (http://erne.sourceforge.net) aligning against hg19. I am running samtools 1.3.1 I have ~ 1.5 mil singleton and ~ 100

Re: [Samtools-help] samtools sort -n

2016-10-10 Thread Colin Hercus
Hi Rebecca, If you don't mind commercial software novosort will happily merge files with different @SQ orders and output order will follow from the first file. novosort -m 16G -i -o out.bam NA18867.bam Denisovan.bam Output will be coordinate sorted and indexed. With a license file it will run

Re: [Samtools-help] samtools sort -n

2016-10-10 Thread Andrew Bjonnes
Hi Rebecca, samtools sort does not sort the sequence dictionary in the header, it sorts only the reads (by read name with the -n flag that you're using). Look into PicardTools ReorderSam ( https://broadinstitute.github.io/picard/command-line-overview.html#ReorderSam) to reorder your sequence dicti

Re: [Samtools-help] samtools sort -n

2016-10-10 Thread Juan Daniel Montenegro Cabrera
Hi Rebeca, I think the merge command expects a coordinate sorted bam and you are sorting it by query name. You should remove the "-n" flag in your sort command and then merge the sorted files. Regards, Juan Montenegro On 11 Oct 2016 6:52 AM, "Rebecca Harris" wrote: > Hi, > > I have a couple of g

Re: [Samtools-help] samtools sort problem

2016-10-10 Thread Juan Daniel Montenegro Cabrera
It does. I have tried all versions from ​0.1.18 to 1.3.1-dev -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot___

Re: [Samtools-help] samtools sort problem

2016-10-10 Thread Peter Cock
On Mon, Oct 10, 2016 at 9:37 AM, John Marshall wrote: > On 10 Oct 2016, at 05:35, Juan Daniel Montenegro Cabrera > wrote: >> I did a few test in my spare time. All samtools version from 0.1.19 >> have the same sorting problem, with or without the use of (-@) multiple >> threads. >> Version 0.1.1

Re: [Samtools-help] samtools sort problem

2016-10-10 Thread John Marshall
On 10 Oct 2016, at 05:35, Juan Daniel Montenegro Cabrera wrote: > I did a few test in my spare time. All samtools version from 0.1.19 have the > same sorting problem, with or without the use of (-@) multiple threads. > Version 0.1.18 is able to sort the file correctly, but is slower than > sam

Re: [Samtools-help] samtools sort problem

2016-10-09 Thread Juan Daniel Montenegro Cabrera
Hi, You can download the unsorted bam file from here: https://cloudstor.aarnet.edu.au/plus/index.php/s/p90bYldJoqE5Fbv Regards, Juan Montenegro 2016-10-10 14:35 GMT+10:00 Juan Daniel Montenegro Cabrera < jdmonteneg...@gmail.com>: > Dear John, > > I did a few test in my spare time. All samtools

Re: [Samtools-help] samtools sort problem

2016-10-09 Thread Juan Daniel Montenegro Cabrera
Dear John, I did a few test in my spare time. All samtools version from 0.1.19 have the same sorting problem, with or without the use of (-@) multiple threads. Version 0.1.18 is able to sort the file correctly, but is slower than sambamba, especially for really big bam files. I have a reduce unsor

Re: [Samtools-help] samtools sort problem

2016-10-07 Thread John Marshall
On 7 Oct 2016, at 06:24, Juan Daniel Montenegro Cabrera wrote: > > samtools view -bh@ 15 in.sam | samtools sort -T tmp -@ 15 -o out.sorted.bam - > > When I try to index the sorted bam file it complains: > > samtools index out.sorted.bam > [E::hts_idx_push] NO_COOR reads not in a single block

Re: [Samtools-help] samtools sort problem

2016-10-07 Thread Juan Daniel Montenegro Cabrera
Quick update. I just used sambamba_v0.6.4 for sorting the same file and the sorted bam file was as expected, the unmapped reads at the end and the reads mapping to each reference grouped by sequence and sorted by coordinates. From this, it seems to me it was a bug in samtools sort. Cheers, Juan

Re: [Samtools-help] samtools sort output

2016-03-09 Thread Tommy Carstensen
Hi Carol, I believe the current version is 1.3. Version 0.1.19 is quite dated. Best wishes, Tommy From: carol white mailto:wht_...@yahoo.com>> Reply-To: carol white mailto:wht_...@yahoo.com>> Date: Wednesday, 9 March 2016 12:48:00 To: Samtools Help mailto:samtools-help@lists.sourceforge.net>> S

Re: [Samtools-help] samtools sort

2015-08-20 Thread Colin Hercus
Hi Joao, Recent versions of samtools write to stdout and let you choose the output format Program: samtools (Tools for alignments in the SAM format) Version: 1.2 (using htslib 1.2.1) Usage: samtools sort [options...] [in.bam] Options: -l INT Set compression level, from 0 (uncompressed) to

Re: [Samtools-help] samtools sort

2015-08-20 Thread Peter Cock
Hi Joao, The short answer is no. For most uses of for coordinate sorted BAM files (e.g. mpileup) random access is performed (via the assosicated BAM index), and therefore having the input BAM file on stdin is not possible. However, you can output samtools sort to stdout (see the -o option). For

Re: [Samtools-help] samtools sort

2015-08-20 Thread Peter Cock
Hi Joao, The short answer is no. For most uses of for coordinate sorted BAM files (e.g. mpileup) random access is performed (via the assosicated BAM index), and therefore having the input BAM file on stdin is not possible. However, you can output samtools sort to stdout (see the -o option). For

Re: [Samtools-help] samtools sort

2015-08-20 Thread Joao Dias
Good morning, I am wondering if you received my previous email. Thank you in advance. Regards, João Dias From: Joao Dias mailto:j...@sanger.ac.uk>> Date: Wednesday, 12 August 2015 09:34 To: "samtools-help@lists.sourceforge.net" mailto:samtools-help@l

Re: [Samtools-help] samtools sort removes @RG header line, but leaves RG tags in alignments

2015-07-27 Thread John Marshall
On 27 Jul 2015, at 17:36, Patrick Reilly wrote: > I'm using BWA v0.7.12-r044 and samtools v1.2-99-ge2bb18f > [...] sorting with output as BAM or SAM results in the loss of the RG header, > while the alignment RG tags remain (thereby corrupting the SAM/BAM file, > according to GATK). > > I've a

Re: [Samtools-help] samtools sort crashed on 110G bam file. memory leakage?

2015-04-28 Thread Colin Hercus
Hi Vladimir, That's good. Increasing the -m setting to 2 to 4 G may reduce maximum memory. There's a section in the manual that discusses memory usage and the merge phase memory increases as -m is reduced and it's possible to set -m too low. Anyway, I'm glad it's done. If you try biobambam could

Re: [Samtools-help] samtools sort crashed on 110G bam file. memory leakage?

2015-04-28 Thread Vladimir Morozov
I was able to finish the job with -@ 2 -m 1G parameters. Though it occupied about 4.5G memory by the end of the job On Mon, Apr 27, 2015 at 9:33 PM, Colin Hercus wrote: > Hi Vladimir, > > You could try using novosort, it's included in download at > http://www.novocraft.com/support/download/ >

Re: [Samtools-help] samtools sort crashed on 110G bam file. memory leakage?

2015-04-27 Thread Colin Hercus
Hi Vladimir, You could try using novosort, it's included in download at http://www.novocraft.com/support/download/ novosort -m 4G -t . my_sample.bam >sorted.bam To get multi-threading you need a license and I've attached a 1 month trial. Just extract the file novoalign.lic from the attache

Re: [Samtools-help] samtools sort creates millions of files

2015-03-11 Thread Bob Harris
On 11 Mar 2015, at 14:24, Bob Harris wrote: > That's one seriously broken grammar that considers "1750M" to mean 175 > but thinks "1.75G" means 1. to which, on Mar 11, 2015, at 12:18 PM, John Marshall replied: > You are not wrong. Coincidentally we've been working on improving ways to >

Re: [Samtools-help] samtools sort creates millions of files

2015-03-11 Thread John Marshall
On 11 Mar 2015, at 14:24, Bob Harris wrote: > That's one seriously broken grammar that considers "1750M" to mean 175 > but thinks "1.75G" means 1. You are not wrong. Coincidentally we've been working on improving ways to specify numbers in htslib for a while, and can probably take advantag

Re: [Samtools-help] samtools sort creates millions of files

2015-03-11 Thread Bob Harris
My two cents, at the risk of just sounding like an old curmudgeon: That's one seriously broken grammar that considers "1750M" to mean 175 but thinks "1.75G" means 1. If the latter doesn't qualify as an INT, the input parser should tell the user, just like it would if she tried "-m unicorn",

Re: [Samtools-help] samtools sort creates millions of files

2015-03-11 Thread Ashish Agarwal
Sigh. Of course it was going to be something trivial. Thanks for the help. A minor other point, it seems samtools goes above the requested memory by a non-negligible amount. I was already accounting for the memory needed by basic system processes, by asking samtools to use 0.5G less than the avail

Re: [Samtools-help] samtools sort creates millions of files

2015-03-10 Thread Colin Hercus
The problem is the decimal point in the -m setting -m 1.75G. The code will pick this up as a request for 1 byte of RAM. The help for -m is .. -m INT where INT stands for integer. so try -m 1750M On 11 March 2015 at 10:24, Ashish Agarwal wrote: > On Mon, Mar 9, 2015 at 6:24 PM, Thomas W. Blac

Re: [Samtools-help] samtools sort creates millions of files

2015-03-10 Thread Monica Britton
Hi Ashish: The usage specifies that -m needs an integer value: -m INTmax memory per thread; suffix K/M/G recognized [768M] So, -m 2G should properly allocate ~2G, while -m 1.75G allocates only a tiny amount of memory, and splits the sorted sub-files into 2.2k chunks ... thus, millions of fi

Re: [Samtools-help] samtools sort creates millions of files

2015-03-10 Thread Ashish Agarwal
> there is no problem on a normal server. We see the problematic behavior only on AWS instances. I take that back. Just tried on a regular server, and we do get the problem there too. Sorry, we've tried so many combinations of things, I'm losing track of what worked when. On Tue, Mar 10, 2015 a

Re: [Samtools-help] samtools sort creates millions of files

2015-03-10 Thread Ashish Agarwal
More tests show that setting only -m leads to the same problem, but setting only -@ works fine. Also, I believe there is no problem on a normal server. We see the problematic behavior only on AWS instances. On Tue, Mar 10, 2015 at 10:24 PM, Ashish Agarwal wrote: > On Mon, Mar 9, 2015 at 6:24 PM

Re: [Samtools-help] samtools sort creates millions of files

2015-03-10 Thread Ashish Agarwal
On Mon, Mar 9, 2015 at 6:24 PM, Thomas W. Blackwell wrote: Can you give us the actual command line ? On a node with 7.5 Gb of RAM and 4 cores, we did: samtools sort -@ 4 -m 1.75G file.bam file_sorted > stdout 2> stderr We've tried variations of this, such as: samtools sort -@ 4 -m 7

Re: [Samtools-help] samtools sort creates millions of files

2015-03-09 Thread Thomas W. Blackwell
Can you give us the actual command line ? And the number of @SQ records when you do 'samtools view -H file.bam | egrep -e '^@SQ' | wc' ? - thanks - tom blackwell - On Mon, 9 Mar 2015, Ashish Agarwal wrote: > `samtools sort` is often creating

Re: [Samtools-help] samtools sort

2015-01-19 Thread Devon Ryan
As long as it finished creating all of the temporary files then you can simply run "samtools merge" yourself. BTW, make sure that you haven't run out of room on the drive. That's a likely cause of this problem. Devon -- Devon Ryan, Ph.D. Email: dpr...@dpryan.com Laboratory for Molecular and Cell

Re: [Samtools-help] samtools sort

2014-11-10 Thread Ann Mongan
Hi, I'm interested in reading about the sorting algorithm used by Samtools. Could you briefly explain the algorithm and/or point me to a reference? Thanks, Ann -- Comprehensive Server Monitoring with Site24x7. Monitor 10 se

Re: [Samtools-help] samtools sort crashing, truncating outputs

2014-10-15 Thread Colin Hercus
Hi Amy, You could try novosort -n, it doesn't do any validation, just sorting. You can download as part of novocraft suite from www.novocraft.com. Without a license it will run single threaded so I've attached a tar with a license file that will enable threading. Just extract the file novoalign.li

Re: [Samtools-help] Samtools sort (merging) is slow

2014-09-05 Thread Adam Skarshewski
...@sanger.ac.uk] Sent: Friday, September 05, 2014 8:31 PM To: Adam Skarshewski Cc: samtools-help@lists.sourceforge.net Subject: Re: [Samtools-help] Samtools sort (merging) is slow On 5 Sep 2014, at 06:52, Adam Skarshewski wrote: > Running into a problem with samtools 1.0 sort, specifically the merg

Re: [Samtools-help] Samtools sort (merging) is slow

2014-09-05 Thread John Marshall
On 5 Sep 2014, at 06:52, Adam Skarshewski wrote: > Running into a problem with samtools 1.0 sort, specifically the merging > phase. It's very slow compared to the previous version. I have a 965MB bam > file (which is already sorted) Does your BAM file have rather a lot of reference sequences?

Re: [Samtools-help] Samtools sort -n problems

2014-08-28 Thread Devon Ryan
Hi Mark, It would be helpful if you showed the exact command used and also how you confirmed that an alignment was lost. Just this week I saw a post (elsewhere) where someone was losing reads in converting from SAM->BAM only to find that it was due to misusing an obscure (present but undocument

Re: [Samtools-help] samtools sort RAM

2014-06-12 Thread Louis Letourneau
Ok thanks, I was afraid it was something on my system or the like. I don't always need twice as much, but always quite a bit more Louis On 14-06-12 05:53 AM, Colin Hercus wrote: > Hi Louis, > > Same here. It seems to use twice what you would expect from the settings. > > Colin > > > On 11 Ju

Re: [Samtools-help] samtools sort RAM

2014-06-12 Thread Colin Hercus
Hi Louis, Same here. It seems to use twice what you would expect from the settings. Colin On 11 June 2014 20:16, Louis Letourneau wrote: > Hello, > > When I use samtools 0.1.19 multithreaded sort > > samtools sort -@ 3 -m 5000M > > I expect the process to use between 15-20Gb (I don't know if