Hi Kar-Tong,
As I understand it, the fastq command will output reads in the same order
they are encountered in the BAM file, so if the input BAM file is not
sorted by filename, the output fastq files will not be sorted either. This
matches the example input and outputs in your email.
Andrew
On Sun, Mar 20, 2016 at 11:12 PM Kar Tong Tan <karto...@gmail.com> wrote:
> I like the new Samtools fastq function which allows me to convert a
> coordinate sorted bam file into a fastq file without having to sort the
> file by read name (which can take forever especially for a really big bam
> file).
>
> However, I have noticed what seems like a bug while trying to convert a
> bam file recently.
>
> Using the following command (Samtools is samtools v1.3), I converted my
> bam to fastq file
>
> $ Samtools fastq -1 1.fq -2 2.fq ./alignments.bam
>
> However, if I take a look at the 1.fq and 2.fq files, I notice that the
> reads in the fastq files are not sorted properly according to readnames.
>
> $ head 1.fq
> @UNC13-SN749_82:3:1102:14504:162540/1
> GTTAGGGTTGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTT
> +
> BBCFFFFDHHHHDHGHIJFGIHIJ?GHIJJFHHIJJDGHHJJDGEHIJ.B
> @UNC13-SN749_82:3:2105:9477:158884/1
> GCTCCTCTCCACAGGAAAACTCCACTCCAGTGCTCAGCTTGCACCCTGGC
> +
> ?B@FFFFFHHHHHJJJJJJJJJGJJJJJIIIIJJJIJJJJJJJJIGIJJJ
> @UNC13-SN749_82:3:1207:3243:175188/1
> TATTAAGTTACATGCAGACAACAGGGGCCAGAAGATGAACAATGGCCCAT
>
> $ head 2.fq
> @UNC13-SN749_82:3:1102:14504:162540/2
> TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA
> +
> CCCFFFFFHHHHHJJJJJJJJJJJIIIJJJIJJJJJJJJJJJJJJIJJIG
> @UNC13-SN749_82:3:1207:3243:175188/2
> ATTTTCTTTGACCTCTTCCTTCTGTTCATGTGTATTTGCTGTCTCTTAGC
> +
> <@@FFFDFHFAHHIJIJG4FFIHIIIIHGIEHH>HHGHICHHIGEHHIII
> @UNC13-SN749_82:3:2105:9477:158884/2
> CTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCCAGACTTCCCGTG
>
>
>
> If I look at the bamfile, this is what I see:
>
> UNC13-SN749_82:3:1102:14504:162540/2 163 chr1 10019 69
> 50M = 10068 99
> TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA
> CCCFFFFFHHHHHJJJJJJJJJJJIIIJJJIJJJJJJJJJJJJJJIJJIG
> RG:Z:110714_UNC13-SN749_0082_AD0DGMABXX_3_ IH:i:1 HI:i:1 NM:i:0
> UNC13-SN749_82:3:1102:14504:162540/1 83 chr1 10068 69
> 50M = 10019 -99
> AACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAAC
> B.JIHEGDJJHHGDJJIHHFJJIHG?JIHIGFJIHGHDHHHHDFFFFCBB
> RG:Z:110714_UNC13-SN749_0082_AD0DGMABXX_3_ IH:i:1 HI:i:1 NM:i:0
> UNC13-SN749_82:3:1207:3243:175188/2 163 chr1 11886 56
> 50M = 12105 269
> ATTTTCTTTGACCTCTTCCTTCTGTTCATGTGTATTTGCTGTCTCTTAGC
> <@@FFFDFHFAHHIJIJG4FFIHIIIIHGIEHH>HHGHICHHIGEHHIII
> RG:Z:110714_UNC13-SN749_0082_AD0DGMABXX_3_ IH:i:7 HI:i:1 NM:i:2
> UNC13-SN749_82:3:2105:9477:158884/2 163 chr1 11900 69
> 50M = 12040 190
> CTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCCAGACTTCCCGTG
> CCCFFFFFHHHHHJJJIIHJIJJJJJJJJJJJJJJJJJJIJJIIIJJJIJ
> RG:Z:110714_UNC13-SN749_0082_AD0DGMABXX_3_ IH:i:6 HI:i:1 NM:i:0
> UNC13-SN749_82:3:2105:9477:158884/1 83 chr1 12040 69
> 50M = 11900 -190
> GCCAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTTCCTGTGGAGAGGAGC
> JJJIGIJJJJJJJJIJJJIIIIJJJJJGJJJJJJJJJHHHHHFFFFF@B?
> RG:Z:110714_UNC13-SN749_0082_AD0DGMABXX_3_ IH:i:6 HI:i:1 NM:i:0
> UNC13-SN749_82:3:1201:10653:108594/2 137 chr1 12085 59
> 50M * 0 0
> GGAGCCATGCCTAGAGTGGGATGGGCCATTGTTCATATTCTGGCCCCTGT
> CCCFFFFFHHHHHIJIEFHGIGGJJJJJJJJJJJJJJIJIJJIIJJJIJJ
> RG:Z:110714_UNC13-SN749_0082_AD0DGMABXX_3_ IH:i:8 HI:i:1 NM:i:1
> UNC13-SN749_82:3:1207:3243:175188/1 83 chr1 12105 69
> 50M = 11886 -269
> ATGGGCCATTGTTCATCTTCTGGCCCCTGTTGTCTGCATGTAACTTAATA
> IIIIIIIHDGIJIIHCIIIJJJIGIIJJJIIJIIJIGHGHFHDFFFFCC@
> RG:Z:110714_UNC13-SN749_0082_AD0DGMABXX_3_ IH:i:7 HI:i:1 NM:i:0
> UNC13-SN749_82:3:1108:9942:173119/1 89 chr1 12110 39
> 50M * 0 0
> CCATTGTTCATATTCTGGCCCCTGTTGTCTGCATGTAACCTAATACCACG
> EGIGJJIGJJJJJJJIJHHIGJJJJJJIJIJIJJIJIHGHGHDDBFD@BB
> RG:Z:110714_UNC13-SN749_0082_AD0DGMABXX_3_ IH:i:8 HI:i:1 NM:i:3
>
>
>
> Does anyone know what might be causing this error? By the way, this is a
> RNA-seq bam file.
>
>
>
> Thank you.
>
>
> Kar-Tong
>
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
> _______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/samtools-help
>
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help