Dear samtools team,
For some further application i need change chromosome names in my bam files.
samtools view example.bam | head -n2
FCC0U42ACXX:2:2103:13375:132773#ACTACAAG 99 SL2.40ch04 1 29
100M = 358 459
CATCACGGCCAATCCAACTCATTTTCAATGTCAAACGAGCCTCGAAGCGCGCATACCCCCCATTTCGACGATTTTTGTGTGCTATAGCAAAACACTTTTT
@@CFDDFADHHDHIGBBHHG@HHHGGEHIIGGGIDH@@GGGCEFAECFGG<?<ABEECC?;233@A
?BB3)2??CA@BC<8CA>AD@@>C9?AA<:((:@ XT:A:U NM:i:3 SM:i:29
AM:i:29 X0:i:1 X1:i:0 XM:i:3 XO:i:0 XG:i:0
MD:Z:86C4C2T5
RG:Z:120512_I238_FCC0U42ACXX_L2_SZAXPI008746-45_1.fq.gz.clean.dup.clean.gz.part5.1.fq.bam
FCC0U42ACXX:2:1306:12617:192543#ACTACAAG 89 SL2.40ch04 2 37
100M = 2 0
ATCACGGCCAATCCACCTCATTTTCAAGGTCAAACGAGCCTCGAAGCGCGCATACCCCCCATTTCGACGATTTTTGTGTGCTATACCAAACCATTTTTTG
?85&>>3B<:42213(3>@@CC>4(CCDCC<D?55BDDBDDDDDDDDDDDDC:5DJIIEIIIG@GGGJIFJJIJJIIJJIJIJIHIJHHDHHFFFFFCCB
XT:A:U NM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:2
XO:i:0 XG:i:0 MD:Z:15A11T72
RG:Z:120512_I238_FCC0U42ACXX_L2_SZAXPI008746-45_1.fq.gz.clean.dup.clean.gz.part4.1.fq.bam
Then i applied the command:
samtools view -H example.bam | sed -r 's/SL2.40ch//g' | samtools reheader -
example.bam >test.bam 2>outErr.txt
Then verify again with command:
samtools view test.bam | head -n2
FCC0U42ACXX:2:2103:13375:132773#ACTACAAG 99 04 1 29 100M
= 358 459
CATCACGGCCAATCCAACTCATTTTCAATGTCAAACGAGCCTCGAAGCGCGCATACCCCCCATTTCGACGATTTTTGTGTGCTATAGCAAAACACTTTTT
@@CFDDFADHHDHIGBBHHG@HHHGGEHIIGGGIDH@@GGGCEFAECFGG<?<ABEECC?;233@A
?BB3)2??CA@BC<8CA>AD@@>C9?AA<:((:@ XT:A:U NM:i:3 SM:i:29
AM:i:29 X0:i:1 X1:i:0 XM:i:3 XO:i:0 XG:i:0
MD:Z:86C4C2T5RG:Z:120512_I238_FCC0U42ACXX_L2_SZAXPI008746-45_1.fq.gz.clean.dup.clean.gz.part5.1.fq.bam
FCC0U42ACXX:2:1306:12617:192543#ACTACAAG 89 04 2 37 100M
= 2 0
ATCACGGCCAATCCACCTCATTTTCAAGGTCAAACGAGCCTCGAAGCGCGCATACCCCCCATTTCGACGATTTTTGTGTGCTATACCAAACCATTTTTTG
?85&>>3B<:42213(3>@@CC>4(CCDCC<D?55BDDBDDDDDDDDDDDDC:5DJIIEIIIG@GGGJIFJJIJJIIJJIJIJIHIJHHDHHFFFFFCCB
XT:A:U NM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:2
XO:i:0 XG:i:0
MD:Z:15A11T72RG:Z:120512_I238_FCC0U42ACXX_L2_SZAXPI008746-45_1.fq.gz.clean.dup.clean.gz.part4.1.fq.bam
So it looks Ok.
But then the command:
samtools view -h test.bam | sed -rn '/SL2.40ch/p' >test.txt 2>outErr.txt
find out a number of strings like this:
FCC0U42ACXX:2:1104:6995:181777#ACTACAAG 83 04 623 29
59M1I40M = 234 -488
CTTGTGCTATAGCAAACCATTTTTTGGGTTATCCGGATTCCGACGTTAAAAATGCCATATTTTTTTGTGGACGTCTGTCAAGACCTTGGCTATGCATCCG
??94CCCEDDCCBBCCCC@CBBA?BBCC@2BBBCCCB@CEDAAECAIIIIIIIIGHGFGBGIGIGEIGHIGGGIHCGGCIIGFIIIIHHGHHFFFFFCCC
XT:A:R NM:i:2 SM:i:0 AM:i:0 X0:i:2 X1:i:1 XM:i:1
XO:i:1 XG:i:1 MD:Z:29G69
XA:Z:SL2.40ch01,-74867066,100M,4;SL2.40ch04,-623,59M1I40M,2;
RG:Z:120512_I238_FCC0U42ACXX_L2_SZAXPI008746-45_1.fq.gz.clean.dup.clean.gz.part0.1.fq.bam
FCC0U42ACXX:2:1305:8246:167295#ACTACAAG 83 04 679 60 100M
= 309 -470
TATTTTTTTGTGGACGTCTGTCAAGACCTTGGCTATGCATCCGATTTTCCTTCATGGACATTCCAACCCATTTTCAAGGTCAAACGAGCCCCGAAGTGCG
>BBB?A8DDDDDDBDDCDDCCDCC@
:DDDDDCCDECBDDDFFCFGHHEJIHEHEAGGAIHF@IGGBJJIGJJIEIJJIIHIGGGJJJHHHGHFFFFFCCC
XT:A:U NM:i:3 SM:i:23 AM:i:23 X0:i:1 X1:i:1 XM:i:3
XO:i:0 XG:i:0 MD:Z:0A0T0A97 XA:Z:SL2.40ch04,-680,2M1I97M,1;
RG:Z:120512_I238_FCC0U42ACXX_L2_SZAXPI008746-45_1.fq.gz.clean.dup.clean.gz.part4.1.fq.bam
You can see the strings contain old chromosome name and further software
(Conifer) also has found it and rise an error.
My question what is wrong in my commands? How i can completely remove the
old chromosome names from my bam files?
Regards,
Denis
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help