[Samtools-help] Problems with picard 1.119. CollectAlignmentMetrics and CollectInsertSizeMetrics

2014-10-08 Thread Konstantinos Mavrommatis
Hi,
When I run CollectAlignmentMetrics and CollectInsertSizeMetrics using the 
latest Picard 1.119 I get the following error message.
The same command line when run on a previous version of Picard (tried 1.107 and 
1.96) runs without issues and produces output.
The BAM file was generated using STAR and was sorted using Picard SortSam.
Thanks in advance for your help
K

Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/scratch/RED/tmp/
[Wed Oct 08 00:15:12 PDT 2014] picard.analysis.CollectInsertSizeMetrics 
HISTOGRAM_FILE=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/archive/RED/RNA-Seq/Processed/STARaln.human-bamQC-SortByCo
ordinates/InsertSize/O2_5.InsertSize.pdf METRIC_ACCUMULATION_LEVEL=[ALL_READS] 
INPUT=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/archive/RED/RNA-Seq/Processed/STARaln.human-bamfiles-SortByC
oordinates/O2_5.coord.bam 
OUTPUT=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/archive/RED/RNA-Seq/Processed/STARaln.human-bamQC-SortByCoordinates/InsertSize/O2_5.InsertSize.qcstats
 REFERENCE
_SEQUENCE=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/reference/v1/Homo-sapiens/GRCh37.p12/WholeGenome/genome.fa
 TMP_DIR=[/scratch/RED/tmp/InsertSize/3966] VERBOSITY=WARN
ING VALIDATION_STRINGENCY=SILENTDEVIATIONS=10.0 MINIMUM_PCT=0.05 
ASSUME_SORTED=true STOP_AFTER=0 QUIET=false COMPRESSION_LEVEL=5 
MAX_RECORDS_IN_RAM=50 CREATE_INDEX=false CREATE_MD5_FILE=false
[Wed Oct 08 00:15:12 PDT 2014] Executing as @ussdgsphpccmp06 on Linux 
2.6.32-358.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_67-b01; 
Picard version: 1.119(d44cdb51745f5e8075c826430a39d8
a61f1dd832_1408991805) IntelDeflater
[Wed Oct 08 00:15:12 PDT 2014] picard.analysis.CollectInsertSizeMetrics done. 
Elapsed time: 0.00 minutes.
Runtime.totalMemory()=2026373120
To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
Exception in thread main java.lang.NullPointerException
   at 
htsjdk.samtools.reference.ReferenceSequenceFileWalker.get(ReferenceSequenceFileWalker.java:87)
   at 
picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:113)
   at 
picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:53)
   at 
picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)
   at 
picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124)
   at 
picard.analysis.CollectInsertSizeMetrics.main(CollectInsertSizeMetrics.java:83)
child process exited with value 1


Konstantinos Mavrommatis, PhD
Senior Bioinformatics Scientist,
Celgene Corp, San Francisco
+1 415 839 7061

*
THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS
CONFIDENTIAL AND MAY CONTAIN LEGALLY PRIVILEGED
INFORMATION INTENDED ONLY FOR THE USE OF THE INDIVIDUAL
OR INDIVIDUALS NAMED ABOVE.
If the reader is not the intended recipient, or the
employee or agent responsible to deliver it to the
intended recipient, you are hereby notified that any
dissemination, distribution or copying of this
communication is strictly prohibited. If you have
received this communication in error, please reply to the
sender to notify us of the error and delete the original
message. Thank You.
--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help


Re: [Samtools-help] Problems with picard 1.119. CollectAlignmentMetrics and CollectInsertSizeMetrics

2014-10-08 Thread John Damm Sørensen
Could it be that you are using Java 1.7?
Picard used to run on Java 1.6 only.
From the Website:
The Picard command-line tools are packaged as executable jar files. They 
require Java 1.6. They can be invoked as follows:

Best
John

On 08 Oct 2014, at 9:49 38am, Konstantinos Mavrommatis 
kmavromma...@celgene.com wrote:

 Hi,
 When I run CollectAlignmentMetrics and CollectInsertSizeMetrics using the 
 latest Picard 1.119 I get the following error message.
 The same command line when run on a previous version of Picard (tried 1.107 
 and 1.96) runs without issues and produces output.
 The BAM file was generated using STAR and was sorted using Picard SortSam.
 Thanks in advance for your help
 K
  
 Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/scratch/RED/tmp/
 [Wed Oct 08 00:15:12 PDT 2014] picard.analysis.CollectInsertSizeMetrics 
 HISTOGRAM_FILE=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/archive/RED/RNA-Seq/Processed/STARaln.human-bamQC-SortByCo
 ordinates/InsertSize/O2_5.InsertSize.pdf 
 METRIC_ACCUMULATION_LEVEL=[ALL_READS] 
 INPUT=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/archive/RED/RNA-Seq/Processed/STARaln.human-bamfiles-SortByC
 oordinates/O2_5.coord.bam 
 OUTPUT=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/archive/RED/RNA-Seq/Processed/STARaln.human-bamQC-SortByCoordinates/InsertSize/O2_5.InsertSize.qcstats
  REFERENCE
 _SEQUENCE=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/reference/v1/Homo-sapiens/GRCh37.p12/WholeGenome/genome.fa
  TMP_DIR=[/scratch/RED/tmp/InsertSize/3966] VERBOSITY=WARN
 ING VALIDATION_STRINGENCY=SILENTDEVIATIONS=10.0 MINIMUM_PCT=0.05 
 ASSUME_SORTED=true STOP_AFTER=0 QUIET=false COMPRESSION_LEVEL=5 
 MAX_RECORDS_IN_RAM=50 CREATE_INDEX=false CREATE_MD5_FILE=false
 [Wed Oct 08 00:15:12 PDT 2014] Executing as @ussdgsphpccmp06 on Linux 
 2.6.32-358.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_67-b01; 
 Picard version: 1.119(d44cdb51745f5e8075c826430a39d8
 a61f1dd832_1408991805) IntelDeflater
 [Wed Oct 08 00:15:12 PDT 2014] picard.analysis.CollectInsertSizeMetrics done. 
 Elapsed time: 0.00 minutes.
 Runtime.totalMemory()=2026373120
 To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
 Exception in thread main java.lang.NullPointerException
at 
 htsjdk.samtools.reference.ReferenceSequenceFileWalker.get(ReferenceSequenceFileWalker.java:87)
at 
 picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:113)
at 
 picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:53)
at 
 picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)
at 
 picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124)
at 
 picard.analysis.CollectInsertSizeMetrics.main(CollectInsertSizeMetrics.java:83)
 child process exited with value 1
  
  
 Konstantinos Mavrommatis, PhD
 Senior Bioinformatics Scientist,
 Celgene Corp, San Francisco
 +1 415 839 7061
  
 *
 THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS
 CONFIDENTIAL AND MAY CONTAIN LEGALLY PRIVILEGED
 INFORMATION INTENDED ONLY FOR THE USE OF THE INDIVIDUAL
 OR INDIVIDUALS NAMED ABOVE.
 If the reader is not the intended recipient, or the
 employee or agent responsible to deliver it to the
 intended recipient, you are hereby notified that any
 dissemination, distribution or copying of this
 communication is strictly prohibited. If you have
 received this communication in error, please reply to the
 sender to notify us of the error and delete the original
 message. Thank You.
 --
 Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
 Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
 Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
 Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
 http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
 Samtools-help mailing list
 Samtools-help@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/samtools-help

--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help


Re: [Samtools-help] How to create consensus sequence

2014-10-08 Thread Colin Hercus
Hi Sami,

I'm not sure why that wouldn't work. TRy

samtools mpileup -f ref.fasta sorted.bam | less

and see what you get. If that works I'd expect bcf output would work.

Colin

On 8 October 2014 16:32, Sami Pietilä sami.piet...@gmail.com wrote:

 Hi Colin,

 Thanks for a reply! Novocraft package looks interesting. However, I am not
 quite sure how to use it in this case. I have a bam file containing the
 alignments and a fasta file for the reference sequences. I am not able to
 produce VCF (at least with samtools) because mpileup does not give anything.

 Is there a way of using novoutil to generate a consensus sequence(s) from
 bam without having a vcf file?

 Many thanks

 Sami

 2014-10-08 5:47 GMT+03:00 Colin Hercus co...@novocraft.com:

 Hi Sami,

 You can use novoutil iupac to generate a new consensus sequence from an
 original fasta and a vcf file. There's options for selecting variant
 quality and whether to include indel calls. It can also lift over
 coordinates on a bed file when applying inserts.

 It's free to use and part of Novoalign download at www.novocraft.com.

 Kind Regards, Colin

 On 7 October 2014 21:22, Sami Pietilä sami.piet...@gmail.com wrote:

 Hi,

 I haven't been able to create a consensus sequence with samtools. I have
 a ref.fasta containing just one sequence and then sorted.bam containing
 alignments against the reference.

 samtools mpileup -uf ref.fasta sorted.bam | bcftools view -cg - gives
 nothing after #CHROMPOSIDREF.. line.

 IGV is able to give the consensus sequence (see below).

 Is there a way to get a consensus sequence using samtools?

 Thanks

 /Sami

 NNN

 NNN

 NNN

 NNN

 NNTCACCAGGGC

 GACGATGCGTAGCCGGCCTGAGAGGGTGGACGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTA

 GGGAA

 

 

 

 

 

 NNNCAATAAGTATCCCGCCTAGTACGG

 CCGCAAGGYTGAAACTCAAAGGAATTGACGCCCGCACAAGCGGTGGAGYATGTGGTTTAATTCGAMGCAACGCGAAGAA

 CCTTACCAGGTCTTGACATCNN

 NNN

 NNN

 NNN

 NNN

 NNN
 


 --
 Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
 Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
 Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
 Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer

 http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk
 ___
 Samtools-help mailing list
 Samtools-help@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/samtools-help




--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help


Re: [Samtools-help] How to create consensus sequence

2014-10-08 Thread Sami Pietilä
Hi Colin,

Running following command: samtools mpileup -f ref.fasta sorted.bam | less

shows:

--- less output ---
[mpileup] 1 samples in 1 input files
mpileup Set max per-file depth to 8000
(END)


Many Thanks

Sami


2014-10-08 12:18 GMT+03:00 Colin Hercus co...@novocraft.com:

 Hi Sami,

 I'm not sure why that wouldn't work. TRy

 samtools mpileup -f ref.fasta sorted.bam | less

 and see what you get. If that works I'd expect bcf output would work.

 Colin

 On 8 October 2014 16:32, Sami Pietilä sami.piet...@gmail.com wrote:

 Hi Colin,

 Thanks for a reply! Novocraft package looks interesting. However, I am
 not quite sure how to use it in this case. I have a bam file containing the
 alignments and a fasta file for the reference sequences. I am not able to
 produce VCF (at least with samtools) because mpileup does not give anything.

 Is there a way of using novoutil to generate a consensus sequence(s) from
 bam without having a vcf file?

 Many thanks

 Sami

 2014-10-08 5:47 GMT+03:00 Colin Hercus co...@novocraft.com:

 Hi Sami,

 You can use novoutil iupac to generate a new consensus sequence from an
 original fasta and a vcf file. There's options for selecting variant
 quality and whether to include indel calls. It can also lift over
 coordinates on a bed file when applying inserts.

 It's free to use and part of Novoalign download at www.novocraft.com.

 Kind Regards, Colin

 On 7 October 2014 21:22, Sami Pietilä sami.piet...@gmail.com wrote:

 Hi,

 I haven't been able to create a consensus sequence with samtools. I
 have a ref.fasta containing just one sequence and then sorted.bam
 containing alignments against the reference.

 samtools mpileup -uf ref.fasta sorted.bam | bcftools view -cg - gives
 nothing after #CHROMPOSIDREF.. line.

 IGV is able to give the consensus sequence (see below).

 Is there a way to get a consensus sequence using samtools?

 Thanks

 /Sami

 NNN

 NNN

 NNN

 NNN

 NNTCACCAGGGC

 GACGATGCGTAGCCGGCCTGAGAGGGTGGACGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTA

 GGGAA

 

 

 

 

 

 NNNCAATAAGTATCCCGCCTAGTACGG

 CCGCAAGGYTGAAACTCAAAGGAATTGACGCCCGCACAAGCGGTGGAGYATGTGGTTTAATTCGAMGCAACGCGAAGAA

 CCTTACCAGGTCTTGACATCNN

 NNN

 NNN

 NNN

 NNN

 NNN
 


 --
 Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
 Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
 Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
 Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer

 http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk
 ___
 Samtools-help mailing list
 Samtools-help@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/samtools-help





--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help


[Samtools-help] HardFiltering variants

2014-10-08 Thread mehar

Hi all,

Knowing the fact that filtering variants manually, using thresholds on 
quality values, is subject to all sorts of caveats i am writing this to 
seek some suggestion for hard filtering variants as it is better than 
nothing.


Could someone provide *generic recommendations* using samtools that 
should at least provide a starting point to analyse the data.


Awaiting for suggestions!!

--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help


Re: [Samtools-help] Problems with picard 1.119. CollectAlignmentMetrics and CollectInsertSizeMetrics

2014-10-08 Thread Nils Homer
Could you let me know if you see a sequence dictionary next to the
reference sequence fasta?  The file would be named:
/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/reference/v1/Homo-sapiens/
GRCh37.p12/WholeGenome/genome.dict

Thanks,

N

On Wed, Oct 8, 2014 at 3:49 AM, Konstantinos Mavrommatis 
kmavromma...@celgene.com wrote:

  Hi,

 When I run CollectAlignmentMetrics and CollectInsertSizeMetrics using the
 latest Picard 1.119 I get the following error message.

 The same command line when run on a previous version of Picard (tried
 1.107 and 1.96) runs without issues and produces output.

 The BAM file was generated using STAR and was sorted using Picard SortSam.

 Thanks in advance for your help

 K



 Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/scratch/RED/tmp/

 [Wed Oct 08 00:15:12 PDT 2014] picard.analysis.CollectInsertSizeMetrics
 HISTOGRAM_FILE=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/archive/RED/RNA-Seq/Processed/STARaln.human-bamQC-SortByCo

 ordinates/InsertSize/O2_5.InsertSize.pdf
 METRIC_ACCUMULATION_LEVEL=[ALL_READS]
 INPUT=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/archive/RED/RNA-Seq/Processed/STARaln.human-bamfiles-SortByC

 oordinates/O2_5.coord.bam
 OUTPUT=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/archive/RED/RNA-Seq/Processed/STARaln.human-bamQC-SortByCoordinates/InsertSize/O2_5.InsertSize.qcstats
 REFERENCE

 _SEQUENCE=/gpfs/scratch/RED/tmp/InsertSize/3966/gpfs/reference/v1/Homo-sapiens/GRCh37.p12/WholeGenome/genome.fa
 TMP_DIR=[/scratch/RED/tmp/InsertSize/3966] VERBOSITY=WARN

 ING VALIDATION_STRINGENCY=SILENTDEVIATIONS=10.0 MINIMUM_PCT=0.05
 ASSUME_SORTED=true STOP_AFTER=0 QUIET=false COMPRESSION_LEVEL=5
 MAX_RECORDS_IN_RAM=50 CREATE_INDEX=false CREATE_MD5_FILE=false

 [Wed Oct 08 00:15:12 PDT 2014] Executing as @ussdgsphpccmp06 on Linux
 2.6.32-358.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM
 1.7.0_67-b01; Picard version: 1.119(d44cdb51745f5e8075c826430a39d8

 a61f1dd832_1408991805) IntelDeflater

 [Wed Oct 08 00:15:12 PDT 2014] picard.analysis.CollectInsertSizeMetrics
 done. Elapsed time: 0.00 minutes.

 Runtime.totalMemory()=2026373120

 To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp

 Exception in thread main java.lang.NullPointerException

at
 htsjdk.samtools.reference.ReferenceSequenceFileWalker.get(ReferenceSequenceFileWalker.java:87)

at
 picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:113)

at
 picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:53)

at
 picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)

at
 picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124)

at
 picard.analysis.CollectInsertSizeMetrics.main(CollectInsertSizeMetrics.java:83)

 child process exited with value 1





 *Konstantinos Mavrommatis, PhD*

 Senior Bioinformatics Scientist,

 Celgene Corp, San Francisco

 *+1 415 839 7061 %2B1%20415%20839%207061*


  *
 THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS
 CONFIDENTIAL AND MAY CONTAIN LEGALLY PRIVILEGED
 INFORMATION INTENDED ONLY FOR THE USE OF THE INDIVIDUAL
 OR INDIVIDUALS NAMED ABOVE.
 If the reader is not the intended recipient, or the
 employee or agent responsible to deliver it to the
 intended recipient, you are hereby notified that any
 dissemination, distribution or copying of this
 communication is strictly prohibited. If you have
 received this communication in error, please reply to the
 sender to notify us of the error and delete the original
 message. Thank You.


 --
 Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
 Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
 Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
 Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer

 http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk
 ___
 Samtools-help mailing list
 Samtools-help@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/samtools-help


--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help


Re: [Samtools-help] HardFiltering variants

2014-10-08 Thread Tim Fennell
Depending on a) whether you’re dealing with human, another diploid organism or 
something else and b) what kind of data you have (wgs, exome, other) you might 
start with Heng’s CHM1 paper as an interesting read:
http://arxiv.org/pdf/1404.0929.pdf

-t

On Oct 8, 2014, at 9:58 AM, mehar meharji.arumi...@helsinki.fi wrote:

 Hi all,
 
 Knowing the fact that filtering variants manually, using thresholds on 
 quality values, is subject to all sorts of caveats i am writing this to seek 
 some suggestion for hard filtering variants as it is better than nothing.
 
 Could someone provide generic recommendations using samtools that should at 
 least provide a starting point to analyse the data.
 
 Awaiting for suggestions!!
 
 --
 Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
 Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
 Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
 Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
 http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
 Samtools-help mailing list
 Samtools-help@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/samtools-help

--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help


[Samtools-help] Picard Release 1.122

2014-10-08 Thread George Grant
Picard Release 1.122
8 October 2014

- New Command Line Program GenotypeConcordance
-- Calculates the concordance between genotype data for two samples in
two different VCFs - one being considered the truth (or reference) the
other being considered the call.  The concordance is broken into separate
results sections for SNPs and indels.  Summary and detailed statistics are
reported.
Note that for any pair of variants to compare, only the alleles for the
samples under interrogation are considered and MNP, Symbolic, and Mixed
classes of variants are not included.

- New Command Line Program UpdateVcfDictionary
-- Updates the sequence dictionary of a VCF from another file (SAM,
BAM, VCF, dictionary, interval_list, fasta, etc).

- New Command Line Program VcfToIntervalList
-- Create an interval list from a VCF

- New Command Line Program MarkDuplicatesWithMateCigar
-- A new tool with which to mark duplicates:
This tool can replace MarkDuplicates if the input SAM/BAM has Mate
CIGAR (MC) optional tags
pre-computed (see the tools RevertOriginalBaseQualitiesAndAddMateCigar
and
FixMateInformation).  This allows the new tool to perform a streaming
duplicate
marking routine (i.e. a single-pass).  This tool cannot be used with
alignments that have large gaps or reference skips, which happens
frequently in RNA-seq data.

There were many refactors of the old MarkDuplicates and
MarkDuplicatesWithMateCigar, since the share common code.
EstimateLibraryComplexity was caught up in this too.

Many, many, many unit tests were added to were added to prove
equivalency of MarkDuplicatesWithMateCigar to MarkDuplicates.  This also
exposed a few one in a million corner cases in MarkDuplicates both in
duplicate marking as well as optical duplicate detection.  This results
in MarkDuplicates needing to write slightly larger temporary files when
running.  SamFileTester was also improved to handle the various test
cases for duplicate marking testing.

- Updates to IntervalList:
-- Added capacity to create a simple interval list from a string (the
name of the contig)
-- Added the capacity to subtract one interval list from another
(currently
   it would only work if they were both wrapped inside a container)

- Updates to SamLocusIterator
-- Performance optimizations gaining about 35% speed up...

- Updates to MarkDuplicates:
-- Removed unnecessary storage of a string in the Read Ends in Mark
-- Clarifed the size of ReadEndsForMarkDuplicates

- Updated the minimum number of times that the BAIT_INTERVALS (in
CalculateHsMetrics) and TARGET_INTERVALS (in CollectTargetedMetrics) must
be set to one.

- Moved CollectHiSeqPfFailMetrics into picard public

- Updates to documentation generation (internal):
-- changed link to IntervalList.java documentation
-- updated how _includes/command-line-usage.html is generated

- Moved SAMSequenceDictionaryExtractor and tests from picard to htsjdk

- George
--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk___
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help


Re: [Samtools-help] samtools V1.1 DPR option

2014-10-08 Thread Petr Danecek
Hi Chushin,

the DPR fields will now get trimmed also with -c, the fixed version is in github
http://samtools.github.io/bcftools/

Cheers,
Petr

On 7 Oct 2014, at 20:02, Petr Danecek wrote:

 Thank you. I see what is the problem now, did not notice the -c in bcftools 
 call command: The DPR output is currently supported only with -m, sorry. 
 
 Petr
 
 
 On 7 Oct 2014, at 19:39, Koh, Chushin wrote:
 
 Hi Petr,
 
 Thank you for your reply. Here is the Version info 'grep' from the same VCF 
 file:
 
 ##samtoolsVersion=1.1+htslib-1.1
 ##bcftools_callVersion=1.1+htslib-1.1
 
 With Regards, 
 ChuShin.
 
 
 From: Petr Danecek [p...@sanger.ac.uk]
 Sent: Tuesday, October 07, 2014 12:28 PM
 To: Koh, Chushin
 Cc: samtools-help@lists.sourceforge.net
 Subject: Re: [Samtools-help] samtools V1.1 DPR option
 
 Hi ChuShin,
 
 only two values should appear on output at biallelic sites. I believe this 
 was fixed some time ago
 http://github.com/samtools/bcftools/commit/91826ee6
 
 Are you sure the output is really from bcftools v1.1? It's easy to check - 
 there is a version string in the VCF header.
 
 Petr
 
 
 On 7 Oct 2014, at 18:24, Koh, Chushin wrote:
 
 Hi all,
 
 I need clarification on the DPR option (see an example output below). There 
 are two alleles reported for each locus, however, the DPR field has 3 
 values at pos 6554 and 4 values at pos 6690. What does the 3rd and 4th 
 value correspond to (e.g. what nucleotide) in these cases?  Thanks!
 
 Cheers,
 ChuShin
 
 
 --
 command:
 $ samtools mpileup -B -d 1  -t DP,DPR -uf reference.fa BAM_FILES| 
 bcftools call -cv -o raw.vcf
 --
 
 --
 output:
 seq1  6554.   G   A   121.224 .   
 DP=102;VDB=0.360777;SGB=17.9188;RPB=0.404703;MQB=2.10899e-12;MQSB=0.980733;BQB=0.730302;MQ0F=0.147059;AF1=0.293951;G3=0.669799,2.37878e-06,0.330198;HWE=6.51903e-05;AC1=10;DP4=43,35,21,3;MQ=42;FQ=122.706;PV4=0.0038602,0.458111,9.36477e-23,0.161162
   GT:PL:DP:DPR1/1:61,15,0:5:0,5,0 0/0:0,9,103:3:3,0,0 
 0/0:0,24,211:8:8,0,00/0:0,30,238:10:10,0,0  0/0:0,36,255:12:12,0,0  
 0/0:0,18,166:6:6,0,00/0:0,30,239:10:10,0,0  0/0:0,0,0:0:0,0,0   
 0/0:0,21,196:7:7,0,01/1:99,21,0:7:0,7,0 0/0:0,24,219:8:8,0,0
 0/0:0,24,222:8:8,0,01/1:18,21,0:7:0,7,0 0/0:0,0,0:0:0,0,0   
 1/1:28,9,0:3:0,3,0  0/0:0,18,190:6:6,0,00/1:22,6,0:2:0,2,0
 seq1  6690.   C   T   999 .   
 DP=152;VDB=0.981285;SGB=3.7794;RPB=0.326667;MQB=0;MQSB=8.54057e-07;BQB=1;MQ0F=0.151316;AF1=0.967821;AC1=33;DP4=1,1,86,64;MQ=35;FQ=15.3744;PV4=1,0.39789,0.00644419,0.357468
  GT:PL:DP:DPR1/1:145,39,0:13:0,13,0,01/1:157,24,0:8:0,8,0,0 
  1/1:241,39,0:13:0,13,0,01/1:238,33,0:12:0,11,1,0
 1/1:255,45,0:15:0,15,0,01/1:128,15,0:5:0,5,0,0  
 1/1:191,30,0:10:0,10,0,00/1:0,6,63:2:2,0,0,0
 1/1:163,27,0:9:0,9,0,0  1/1:161,30,0:10:0,10,0,0
 1/1:255,54,0:18:0,18,0,01/1:255,48,0:16:0,16,0,0
 1/1:88,24,0:8:0,8,0,0   1/1:51,9,0:3:0,3,0,01/1:71,9,0:3:0,3,0,0
 1/1:98,9,0:3:0,3,0,01/1:39,12,0:4:0,4,0,0
 --
 --
 Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
 Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
 Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
 Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
 http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk
 ___
 Samtools-help mailing list
 Samtools-help@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/samtools-help
 
 
 
 --
 The Wellcome Trust Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.
 
 
 
 -- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 
 
 --
 Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
 Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
 Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
 Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
 http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk
 ___
 Samtools-help mailing list
 Samtools-help@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/samtools-help



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in