Dear Samtools-help list members,

I want to run EstimateLibraryComplexity.jar with a 9.8GB big bam file, but I 
always get a OutOfMemoryError error. I already tried -Xmx (up to 60GB) and 
still get the error. Has anybody an idea of how to run 
EstimateLibraryComplexity on bigger bam files?

 
That's my call and the error message:
=============================

$ java -Xmx10g -jar EstimateLibraryComplexity.jar INPUT=file.bam 
OUTPUT=file.libraryComplexity

[Wed Jun 04 21:43:08 CEST 2014] picard.sam.EstimateLibraryComplexity 
INPUT=[file.bam] OUTPUT=file.libraryComplexity    MIN_IDENTICAL_BASES=5 
MAX_DIFF_RATE=0.03 MIN_MEAN_QUALITY=20 MAX_GROUP_RATIO=500 
READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* 
OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false 
VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 
CREATE_INDEX=false CREATE_MD5_FILE=false
[Wed Jun 04 21:43:08 CEST 2014] Executing as me@work on Linux 
3.6.2-1.fc16.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_07-b10; 
Picard version: 1.114(444810c1de1433d9eca8130be63ccc7fd70a9499_1400593393) 
JdkDeflater
INFO    2014-06-04 21:43:08     EstimateLibraryComplexity       Will store 
15494157 read pairs in memory before sorting.
INFO    2014-06-04 21:43:13     EstimateLibraryComplexity       Read     
1,000,000 records.  Elapsed time: 00:00:05s.  Time for last 1,000,000:    5s.  
Last read position: chr10:38,239,480

....

INFO    2014-06-04 21:53:21     EstimateLibraryComplexity       Read    
30,000,000 records.  Elapsed time: 00:10:13s.  Time for last 1,000,000:  183s.  
Last read position: chr15:34,522,127

[Wed Jun 04 22:54:26 CEST 2014] picard.sam.EstimateLibraryComplexity done. 
Elapsed time: 71.30 minutes.
Runtime.totalMemory()=5801312256
To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOfRange(Arrays.java:2694)
        at java.lang.String.<init>(String.java:203)
        at java.lang.String.substring(String.java:1913)
        at htsjdk.samtools.util.StringUtil.split(StringUtil.java:89)
        at 
picard.sam.AbstractDuplicateFindingAlgorithm.addLocationInformation(AbstractDuplicateFindingAlgorithm.java:71)
        at 
picard.sam.EstimateLibraryComplexity.doWork(EstimateLibraryComplexity.java:256)
        at 
picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)
        at 
picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124)
        at 
picard.sam.EstimateLibraryComplexity.main(EstimateLibraryComplexity.java:217)

 

And that's the java version:
=====================

$ java -showversion
java version "1.7.0_07"
Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)


I also tried ValidateSamFile.jar:
========================


$ java -jar /scr/k41san/tools/picard/picard-tools-1.114/ValidateSamFile.jar 
INPUT=file.bam MODE=SUMMARY

[Thu Jun 05 12:12:17 CEST 2014] picard.sam.ValidateSamFile INPUT=file.bam 
MODE=SUMMARY    MAX_OUTPUT=100 IGNORE_WARNINGS=false VALIDATE_INDEX=true 
IS_BISULFITE_SEQUENCED=false MAX_OPEN_TEMP_FILES=8000 VERBOSITY=INFO 
QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 
MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Thu Jun 05 12:12:17 CEST 2014] Executing as me@work on Linux 
3.6.2-1.fc16.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_07-b10; 
Picard version: 1.114(444810c1de1433d9eca8130be63ccc7fd70a9499_1400593393) 
JdkDeflater
INFO    2014-06-05 12:13:18     SamFileValidator        Validated Read    
10,000,000 records.  Elapsed time: 00:01:00s.  Time for last 10,000,000:   60s. 
 Last read position: chr11:67,275,063
INFO    2014-06-05 12:14:36     SamFileValidator        Validated Read    
20,000,000 records.  Elapsed time: 00:02:18s.  Time for last 10,000,000:   77s. 
 Last read position: chr12:112,229,147
INFO    2014-06-05 12:15:45     SamFileValidator        Validated Read    
30,000,000 records.  Elapsed time: 00:03:27s.  Time for last 10,000,000:   69s. 
 Last read position: chr15:34,522,127
INFO    2014-06-05 12:18:05     SamFileValidator        Validated Read    
40,000,000 records.  Elapsed time: 00:05:47s.  Time for last 10,000,000:  140s. 
 Last read position: chr16:56,362,603
INFO    2014-06-05 12:20:07     SamFileValidator        Validated Read    
50,000,000 records.  Elapsed time: 00:07:49s.  Time for last 10,000,000:  121s. 
 Last read position: chr17:65,979,420
INFO    2014-06-05 12:21:11     SamFileValidator        Validated Read    
60,000,000 records.  Elapsed time: 00:08:53s.  Time for last 10,000,000:   64s. 
 Last read position: chr19:38,049,399
INFO    2014-06-05 12:27:34     SamFileValidator        Validated Read    
70,000,000 records.  Elapsed time: 00:15:16s.  Time for last 10,000,000:  383s. 
 Last read position: chr1:43,396,405
INFO    2014-06-05 12:48:18     SamFileValidator        Validated Read    
80,000,000 records.  Elapsed time: 00:36:00s.  Time for last 10,000,000: 
1,243s.  Last read position: chr1:246,706,542

>> Still running  2014-06-05 15:37


I also posted the problem at Biostars (https://www.biostars.org/p/102538/) and 
SEQanswers (http://seqanswers.com/forums/showthread.php?t=43910).


Thanks for your help,
David Langenberger
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to