Dear Samtools-help list members, I want to run EstimateLibraryComplexity.jar with a 9.8GB big bam file, but I always get a OutOfMemoryError error. I already tried -Xmx (up to 60GB) and still get the error. Has anybody an idea of how to run EstimateLibraryComplexity on bigger bam files?
That's my call and the error message: ============================= $ java -Xmx10g -jar EstimateLibraryComplexity.jar INPUT=file.bam OUTPUT=file.libraryComplexity [Wed Jun 04 21:43:08 CEST 2014] picard.sam.EstimateLibraryComplexity INPUT=[file.bam] OUTPUT=file.libraryComplexity MIN_IDENTICAL_BASES=5 MAX_DIFF_RATE=0.03 MIN_MEAN_QUALITY=20 MAX_GROUP_RATIO=500 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false [Wed Jun 04 21:43:08 CEST 2014] Executing as me@work on Linux 3.6.2-1.fc16.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_07-b10; Picard version: 1.114(444810c1de1433d9eca8130be63ccc7fd70a9499_1400593393) JdkDeflater INFO 2014-06-04 21:43:08 EstimateLibraryComplexity Will store 15494157 read pairs in memory before sorting. INFO 2014-06-04 21:43:13 EstimateLibraryComplexity Read 1,000,000 records. Elapsed time: 00:00:05s. Time for last 1,000,000: 5s. Last read position: chr10:38,239,480 .... INFO 2014-06-04 21:53:21 EstimateLibraryComplexity Read 30,000,000 records. Elapsed time: 00:10:13s. Time for last 1,000,000: 183s. Last read position: chr15:34,522,127 [Wed Jun 04 22:54:26 CEST 2014] picard.sam.EstimateLibraryComplexity done. Elapsed time: 71.30 minutes. Runtime.totalMemory()=5801312256 To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:2694) at java.lang.String.<init>(String.java:203) at java.lang.String.substring(String.java:1913) at htsjdk.samtools.util.StringUtil.split(StringUtil.java:89) at picard.sam.AbstractDuplicateFindingAlgorithm.addLocationInformation(AbstractDuplicateFindingAlgorithm.java:71) at picard.sam.EstimateLibraryComplexity.doWork(EstimateLibraryComplexity.java:256) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183) at picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:124) at picard.sam.EstimateLibraryComplexity.main(EstimateLibraryComplexity.java:217) And that's the java version: ===================== $ java -showversion java version "1.7.0_07" Java(TM) SE Runtime Environment (build 1.7.0_07-b10) Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode) I also tried ValidateSamFile.jar: ======================== $ java -jar /scr/k41san/tools/picard/picard-tools-1.114/ValidateSamFile.jar INPUT=file.bam MODE=SUMMARY [Thu Jun 05 12:12:17 CEST 2014] picard.sam.ValidateSamFile INPUT=file.bam MODE=SUMMARY MAX_OUTPUT=100 IGNORE_WARNINGS=false VALIDATE_INDEX=true IS_BISULFITE_SEQUENCED=false MAX_OPEN_TEMP_FILES=8000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false [Thu Jun 05 12:12:17 CEST 2014] Executing as me@work on Linux 3.6.2-1.fc16.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_07-b10; Picard version: 1.114(444810c1de1433d9eca8130be63ccc7fd70a9499_1400593393) JdkDeflater INFO 2014-06-05 12:13:18 SamFileValidator Validated Read 10,000,000 records. Elapsed time: 00:01:00s. Time for last 10,000,000: 60s. Last read position: chr11:67,275,063 INFO 2014-06-05 12:14:36 SamFileValidator Validated Read 20,000,000 records. Elapsed time: 00:02:18s. Time for last 10,000,000: 77s. Last read position: chr12:112,229,147 INFO 2014-06-05 12:15:45 SamFileValidator Validated Read 30,000,000 records. Elapsed time: 00:03:27s. Time for last 10,000,000: 69s. Last read position: chr15:34,522,127 INFO 2014-06-05 12:18:05 SamFileValidator Validated Read 40,000,000 records. Elapsed time: 00:05:47s. Time for last 10,000,000: 140s. Last read position: chr16:56,362,603 INFO 2014-06-05 12:20:07 SamFileValidator Validated Read 50,000,000 records. Elapsed time: 00:07:49s. Time for last 10,000,000: 121s. Last read position: chr17:65,979,420 INFO 2014-06-05 12:21:11 SamFileValidator Validated Read 60,000,000 records. Elapsed time: 00:08:53s. Time for last 10,000,000: 64s. Last read position: chr19:38,049,399 INFO 2014-06-05 12:27:34 SamFileValidator Validated Read 70,000,000 records. Elapsed time: 00:15:16s. Time for last 10,000,000: 383s. Last read position: chr1:43,396,405 INFO 2014-06-05 12:48:18 SamFileValidator Validated Read 80,000,000 records. Elapsed time: 00:36:00s. Time for last 10,000,000: 1,243s. Last read position: chr1:246,706,542 >> Still running 2014-06-05 15:37 I also posted the problem at Biostars (https://www.biostars.org/p/102538/) and SEQanswers (http://seqanswers.com/forums/showthread.php?t=43910). Thanks for your help, David Langenberger ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help