Hi Bastian,

CreateSequenceDictionary attempts to create the entire sequence 
dictionary as a Java String.  The longest Java String that can be 
allocated is about 2GB, I believe.  Your fasta has too many sequences 
for the program to handle.  None of the Picard programs would be able to 
handle such a sequence dictionary.  This limit is fixed in Java, so 
increasing -Xmx won't help.

-Alec

On 7/19/14, 7:23 PM, Bastian Schiffthaler wrote:
> Hi,
>
> I have the following problem when trying to execute 
> cCreateSequenceDictionary.jar on an ubuntu linux server:
>
> Exception in thread "main" java.lang.OutOfMemoryError: Requested array size 
> exceeds VM limit
>
> The genome is rather big (around 12G). I have tried controlling memory usage 
> with:
>
> java -jar -Xmx <value> …cmd
>
> with <value>(s) 4G, 12G, 14G, 30G, 100G, 400G, 500G
>
> The error stays the same (system call attached, as well as error message)
>
> I have also tried setting MAX_RECORDS_IN_RAM to 5,000,000 or 20,000,000
>
>
> Additional information:
>
> ######## Java Version ############
>
> bastian@[hostname redacted]:~$ java -version
> java version "1.7.0_55"
> Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
>
> ######## Server system ############
>
> Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.8.0-29-generic x86_64)
>
> ######## Genome ##########
>
> 12666745333 Jul 16 19:41 picea_abies.master-rna-scaff.nov2012.fa
>
> ######## Server hardware ###########
>
> bastian@[hostname redacted]:~$ free
>               total       used       free     shared    buffers     cached
> Mem:     528344012  244740016  283603996          0     164596  224830244
> -/+ buffers/cache:   19745176  508598836
> Swap:     31583228     190760   31392468
>
> ######## The system has 64 CPUs: ##########
>
> cat /proc/cpuinfo
> .
> .
> .
> processor     : 63
> vendor_id     : AuthenticAMD
> cpu family    : 21
> model         : 2
> model name    : AMD Opteron(tm) Processor 6386 SE
> stepping      : 0
> microcode     : 0x6000822
> cpu MHz               : 2792.892
> cache size    : 2048 KB
> physical id   : 3
> siblings      : 16
> core id               : 7
> cpu cores     : 8
> apicid                : 143
> initial apicid        : 111
> fpu           : yes
> fpu_exception : yes
> cpuid level   : 13
> wp            : yes
> flags         : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
> pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
> rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid amd_dcm 
> aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes 
> xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a 
> misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm 
> topoext perfctr_core arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale 
> vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bmi1
> bogomips      : 5586.23
> TLB size      : 1536 4K pages
> clflush size  : 64
> cache_alignment       : 64
> address sizes : 48 bits physical, 48 bits virtual
> power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
>
> ######## File system ############
>
> df:
>
> [IP address redacted]:/export/[hostname redacted] 61201197056 29745669120 
> 31455527936  49% /mnt/[hostname redacted]
>
> ######## Command line call ##########
>
> [Sat Jul 19 23:57:33 CEST 2014] picard.sam.CreateSequenceDictionary 
> REFERENCE=picea_abies.master-rna-scaff.nov2012.fa 
> OUTPUT=picea_abies.master-rna-scaff.nov2012.dict MAX_RECORDS_IN_RAM=20000000  
>   TRUNCATE_NAMES_AT_WHITESPACE=true NUM_SEQUENCES=2147483647 VERBOSITY=INFO 
> QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 
> CREATE_INDEX=false CREATE_MD5_FILE=false
> [Sat Jul 19 23:57:33 CEST 2014] Executing as bastian@[hostname redacted] on 
> Linux 3.8.0-29-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_55-b13; 
> Picard version: 1.117(107391d3f3e72b31589868c250262ca79659f577_1405353489) 
> JdkDeflater
> [Sun Jul 20 00:18:57 CEST 2014] picard.sam.CreateSequenceDictionary done. 
> Elapsed time: 21.40 minutes.
> Runtime.totalMemory()=19666042880
> To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
> Exception in thread "main" java.lang.OutOfMemoryError: Requested array size 
> exceeds VM limit
>          at java.util.Arrays.copyOf(Arrays.java:2367)
>          at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
>          at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
>          at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:535)
>          at java.lang.StringBuffer.append(StringBuffer.java:322)
>          at java.io.StringWriter.write(StringWriter.java:94)
>          at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129)
>          at java.io.BufferedWriter.write(BufferedWriter.java:230)
>          at java.io.Writer.write(Writer.java:157)
>          at java.io.Writer.append(Writer.java:227)
>          at 
> htsjdk.samtools.SAMTextHeaderCodec.println(SAMTextHeaderCodec.java:392)
>          at 
> htsjdk.samtools.SAMTextHeaderCodec.writeSQLine(SAMTextHeaderCodec.java:447)
>          at 
> htsjdk.samtools.SAMTextHeaderCodec.encode(SAMTextHeaderCodec.java:371)
>          at 
> htsjdk.samtools.SAMTextHeaderCodec.encode(SAMTextHeaderCodec.java:356)
>          at 
> htsjdk.samtools.SAMFileWriterImpl.setHeader(SAMFileWriterImpl.java:135)
>          at 
> htsjdk.samtools.SAMFileWriterFactory.makeSAMWriter(SAMFileWriterFactory.java:212)
>          at 
> picard.sam.CreateSequenceDictionary.doWork(CreateSequenceDictionary.java:124)
>          at 
> picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:183)
>          at 
> picard.sam.CreateSequenceDictionary.main(CreateSequenceDictionary.java:97)
> ------------------------------------------------------------------------------
> Want fast and easy access to all the code in your enterprise? Index and
> search up to 200,000 lines of code with a free copy of Black Duck
> Code Sight - the same software that powers the world's largest code
> search on Ohloh, the Black Duck Open Hub! Try it now.
> http://p.sf.net/sfu/bds
> _______________________________________________
> Samtools-help mailing list
> Samtools-help@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/samtools-help


------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to