[ 
https://issues.apache.org/jira/browse/LUCENE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089366#comment-14089366
 ] 

Christian Ziech edited comment on LUCENE-5875 at 8/7/14 3:54 PM:
-----------------------------------------------------------------

Oh there is another OOM we get: At the time the exception was thrown we were 
indexing for 5-6 hours and have closed the IndexWriter already. Now we only 
want to store the special terms we gathered during indexing into a custom FST. 
At the point in time the Exception was thrown effectively one one thread was 
active in the VM the last attempt of a GC printed the following:
Eden: 0B(4021M)->0B(4021M) Survivors: 75M->75M Heap: 
9615M(30720M)->9615M(30720M)
Those values are also pretty much in line with the numbers we get from the 
runtime if we add custom debug statements.
{code}
java.lang.OutOfMemoryError: Java heap space
        at org.apache.lucene.util.packed.Packed64.<init>(Packed64.java:73)
        at 
org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:1034)
        at 
org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:1001)
        at 
org.apache.lucene.util.packed.GrowableWriter.<init>(GrowableWriter.java:46)
        at 
org.apache.lucene.util.packed.GrowableWriter.resize(GrowableWriter.java:98)
        at org.apache.lucene.util.fst.FST.addNode(FST.java:845)
        at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:200)
        at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:289)
        at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
        at 
com.nokia.search.candgen.spelling.AtomicFSTBuilder$FSTWriter.put(AtomicFSTBuilder.java:358)
        at 
com.nokia.search.candgen.spelling.AtomicFSTBuilder$WriteTask.run(AtomicFSTBuilder.java:156)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
{code}


was (Author: christianz):
Oh there is another OOM we get: At the time the exception was thrown we were 
indexing for 5-6 hours and have closed the IndexWriter already. Now we only 
want to store the special terms we gathered during indexing into a custom FST. 
At the point in time the Exception was thrown effectively one one thread was 
active in the VM the last attempt of a GC printed the following:
Eden: 0B(4021M)->0B(4021M) Survivors: 75M->75M Heap: 
9615M(30720M)->9615M(30720M)
Those values are also pretty much in line with the numbers we get from the 
runtime if we add custom debug statements.

java.lang.OutOfMemoryError: Java heap space
        at org.apache.lucene.util.packed.Packed64.<init>(Packed64.java:73)
        at 
org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:1034)
        at 
org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:1001)
        at 
org.apache.lucene.util.packed.GrowableWriter.<init>(GrowableWriter.java:46)
        at 
org.apache.lucene.util.packed.GrowableWriter.resize(GrowableWriter.java:98)
        at org.apache.lucene.util.fst.FST.addNode(FST.java:845)
        at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:200)
        at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:289)
        at org.apache.lucene.util.fst.Builder.add(Builder.java:394)
        at 
com.nokia.search.candgen.spelling.AtomicFSTBuilder$FSTWriter.put(AtomicFSTBuilder.java:358)
        at 
com.nokia.search.candgen.spelling.AtomicFSTBuilder$WriteTask.run(AtomicFSTBuilder.java:156)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)


> Default page/block sizes in the FST package can cause OOMs
> ----------------------------------------------------------
>
>                 Key: LUCENE-5875
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5875
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 4.9
>            Reporter: Christian Ziech
>            Priority: Minor
>
> We are building some fairly big FSTs (the biggest one having about 500M terms 
> with an average of 20 characters per term) and that works very well so far.
> The problem is just that we can use neither the "doShareSuffix" nor the 
> "doPackFST" option from the builder since both would cause us to get 
> exceptions. One beeing an OOM and the other an IllegalArgumentException for a 
> negative array size in ArrayUtil.
> The thing here is that we in theory still have far more than enough memory 
> available but it seems that java for some reason cannot allocate byte or long 
> arrays of the size the NodeHash needs (maybe fragmentation?).
> Reducing the constant in the NodeHash from 1<<30 to e.g. 27 seems to fix the 
> issue mostly. Could e.g. the Builder pass through its bytesPageBits to the 
> NodeHash or could we get a custom parameter for that?
> The other problem we run into was a NegativeArraySizeException when we try to 
> pack the FST. It seems that we overflowed to 0x80000000. Unfortunately I 
> accidentally overwrote that exception but I remember it was triggered by the 
> GrowableWriter for the inCounts in line 728 of the FST. If it helps I can try 
> to reproduce it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to