TermAttributeImpl's buffer will never "shrink" if it grows too big
------------------------------------------------------------------

                 Key: LUCENE-1859
                 URL: https://issues.apache.org/jira/browse/LUCENE-1859
             Project: Lucene - Java
          Issue Type: Bug
          Components: Analysis
    Affects Versions: 2.9
            Reporter: Tim Smith


This was also an issue with Token previously as well

If a TermAttributeImpl is populated with a very long buffer, it will never be 
able to reclaim this memory

Obviously, it can be argued that Tokenizer's should never emit "large" tokens, 
however it seems that the TermAttributeImpl should have a reasonable static 
"MAX_BUFFER_SIZE" such that if the term buffer grows bigger than this, it will 
shrink back down to this size once the next token smaller than MAX_BUFFER_SIZE 
is set

I don't think i have actually encountered issues with this yet, however it 
seems like if you have multiple indexing threads, you could end up with a 
char[Integer.MAX_VALUE] per thread (in the very worst case scenario)


perhaps growTermBuffer should have the logic to shrink if the buffer is 
currently larger than MAX_BUFFER_SIZE and it needs less than MAX_BUFFER_SIZE

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to