[ 
https://issues.apache.org/jira/browse/LUCENE-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748122#action_12748122
 ] 

Tim Smith commented on LUCENE-1859:
-----------------------------------

On documentation:
any warnings/precautions should always be called out (calling out the external 
link (wiki/etc) for in depth details)
in depth descriptions of the details can be pushed off to wiki pages or 
external references, as long as a link is provided for the curious, but i would 
still argue that they should exist

bq. this doesn't prevent the OOM, it just makes it less likely

all you can ever do for OOM issues is make them less likely (short of just 
fixing a bug that holds onto memory like mad). 
If accepting arbitrary content, there will always be a possibility of the 
content forcing OOM issues. In general, everything possible should be done to 
reduce the likelyhood of such OOM issues where possible (IMO).

> TermAttributeImpl's buffer will never "shrink" if it grows too big
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1859
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1859
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>    Affects Versions: 2.9
>            Reporter: Tim Smith
>            Priority: Minor
>
> This was also an issue with Token previously as well
> If a TermAttributeImpl is populated with a very long buffer, it will never be 
> able to reclaim this memory
> Obviously, it can be argued that Tokenizer's should never emit "large" 
> tokens, however it seems that the TermAttributeImpl should have a reasonable 
> static "MAX_BUFFER_SIZE" such that if the term buffer grows bigger than this, 
> it will shrink back down to this size once the next token smaller than 
> MAX_BUFFER_SIZE is set
> I don't think i have actually encountered issues with this yet, however it 
> seems like if you have multiple indexing threads, you could end up with a 
> char[Integer.MAX_VALUE] per thread (in the very worst case scenario)
> perhaps growTermBuffer should have the logic to shrink if the buffer is 
> currently larger than MAX_BUFFER_SIZE and it needs less than MAX_BUFFER_SIZE

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to