[ 
https://issues.apache.org/jira/browse/LUCENE-6814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906601#comment-14906601
 ] 

Uwe Schindler edited comment on LUCENE-6814 at 9/24/15 4:33 PM:
----------------------------------------------------------------

Tokenstream consumer workflow is clearly defined:
https://lucene.apache.org/core/5_3_1/core/org/apache/lucene/analysis/TokenStream.html

The last step is close(). There is nothing confusing, just RTFM.

close() is defined as "Releases resources associated with this stream.", and 
that's what we do here. end() has a different meaning. Its sole purpose is to 
get the "token" information after the very last token. At this time all 
resources still need to be hold, because the attributes must be accessible 
after this call. After close() the resources are freed. In fact, this does not 
make a difference for this tokenizer, but in general stuff like termattribute 
must still be in a valid state.


was (Author: thetaphi):
Tokenstream consumer workflow is clearly defined:
https://lucene.apache.org/core/5_3_1/core/org/apache/lucene/analysis/TokenStream.html

The last step is close(). There is nothing confusing, just RTFM.

> PatternTokenizer should free heap after it's done
> -------------------------------------------------
>
>                 Key: LUCENE-6814
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6814
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: Trunk, 5.4
>
>         Attachments: LUCENE-6814.patch, LUCENE-6814.patch
>
>
> Caught by Alex Chow in this Elasticsearch issue: 
> https://github.com/elastic/elasticsearch/issues/13721
> Today, PatternTokenizer reuses a single StringBuilder, but it doesn't free 
> its heap usage after tokenizing is done.  We can either stop reusing, or ask 
> it to {{.trimToSize}} when we are done ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to