[ 
https://issues.apache.org/jira/browse/LUCENE-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430554#comment-13430554
 ] 

Benjamin Douglas commented on LUCENE-2145:
------------------------------------------

This looks to be intentional. Calling close() on the token stream is designed 
to release the Reader, which should happen as soon as you know you are done 
with it. LUCENE-2387 explains the negative side-effects of holding onto Readers 
too long. Calling analyzer.reusableTokenStream() the next time will provide a 
new Reader. 

If the external resource is tied to the Reader, then it should also be released 
when TokenStream.close() is called. Only that data that is independent of 
current text should survive to the next reusableTokenStream() call.
                
> TokenStream.close() is called multiple times per TokenStream instance
> ---------------------------------------------------------------------
>
>                 Key: LUCENE-2145
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2145
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index, core/queryparser
>    Affects Versions: 2.9, 2.9.1, 3.0
>         Environment: Solr 1.4.0
>            Reporter: KuroSaka TeruHiko
>
> I have a Tokenizer that uses an external resource.  I wrote this Tokenizer so 
> that the external resource is released in its close() method.
> This should work because close() is supposed to be called when the caller is 
> done with the TokenStream of which Tokenizer is a subclass.  TokenStream's 
> API document 
> <http://lucene.apache.org/java/2_9_1/api/core/org/apache/lucene/analysis/TokenStream.html>
>  states:
> {noformat}
> 6. The consumer calls close() to release any resource when finished using the 
> TokenStream. 
> {noformat}
> When I used my Tokenizer from Solr 1.4.0, it did not work as expected.  An 
> error analysis suggests an instance of my Tokenizer is used even after 
> close() is called and the external resource is released. After a further 
> analysis it seems that it is not Solr but Lucene itself that is breaking the 
> contract.
> This is happening in two places.
> src/java/org/apache/lucene/queryParser/QueryParser.java:
>   protected Query getFieldQuery(String field, String queryText)  throws 
> ParseException {
>     // Use the analyzer to get all the tokens, and then build a TermQuery,
>     // PhraseQuery, or nothing based on the term count
>     TokenStream source;
>     try {
>       source = analyzer.reusableTokenStream(field, new 
> StringReader(queryText));
>       source.reset();
> .
> .
> .
>      try {
>       // rewind the buffer stream
>       buffer.reset();
>       // close original stream - all tokens buffered
>       source.close(); // <---- HERE
>     }
> src/java/org/apache/lucene/index/DocInverterPerField.java
> public void processFields(final Fieldable[] fields,
>                             final int count) throws IOException {
> ...
>           } finally {
>             stream.close();
>           }
> Calling close() would be good if the TokenStream is not reusable one. But 
> when it is reusable, it might be used again, so the resource associated with 
> the TokenStream instance should not be released.  close() needs to be called 
> selectively only when it know it is not going to be reused. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to