[ https://issues.apache.org/jira/browse/LUCENE-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854908#action_12854908 ]
Uwe Schindler commented on LUCENE-2384: --------------------------------------- {quote} patch to reset the zzBuffer when the input is reseted. The code is really taken from https://sourceforge.net/mailarchive/message.php?msg_id=444070.38422...@web38901.mail.mud.yahoo.com so I can't really grant license to use it but I think the guy realeased it as public domain by posting it to the mailing list. I tested it and it seems to work for me. Just including it here is case somebody want to apply the patch directly to 3.0.1 (although it's better to wait for 3.1) {quote} Your fix adds an addtional complexity. Just reset the buffer back to the default ZZ_BUFFERSIZE if grown on reset. Your patch always reallocates a new buffer. Use this: {code} public final void reset(Reader r) { // reset to default buffer size, if buffer has grown if (zzBuffer.length > ZZ_BUFFERSIZE) { zzBuffer = new char[ZZ_BUFFERSIZE]; } yyreset(r); } {code} > Reset zzBuffer in StandardTokenizerImpl* when lexer is reset. > ------------------------------------------------------------- > > Key: LUCENE-2384 > URL: https://issues.apache.org/jira/browse/LUCENE-2384 > Project: Lucene - Java > Issue Type: Sub-task > Components: Analysis > Affects Versions: 3.0.1 > Reporter: Uwe Schindler > Assignee: Uwe Schindler > Fix For: 3.1 > > Attachments: reset.diff > > > When indexing large documents, the lexer buffer may stay large forever. This > sub-issue resets the lexer buffer back to the default on reset(Reader). > This is done on the enclosing issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org