[ https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791940#action_12791940 ]
Uwe Schindler commented on SOLR-1662: ------------------------------------- +1 Looks good! > BufferedTokenStream incorrect cloning > ------------------------------------- > > Key: SOLR-1662 > URL: https://issues.apache.org/jira/browse/SOLR-1662 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis > Affects Versions: 1.4 > Reporter: Robert Muir > Assignee: Shalin Shekhar Mangar > Attachments: SOLR-1662.patch > > > As part of writing tests for SOLR-1657, I rewrote one of the base classes > (BaseTokenTestCase) to use the new TokenStream API, but also with some > additional safety. > {code} > public static String tsToString(TokenStream in) throws IOException { > StringBuilder out = new StringBuilder(); > TermAttribute termAtt = (TermAttribute) > in.addAttribute(TermAttribute.class); > // extra safety to enforce, that the state is not preserved and also > // assign bogus values > in.clearAttributes(); > termAtt.setTermBuffer("bogusTerm"); > while (in.incrementToken()) { > if (out.length() > 0) > out.append(' '); > out.append(termAtt.term()); > in.clearAttributes(); > termAtt.setTermBuffer("bogusTerm"); > } > in.close(); > return out.toString(); > } > {code} > Setting the term text to bogus values helps find bugs in tokenstreams that do > not clear or clone properly. In this case there is a problem with a > tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A > B but does not clone, so the values get overwritten. > This can be fixed in two ways: > * BufferedTokenStream does the cloning > * subclasses are responsible for the cloning > The question is which one should it be? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.