uwe look at the patch i pasted in haste (i have a delivery guy here, sorry).
the filter had a bug all along (it was using termBuffer.length for some length calculations). On Thu, Aug 6, 2009 at 11:17 AM, Uwe Schindler<u...@thetaphi.de> wrote: > I looked into the code of this Filter. It is very simple and should work out > of the box. There is no cloning done. When the indexer calls incrementToken, > the delegation to next(Token) does not clone at all. It just uses the > encapsulated Token instance (inside the AttributeImpl TokenWrapper) as > reusableToken and calls next(reusable) and then replaces the encapsulated > instance by the return value of next() -- so no cloning. As you do not > change the token instance at all and return the reusable token it is all > done on one Token/Attribute instance. > > In my opinion, this is the simpliest TokenFilter that could occur, it just > changes the contents of the buffer. By the way, this one could be easily > rewritten to use incrementToken() without cloning, just use > termAtt.setTermBuffer() and so on. > > Where do you see a problem, does it simply not work or do you think there > could be an issue? > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> -----Original Message----- >> From: Mark Miller [mailto:markrmil...@gmail.com] >> Sent: Thursday, August 06, 2009 4:14 PM >> To: java-dev@lucene.apache.org >> Subject: Issue with Solr TokenFilter and the new TokenStream API >> >> I think there is an issue here, but I didn't follow the TokenStream >> improvements very closely. >> >> In Solr, CapitalizationFilterFactory has a CharArray set that it loads >> up with keep words - it then checks (with the old TokenStream API) each >> token (char array) to see if it should keep it. I think because of the >> cloning going on in next, this breaks and you can't match anything in >> the keep set. Does that make sense? >> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > -- Robert Muir rcm...@gmail.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org