regardless of it, I really do not understand the call to initTermBuffer() in termLength()? What is it good for?
this method will return the same value in both cases, zero, I see no harm in removing it? /** Return number of valid characters (length of the term) * in the termBuffer array. */ public int termLength() { initTermBuffer(); return termLength; } ----- Original Message ---- > From: Uwe Schindler <u...@thetaphi.de> > To: java-dev@lucene.apache.org > Sent: Sunday, 26 April, 2009 23:03:06 > Subject: RE: new TokenStream api Question > > There is one problem: if you extend TermAttribute, the class is different > (which is the key in the attributes list). So when you initialize the > TokenStream and do a > > YourClass termAtt = (YourClass) addAttribute(YourClass.class) > > ...you create a new attribute. So one possibility would be to also specify > the instance and save the attribute by class (as key), but with your > instance. If you are the first one that creates the attribute (if it is a > token stream and not a filter it is ok, you will be the first, it adding the > attribute in the ctor), everything is ok. Register the attribute by yourself > (maybe we should add a specialized addAttribute, that can specify a instance > as default)?: > > YourClass termAtt = new YourClass(); > attributes.put(TermAttribute.class, termAtt); > > In this case, for the indexer it is a standard TermAttribute, but you can > more with it. > > Replacing TermAttribute by an own class is not possible, as the indexer will > get a ClassCastException when using the instance retrieved with > getAttribute(TermAttribute.class). > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -----Original Message----- > > From: eks dev [mailto:eks...@yahoo.co.uk] > > Sent: Sunday, April 26, 2009 10:39 PM > > To: java-dev@lucene.apache.org > > Subject: new TokenStream api Question > > > > > > I am just looking into new TermAttribute usage and wonder what would be > > the best way to implement PrefixFilter that would filter out some Terms > > that have some prefix, > > > > something like this, where '-' represents my prefix: > > > > public final boolean incrementToken() throws IOException { > > // the first word we found > > while (input.incrementToken()) { > > int len = termAtt.termLength(); > > > > if(len > 0 && termAtt.termBuffer()[0]!='-') //only length > 0 and > > non LFs > > return true; > > // note: else we ignore it > > } > > // reached EOS > > return false; > > } > > > > > > > > > > > > The question would be: > > > > can I extend TermAttribute and add boolean startsWith(char c); > > > > The point is speed and my code gets smaller. > > TermAttribute has one method called in termLength() and termBuffer() I do > > not understand (back compatibility, I guess) > > public int termLength() { > > initTermBuffer(); // I'd like to avoid it... > > return termLength; > > } > > > > > > I'd like to get rid of initTermBuffer(), the first option is to *extend* > > TermAttribute code (but fields are private, so no help there) or can I > > implement my own MyTermAttribute (will Indexer know how to deal with it?) > > > > Must I extend TermAttribute or I can add my own? > > > > thanks, > > eks > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org