--- ilma barbosa <[EMAIL PROTECTED]> escreveu:
> --- Erik Hatcher <[EMAIL PROTECTED]>
> escreveu: > On Monday, September 29, 2003, at 09:19
> AM, Hackl,
> > Rene wrote:
> > > I'm looking for a way to implement simultaneous
> > left and right
> > > truncation.
> > >
> > > The goal is to enable the user to search for
> e.g.
> > "*hydronaphth*" and
> > > find
> > > "hexahydronaphthalene" as well as
> > "heptahydronaphthalin".
> >
> > WildcardQuery, if used directly but not through
> > QueryParser, allows for
> > left and right wildcards like this. Performance
> is
> > the biggest concern
> > though, as it will have to enumerate all terms in
> > the index to look for
> > matches.
> >
> > > To achieve that functionality, I'd like to index
> > terms in the way that
> > > from
> > > a token "foobar" the tokens "oobar" and "obar" (
> > e.g. mininum word
> > > length =
> > > 4)
> > > would be derived and added to the index. I tried
> > to extend
> > > TokenFilter, but
> > > all I get is either "oobar" or "obar", depends
> on
> > when 'return' is
> > > called.
> >
> > This is an interesting approach and would
> certainly
> > give you the search
> > results you are looking for, except that you'll be
> > indexing a ton of
> > terms I'd guess. If there is some other way to
> > split these words by
> > separating by prefix ("hexa", "hepta") and suffix
> > ("alene", "alin") it
> > would likely be better. But maybe its not
> practical
> > to do so.
> >
> > But, to the main question....
> >
> > > How could I add such extra tokens to the
> > tokenStream? Any thoughts on
> > > this
> > > appreciated.
> >
> > Yes, this can be done. Craig Walls did this with
> an
> > example Analyzer
> > in his December '02 Java Developer's Journal
> > article. I adapted his
> > example to use to demonstrate this for
> > presentations. Here is my code:
> >
> > import java.io.IOException;
> > import java.util.Enumeration;
> > import java.util.HashMap;
> > import java.util.ResourceBundle;
> > import java.util.Stack;
> > import java.util.StringTokenizer;
> >
> > import org.apache.lucene.analysis.Token;
> > import org.apache.lucene.analysis.TokenFilter;
> > import org.apache.lucene.analysis.TokenStream;
> >
> > public class AliasFilter extends TokenFilter {
> > private static HashMap aliasMap = new
> HashMap();
> >
> > static {
> > aliasMap.put("country","yeehaw");
> > aliasMap.put("western","rawhide");
> > }
> >
> > private Stack currentTokenAliases;
> >
> > public AliasFilter(TokenStream in) {
> > currentTokenAliases = new Stack();
> > input = in;
> > }
> >
> > public Token next() throws IOException {
> > if(currentTokenAliases.size() > 0) {
> > return (Token)currentTokenAliases.pop();
> > }
> >
> > Token nextToken = input.next();
> > addAliasesToStack(nextToken,
> > currentTokenAliases);
> >
> > return nextToken;
> > }
> >
> > private void addAliasesToStack(Token token,
> Stack
> > aliasStack) {
> > if(token == null) return;
> >
> > String aliasString =
> > (String)aliasMap.get(token.termText());
> >
> > if(aliasString == null ||
> aliasString.length()
> > < 1) return;
> >
> > StringTokenizer tokenizer = new
> > StringTokenizer(aliasString, " ");
> > while(tokenizer.hasMoreElements()) {
> > String nextAlias = tokenizer.nextToken();
> > Token nextTokenAlias = new Token(nextAlias,
> > 0,
> > nextAlias.length());
> > aliasStack.push(nextTokenAlias);
> > }
> > }
> > }
> >
> > Note that this is a braindead example of injecting
> > tokens into the
> > stream, so when "country" or "western" is
> > encountered, another token is
> > added to the stream. In your case you'll do
> > something with the initial
> > token and push all its 4+ length pieces onto the
> > stack.
> >
> > Erik
> >
> >
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > [EMAIL PROTECTED]
> > For additional commands, e-mail:
> > [EMAIL PROTECTED]
> >
>
>
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
>
=====
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]