To anyone curious of the solution that I came to ...

=====================================================
  public static string[]
AnalyzeTerms(Lucene.Net.Analysis.Analyzeranalyzer, string terms)
  {
      System.Collections.Generic.List<string> returnObj = new
List<string>();
      Lucene.Net.Analysis.Token t = null;
      Lucene.Net.Analysis.TokenStream ts = analyzer.TokenStream(new
System.IO.StringReader(terms));
      while ((t = ts.Next()) != null)
      {
          returnObj.Add(t.TermText());
      }
      return returnObj.ToArray();
  }
=====================================================

Not completely obvious how to do that kind of thing "out-of-the-box", but
eventually I got to it.  :-)

Andy


On 2/15/07, Jokin Cuadrado <[EMAIL PROTECTED]> wrote:

why don't you use the StandardAnalyzer?

as far as i know, you can create an instance of the analyzer and get
the tokens off the query.

watch how is it done in the queryparser an reproduce it.

--
Jokin.

On 2/14/07, Andy Berryman <[EMAIL PROTECTED]> wrote:
> I noticed that my message appeared in raw text with some "*" in
it.  That
> was due to the rich text editor I was using to type the email.  Please
> ignore those.
>
> Andy
>
>
> On 2/14/07, Andy Berryman <[EMAIL PROTECTED]> wrote:
> >
> > In my application, I was previously building queries as a string and
I'm
> > having to convert over to the API because of the need to use the
Wildcard
> > Query.  I'm running into a few searching issues and they all seem to
center
> > around the fact that the field is of *TEXT* type which means it is
> > Analyzed when indexed.
> >
> > Assume that my field name is *Title* and it is of *TEXT* type.  Also
> > assume that I am using the StandardAnalyzer.
> >
> > I have a document stored in the index that had the original text of "I
was
> > on the cat-walk".  During the index process, I know that the stop
words
> > are removed and that certain characters are stripped.  So basically,
the end
> > result was that the terms ... "I", "cat", and "walk" ... were stored
in
> > the index.
> >
> > My previous code was doing the simplest case to get the Query by just
> > building ... *Title:"I was on the cat-walk"* ... and passing that into
the
> > Parse method.  Since the analyzer is part of that method call, it was
doing
> > all of the necessary stripping within the query for me and thus the
search
> > was working just fine.  It was returning the Query ... *Title:"i cat
walk"
> > *.
> >
> > With the new code, I'm now buidling the query like this ... TermQuery
tq =
> > new TermQuery (new Term("Title", "I was on the cat-walk")) ... And
this is
> > NOT working.  And the reason is because there is no analysis being
done on
> > the string being searched.  I can certainly write a loop pretty simply
to do
> > the stripping of the stop words, but I dont really know what to do
about the
> > special characters.
> >
> > The main problem I'm looking into is that my end-users are unable to
> > search for just "cat-walk" and get results.  But if they search for
"cat
> > walk", they get the result you would expect.
> >
> > Hopefully someone out there has tackled this issue before and can show
me
> > an example of how to do this without having to re-invent the wheel.
> >
> > Thanks
> > Andy
> >
>

Reply via email to