Re: Question about StandardAnalyzer.cs

Floyd Wu Wed, 04 Mar 2009 21:39:58 -0800

Hi Michael,
I'm sure that I use StandardAnalyzer when indexing. The problem is I need to
get search result when I query "Z123456" to my index filed named "author_id"
and currently this field value is "Z123456" shown by Luke-0.8.1 in index.


I'm stuck here for a month. Please help on this.
Thanks



2009/3/5 Michael Mitiaguin <mitiag...@gmail.com>

> As mentioned in this thread could you re-check that you explicitly  use
> StandardAnalyzer when indexing.
> I must admit though I am still using 2.0.4
>
>  writer = new IndexWriter(indexdir, new StandardAnalyzer(), true);
>
> In Luke if  you to select plugins > Analyzer tool  > StandardAnalyzer
> it also makes lowercase
> Original text : Z123456  tokens found : z123456
>
> On Thu, Mar 5, 2009 at 2:45 PM, Floyd Wu <floyd...@gmail.com> wrote:
>
> > I'm sure the application and Luke use the same analyzer, StandardAnalyer.
> > But I can't search "Z123456" and I don't know why. As log as I commentted
> > out StandardAnalyzer.cs
> > line: result = new LowerCaseFilter(result);
> > The result will be what I want.
> >
> >
> >
> > 2009/3/4 Jokin Cuadrado <joki...@gmail.com>
> >
> > > using luke you could use another analyzers as well, so use the keyword
> > > analyzer for example. But regards your application, you must use the
> same
> > > analyzer whe you make your index and when you query it.
> > >
> > > On Wed, Mar 4, 2009 at 10:50 AM, Floyd Wu <floyd...@gmail.com> wrote:
> > >
> > > > But the current situation is: I can't search any result with
> "Z123456"
> > > when
> > > > I type "Z123456" or "z123456".
> > > >
> > > > I'm using StandardAnalyzer and by using luke, the value indexed is
> > > > "Z123456".
> > > > How can I fix this problem?
> > > >
> > > >
> > > >
> > > > 2009/3/4 Jokin Cuadrado <joki...@gmail.com>
> > > >
> > > > > the rationale behind using the lowercase filter, is that it would
> > mach
> > > > when
> > > > > you search both of Z123456 and z132456, so the searchs are case
> > > > > insensitive,
> > > > > however, as with any filter, you must use the same analyzer when
> > > indexing
> > > > > your documents, Are you doing that?
> > > > >
> > > > > On Wed, Mar 4, 2009 at 9:31 AM, Floyd Wu <floyd...@gmail.com>
> wrote:
> > > > >
> > > > > > Hi all,
> > > > > > My problem is I have a field and the field is set to be  Indexed
> &
> > > > > Stored.
> > > > > > The index value is Z123456.
> > > > > > But when I using StandardAnalyzer to search this field, it seems
> > >  that
> > > > > > StandarAnalyzer will transaform my query text "Z123456" to
> > "z123456".
> > > > > After
> > > > > > walk through source code, I found following lines:
> > > > > >  public override TokenStream TokenStream(System.String fieldName,
> > > > > > System.IO.TextReader reader)
> > > > > >  {
> > > > > >   StandardTokenizer tokenStream = new StandardTokenizer(reader,
> > > > > > replaceInvalidAcronym);
> > > > > >   tokenStream.SetMaxTokenLength(maxTokenLength);
> > > > > >   TokenStream result = new StandardFilter(tokenStream);
> > > > > >   result = new LowerCaseFilter(result);
> > > > > >   result = new StopFilter(result, stopSet);
> > > > > >   return result;
> > > > > >  }
> > > > > >
> > > > > > Why using LoweCasefilter() here? If I comment out this line, will
> I
> > > > have
> > > > > > any
> > > > > > potential problems?
> > > > > > I think my "Z123456" to "z123456" is transformed by this filter.
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Jokin
> > > > > Sent from: Sant cugat del valles  Spain.
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Jokin
> > >
> >
>

Re: Question about StandardAnalyzer.cs

Reply via email to