Re: [Lucene.Net] How to index/search a file name

Gustavo Poll Tue, 06 Sep 2011 13:07:15 -0700

thanks, I'll do it...

2011/9/6 Digy <digyd...@gmail.com>


> That can be a starting point (Just play a little bit with with tokenizers &
> filters )
>
>
>
>    public class ModifiedStandardAnalyzer : Analyzer
>
>    {
>
>        public override TokenStream TokenStream(System.String fieldName,
> System.IO.TextReader reader)
>
>        {
>
>            StandardTokenizer tokenStream = new StandardTokenizer(reader,
> true);
>
>            TokenStream result = new StandardFilter(tokenStream);
>
>            result = new LowerCaseFilter(result);
>
>            result = new ASCIIFoldingFilter(result);
>
>            return result;
>
>        }
>
>    }
>
>
>
> DIGY
>
>
>
> -----Original Message-----
> From: Gustavo Poll [mailto:gkp...@gmail.com]
> Sent: Tuesday, September 06, 2011 10:06 PM
> To: lucene-net-user@lucene.apache.org
> Subject: Re: [Lucene.Net] How to index/search a file name
>
>
>
> thanks again... Ok, it is not..
>
>
>
> standard analyzer:
>
>
>
> [name.surn...@gmail.com] [123.456] [3,5] [at&t] [güsıöç] [güsiöç] [aß?de?]
>
> [??????] [ssß]
>
>
>
> UnaccentedWordAnalyzer:
>
>
>
> [name] [surname] [gmail] [com] [123] [456] [3] [5] [at] [t] [gusioc]
>
> [gusioc] [aß?de?] [??????] [ssss]
>
>
>
>
>
> StandardAnalyzer would be perfect to my application if it was accent
>
> insensitive... Can anyone tell me please, the easiest way to code such
>
> analyzer? (accent insensitive Standard Analyzer)
>
>
>
> I hear it is not a good idea to make a class that inherits StandardAnalyzer
>
> cause StandardAnalyzer should be a final class.. Is this coherent?
>
>
>
> Appreciate any help please...
>
> Gustavo Poll
>
>
>
>
>
>
>
>
>
> 2011/9/6 Digy <digyd...@gmail.com>
>
>
>
> > A function is worth a thousand words J
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >        void Test()
>
> >
>
> >        {
>
> >
>
> >            Analyzer[] analyzers = new Analyzer[] { new
> StandardAnalyzer(),
>
> > new Lucene.Net.Analysis.Ext.UnaccentedWordAnalyzer() };
>
> >
>
> >            string input = "name.surn...@gmail.com 123.456 3,5 AT&T
>
> > ğüşıöç%ĞÜŞİÖÇ$ΑΒΓΔΕΖ#АБВГДЕ SSß";
>
> >
>
> >
>
> >
>
> >            foreach (Analyzer analyzer in analyzers)
>
> >
>
> >            {
>
> >
>
> >                TokenStream ts = analyzer.TokenStream("", new
>
> > StringReader(input));
>
> >
>
> >                Lucene.Net.Analysis.Token t = ts.Next();
>
> >
>
> >                while (t != null)
>
> >
>
> >                {
>
> >
>
> >                    Console.Write("[" + t.TermText() + "] ");
>
> >
>
> >                    t = ts.Next();
>
> >
>
> >                }
>
> >
>
> >                Console.WriteLine(); Console.WriteLine();
>
> >
>
> >
>
> >
>
> >            }
>
> >
>
> >        }
>
> >
>
> >
>
> >
>
> > DIGY
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > -----Original Message-----
>
> > From: Gustavo Poll [mailto:gkp...@gmail.com]
>
> > Sent: Tuesday, September 06, 2011 9:00 PM
>
> > To: lucene-net-user@lucene.apache.org
>
> > Subject: Re: [Lucene.Net] How to index/search a file name
>
> >
>
> >
>
> >
>
> > thanks DIGY, I have interest in that too... Let me see if i understood:
>
> >
>
> >
>
> >
>
> > UnaccentedWordAnalyzer  is like Standard Analyzer, but accent
> insensitive?
>
> >
>
> >
>
> >
>
> > Thanks!
>
> >
>
> > Gustavo Poll
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > 2011/9/6 digy digy <digyd...@gmail.com>
>
> >
>
> >
>
> >
>
> > > That may help
>
> >
>
> > >
>
> >
>
> > > UnaccentedWordAnalyzer @
>
> >
>
> > >
>
> >
>
> > >
>
> >
> https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/src/contrib/Core/Analysis/Ext/Analysis.Ext.cs
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > DIGY
>
> >
>
> > >
>
> >
>
> > > On Tue, Sep 6, 2011 at 12:31 PM, Floyd Wu <floyd...@gmail.com> wrote:
>
> >
>
> > >
>
> >
>
> > > > Hi everyone,
>
> >
>
> > > >
>
> >
>
> > > > I have a question that annoying me many times. my situation is that I
>
> >
>
> > > need
>
> >
>
> > > > to index file name and need to be searchable using partial file name.
>
> >
>
> > > >
>
> >
>
> > > > example--> 2009&2010Q2_ABCD_Report.xls (the file name)
>
> >
>
> > > >
>
> >
>
> > > > When I shot queries
>
> >
>
> > > >
>
> >
>
> > > > filename:ABCD    no match return.
>
> >
>
> > > >
>
> >
>
> > > > filename:2010Q2_ABCD     match
>
> >
>
> > > >
>
> >
>
> > > > filename:Report*    match
>
> >
>
> > > >
>
> >
>
> > > > I'm using StandardAnalyzer and Lucene.Net version is 2.9.3. Current
>
> >
>
> > > > filename
>
> >
>
> > > > field is set to tokenized/indexed/store
>
> >
>
> > > >
>
> >
>
> > > > What I want is when user type any part of file name that lucene.Net
> can
>
> >
>
> > > > match.
>
> >
>
> > > > (string like 2009 or 2010Q2 or ABCD or Report or xls or Report.xls)
>
> >
>
> > > >
>
> >
>
> > > > Please help on this or kindly direct me a way to solve it.
>
> >
>
> > > >
>
> >
>
> > > > Floyd
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> >
>
> >
>
> > -----
>
> >
>
> > Bu iletide virüs bulunamadı.
>
> >
>
> > AVG tarafından kontrol edildi - www.avg.com
>
> >
>
> > Sürüm: 2012.0.1796 / Virüs Veritabanı: 2082/4480 - Sürüm Tarihi:
> 06.09.2011
>
> >
>
> >
>
>
>
> -----
>
> Bu iletide virüs bulunamadı.
>
> AVG tarafından kontrol edildi - www.avg.com
>
> Sürüm: 2012.0.1796 / Virüs Veritabanı: 2082/4480 - Sürüm Tarihi: 06.09.2011
>
>

Re: [Lucene.Net] How to index/search a file name

Reply via email to