Doh, I forgot.  Regular expressions worked well for me when I dealt with that 
problem many years ago.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
> From: Otis Gospodnetic <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Monday, June 9, 2008 5:36:34 PM
> Subject: Re: Solr system and numbers
> 
> Not sure.  Perhaps it can be done by training a language model and treating 
> phone numbers as named entities?  Not sure if it would work.  But I know 
> there 
> are a few NLP people subscribed, maybe they'll have some good ideas.
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> ----- Original Message ----
> > From: Cam Bazz 
> > To: solr-user@lucene.apache.org
> > Sent: Monday, June 9, 2008 4:24:48 PM
> > Subject: Re: Solr system and numbers
> > 
> > I got a similar question:
> > how would one normalize or even detect if a string is a phone number?
> > 
> > On Mon, Jun 9, 2008 at 4:17 PM, dudes dudes wrote:
> > 
> > >
> > > great info ,,, thanks a lot all
> > >
> > >
> > > ----------------------------------------
> > > > Date: Mon, 9 Jun 2008 05:58:50 -0700
> > > > From: [EMAIL PROTECTED]
> > > > Subject: Re: Solr system and numbers
> > > > To: solr-user@lucene.apache.org
> > > >
> > > > Hi,
> > > > Solr/Lucene can treat phone numbers as strings.  If you want to clean
> > > them up and normalize them outside of Solr, you can do that and feed them
> > > into Solr as pure numbers.
> > > >
> > > > How the phone numbers will be treated after you pump them into Solr
> > > depends on the analyzer you choose to use for this data.  If you don't 
> > > need
> > > to search on subsets of phone numbers, then just don't tokenize them (i.e.
> > > use string type if the phone numbers contain any non-numeric characters,
> > > sint otherwise).
> > > >
> > > > Otis
> > > > --
> > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > >
> > > >
> > > > ----- Original Message ----
> > > >> From: dudes dudes
> > > >> To: solr-user@lucene.apache.org
> > > >> Sent: Monday, June 9, 2008 2:10:20 PM
> > > >> Subject: Solr system and numbers
> > > >>
> > > >>
> > > >> Hello experts,
> > > >>
> > > >> How does Solr deal with numbers or phone numbers .. For example if you
> > > have 1234
> > > >> and 12 34 or 1 234... with spaces between the numbers ..
> > > >> Or this is dealt by lucene ?
> > > >>
> > > >> any documentations or tutorial on this ?
> > > >>
> > > >> many thanks,
> > > >> ak
> > > >> _________________________________________________________________
> > > >>
> > > >> All new Live Search at Live.com
> > > >>
> > > >> http://clk.atdmt.com/UKM/go/msnnkmgl0010000006ukm/direct/01/
> > > >
> > >
> > > _________________________________________________________________
> > >
> > > All new Live Search at Live.com
> > >
> > > http://clk.atdmt.com/UKM/go/msnnkmgl0010000006ukm/direct/01/
> > >

Reply via email to