Cool! I'll take a look. Thanks David.

robert

> -----Original Message-----
> From: Hibbs, David [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, March 16, 2004 9:20 AM
> To: 'Robert Taylor'; [EMAIL PROTECTED]
> Subject: RE: [OT] Search string tokenizer
> 
> 
> I wrote some code to do this for an open-source project on sf.net an eon or
> two ago, before the regex packages matured.  You can probably enhance what I
> wrote with regex, but it's at least a starting point... 
> 
> http://cvs.sourceforge.net/viewcvs.py/omd/java/coruscant/omd/util/StringSear
> cher.java?rev=1.2&view=markup
> 
> Anyway, what the code does is split the input into lists (ok, so I used
> vectors, I was still learning Java!) of 3 types: required present (i.e. +) ,
> required absent (i.e. -), and optional terms.  In short, the yahoo search
> style.  Usage:
> 1) call setCriteriaString (passing your user search input)
> 2) call compareString (passing the content to search/validate)
> 
> In your case, since you're going to pass the search criteria to SQL, you can
> probably just use the tokeinzing logic and add some getters for the criteria
> lists...
> 
> David Hibbs, ACS
> Staff Programmer / Analyst
> American National Insurance Company
> 
> > -----Original Message-----
> > From: Robert Taylor [mailto:[EMAIL PROTECTED]
> > Sent: Monday, March 15, 2004 3:20 PM
> > To: [EMAIL PROTECTED]
> > Subject: [OT] Search string tokenizer
> > 
> > 
> > I did a google search on this and didn't really come up with 
> > anything useful.
> > Before I implement this myself, is there an existing 
> > implementation of parsing
> > a search string which would produce tokens similar to how 
> > Google or other search
> > engines parse search strings.
> > 
> > For example, I would like to parse a search string into 
> > tokens where tokens are 
> > delimited by either a blank space or a quoted phrase.
> > 
> > So the string:
> > 
> > 'Struts "web presentation tier"' 
> > 
> > would return  2 tokens:
> >  - Struts
> >  - web presentation tier
> > 
> > but the string:
> > 
> > 'Struts web presentation tier'
> > 
> > would return 4 tokens:
> >  - Struts
> >  - web
> >  - presentation
> >  - tier
> > 
> > 
> > Any help is appreciated.
> > 
> > robert
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to