Cool! I'll take a look. Thanks David. robert
> -----Original Message----- > From: Hibbs, David [mailto:[EMAIL PROTECTED] > Sent: Tuesday, March 16, 2004 9:20 AM > To: 'Robert Taylor'; [EMAIL PROTECTED] > Subject: RE: [OT] Search string tokenizer > > > I wrote some code to do this for an open-source project on sf.net an eon or > two ago, before the regex packages matured. You can probably enhance what I > wrote with regex, but it's at least a starting point... > > http://cvs.sourceforge.net/viewcvs.py/omd/java/coruscant/omd/util/StringSear > cher.java?rev=1.2&view=markup > > Anyway, what the code does is split the input into lists (ok, so I used > vectors, I was still learning Java!) of 3 types: required present (i.e. +) , > required absent (i.e. -), and optional terms. In short, the yahoo search > style. Usage: > 1) call setCriteriaString (passing your user search input) > 2) call compareString (passing the content to search/validate) > > In your case, since you're going to pass the search criteria to SQL, you can > probably just use the tokeinzing logic and add some getters for the criteria > lists... > > David Hibbs, ACS > Staff Programmer / Analyst > American National Insurance Company > > > -----Original Message----- > > From: Robert Taylor [mailto:[EMAIL PROTECTED] > > Sent: Monday, March 15, 2004 3:20 PM > > To: [EMAIL PROTECTED] > > Subject: [OT] Search string tokenizer > > > > > > I did a google search on this and didn't really come up with > > anything useful. > > Before I implement this myself, is there an existing > > implementation of parsing > > a search string which would produce tokens similar to how > > Google or other search > > engines parse search strings. > > > > For example, I would like to parse a search string into > > tokens where tokens are > > delimited by either a blank space or a quoted phrase. > > > > So the string: > > > > 'Struts "web presentation tier"' > > > > would return 2 tokens: > > - Struts > > - web presentation tier > > > > but the string: > > > > 'Struts web presentation tier' > > > > would return 4 tokens: > > - Struts > > - web > > - presentation > > - tier > > > > > > Any help is appreciated. > > > > robert > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]