It won't do what I need.  I may have something like:

"All-In-One is located in 92226-4446 and has an E-A-R"

I want it to be tokenized as follows:

all
one
located
92226
4446
E-A-R

Right now... it is tokenizing it as this:

all
one
located
92226-4446
E-A-R
Thats the type of information you give when you ask the question the first time (not to be a pompous ass or anything <g> ). The problem is that your zip code is match by NUM

| <NUM: (<ALPHANUM> <P> <HAS_DIGIT>
      | <HAS_DIGIT> <P> <ALPHANUM>
      | <ALPHANUM> (<P> <HAS_DIGIT> <P> <ALPHANUM>)+
      | <HAS_DIGIT> (<P> <ALPHANUM> <P> <HAS_DIGIT>)+
      | <ALPHANUM> <P> <HAS_DIGIT> (<P> <ALPHANUM> <P> <HAS_DIGIT>)+
      | <HAS_DIGIT> <P> <ALPHANUM> (<P> <HAS_DIGIT> <P> <ALPHANUM>)+
       )
 >

You could try and remove the first two OR options. Other than that, it gets tricky. And if you remove them than other things they might normally match (other than zip-codes) will not be matched.

- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to