Doug Cutting writes:
> Morus Walter wrote:
> > Now I think this can be fixed in the query parser alone by simply allowing
> > '-' within words.
> > That is change
> > <#_TERM_CHAR: ( <_TERM_START_CHAR> | <_ESCAPED_CHAR> ) >
> > to
> > <#_TERM_CHAR: ( <_TERM_START_CHAR> | <_ESCAPED_CHAR> | "-" ) >
> >
> > As a result, query parser will read '-' within words (such as tft-monitor
> > or Sysh1-1) as one word, which will be tokenized by the used analyzer
> > and end up in a term query or phrase query depending if it create one ore
> > more tokens.
>
> Other characters which are also candidates for this sort of treatment
> include "/", "@", ".", "'", and "+".
>
_TERM_START_CHAR is
| <#_TERM_START_CHAR: ( ~[ " ", "\t", "\n", "\r", "+", "-", "!", "(", ")",
":", "^", "[", "]", "\"", "{", "}", "~", "*", "?" ]
so / @ . ' are already allowed in terms.
(:, ^, ~, * and ? cannot be added, parenthesis don't make sense.)
So I end up with
<#_TERM_CHAR: ( <_TERM_START_CHAR> | <_ESCAPED_CHAR> | "-" | "+" ) >
The regression tests show no error, so I entered that in bugzilla.
Morus
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]