I believe that the tokenizer treats a dash as a token separator. Hence, the only way, as I recall, to eliminate this behavior is to modify QueryParser.jj so it doesn't do this. However, doing this can cause some other problems, like hyphenated words at a line break and the like.
(Of course, if you do make such a change, you'll have to go back and reindex after such a change.) I've run into this problem myself and I've 'punted' - on certain fields, when I index, I replace the dash with an underscore. This isn't a real good solution, and it does require me to keep remembering in which fields I have to do this substitution in the search. But, for the moment it works. I'll probably go back and make some kind of change later, when I have more time. HTH, Terry ----- Original Message ----- From: "hermit" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, February 03, 2003 2:39 AM Subject: '-' character not interpreted correctly in field names > Hello! > > I have a problem, a big one. I have successfully indexed 600 MB of XML > data, but the search can't give any results if the field contains any > '-' characters . > For example: compound@cgx-code:[2 - 5] must match at least two results > based on my XML data but it gives nothing. > > Can you advice me a simple solution? Or is it a bug? > > The Hermit > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
