Hi,
I am indexing a number of English articles on Spanish resorts. As such there are a number of spanish characters throught the text, most of these are in the place names which are the type of words I would like to use as queries. My problem is with the StandardTokenizer class which cuts the word into two when it comes across any of the spanish characters. I had a look at the source but the code was generated by JavaCC and so is not very readable. I was wondering if there was a way around this problem or which area of the code I would need to change to avoid this.
Thanks Hannah Cumming
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
