We have a large body of documents that have xml and ocr embedded within one of the xml fields.
Searches such as "group effect" are returning hits for docs such as ones that include the following: ...group of ~a- The effect... because, I take it, stop words like 'of' and 'the' and punctuation are ignored. Is there anything I can do about this other than write an alternative to the Standard Analyzer? thanks, Bob Mason UCSF Tobacco Industy Digital Library --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]