Thank you, again, for pointing me towards TrueWord which (according to the
dictionary):

"Designates a string as part of a chunk expression, delimited by Unicode
word breaks, as determined by the ICU Library."

Using TrueWord instead of word in my 8.1.4 code, fixed the problems that I
was encountereing with word chunk identification when the "word" was
prefixed or suffixed with parens characters or punctuation. There appears to
be little performance impact is using "TrueWord" in place of "word" when
searching very large texts (10+ MB).

Interestingly, there is no cross-refernce to TrueWord from the Word entry in
the 8.1.4 dictionary, though there is a cross-reference from TrueWord to
Word. 

Henry



--
View this message in context: 
http://runtime-revolution.278305.n4.nabble.com/Word-delimiters-tp4715709p4715715.html
Sent from the Revolution - User mailing list archive at Nabble.com.

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to