There is a new version of Unicode Technical Report #29: Text Boundaries on <http://www.unicode.org/reports/tr29/>, covering grapheme-cluster, word and sentence boundaries. There are significant modifications to this version; for a summary, see <http://www.unicode.org/reports/tr29/#Modifications>.
This is a draft version, not a final version. There are a number of open issues remaining. Feedback is welcome: especially useful would be feedback from two groups of people: - those who are aware of the usage of punctuation in different languages - those who are experienced with regular expressions Feedback that is received before the UTC meeting (starting August 20) can be made available for the discussion of TR29 at that meeting. Mark __________________________________ http://www.macchiato.com ► “Eppur si muove” ◄

