Martin v. Löwis <[EMAIL PROTECTED]> added the comment: Unicode TR#18 defines \w as a shorthand for
\p{alpha} \p{gc=Mark} \p{digit} \p{gc=Connector_Punctuation} which would include all marks. We should recursively check whether we follow the recommendation (e.g. \p{alpha} refers to all character having the Alphabetic derived core property, which is Lu+Ll+Lt+Lm+Lo+Nl + Other_Alphabetic, where Other_Alphabetic is a selected list of additional character - all from Mn/Mc) ---------- nosy: +loewis _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1693050> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com