[issue38566] Description of '\w' behavior is vague in `re` documentation

2019-10-24 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- nosy: +serhiy.storchaka type: -> behavior ___ Python tracker ___ ___ Python-bugs-list

[issue38566] Description of '\w' behavior is vague in `re` documentation

2019-10-23 Thread James Gerity
James Gerity added the comment: Cheers for the additional context. My recommendation would be to change the language to avoid confusion with the consortium's formal specifications. Describing what SRE does should be fine: > Matches any alphanumeric Unicode character, as well as '_'. If the

[issue38566] Description of '\w' behavior is vague in `re` documentation

2019-10-23 Thread Josh Rosenberg
Josh Rosenberg added the comment: The definition of \w, historically, has corresponded to the set of characters that can occur in legal variable names in C (alphanumeric ASCII plus underscores, making it equivalent to [a-zA-Z0-9_] for ASCII regex). That's why, on top of the definitely wordy

[issue38566] Description of '\w' behavior is vague in `re` documentation

2019-10-23 Thread James Gerity
New submission from James Gerity : The documentation for the `re` library¹ describes the behavior of the specifier '\w' as matching "Unicode word characters," which is very vague. The closest thing I can find that corresponds to this language is the guidance offered in Unicode Technical