New submission from Wellington Fan:
Hello,
It seems that the word boundary sequence -- r'\b' -- is not behaving as
expected using re.split(). The regex docs say:
\b Matches the empty string, but only at the start or end of a word.
My (failing) test:
> import re
> re.split(r'\b', 'A funky string')
['A funky string']
We get a one-element array returned; I would expect a seven-element array:
['', 'A', ' ', 'funky', ' ', 'string', '']
I have equivalent code in PHP that *does* work:
php > print_r( preg_split('/\b/', 'A funny string') );
Array
(
[0] =>
[1] => A
[2] =>
[3] => funny
[4] =>
[5] => string
[6] =>
)
----------
components: Regular Expressions
messages: 218879
nosy: Wellington.Fan, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: Behavior of word boundaries in regexes unexpected
type: behavior
versions: Python 2.7
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue21551>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com