New submission from Evan:
The changes to shlex due to land in 3.6 use a predefined set of characters to
"augment" wordchars, however this set is incomplete. For example, 'foo,bar'
should be parsed as a single token, but it is split on the comma:
$ echo foo,bar
foo,bar
>>> import shlex
>>> list(shlex.shlex('foo,bar', punctuation_chars=True))
['foo', ',', 'bar']
(For context on where this was encountered, see
https://github.com/kislyuk/argcomplete/issues/161)
Instead of trying to enumerate all possible wordchars, I think a more robust
solution is to use whitespace_split to include *all* characters not otherwise
considered special.
Ideally this would be fixed before 3.6 is released to avoid needing to maintain
backwards compatibility with the current behaviour, although I understand the
timeline may make this difficult.
I've attached a patch with proposed changes, including updates to the tests to
demonstrate the effective difference. I can make the corresponding
documentation changes if we want this merged.
(I've added everyone to the nosy list from http://bugs.python.org/issue1521950
where these changes originated.)
----------
components: Library (Lib)
files: without_augmenting_chars.diff
keywords: patch
messages: 279980
nosy: Andrey.Kislyuk, cvrebert, eric.araujo, eric.smith, evan_, ezio.melotti,
python-dev, r.david.murray, robodan, vinay.sajip
priority: normal
severity: normal
status: open
title: shlex.split should not augment wordchars
type: behavior
versions: Python 3.6
Added file: http://bugs.python.org/file45333/without_augmenting_chars.diff
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue28595>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com