[issue7089] shlex behaves unexpected if newlines are not whitespace
Jan David Mol jjd...@gmail.com added the comment: As there seems to be some interest, I've continued working on patching this issue. Attached is an improved version of the patch, including additions to test_shlex.py. Improved in the sense that newlines after a comment are not considered to be actually part of the comment (according to POSIX), which makes a difference when newlines are tokens. To accomplish this, I had to add an ungetc buffer to shlex, in order to push back any newlines read by the readline() routine used when a comment is encountered. @Gabriel: the test case of no newline at the end of the file after a comment is addressed. Relevant POSIX sections are Shell Utilities 2.3(10) Rationale C.2.3 -- Added file: http://bugs.python.org/file15708/lexer-newline-tokens-patch-2.0.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7089 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7611] shlex not posix compliant when parsing foo#bar
New submission from Jan David Mol jjd...@gmail.com: The shlex parser parses foo#bar as foo, discarding the rest as a comment. This is actually one of the test cases, even in POSIX mode. However, POSIX (see below) only allows comments to start at the beginning of a token, so foo#bar has to result in a foo#bar token. To easily see this, do echo foo#bar in bash, versus echo foo #bar. Fixing this might break some applications that rely on this broken behaviour, even though they're not strictly POSIX compliant. POSIX 2008, Rationale C.2.3 (which refers to Shell Utilities 2.3(10)): The (10) rule about '#' as the current character is the first in the sequence in which a new token is being assembled. The '#' starts a comment only when it is at the beginning of a token. This rule is also written to indicate that the search for the end-of-comment does not consider escaped newline specially, so that a comment cannot be continued to the next line. -- components: Library (Lib) messages: 97081 nosy: jjdmol2 severity: normal status: open title: shlex not posix compliant when parsing foo#bar type: behavior versions: Python 2.5, Python 2.6, Python 2.7, Python 3.1, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7611 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7611] shlex not posix compliant when parsing foo#bar
Jan David Mol jjd...@gmail.com added the comment: Attached a program which shows the relevant behaviour: import shlex tests = [ foo#bar, foo #bar ] for t in tests: print %s - %s % (t,[x for x in shlex.shlex(t,posix=True)]) results in $ python lexer_test.py foo#bar - ['foo'] foo #bar - ['foo'] (expected of course is ['foo#bar'] on the first line). -- versions: +Python 2.5 Added file: http://bugs.python.org/file15709/lexer_test.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7611 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7089] shlex behaves unexpected if newlines are not whitespace
New submission from Jan David Mol jjd...@gmail.com: The shlex module does not function as expected in the presence of comments when newlines are not whitespace. An example (attached): from shlex import shlex lexer = shlex(a \n b) print ,.join(lexer) a,b lexer = shlex(a # comment \n b) print ,.join(lexer) a,b lexer = shlex(a \n b) lexer.whitespace= print ,.join(lexer) a, ,b lexer = shlex(a # comment \n b) lexer.whitespace= print ,.join(lexer) a,b Now where did my newline go? The comment ate it! Even though the docs seem to indicate the newline is not part of the comment itself: shlex.commenters: The string of characters that are recognized as comment beginners. All characters from the comment beginner to end of line are ignored. Includes just '#' by default. -- files: lexertest.py messages: 93776 nosy: jjdmol2 severity: normal status: open title: shlex behaves unexpected if newlines are not whitespace type: behavior versions: Python 2.4, Python 2.5, Python 2.6, Python 2.7, Python 3.0, Python 3.1, Python 3.2 Added file: http://bugs.python.org/file15087/lexertest.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7089 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7089] shlex behaves unexpected if newlines are not whitespace
Jan David Mol jjd...@gmail.com added the comment: Attached is a patch which fixes this for me. It basically does a fall-through using '\n' when encountering a comment. So that may be a bit of a hack (who says '\n' is the only newline char in there, and not '\r'?) but I'll leave the more intricate stuff to you experts. -- keywords: +patch Added file: http://bugs.python.org/file15088/lexer-newline-tokens.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7089 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7089] shlex behaves unexpected if newlines are not whitespace
Changes by Jan David Mol jjd...@gmail.com: -- components: +Library (Lib) ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7089 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com