On Sunday, November 3, 2019 at 3:20:27 AM UTC-6, Edward K. Ream wrote: > The new code should put an end to a long series of issues <https://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype&%40sort=-activity&%40filter=status&%40action=searchid&ignore=file%3Acontent&%40search_text=untokenize&submit=search&status=-1%2C1%2C2%2C3> against untokenize code in python's tokenize <https://github.com/python/cpython/blob/master/Lib/tokenize.py>library module.
A few more words about the testing that I have done. The python tests in test_tokenize.py are quite rightly careful about unicode. My test code takes similar care. leoTest.leo contains unit tests from test_tokenize.py, adapted to run within Leo. leoTest.leo contains this new unit test: # Test https://bugs.python.org/issue38663. import leo.core.leoBeautify as leoBeautify check_roundtrip = leoBeautify.check_roundtrip check_roundtrip( "print \\\n" " ('abc')\n", expect_failure = True ) Something similar should be added to test_tokenize.py. Here is the testing code from leoBeautify.py. As you will see, the code runs *stricter* tests than those in test_tokenize.py: import unittest def check_roundtrip(f, expect_failure=False): """ Called from unit tests in unitTest.leo. Test python's tokenize.untokenize method and Leo's Untokenize class. """ check_python_roundtrip(f, expect_failure) check_leo_roundtrip(f) def check_leo_roundtrip(code, trace=False): """Check Leo's Untokenize class""" # pylint: disable=import-self import leo.core.leoBeautify as leoBeautify assert isinstance(code, str), repr(code) tokens = tokenize.tokenize(io.BytesIO(code.encode('utf-8')).readline) u = leoBeautify.Untokenize(code, trace=trace) results = u.untokenize(tokens) unittest.TestCase().assertEqual(code, results) def check_python_roundtrip(f, expect_failure): """ This is tokenize.TestRoundtrip.check_roundtrip, without the wretched fudges. """ if isinstance(f, str): code = f.encode('utf-8') else: code = f.read() f.close() readline = iter(code.splitlines(keepends=True)).__next__ tokens = list(tokenize.tokenize(readline)) bytes = tokenize.untokenize(tokens) readline5 = iter(bytes.splitlines(keepends=True)).__next__ result_tokens = list(tokenize.tokenize(readline5)) if expect_failure: unittest.TestCase().assertNotEqual(result_tokens, tokens) else: unittest.TestCase().assertEqual(result_tokens, tokens) *Summary* I've done the heavy lifting on issue 38663 <https://bugs.python.org/issue38663>. Python devs should handle the details of testing and packaging. Leo's tokenizing code in leoBeautify.py can use the new code immediately, without waiting for python to improve tokenize.untokenize. Edward -- You received this message because you are subscribed to the Google Groups "leo-editor" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/leo-editor/f27a1107-f31b-4153-9bbd-195cc2762e1b%40googlegroups.com.
