[Python-Dev] Commit changelog: issue number and merges
Hi, Commit changelogs are important to understand why the code was changed. I regulary use hg blame to search which commit introduced a particular line of code, and I am always happy if I can find an issue number because it usually contains the whole story. And since the migration to Mercurial, we have also a great tool adding a comment to an issue if the changelog contains an issue number (e.g. changelog starting with Issue #11: ...). So if someone watchs an issue (is in the nosy list), (s)he will be noticed that a related commit was pushed. It is not exactly something new: we already do that with Subversion except that today it is more automatic. I noticed that some recent commits don't contain the issue number: please try to always prefix your changelog with the issue number. It is not mandatory, but it helps me when I dig the Python history. -- For merge commits: many developers just write merge or merge 3.1. I have to go to the parent commit (and something to the grandparent, 3.1-3.2-3.3) to learn more about the commit. Would it be possible to repeat the changelog of the original commit in the merge commits? svnmerge toold prepared a nice changelog containing the changelog of all pendings commits, even when a commit was blocked. For a merge commit, I copy/paste the changelog of the original commit and I add a (Merge 3.1) prefix. I prefer to add explictly a prefix because it is not easy to notice that it is a merge commit in a python-checkins email or in the history of hg.python.org. We need maybe new tools to help the process. -- Usecases needing better changelogs: - All changes section of a buildbot build - hg blame (or just hg log) Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
On Mon, 09 May 2011 12:32:48 +0200, Victor Stinner victor.stin...@haypocalc.com wrote: For merge commits: many developers just write merge or merge 3.1. I have to go to the parent commit (and something to the grandparent, 3.1-3.2-3.3) to learn more about the commit. Would it be possible to repeat the changelog of the original commit in the merge commits? svnmerge toold prepared a nice changelog containing the changelog of all pendings commits, even when a commit was blocked. For a merge commit, I copy/paste the changelog of the original commit and I add a (Merge 3.1) prefix. I prefer to add explictly a prefix because it is not easy to notice that it is a merge commit in a python-checkins email or in the history of hg.python.org. +1. What I do is, in the edit window for the commit message, I pull in .hg/last-message.txt, and just type 'Merge' in front of my previous first line. I don't add the merge-from number, because I figure if you know which branch you are looking at you know which branch the merge came from, given that there is a strict progression. -- R. David Murray http://www.bitdance.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277: Remove useless test from test_zlib.
Can you clarify (preferably in the commit message as well) exactly *why* these largefile tests are useless? For example, is there another test that covers this already? -jJ On 5/7/11, nadeem.vawda python-check...@python.org wrote: http://hg.python.org/cpython/rev/201dcfc56e86 changeset: 69886:201dcfc56e86 branch: 2.7 parent: 69881:a0147a1f1776 user:Nadeem Vawda nadeem.va...@gmail.com date:Sat May 07 11:28:03 2011 +0200 summary: Issue #11277: Remove useless test from test_zlib. files: Lib/test/test_zlib.py | 42 --- 1 files changed, 0 insertions(+), 42 deletions(-) diff --git a/Lib/test/test_zlib.py b/Lib/test/test_zlib.py --- a/Lib/test/test_zlib.py +++ b/Lib/test/test_zlib.py @@ -72,47 +72,6 @@ zlib.crc32('spam', (2**31))) -# Issue #11277 - check that inputs of 2 GB (or 1 GB on 32 bits system) are -# handled correctly. Be aware of issues #1202. We cannot test a buffer of 4 GB -# or more (#8650, #8651 and #10276), because the zlib stores the buffer size -# into an int. -class ChecksumBigBufferTestCase(unittest.TestCase): -if sys.maxsize _4G: -# (64 bits system) crc32() and adler32() stores the buffer size into an -# int, the maximum filesize is INT_MAX (0x7FFF) -filesize = 0x7FFF -else: -# (32 bits system) On a 32 bits OS, a process cannot usually address -# more than 2 GB, so test only 1 GB -filesize = _1G - -@unittest.skipUnless(mmap, mmap() is not available.) -def test_big_buffer(self): -if sys.platform[:3] == 'win' or sys.platform == 'darwin': -requires('largefile', - 'test requires %s bytes and a long time to run' % - str(self.filesize)) -try: -with open(TESTFN, wb+) as f: -f.seek(self.filesize-4) -f.write(asdf) -f.flush() -m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) -try: -if sys.maxsize _4G: -self.assertEqual(zlib.crc32(m), 0x709418e7) -self.assertEqual(zlib.adler32(m), -2072837729) -else: -self.assertEqual(zlib.crc32(m), 722071057) -self.assertEqual(zlib.adler32(m), -1002962529) -finally: -m.close() -except (IOError, OverflowError): -raise unittest.SkipTest(filesystem doesn't have largefile support) -finally: -unlink(TESTFN) - - class ExceptionTestCase(unittest.TestCase): # make sure we generate some expected errors def test_badlevel(self): @@ -595,7 +554,6 @@ def test_main(): run_unittest( ChecksumTestCase, -ChecksumBigBufferTestCase, ExceptionTestCase, CompressTestCase, CompressObjectTestCase -- Repository URL: http://hg.python.org/cpython ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] more timely detection of unbound locals
Hi all, It's a known Python gotcha (*) that the following code: x = 5 def foo(): print(x) x = 1 print(x) foo() Will throw: UnboundLocalError: local variable 'x' referenced before assignment On the usage of 'x' in the *first* print. Recently, while reading the zillionth question on StackOverflow on some variation of this case, I started thinking whether this behavior is desired or just an implementation artifact. IIUC, the reason it behaves this way is that the symbol table logic goes over the code before the code generation runs, sees the assignment 'x = 1` and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST for all loads of 'x' in 'foo', even though 'x' is actually bound locally after the first print. When the bytecode is run, since it's LOAD_FAST and no store was made into the local 'x', ceval.c then throws the exception. On first sight, it's possible to signal that 'x' truly becomes local only after it's bound in the scope (and before that LOAD_NAME can be generated for it instead of LOAD_FAST). To do this, some modifications to the symbol table creation and usage are required, because we can no longer say x is local in this block, but rather should attach scope information to each instance of x. This has some overhead, but it's only at the compilation stage so it shouldn't have a real effect on the runtime of Python code. This is also less convenient and clean than the current approach - this is why I'm wondering whether the behavior is an artifact of the implementation. Would it not be worth to make Python's behavior more expected in this case, at the cost of some implementation complexity? What are the cons to making such a change? At least judging by the amount of people getting confused by it, maybe it's in line with the zen of Python to behave more explicitly here. Thanks in advance, Eli (*) Variation of this FAQ: http://docs.python.org/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Are you asserting that all foreign modules (or at least all handled by this) are in C, as opposed to C++ or even Java or Fortran? (And the C won't change?) Is this ASCII restriction (as opposed to even UTF8) really needed? Or are you just saying that we need to create an ASCII name for passing to C? -jJ On 5/7/11, victor.stinner python-check...@python.org wrote: http://hg.python.org/cpython/rev/eb003c3d1770 changeset: 69889:eb003c3d1770 user:Victor Stinner victor.stin...@haypocalc.com date:Sat May 07 12:46:05 2011 +0200 summary: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII The name must be encodable to ASCII because dynamic module must have a function called PyInit_NAME, they are written in C, and the C language doesn't accept non-ASCII identifiers. files: Python/importdl.c | 40 +- 1 files changed, 25 insertions(+), 15 deletions(-) diff --git a/Python/importdl.c b/Python/importdl.c --- a/Python/importdl.c +++ b/Python/importdl.c @@ -20,31 +20,36 @@ const char *pathname, FILE *fp); #endif -/* name should be ASCII only because the C language doesn't accept non-ASCII - identifiers, and dynamic modules are written in C. */ - PyObject * _PyImport_LoadDynamicModule(PyObject *name, PyObject *path, FILE *fp) { -PyObject *m; +PyObject *m = NULL; #ifndef MS_WINDOWS PyObject *pathbytes; #endif +PyObject *nameascii; char *namestr, *lastdot, *shortname, *packagecontext, *oldcontext; dl_funcptr p0; PyObject* (*p)(void); struct PyModuleDef *def; -namestr = _PyUnicode_AsString(name); -if (namestr == NULL) -return NULL; - m = _PyImport_FindExtensionObject(name, path); if (m != NULL) { Py_INCREF(m); return m; } +/* name must be encodable to ASCII because dynamic module must have a + function called PyInit_NAME, they are written in C, and the C language + doesn't accept non-ASCII identifiers. */ +nameascii = PyUnicode_AsEncodedString(name, ascii, NULL); +if (nameascii == NULL) +return NULL; + +namestr = PyBytes_AS_STRING(nameascii); +if (namestr == NULL) +goto error; + lastdot = strrchr(namestr, '.'); if (lastdot == NULL) { packagecontext = NULL; @@ -60,34 +65,33 @@ #else pathbytes = PyUnicode_EncodeFSDefault(path); if (pathbytes == NULL) -return NULL; +goto error; p0 = _PyImport_GetDynLoadFunc(shortname, PyBytes_AS_STRING(pathbytes), fp); Py_DECREF(pathbytes); #endif p = (PyObject*(*)(void))p0; if (PyErr_Occurred()) -return NULL; +goto error; if (p == NULL) { PyErr_Format(PyExc_ImportError, dynamic module does not define init function (PyInit_%s), shortname); -return NULL; +goto error; } oldcontext = _Py_PackageContext; _Py_PackageContext = packagecontext; m = (*p)(); _Py_PackageContext = oldcontext; if (m == NULL) -return NULL; +goto error; if (PyErr_Occurred()) { -Py_DECREF(m); PyErr_Format(PyExc_SystemError, initialization of %s raised unreported exception, shortname); -return NULL; +goto error; } /* Remember pointer to module init function. */ @@ -101,12 +105,18 @@ Py_INCREF(path); if (_PyImport_FixupExtensionObject(m, name, path) 0) -return NULL; +goto error; if (Py_VerboseFlag) PySys_FormatStderr( import %U # dynamically loaded from %R\n, name, path); +Py_DECREF(nameascii); return m; + +error: +Py_DECREF(nameascii); +Py_XDECREF(m); +return NULL; } #endif /* HAVE_DYNAMIC_LOADING */ -- Repository URL: http://hg.python.org/cpython ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
On Mon, May 09, 2011 at 08:40:03AM -0400, R. David Murray wrote: +1. What I do is, in the edit window for the commit message, I pull in .hg/last-message.txt, and just type 'Merge' in front of my previous Thanks for this tip. I shall start following this one too. -- Senthil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
On Mon, 9 May 2011, Eli Bendersky wrote: It's a known Python gotcha (*) that the following code: x = 5 def foo(): print(x) x = 1 print(x) foo() Will throw: UnboundLocalError: local variable 'x' referenced before assignment On the usage of 'x' in the *first* print. Recently, while reading the zillionth question on StackOverflow on some variation of this case, I started thinking whether this behavior is desired or just an implementation artifact. IIUC, the reason it behaves this way is that the symbol table logic goes over the code before the code generation runs, sees the assignment 'x = 1` and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST for all loads of 'x' in 'foo', even though 'x' is actually bound locally after the first print. When the bytecode is run, since it's LOAD_FAST and no store was made into the local 'x', ceval.c then throws the exception. On first sight, it's possible to signal that 'x' truly becomes local only after it's bound in the scope (and before that LOAD_NAME can be generated for it instead of LOAD_FAST). To do this, some modifications to the symbol table creation and usage are required, because we can no longer say x is local in this block, but rather should attach scope information to each instance of x. This has some overhead, but it's only at the compilation stage so it shouldn't have a real effect on the runtime of Python code. This is also less convenient and clean than the current approach - this is why I'm wondering whether the behavior is an artifact of the implementation. x = 5 def foo (): print (x) if bar (): x = 1 print (x) Isaac Morland CSCF Web Guru DC 2554C, x36650WWW Software Specialist ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
Eli Bendersky, 09.05.2011 14:56: It's a known Python gotcha (*) that the following code: x = 5 def foo(): print(x) x = 1 print(x) foo() Will throw: UnboundLocalError: local variable 'x' referenced before assignment On the usage of 'x' in the *first* print. Recently, while reading the zillionth question on StackOverflow on some variation of this case, I started thinking whether this behavior is desired or just an implementation artifact. Well, basically any compiler these days can detect that a variable is being used before assignment, or at least that this is possibly the case, depending on prior branching. ISTM that your suggestion is to let x refer to the outer x up to the assignment and to the inner x from that point on. IMHO, that's much worse than the current behaviour and potentially impractical due to conditional assignments. However, it's also a semantic change to reject code with unbound locals at compile time, as the specific code in question may actually be unreachable at runtime. This makes me think that it would be best to discuss this on the python-ideas list first. If nothing else, I'd like to see a discussion on this behaviour being an implementation detail of CPython or a feature of the Python language. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
On May 9, 2011 6:59 AM, Eli Bendersky eli...@gmail.com wrote: Hi all, It's a known Python gotcha (*) that the following code: x = 5 def foo(): print(x) x = 1 print(x) foo() Will throw: UnboundLocalError: local variable 'x' referenced before assignment On the usage of 'x' in the *first* print. Recently, while reading the zillionth question on StackOverflow on some variation of this case, I started thinking whether this behavior is desired or just an implementation artifact. IIUC, the reason it behaves this way is that the symbol table logic goes over the code before the code generation runs, sees the assignment 'x = 1` and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST for all loads of 'x' in 'foo', even though 'x' is actually bound locally after the first print. When the bytecode is run, since it's LOAD_FAST and no store was made into the local 'x', ceval.c then throws the exception. On first sight, it's possible to signal that 'x' truly becomes local only after it's bound in the scope (and before that LOAD_NAME can be generated for it instead of LOAD_FAST). To do this, some modifications to the symbol table creation and usage are required, because we can no longer say x is local in this block, but rather should attach scope information to each instance of x. This has some overhead, but it's only at the compilation stage so it shouldn't have a real effect on the runtime of Python code. This is also less convenient and clean than the current approach - this is why I'm wondering whether the behavior is an artifact of the implementation. Would it not be worth to make Python's behavior more expected in this case, at the cost of some implementation complexity? What are the cons to making such a change? At least judging by the amount of people getting confused by it, maybe it's in line with the zen of Python to behave more explicitly here. This is about mixing scopes for the the same name in the same block, right? Perhaps a more specific error would be enough, unless there is a good use case for having that mixed scope for the name. -eric Thanks in advance, Eli (*) Variation of this FAQ: http://docs.python.org/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277: Remove useless test from test_zlib.
On Mon, May 9, 2011 at 2:53 PM, Jim Jewett jimjjew...@gmail.com wrote: Can you clarify (preferably in the commit message as well) exactly *why* these largefile tests are useless? For example, is there another test that covers this already? Ah, sorry about that. It was discussed on the tracker issue, but I guess I can't expect people to read through 90+ messages to figure it out :P The short version is that it was supposed to test 4GB+ inputs, but in 2.7, the functions being tested don't accept inputs that large. The details: The test was originally intended to catch the case where crc32() or adler32() would get a buffer of =4GB, and then silently truncate the buffer size and produce an incorrect result (issue10276). It had been written for 3.x, and then backported to 2.7. However, in 2.7, zlibmodule.c doesn't define PY_SSIZE_T_CLEAN, so passing in a buffer of =2GB raises an OverflowError (see issue8651). This means that it is impossible to trigger the bug in question on 2.7, making the test pointless. Of course, the code that was deleted tests with an input sized 2GB-1 or 1GB, rather than 4GB (the size used in 3.x). When the test was backported, the size of the input was reduced, to avoid triggering an OverflowException. At the time, no-one realized that this also would not trigger the bug being tested for; it only came to light when the test started crashing for unrelated reasons (issue11277). Cheers, Nadeem ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
2011/5/9 Victor Stinner victor.stin...@haypocalc.com: Hi, Commit changelogs are important to understand why the code was changed. I regulary use hg blame to search which commit introduced a particular line of code, and I am always happy if I can find an issue number because it usually contains the whole story. And since the migration to Mercurial, we have also a great tool adding a comment to an issue if the changelog contains an issue number (e.g. changelog starting with Issue #11: ...). So if someone watchs an issue (is in the nosy list), (s)he will be noticed that a related commit was pushed. It is not exactly something new: we already do that with Subversion except that today it is more automatic. I noticed that some recent commits don't contain the issue number: please try to always prefix your changelog with the issue number. It is not mandatory, but it helps me when I dig the Python history. -- For merge commits: many developers just write merge or merge 3.1. I have to go to the parent commit (and something to the grandparent, 3.1-3.2-3.3) to learn more about the commit. I thought the whole point of merging was that you brought a changeset from one branch to another. This why I just write merge because otherwise you're technically duplicating information that is pulled onto the branch by merging. It seems like something that should be solved by tools like a display visual graph indicating what is merged. (like Bazaar) -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Le lundi 09 mai 2011 à 09:00 -0400, Jim Jewett a écrit : Are you asserting that all foreign modules (or at least all handled by this) are in C, as opposed to C++ or even Java or Fortran? (And the C won't change?) C and C++ identifiers are restricted to ASCII. I don't know for Fortran or Java. Is it possible to write a CPython extension module in Java or Fortran? (My change doesn't concern Jython: it's an implementation detail of dynamic modules in CPython.) Is this ASCII restriction (as opposed to even UTF8) really needed? I prefer to explicitly limit module names of dynamic modules to ASCII. If we decide to extend the support to something else than ASCII, we will need a working module to test it, and maybe also a test. Or are you just saying that we need to create an ASCII name for passing to C? You pass a Unicode module name to import (import hé or __import__('hé')), and Python encodes the name to ASCII if it is a dynamic module. It is still possible to use non-ASCII module names, but only for modules written in Python. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
Le lundi 09 mai 2011 à 09:08 -0500, Benjamin Peterson a écrit : It seems like something that should be solved by tools like a display visual graph indicating what is merged. (like Bazaar) Yeah, we could fix buildbot, hg.python.org website, improve hg log, and all other tools using Mercurial. But until that, I would prefer to duplicate the information. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Borrowed and Stolen References in API
On Fri, May 6, 2011 at 8:27 PM, Antoine Pitrou solip...@pitrou.net wrote: On Fri, 06 May 2011 13:28:11 +1200 Greg Ewing greg.ew...@canterbury.ac.nz wrote: Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]: This is not always true, for example when the item is already present in the dict. It's not important to know what the function does to the object, Only the action on the reference is relevant. Yes, that's the whole point. When using a functon, what you need to know is whether it borrows or steals a reference. Doesn't borrow mean the same as steal in that context? If an API borrows a reference, I expect it to take it from me. Input parameter, borrowed or new reference: caller retains ownership and must still decref item Input parameter, stolen reference: caller transfers ownership and must NOT decref item (or must incref before call to guarantee lifecycle if planning to continue using the object after the call) Output parameter or return value, borrowed reference: caller does NOT receive ownership and does not need to decref item, but needs to be careful of lifecycle (and promote to a full reference with incref if the borrowed reference may outlive the original) Output parameter or return value, stolen or new reference: caller receives ownership and must decref item One interesting aspect is that from the caller's point of view, a *new* reference to the relevant behaves like a borrowed reference for input parameters, but like a stolen reference for output parameters and return values. It is typically the converse cases (stolen reference to an input parameter, borrowed reference to an output parameter or return value) that requires special attention on the caller's part. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
Eli Bendersky wrote: Hi all, It's a known Python gotcha (*) that the following code: x = 5 def foo(): print(x) x = 1 print(x) foo() Will throw: UnboundLocalError: local variable 'x' referenced before assignment I think part of the problem is that UnboundLocalError is a jargon name, while it's predecessor NameError (used up to Python 1.5) is far more intuitively obvious. On the usage of 'x' in the *first* print. Recently, while reading the zillionth question on StackOverflow on some variation of this case, I started thinking whether this behavior is desired or just an implementation artifact. [...] Would it not be worth to make Python's behavior more expected in this case, at the cost of some implementation complexity? What are the cons to making such a change? At least judging by the amount of people getting confused by it, maybe it's in line with the zen of Python to behave more explicitly here. I think you are making an unwarranted assumption about what is more expected. I presume you are thinking that the expected behaviour is that foo() should: print global x (5) assign 1 to local x print local x (1) If we implemented this change, there would be no more questions about UnboundLocalError, but instead there would be lots of questions like why is it that globals revert to their old value after I change them in a function?. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problems with regrtest and with logging
On Sat, May 7, 2011 at 3:51 AM, Éric Araujo mer...@netwok.org wrote: regrtest helpfully reports when a test leaves the environment unclean (sys.path, os.environ, logging._handlerList), but I think the implementation is buggy: it compares object identity and then value. Why is comparing identity useful? I’d just use ==. It makes writing cleanup code easier (just use addCleanup(setattr, obj, 'attr', copy(obj.attr))). Because changing the identity of any of those global state attributes that regrtest monitors is itself suggestive of a bug. When it comes to containers, identity matters at least as much as value does (and sometimes more so - e.g. sys.modules). Replacing those global containers with new ones isn't guaranteed to work, as they may be cached in various places rather than always retrieved fresh from the relevant module namespace. Modifying them in place, on the other hand, does the right thing even in the presence of cached references. A comment to that effect may be a useful addition to regrtest, as I expect others may have similar questions about those identity checks in the future. (It may even be a useful addition to the documentation, but I have no idea where it could be sensibly included). Also, don't be surprised if wholesale cleanup like that isn't completely reliable - it's far, far better if the test case understands the changes it is making (even indirectly) and explicitly reverses them. Save-and-restore should be a last resort technique (although context managers that are designed for more general use, such as warnings.catch_warnings(), use save-and-restore by necessity, since they have no control over the body of the relevant with statements). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
I think you are making an unwarranted assumption about what is more expected. I presume you are thinking that the expected behaviour is that foo() should: print global x (5) assign 1 to local x print local x (1) If we implemented this change, there would be no more questions about UnboundLocalError, but instead there would be lots of questions like why is it that globals revert to their old value after I change them in a function?. True, but this is less confusing and follows the rules in a more straightforward way. x = 1 without a 'global x' assigns a local x, this make sense and is similar to what happens in C where an inner declaration temporarily shadows a global one. Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
On Mon, May 9, 2011 at 11:00 PM, Jim Jewett jimjjew...@gmail.com wrote: Are you asserting that all foreign modules (or at least all handled by this) are in C, as opposed to C++ or even Java or Fortran? (And the C won't change?) The extension module that interfaces them to CPython will be written in C, or something that can export a C-compatible library interface (after reading in the Python C API headers). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
x = 5 def foo (): print (x) if bar (): x = 1 print (x) I wish you'd annotate this code sample, what do you intend it to demonstrate? It probably shows the original complaint even more strongly. As for being a problem with the suggested solution, I suppose you're right, although it doesn't make it much different. Still, before a *possible* assignment to 'x', it should be loaded as LOAD_NAME since it was surely not bound as local, yet. Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
On Tue, May 10, 2011 at 1:01 AM, Eli Bendersky eli...@gmail.com wrote: I think you are making an unwarranted assumption about what is more expected. I presume you are thinking that the expected behaviour is that foo() should: print global x (5) assign 1 to local x print local x (1) If we implemented this change, there would be no more questions about UnboundLocalError, but instead there would be lots of questions like why is it that globals revert to their old value after I change them in a function?. True, but this is less confusing and follows the rules in a more straightforward way. x = 1 without a 'global x' assigns a local x, this make sense and is similar to what happens in C where an inner declaration temporarily shadows a global one. However, since flow control constructs in Python don't create new scopes (unlike C/C++), you run into a fundamental problem with cases like the one Isaac posted, or even nastier ones like the following: def f(): if bar(): fill = 1 else: fiil = 2 print(fill) # Q: What does this do when bool(bar()) is False? Since we want to make the decision categorically at compile-time, the simplest, least-confusing option is to say assignment makes a variable name local, referencing it before the first assignment is now an error. I don't know of anyone that particularly *likes* UnboundLocalError, but it's better than letting errors like the one above pass silently. (It obviously doesn't trap *all* typo-related errors, but it at least lets you reason sanely about name bindings) On the reasoning-sanely front, closures likely present a more compelling argument: def f(): def g(): print(x) # We want this to refer to the closure in f(), thanks x = 1 return g UnboundLocalError is really about aligning the rules for the current scope with those for references from nested scopes (i.e. x is a local variable of f, whether it is referenced from f's local scope, or any nested scope within f) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
On Tue, May 10, 2011 at 1:06 AM, Eli Bendersky eli...@gmail.com wrote: It probably shows the original complaint even more strongly. As for being a problem with the suggested solution, I suppose you're right, although it doesn't make it much different. Still, before a *possible* assignment to 'x', it should be loaded as LOAD_NAME since it was surely not bound as local, yet. Yeah, I've decided I'm happier with the closure based arguments than the conditional statement related ones. Assignments create local variables is a relatively simple rule to reason about, and is equally valid for the current scope and for any nested scopes. The symtable analysis for nested scopes is ordering independent (and can't be changed for backwards compatibility reasons if nothing else), and UnboundLocalError is a natural outgrowth of applying those semantics to the current scope as well. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson benja...@python.org wrote: I thought the whole point of merging was that you brought a changeset from one branch to another. This why I just write merge because otherwise you're technically duplicating information that is pulled onto the branch by merging. No it isn't. The commit message isn't pulled into the new branch. It seems like something that should be solved by tools like a display visual graph indicating what is merged. (like Bazaar) You'd need some extension to hg log that would show the original commit message for the first changeset in the merge line in order to fix this. I doubt that is going to happen. Note that saying just 'merge' makes perfect sense when you are pulling in a whole group of changesets in order to synchronize two branches. But if you are applying a single changeset to multiple branches, as we often do in our workflow, then I think duplicating the commit message is (1) easy to do and (2) very helpful when looking at hg log output. -- R. David Murray http://www.bitdance.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
On Mon, 9 May 2011, Eli Bendersky wrote: x = 5 def foo (): print (x) if bar (): x = 1 print (x) I wish you'd annotate this code sample, what do you intend it to demonstrate? It probably shows the original complaint even more strongly. As for being a problem with the suggested solution, I suppose you're right, although it doesn't make it much different. Still, before a *possible* assignment to 'x', it should be loaded as LOAD_NAME since it was surely not bound as local, yet. Extrapolating from your suggestion, you're saying before a *possible* assignment it will be treated as global, and after a *possible* assignment it will be treated as local? But surely: print (x) if False: x = 1 print (x) should always print the same thing twice (in the absence of actions taken by other threads)! Replace False by something that is usually (but not always) True, and print (x) by something that actually does something, and you had best put on your helmet because it's going to be a fun ride. But I won't be on it. The idea that the same name within the same scope always refers to the same value is an idea from functional programming and not part of Python; but surely the same name within the same scope should at least always refer to the same variable! If something is to be done here, it occurs to me that the same parser that decides that the initial reference to x should use the local x could conceivably issue an error right away - local variable can never be assigned before use rather than waiting until runtime. But even if I haven't confused myself about the possibility of this raising a false positive (and it certainly could in the presence of dead code), it wouldn't catch cases of conditional premature use of a local variable. I think in those cases people would still ask the same questions they do with the existing implementation. Isaac Morland CSCF Web Guru DC 2554C, x36650WWW Software Specialist ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
Hi, Le 09/05/2011 16:08, Benjamin Peterson a écrit : 2011/5/9 Victor Stinner victor.stin...@haypocalc.com: For merge commits: many developers just write merge or merge 3.1. I have to go to the parent commit (and something to the grandparent, 3.1-3.2-3.3) to learn more about the commit. I follow conventions I’ve seen elsewhere (maybe Mercurial itself): I use “Branch merge” when I merge anonymous branches on the same named branch, and “Merge x.y” for forward-porting across named branches. I also tend to do more than one commit before merging. It would not be very easy with my current toolchain to get the commit message(s) to insert into the new message, and I think it’s not necessary. I thought the whole point of merging was that you brought a changeset from one branch to another. This why I just write merge because otherwise you're technically duplicating information that is pulled onto the branch by merging. +1. No interest in manually duplicating available information. Le 09/05/2011 17:44, R. David Murray a écrit : No it isn't. The commit message isn't pulled into the new branch. Sorry, your terminology does not make sense. If you mean that the commit message is not reused in the new commit after the merge, it’s true. However, the commit message with the relevant information is available as part of the changesets that have been pulled and merged. Regards ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problems with regrtest and with logging
Hi, Thanks for the help. I didn’t know about handler.close. (By which I mean that I used logging without re-reading its documentation, which is a testimony to its usability :) The cases you refer to seem to be _set_logger in packaging/run.py (which appears not to be used at all - there appear to be no other references to it in the code), Yep, probably dead code. I think that an handler should be defined only once, in the “if __name__ == '__main__'” block. Am I right? Just like you don’t call sys.exit from library code (hello optparse!), you don’t set logging handlers in library code, only in the outmost layer of the script. Dispatcher.__init__ in packaging/run.py and This is the new-fangled command line parser, which can run global (Python-wide) commands (search, uninstall, etc.) as well as traditional project-wide commands (build, check, sdist, etc.) Distribution.parse_command_line in packaging/dist.py. This is the older command line parser, that can handle only project-wide commands. I’m not sure the work is finished to integrate both parsers; my smoke test used to be --help-commands, which can be hard to run these days. The problem is that Dispatcher or Distribution get the quiet or verbose options from the command-line deep in the library code, and want to use it to configure the log level on the handler, which I’ve just said should be set up at a much higher level. To solve this, I’m going to add a *logginghandler* argument to Dispatcher/Distribution; that way, the creation of the handler will happen only once and at a high level, but the command-line parsing code will be able to set the log handler from the command-line arguments. :) In the second and third cases, can you be sure that only one of these code paths will be executed, at most once? Gut feeling is yes, but we’ve learned not to trust our instinct with distutils. In the case of the test support code, I'm not really sure that LoggingCatcher is needed. There is already a TestHandler class in test.support which captures records in a buffer, and allows flexible matching for assertions, as described in distutils used its own log module; this mixin was used to intercept messages sent with this system. When we migrated to stdlib logging, I added a todo comment to update the code to use something less kludgy :) The post you linked to is already in my bookmarks. Note that this support module also helps with Python 2.4+, so I may have to copy-paste TestHandler. So, I will fix the LoggingCatcher mixin to use the much cleaner addHandler/removeHandler combo (I’ll avoid calling logging._removeHandlerRef if I don’t have to) and try my idea about the handler instantiation in the code. Thanks a lot! Cheers ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
Eli Bendersky wrote: I think you are making an unwarranted assumption about what is more expected. I presume you are thinking that the expected behaviour is that foo() should: print global x (5) assign 1 to local x print local x (1) If we implemented this change, there would be no more questions about UnboundLocalError, but instead there would be lots of questions like why is it that globals revert to their old value after I change them in a function?. True, but this is less confusing and follows the rules in a more straightforward way. x = 1 without a 'global x' assigns a local x, this make sense and is similar to what happens in C where an inner declaration temporarily shadows a global one. I disagree that it is less confusing. Instead of a nice, straightforward error that you can google, the function will silently do the wrong thing, giving no clue that weirdness is happening. def spam(): if x 0: # refers to global x x = 1 # now local if x 0: # could be either global or local x = x - 1 # local on the LHS of the equal # sometimes global on the RHS else: x += 1 # local x, but what value does it have? Just thinking about debugging the mess that this could make gives me a headache. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problems with regrtest and with logging
Hi, When it comes to containers, identity matters at least as much as value does (and sometimes more so - e.g. sys.modules). Replacing those global containers with new ones isn't guaranteed to work, as they may be cached in various places rather than always retrieved fresh from the relevant module namespace. Modifying them in place, on the other hand, does the right thing even in the presence of cached references. That makes sense, thanks for the explanation! A comment to that effect may be a useful addition to regrtest, as I expect others may have similar questions about those identity checks in the future. (It may even be a useful addition to the documentation, but I have no idea where it could be sensibly included). Somewhere in unittest doc, say in the section about tearDown. Or maybe it’s time for a Python testing best practices howto? Also, don't be surprised if wholesale cleanup like that isn't completely reliable - it's far, far better if the test case understands the changes it is making (even indirectly) and explicitly reverses them. Yep, I was probably bringing out the big guns too early. self.addCleanup(sys.path.remove, path) is better and even shorter than my previous code! Cheers ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
On 5/9/2011 9:27 AM, Stefan Behnel wrote: Eli Bendersky, 09.05.2011 14:56: It's a known Python gotcha (*) that the following code: x = 5 def foo(): print(x) x = 1 print(x) foo() Will throw: UnboundLocalError: local variable 'x' referenced before assignment On the usage of 'x' in the *first* print. Recently, while reading the zillionth question on StackOverflow on some variation of this case, I started thinking whether this behavior is desired or just an implementation artifact. Well, basically any compiler these days can detect that a variable is being used before assignment, or at least that this is possibly the case, depending on prior branching. ISTM that your suggestion is to let x refer to the outer x up to the assignment and to the inner x from that point on. IMHO, that's much worse than the current behaviour and potentially impractical due to conditional assignments. However, it's also a semantic change to reject code with unbound locals at compile time, as the specific code in question may actually be unreachable at runtime. This makes me think that it would be best to discuss this on the python-ideas list first. If nothing else, I'd like to see a discussion on this behaviour being an implementation detail of CPython or a feature of the Python language. Stefan -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Commit messages: please avoid temporal ambiguity
A commit (push) partition time and behavior into before and after (with a short change period in between during which behavior is undefined). Some commit messages have the form 'x does y'. Does 'does' mean before or after? Sometimes that is clear. 'x crashes' means before. 'x return correct value' means after. But some messages of this type are unclear to me as written. Consider 'x raises exception'? The temporal reference is obvious to the committer but not necessary to everyone else. It could mean 'x used to segfault and now raises a catchable exception'. There was a fix like this (with a clear message) just today. It could also mean 'x used to raise but now return an answer. There have been many fixes like this. Two minimal fixes are 'x raised exception' or 'make x raise exception'. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problems with regrtest and with logging
Éric Araujo merwok at netwok.org writes: Yep, probably dead code. I think that an handler should be defined only once, in the “if __name__ == '__main__'” block. Am I right? Just like you don’t call sys.exit from library code (hello optparse!), you don’t set logging handlers in library code, only in the outmost layer of the script. That's right, though it's OK to provide a documented convenience API for adding handlers. The problem is that Dispatcher or Distribution get the quiet or verbose options from the command-line deep in the library code, and want to use it to configure the log level on the handler, which I’ve just said should be set up at a much higher level. To solve this, I’m going to add a *logginghandler* argument to Dispatcher/Distribution; that way, the creation of the handler will happen only once and at a high level, but the command-line parsing code will be able to set the log handler from the command-line arguments. :) You don't necessarily need to set the level on the handler - why can you not just set it on the logger? The effect would often be the same: the logger's level is checked first, and then the handler's level. Generally you set levels on handlers when you want specific behaviour, such as all ERROR and above to a particular file, all CRITICAL to an email handler etc. For command-line scripts outputting to the console and nowhere else, usually you could just add a StreamHandler (with no level set on it), and set the level on the logger. Where the functionality may be used in an API, you should perhaps check logger.hasHandlers() and avoid adding handlers if there are already some added by a using library or application. Regards, Vinay Sajip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
On Mon, 09 May 2011 17:55:42 +0200, =?UTF-8?Q?=C3=89ric_Araujo?= mer...@netwok.org wrote: Le 09/05/2011 16:08, Benjamin Peterson a écrit : 2011/5/9 Victor Stinner victor.stin...@haypocalc.com: For merge commits: many developers just write merge or merge 3.1. I have to go to the parent commit (and something to the grandparent, 3.1-3.2-3.3) to learn more about the commit. I follow conventions Iâve seen elsewhere (maybe Mercurial itself): I use âBranch mergeâ when I merge anonymous branches on the same named branch, and âMerge x.yâ for forward-porting across named branches. I also tend to do more than one commit before merging. It would not be very easy with my current toolchain to get the commit message(s) to insert into the new message, and I think itâs not necessary. I thought the whole point of merging was that you brought a changeset from one branch to another. This why I just write merge because otherwise you're technically duplicating information that is pulled onto the branch by merging. +1. No interest in manually duplicating available information. Le 09/05/2011 17:44, R. David Murray a écrit : No it isn't. The commit message isn't pulled into the new branch. Sorry, your terminology does not make sense. If you mean that the commit message is not reused in the new commit after the merge, itâs true. However, the commit message with the relevant information is available as part of the changesets that have been pulled and merged. The changesets are in the repository and there are pointers to them from the merge changeset, sure, but the data isn't in the checkout (that's how I understood pulled in to the new branch). If I do 'hg log' and search for a revno (that I got from hg annotate), the commit message describing the change is not attached to that revno, nor as far as I know is there a tool that makes it easy to get from that revno to the explanatory commit message. That's what Victor and I are talking about. Is there a tool that fixes this problem? (svnmerge did a nice job of that from the automate-the-message-generation end of things). -- R. David Murray http://www.bitdance.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit messages: please avoid temporal ambiguity
On 5/9/2011 1:24 PM, Terry Reedy wrote: A commit (push) partition time and behavior into before and after (with a short change period in between during which behavior is undefined). Some commit messages have the form 'x does y'. Does 'does' mean before or after? Sometimes that is clear. 'x crashes' means before. 'x return correct value' means after. But some messages of this type are unclear to me as written. Consider 'x raises exception'? The temporal reference is obvious to the committer but not necessary to everyone else. It could mean 'x used to segfault and now raises a catchable exception'. There was a fix like this (with a clear message) just today. It could also mean 'x used to raise but now return an answer. There have been many fixes like this. Two minimal fixes are 'x raised exception' or 'make x raise exception'. I've always favored X now properly raises an exception. --Ned. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit messages: please avoid temporal ambiguity
On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder n...@nedbatchelder.com wrote: On 5/9/2011 1:24 PM, Terry Reedy wrote: A commit (push) partition time and behavior into before and after (with a short change period in between during which behavior is undefined). Some commit messages have the form 'x does y'. Does 'does' mean before or after? Sometimes that is clear. 'x crashes' means before. 'x return correct value' means after. But some messages of this type are unclear to me as written. Consider 'x raises exception'? The temporal reference is obvious to the committer but not necessary to everyone else. It could mean 'x used to segfault and now raises a catchable exception'. There was a fix like this (with a clear message) just today. It could also mean 'x used to raise but now return an answer. There have been many fixes like this. Two minimal fixes are 'x raised exception' or 'make x raise exception'. I've always favored X now properly raises an exception. While my own preference is make X properly raise an exception I'm happy with any of the alternatives proposed here, and grateful to Terry for calling this out. Checkin comments of the form X does Y are ambiguous and confusing. (Same for feature requests in the tracker.) I'm curious where the habit to use the present tense comes from; I wonder if it originates in some agile development practice? -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit messages: please avoid temporal ambiguity
On 05/09/2011 03:17 PM, Guido van Rossum wrote: On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder n...@nedbatchelder.com wrote: On 5/9/2011 1:24 PM, Terry Reedy wrote: A commit (push) partition time and behavior into before and after (with a short change period in between during which behavior is undefined). Some commit messages have the form 'x does y'. Does 'does' mean before or after? Sometimes that is clear. 'x crashes' means before. 'x return correct value' means after. But some messages of this type are unclear to me as written. Consider 'x raises exception'? The temporal reference is obvious to the committer but not necessary to everyone else. It could mean 'x used to segfault and now raises a catchable exception'. There was a fix like this (with a clear message) just today. It could also mean 'x used to raise but now return an answer. There have been many fixes like this. Two minimal fixes are 'x raised exception' or 'make x raise exception'. I've always favored X now properly raises an exception. While my own preference is make X properly raise an exception I'm happy with any of the alternatives proposed here, and grateful to Terry for calling this out. Checkin comments of the form X does Y are ambiguous and confusing. (Same for feature requests in the tracker.) I'm curious where the habit to use the present tense comes from; I wonder if it originates in some agile development practice? Thanks indeed for bringing this up, Terry. It's been on my to-do list for a while. I think it comes from just copying the title of a bug report. The bug is X does Y, and that's what's used in the fix. Eric. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit messages: please avoid temporal ambiguity
On Mon, May 9, 2011 at 12:36 PM, Eric Smith e...@trueblade.com wrote: On 05/09/2011 03:17 PM, Guido van Rossum wrote: On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder n...@nedbatchelder.com wrote: On 5/9/2011 1:24 PM, Terry Reedy wrote: A commit (push) partition time and behavior into before and after (with a short change period in between during which behavior is undefined). Some commit messages have the form 'x does y'. Does 'does' mean before or after? Sometimes that is clear. 'x crashes' means before. 'x return correct value' means after. But some messages of this type are unclear to me as written. Consider 'x raises exception'? The temporal reference is obvious to the committer but not necessary to everyone else. It could mean 'x used to segfault and now raises a catchable exception'. There was a fix like this (with a clear message) just today. It could also mean 'x used to raise but now return an answer. There have been many fixes like this. Two minimal fixes are 'x raised exception' or 'make x raise exception'. I've always favored X now properly raises an exception. While my own preference is make X properly raise an exception I'm happy with any of the alternatives proposed here, and grateful to Terry for calling this out. Checkin comments of the form X does Y are ambiguous and confusing. (Same for feature requests in the tracker.) I'm curious where the habit to use the present tense comes from; I wonder if it originates in some agile development practice? Thanks indeed for bringing this up, Terry. It's been on my to-do list for a while. I think it comes from just copying the title of a bug report. The bug is X does Y, and that's what's used in the fix. But in bug reports it is also ambiguous, since I've often seen it used meaning X should do Y which is very confusing when it doesn't do Y yet at the time the bug is created. :-( -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit messages: please avoid temporal ambiguity
On 5/9/2011 4:05 PM, Guido van Rossum wrote: On Mon, May 9, 2011 at 12:36 PM, Eric Smithe...@trueblade.com wrote: On 05/09/2011 03:17 PM, Guido van Rossum wrote: While my own preference is make X properly raise an exception I'm happy with any of the alternatives proposed here, and grateful to Terry for calling this out. I am willing to admit that I do not know all corners of Python ;-) I read the commit messages to learn more; in particular what sort of errors exist and how are they fixed. Checkin comments of the form X does Y are ambiguous and confusing. (Same for feature requests in the tracker.) I have always assumed that an issue entitled 'x does y' is a bug report about doing y now, before a fix. Thanks indeed for bringing this up, Terry. It's been on my to-do list for a while. I think it comes from just copying the title of a bug report. The bug is X does Y, and that's what's used in the fix. I have also seen this type of message for non-tracker-issue commits. But in bug reports it is also ambiguous, since I've often seen it used meaning X should do Y which is very confusing when it doesn't do Y yet at the time the bug is created. :-( If I notice a title that bad, I will try to change it. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
On 5/9/2011 1:54 PM, R. David Murray wrote: If I do 'hg log' and search for a revno (that I got from hg annotate), the commit message describing the change is not attached to that revno, nor as far as I know is there a tool that makes it easy to get from that revno to the explanatory commit message. That's what Victor and I are talking about. Is there a tool that fixes this problem? (svnmerge did a nice job of that from the automate-the-message-generation end of things). TortoiseSvn, and I presume TortoiseHg also, has a 'recent messages' box that makes is trivial to reuse a message. I used it with svn and will make sure to use it, if it exists, when I get started with hg. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
2011/5/9 R. David Murray rdmur...@bitdance.com: On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson benja...@python.org wrote: I thought the whole point of merging was that you brought a changeset from one branch to another. This why I just write merge because otherwise you're technically duplicating information that is pulled onto the branch by merging. No it isn't. The commit message isn't pulled into the new branch. It seems like something that should be solved by tools like a display visual graph indicating what is merged. (like Bazaar) You'd need some extension to hg log that would show the original commit message for the first changeset in the merge line in order to fix this. I doubt that is going to happen. *cough* http://mercurial.selenic.com/wiki/GraphlogExtension Note that saying just 'merge' makes perfect sense when you are pulling in a whole group of changesets in order to synchronize two branches. But if you are applying a single changeset to multiple branches, as we often do in our workflow, then I think duplicating the commit message is (1) easy to do and (2) very helpful when looking at hg log output. What's the difference between pulling multiple changesets in and one then? -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Victor Stinner: C and C++ identifiers are restricted to ASCII. I don't know for Fortran or Java. Some C and C++ implementations currently allow non-ASCII identifiers and the forthcoming C1X and C++0x language standards include non-ASCII identifiers. The allowed characters are specified in Annexes of the respective standards. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
On Mon, 09 May 2011 16:11:15 +0200 Victor Stinner victor.stin...@haypocalc.com wrote: Le lundi 09 mai 2011 à 09:00 -0400, Jim Jewett a écrit : Are you asserting that all foreign modules (or at least all handled by this) are in C, as opposed to C++ or even Java or Fortran? (And the C won't change?) C and C++ identifiers are restricted to ASCII. I don't know for Fortran or Java. Why is it important, though? What matters is not what C/C++ can produce, but what a shared library can export. So the question is: are shared libraries limited to ASCII symbols? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Borrowed and Stolen References in API
Nick Coghlan wrote: One interesting aspect is that from the caller's point of view, a *new* reference to the relevant behaves like a borrowed reference for input parameters, but like a stolen reference for output parameters and return values. I think it's less confusing to use the term new only for output/return values, and stolen only for input values. Inputs are either borrowed or stolen (by the callee). Outputs are either new (to the caller) or borrowed (by the caller). (Or maybe the terms for outputs should be given and lent?-) -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Le mardi 10 mai 2011 à 09:52 +1000, Neil Hodgson a écrit : Some C and C++ implementations currently allow non-ASCII identifiers and the forthcoming C1X and C++0x language standards include non-ASCII identifiers. The allowed characters are specified in Annexes of the respective standards. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E I read these documents but they don't explain which encoding is used in libraries and programs. Does it mean that Windows and Linux may use different encodings? At least, the surrogate range (U+DC00-U+DFFF) is excluded, which is a good news (UTF-8 decoder of Python 3 rejects surrogate characters). I discovered -fextended-identifiers option of gcc: using this option, you can use \u and \U in identifiers, but not \xHH. On Linux, identifiers are encoded to UTF-8. Example: -- #define _ISOC99_SOURCE #include stdio.h int f\u00E9() { wprintf(LU+00E9 = \xE9\n); } int g\U00E8() { wprintf(LU+00E8 = \xE8\n); } int main() { f\u00E9(); g\U00E8(); return 0; } -- It's not very practical, I would prefer to write directly Unicode characters (as I can do in Python 3!). I'm not sure that chineses will prefer to call \u4f60\u597d() instead of hello(). Ok, I now agree, it is possible to use non-ASCII characters in C. But what about the encoding of symbols in a dynamic library: is it always UTF-8? Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Victor Stinner: I read these documents but they don't explain which encoding is used in libraries and programs. Does it mean that Windows and Linux may use different encodings? Yes, Windows will use UTF-16 as it does for almost everything. From a user's point of view, these should both just be seen as Unicode. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Borrowed and Stolen References in API
Marvin Humphrey wrote: incremented: The caller has to account for an additional refcount. decremented: The caller has to account for a lost refcount. I'm not sure that really clarifies anything. These terms sound like they're talking about the reference count of the object, but if they correspond to borrowed/stolen, they don't necessarily correlate with what actually happens to the reference count. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Borrowed and Stolen References in API
On Tue, May 10, 2011 at 12:13:47PM +1200, Greg Ewing wrote: Nick Coghlan wrote: One interesting aspect is that from the caller's point of view, a *new* reference to the relevant behaves like a borrowed reference for input parameters, but like a stolen reference for output parameters and return values. I think it's less confusing to use the term new only for output/return values, and stolen only for input values. Inputs are either borrowed or stolen (by the callee). Outputs are either new (to the caller) or borrowed (by the caller). (Or maybe the terms for outputs should be given and lent?-) To solve this problem in a similar system (the Clownfish object system used by Apache Lucy) we used the keywords incremented and decremented. Applied to some Python C API function documentation: incremented PyObject* PyTuple_New(Py_ssize_t len) int PyTuple_SetItem(PyObject *p, Py_ssize_t pos, decremented PyObject *o) With incremented and decremented, the perspective is always that of the caller. incremented: The caller has to account for an additional refcount. decremented: The caller has to account for a lost refcount. Marvin Humphrey ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson benja...@python.org wrote: 2011/5/9 R. David Murray rdmur...@bitdance.com: On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson benja...@python.or= g wrote: I thought the whole point of merging was that you brought a changeset from one branch to another. This why I just write merge because otherwise you're technically duplicating information that is pulled onto the branch by merging. No it isn't. =C2=A0The commit message isn't pulled into the new branch. It seems like something that should be solved by tools like a display visual graph indicating what is merged. (like Bazaar) You'd need some extension to hg log that would show the original commit message for the first changeset in the merge line in order to fix this. =C2=A0I doubt that is going to happen. *cough* http://mercurial.selenic.com/wiki/GraphlogExtension I'm sorry, but I've looked at the output of that and the mental overhead has so far proven too high for it to be of any use to me. I apologize for not having made the full mental transition to distributed VCS/DAG (apparently), but it sounds like I'm not the only one Note that saying just 'merge' makes perfect sense when you are pulling in a whole group of changesets in order to synchronize two branches. But if you are applying a single changeset to multiple branches, as we often do in our workflow, then I think duplicating the commit message is (1) easy to do and (2) very helpful when looking at hg log output. What's the difference between pulling multiple changesets in and one then? I'm talking about merging trunk to a feature branch, for example. I'd not expect any message other than 'merge' for that. I'd be satisfied if the commit messages listed the issue numbers involved in the merge, especially if someone (like Ãric) is merging more than one change at a time. But as I think about this, frankly I'd rather see atomic commits, even on merges. That was something I disliked about svnmerge, the fact that often an svnmerge commit involved many changesets from the other branch. That was especially painful in exactly the same situation: trying to backtrack a change starting from 'svn blame'. I limited my own use of multiple-changeset-svnmerge to doc changes and changesets that were actually related, despite the overhead involved in doing it that way. All that said, I'm not trying to impose my will on the workflow, I'll certainly live with the consensus (though unless there is an outcry against it I'll continue putting the full commit message in my own merges). -- R. David Murray http://www.bitdance.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Borrowed and Stolen References in API
On Tue, May 10, 2011 at 01:28:04PM +1200, Greg Ewing wrote: Marvin Humphrey wrote: incremented: The caller has to account for an additional refcount. decremented: The caller has to account for a lost refcount. I'm not sure that really clarifies anything. These terms sound like they're talking about the reference count of the object, but if they correspond to borrowed/stolen, they don't necessarily correlate with what actually happens to the reference count. Hmm, they don't correspond to borrowed/stolen. stolen from the caller - decremented stolen from the callee - incremented borrowed - [no modifier] We don't have a modifier keyword which is analogous to borrowed. The user is expected to understand object lifespan issues for borrowed references without explicit guidance. With regards to what actually happens to the reference count, I would argue that incremented and decremented are accurate descriptions. * When a function returns an incremented object, that function has added a refcount to it. * When a function accepts a decremented object as an argument, it will consume a refcount from it -- either right away, or at some point in the future. In my view, it is not desirable to label arguments or return values as borrowed; it is only necessary to advise the user when they must take action to account for a refcount, gained or lost. Cheers, Marvin Humphrey ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Commit changelog: issue number and merges
R. David Murray writes: On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson benja...@python.org wrote: *cough* http://mercurial.selenic.com/wiki/GraphlogExtension I'm sorry, but I've looked at the output of that and the mental overhead has so far proven too high for it to be of any use to me. How about the hgk extension, and hg view? http://mercurial.selenic.com/wiki/HgkExtension But as I think about this, frankly I'd rather see atomic commits, even on merges. That was something I disliked about svnmerge, the fact that often an svnmerge commit involved many changesets from the other branch. That was especially painful in exactly the same situation: trying to backtrack a change starting from 'svn blame'. I don't understand the issue. In my experience, hg annotate will point to the commit on the branch, not to the merge, unless there was a conflict, in which case the merge is the right place (although not necessarily the most useful place) to point. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
On Mon, May 9, 2011 at 20:08, Neil Hodgson nyamaton...@gmail.com wrote: Yes, Windows will use UTF-16 as it does for almost everything. From a user's point of view, these should both just be seen as Unicode. I'm not convinced this is correct for this case. GetProcAddress takes an ANSI string, meaning while it could theoretically use UTF-8, in practice I doubt it uses anything outside of ASCII safely. So while the name of the library would be encoded in UTF-16, the name of the function loaded from the library would not be. http://msdn.microsoft.com/en-us/library/ms683212(v=vs.85).aspx -- Michael Urman ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Michael Urman: I'm not convinced this is correct for this case. GetProcAddress takes an ANSI string, meaning while it could theoretically use UTF-8, in practice I doubt it uses anything outside of ASCII safely. So while the name of the library would be encoded in UTF-16, the name of the function loaded from the library would not be. Yes you are right: http://scintilla.org/NarrowName.png Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
On Mon, May 9, 2011 at 23:09, Neil Hodgson nyamaton...@gmail.com wrote: Michael Urman: I'm not convinced this is correct for this case. GetProcAddress takes an ANSI string, meaning while it could theoretically use UTF-8, in practice I doubt it uses anything outside of ASCII safely. So while the name of the library would be encoded in UTF-16, the name of the function loaded from the library would not be. Yes you are right: http://scintilla.org/NarrowName.png Neil That screenshot seems to show UTF-8 is being used. This may just be the literal bytes in the .c file, but could it be something more dependable? http://unicode.org/cgi-bin/GetUnihanData.pl?codepoint=6728 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Michael Urman: That screenshot seems to show UTF-8 is being used. This may just be the literal bytes in the .c file, but could it be something more dependable? The file is in UTF-8 so the compiler may just be copying the bytes. There is a setlocale pragma but that seems to be just for string literals. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] more timely detection of unbound locals
On Mon, May 9, 2011 at 18:44, Isaac Morland ijmor...@uwaterloo.ca wrote: On Mon, 9 May 2011, Eli Bendersky wrote: x = 5 def foo (): print (x) if bar (): x = 1 print (x) I wish you'd annotate this code sample, what do you intend it to demonstrate? It probably shows the original complaint even more strongly. As for being a problem with the suggested solution, I suppose you're right, although it doesn't make it much different. Still, before a *possible* assignment to 'x', it should be loaded as LOAD_NAME since it was surely not bound as local, yet. Extrapolating from your suggestion, you're saying before a *possible* assignment it will be treated as global, and after a *possible* assignment it will be treated as local? But surely: print (x) if False: x = 1 print (x) [snip] Alright, I now understand the problems with the suggestion. Indeed, conditional assignments that are only really resolved at runtime are the big stumbling block here. However, maybe the error message/reporting can still be improved? ISTM the UnboundLocalError exception gets raised only in those weird and confusing cases. After all, why would Python decide an access to some name is to a local? Only if it found an assignment to that local in the scope. But that assignment clearly didn't happen yet, so the error is thrown. So cases like these: x = 2 def foo1(): x += 1 def foo2(): print(x) x = 10 def foo3(): if something_that_didnot_happen: x = 10 print(x) All belong to the category. With an unlimited error message length it could make sense to say Hey, I see 'x' may be assigned in this scope, so I mark it local. But this access to 'x' happens before assignment - so ERROR. This isn't realistic, of course, so I'm wondering: 1. Does this error message (although unrealistic) capture all possible appearances of UnboundLocalError? 2. If the answer to (1) is yes - could it be usefully shortened to be clearer than the current local variable referenced before assignment? This may not be possible, of course, but it doesn't harm trying :-) Eli ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com