[Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread Victor Stinner
Hi,

Commit changelogs are important to understand why the code was changed.
I regulary use hg blame to search which commit introduced a particular
line of code, and I am always happy if I can find an issue number
because it usually contains the whole story.

And since the migration to Mercurial, we have also a great tool adding a
comment to an issue if the changelog contains an issue number (e.g.
changelog starting with Issue #11: ...). So if someone watchs an
issue (is in the nosy list), (s)he will be noticed that a related commit
was pushed. It is not exactly something new: we already do that with
Subversion except that today it is more automatic.

I noticed that some recent commits don't contain the issue number:
please try to always prefix your changelog with the issue number. It is
not mandatory, but it helps me when I dig the Python history.

--

For merge commits: many developers just write merge or merge 3.1. I
have to go to the parent commit (and something to the grandparent,
3.1-3.2-3.3) to learn more about the commit.

Would it be possible to repeat the changelog of the original commit in
the merge commits? svnmerge toold prepared a nice changelog containing
the changelog of all pendings commits, even when a commit was blocked.

For a merge commit, I copy/paste the changelog of the original commit
and I add a (Merge 3.1)  prefix. I prefer to add explictly a prefix
because it is not easy to notice that it is a merge commit in a
python-checkins email or in the history of hg.python.org.

We need maybe new tools to help the process.

--

Usecases needing better changelogs:

 - All changes section of a buildbot build
 - hg blame (or just hg log)

Victor

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread R. David Murray
On Mon, 09 May 2011 12:32:48 +0200, Victor Stinner 
victor.stin...@haypocalc.com wrote:
 For merge commits: many developers just write merge or merge 3.1. I
 have to go to the parent commit (and something to the grandparent,
 3.1-3.2-3.3) to learn more about the commit.
 
 Would it be possible to repeat the changelog of the original commit in
 the merge commits? svnmerge toold prepared a nice changelog containing
 the changelog of all pendings commits, even when a commit was blocked.
 
 For a merge commit, I copy/paste the changelog of the original commit
 and I add a (Merge 3.1)  prefix. I prefer to add explictly a prefix
 because it is not easy to notice that it is a merge commit in a
 python-checkins email or in the history of hg.python.org.

+1.  What I do is, in the edit window for the commit message, I pull
in .hg/last-message.txt, and just type 'Merge' in front of my previous
first line.  I don't add the merge-from number, because I figure if you
know which branch you are looking at you know which branch the merge
came from, given that there is a strict progression.

--
R. David Murray   http://www.bitdance.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277: Remove useless test from test_zlib.

2011-05-09 Thread Jim Jewett
Can you clarify (preferably in the commit message as well) exactly
*why* these largefile tests are useless?  For example, is there
another test that covers this already?

-jJ

On 5/7/11, nadeem.vawda python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/201dcfc56e86
 changeset:   69886:201dcfc56e86
 branch:  2.7
 parent:  69881:a0147a1f1776
 user:Nadeem Vawda nadeem.va...@gmail.com
 date:Sat May 07 11:28:03 2011 +0200
 summary:
   Issue #11277: Remove useless test from test_zlib.

 files:
   Lib/test/test_zlib.py |  42 ---
   1 files changed, 0 insertions(+), 42 deletions(-)


 diff --git a/Lib/test/test_zlib.py b/Lib/test/test_zlib.py
 --- a/Lib/test/test_zlib.py
 +++ b/Lib/test/test_zlib.py
 @@ -72,47 +72,6 @@
   zlib.crc32('spam',  (2**31)))


 -# Issue #11277 - check that inputs of 2 GB (or 1 GB on 32 bits system) are
 -# handled correctly. Be aware of issues #1202. We cannot test a buffer of 4
 GB
 -# or more (#8650, #8651 and #10276), because the zlib stores the buffer
 size
 -# into an int.
 -class ChecksumBigBufferTestCase(unittest.TestCase):
 -if sys.maxsize  _4G:
 -# (64 bits system) crc32() and adler32() stores the buffer size
 into an
 -# int, the maximum filesize is INT_MAX (0x7FFF)
 -filesize = 0x7FFF
 -else:
 -# (32 bits system) On a 32 bits OS, a process cannot usually
 address
 -# more than 2 GB, so test only 1 GB
 -filesize = _1G
 -
 -@unittest.skipUnless(mmap, mmap() is not available.)
 -def test_big_buffer(self):
 -if sys.platform[:3] == 'win' or sys.platform == 'darwin':
 -requires('largefile',
 - 'test requires %s bytes and a long time to run' %
 - str(self.filesize))
 -try:
 -with open(TESTFN, wb+) as f:
 -f.seek(self.filesize-4)
 -f.write(asdf)
 -f.flush()
 -m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
 -try:
 -if sys.maxsize  _4G:
 -self.assertEqual(zlib.crc32(m), 0x709418e7)
 -self.assertEqual(zlib.adler32(m), -2072837729)
 -else:
 -self.assertEqual(zlib.crc32(m), 722071057)
 -self.assertEqual(zlib.adler32(m), -1002962529)
 -finally:
 -m.close()
 -except (IOError, OverflowError):
 -raise unittest.SkipTest(filesystem doesn't have largefile
 support)
 -finally:
 -unlink(TESTFN)
 -
 -
  class ExceptionTestCase(unittest.TestCase):
  # make sure we generate some expected errors
  def test_badlevel(self):
 @@ -595,7 +554,6 @@
  def test_main():
  run_unittest(
  ChecksumTestCase,
 -ChecksumBigBufferTestCase,
  ExceptionTestCase,
  CompressTestCase,
  CompressObjectTestCase

 --
 Repository URL: http://hg.python.org/cpython

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Eli Bendersky
Hi all,

It's a known Python gotcha (*) that the following code:

x = 5
def foo():
print(x)
x = 1
print(x)
foo()

Will throw:

   UnboundLocalError: local variable 'x' referenced before assignment

On the usage of 'x' in the *first* print. Recently, while reading the
zillionth question on StackOverflow on some variation of this case, I
started thinking whether this behavior is desired or just an implementation
artifact.

IIUC, the reason it behaves this way is that the symbol table logic goes
over the code before the code generation runs, sees the assignment 'x = 1`
and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST
for all loads of  'x' in 'foo', even though 'x' is actually bound locally
after the first print. When the bytecode is run, since it's LOAD_FAST and no
store was made into the local 'x', ceval.c then throws the exception.

On first sight, it's possible to signal that 'x' truly becomes local only
after it's bound in the scope (and before that LOAD_NAME can be generated
for it instead of LOAD_FAST). To do this, some modifications to the symbol
table creation and usage are required, because we can no longer say x is
local in this block, but rather should attach scope information to each
instance of x. This has some overhead, but it's only at the compilation
stage so it shouldn't have a real effect on the runtime of Python code. This
is also less convenient and clean than the current approach - this is why
I'm wondering whether the behavior is an artifact of the implementation.

Would it not be worth to make Python's behavior more expected in this case,
at the cost of some implementation complexity? What are the cons to making
such a change? At least judging by the amount of people getting confused by
it, maybe it's in line with the zen of Python to behave more explicitly
here.

Thanks in advance,
Eli

(*) Variation of this FAQ:
http://docs.python.org/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Jim Jewett
Are you asserting that all foreign modules (or at least all handled by
this) are in C, as opposed to C++ or even Java or Fortran?  (And the C
won't change?)

Is this ASCII restriction (as opposed to even UTF8) really needed?

Or are you just saying that we need to create an ASCII name for passing to C?

-jJ

On 5/7/11, victor.stinner python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/eb003c3d1770
 changeset:   69889:eb003c3d1770
 user:Victor Stinner victor.stin...@haypocalc.com
 date:Sat May 07 12:46:05 2011 +0200
 summary:
   _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

 The name must be encodable to ASCII because dynamic module must have a
 function
 called PyInit_NAME, they are written in C, and the C language doesn't
 accept
 non-ASCII identifiers.

 files:
   Python/importdl.c |  40 +-
   1 files changed, 25 insertions(+), 15 deletions(-)


 diff --git a/Python/importdl.c b/Python/importdl.c
 --- a/Python/importdl.c
 +++ b/Python/importdl.c
 @@ -20,31 +20,36 @@
 const char *pathname, FILE *fp);
  #endif

 -/* name should be ASCII only because the C language doesn't accept
 non-ASCII
 -   identifiers, and dynamic modules are written in C. */
 -
  PyObject *
  _PyImport_LoadDynamicModule(PyObject *name, PyObject *path, FILE *fp)
  {
 -PyObject *m;
 +PyObject *m = NULL;
  #ifndef MS_WINDOWS
  PyObject *pathbytes;
  #endif
 +PyObject *nameascii;
  char *namestr, *lastdot, *shortname, *packagecontext, *oldcontext;
  dl_funcptr p0;
  PyObject* (*p)(void);
  struct PyModuleDef *def;

 -namestr = _PyUnicode_AsString(name);
 -if (namestr == NULL)
 -return NULL;
 -
  m = _PyImport_FindExtensionObject(name, path);
  if (m != NULL) {
  Py_INCREF(m);
  return m;
  }

 +/* name must be encodable to ASCII because dynamic module must have a
 +   function called PyInit_NAME, they are written in C, and the C
 language
 +   doesn't accept non-ASCII identifiers. */
 +nameascii = PyUnicode_AsEncodedString(name, ascii, NULL);
 +if (nameascii == NULL)
 +return NULL;
 +
 +namestr = PyBytes_AS_STRING(nameascii);
 +if (namestr == NULL)
 +goto error;
 +
  lastdot = strrchr(namestr, '.');
  if (lastdot == NULL) {
  packagecontext = NULL;
 @@ -60,34 +65,33 @@
  #else
  pathbytes = PyUnicode_EncodeFSDefault(path);
  if (pathbytes == NULL)
 -return NULL;
 +goto error;
  p0 = _PyImport_GetDynLoadFunc(shortname,
PyBytes_AS_STRING(pathbytes), fp);
  Py_DECREF(pathbytes);
  #endif
  p = (PyObject*(*)(void))p0;
  if (PyErr_Occurred())
 -return NULL;
 +goto error;
  if (p == NULL) {
  PyErr_Format(PyExc_ImportError,
   dynamic module does not define init function
(PyInit_%s),
   shortname);
 -return NULL;
 +goto error;
  }
  oldcontext = _Py_PackageContext;
  _Py_PackageContext = packagecontext;
  m = (*p)();
  _Py_PackageContext = oldcontext;
  if (m == NULL)
 -return NULL;
 +goto error;

  if (PyErr_Occurred()) {
 -Py_DECREF(m);
  PyErr_Format(PyExc_SystemError,
   initialization of %s raised unreported exception,
   shortname);
 -return NULL;
 +goto error;
  }

  /* Remember pointer to module init function. */
 @@ -101,12 +105,18 @@
  Py_INCREF(path);

  if (_PyImport_FixupExtensionObject(m, name, path)  0)
 -return NULL;
 +goto error;
  if (Py_VerboseFlag)
  PySys_FormatStderr(
  import %U # dynamically loaded from %R\n,
  name, path);
 +Py_DECREF(nameascii);
  return m;
 +
 +error:
 +Py_DECREF(nameascii);
 +Py_XDECREF(m);
 +return NULL;
  }

  #endif /* HAVE_DYNAMIC_LOADING */

 --
 Repository URL: http://hg.python.org/cpython

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread Senthil Kumaran
On Mon, May 09, 2011 at 08:40:03AM -0400, R. David Murray wrote:
 +1.  What I do is, in the edit window for the commit message, I pull
 in .hg/last-message.txt, and just type 'Merge' in front of my previous

Thanks for this tip. I shall start following this one too.

-- 
Senthil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Isaac Morland

On Mon, 9 May 2011, Eli Bendersky wrote:


It's a known Python gotcha (*) that the following code:

x = 5
def foo():
   print(x)
   x = 1
   print(x)
foo()

Will throw:

  UnboundLocalError: local variable 'x' referenced before assignment

On the usage of 'x' in the *first* print. Recently, while reading the
zillionth question on StackOverflow on some variation of this case, I
started thinking whether this behavior is desired or just an implementation
artifact.

IIUC, the reason it behaves this way is that the symbol table logic goes
over the code before the code generation runs, sees the assignment 'x = 1`
and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST
for all loads of  'x' in 'foo', even though 'x' is actually bound locally
after the first print. When the bytecode is run, since it's LOAD_FAST and no
store was made into the local 'x', ceval.c then throws the exception.

On first sight, it's possible to signal that 'x' truly becomes local only
after it's bound in the scope (and before that LOAD_NAME can be generated
for it instead of LOAD_FAST). To do this, some modifications to the symbol
table creation and usage are required, because we can no longer say x is
local in this block, but rather should attach scope information to each
instance of x. This has some overhead, but it's only at the compilation
stage so it shouldn't have a real effect on the runtime of Python code. This
is also less convenient and clean than the current approach - this is why
I'm wondering whether the behavior is an artifact of the implementation.


x = 5
def foo ():
print (x)
if bar ():
x = 1
print (x)

Isaac Morland   CSCF Web Guru
DC 2554C, x36650WWW Software Specialist
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Stefan Behnel

Eli Bendersky, 09.05.2011 14:56:

It's a known Python gotcha (*) that the following code:

x = 5
def foo():
 print(x)
 x = 1
 print(x)
foo()

Will throw:

UnboundLocalError: local variable 'x' referenced before assignment

On the usage of 'x' in the *first* print. Recently, while reading the
zillionth question on StackOverflow on some variation of this case, I
started thinking whether this behavior is desired or just an implementation
artifact.


Well, basically any compiler these days can detect that a variable is being 
used before assignment, or at least that this is possibly the case, 
depending on prior branching.


ISTM that your suggestion is to let x refer to the outer x up to the 
assignment and to the inner x from that point on. IMHO, that's much worse 
than the current behaviour and potentially impractical due to conditional 
assignments.


However, it's also a semantic change to reject code with unbound locals at 
compile time, as the specific code in question may actually be unreachable 
at runtime. This makes me think that it would be best to discuss this on 
the python-ideas list first.


If nothing else, I'd like to see a discussion on this behaviour being an 
implementation detail of CPython or a feature of the Python language.


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Eric Snow
On May 9, 2011 6:59 AM, Eli Bendersky eli...@gmail.com wrote:

 Hi all,

 It's a known Python gotcha (*) that the following code:

 x = 5
 def foo():
 print(x)
 x = 1
 print(x)
 foo()

 Will throw:

UnboundLocalError: local variable 'x' referenced before assignment

 On the usage of 'x' in the *first* print. Recently, while reading the
zillionth question on StackOverflow on some variation of this case, I
started thinking whether this behavior is desired or just an implementation
artifact.

 IIUC, the reason it behaves this way is that the symbol table logic goes
over the code before the code generation runs, sees the assignment 'x = 1`
and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST
for all loads of  'x' in 'foo', even though 'x' is actually bound locally
after the first print. When the bytecode is run, since it's LOAD_FAST and no
store was made into the local 'x', ceval.c then throws the exception.

 On first sight, it's possible to signal that 'x' truly becomes local only
after it's bound in the scope (and before that LOAD_NAME can be generated
for it instead of LOAD_FAST). To do this, some modifications to the symbol
table creation and usage are required, because we can no longer say x is
local in this block, but rather should attach scope information to each
instance of x. This has some overhead, but it's only at the compilation
stage so it shouldn't have a real effect on the runtime of Python code. This
is also less convenient and clean than the current approach - this is why
I'm wondering whether the behavior is an artifact of the implementation.

 Would it not be worth to make Python's behavior more expected in this
case, at the cost of some implementation complexity? What are the cons to
making such a change? At least judging by the amount of people getting
confused by it, maybe it's in line with the zen of Python to behave more
explicitly here.

This is about mixing scopes for the the same name in the same block, right?
Perhaps a more specific error would be enough, unless there is a good use
case for having that mixed scope for the name.

-eric

 Thanks in advance,
 Eli

 (*) Variation of this FAQ:
http://docs.python.org/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
http://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277: Remove useless test from test_zlib.

2011-05-09 Thread Nadeem Vawda
On Mon, May 9, 2011 at 2:53 PM, Jim Jewett jimjjew...@gmail.com wrote:
 Can you clarify (preferably in the commit message as well) exactly
 *why* these largefile tests are useless?  For example, is there
 another test that covers this already?

Ah, sorry about that. It was discussed on the tracker issue, but I guess I
can't expect people to read through 90+ messages to figure it out :P

The short version is that it was supposed to test 4GB+ inputs, but in 2.7,
the functions being tested don't accept inputs that large.

The details:

The test was originally intended to catch the case where crc32() or adler32()
would get a buffer of =4GB, and then silently truncate the buffer size and
produce an incorrect result (issue10276). It had been written for 3.x, and then
backported to 2.7. However, in 2.7, zlibmodule.c doesn't define
PY_SSIZE_T_CLEAN, so passing in a buffer of =2GB raises an OverflowError
(see issue8651). This means that it is impossible to trigger the bug in question
on 2.7, making the test pointless.

Of course, the code that was deleted tests with an input sized 2GB-1 or 1GB,
rather than 4GB (the size used in 3.x). When the test was backported, the size
of the input was reduced, to avoid triggering an OverflowException. At the time,
no-one realized that this also would not trigger the bug being tested
for; it only
came to light when the test started crashing for unrelated reasons (issue11277).

Cheers,
Nadeem
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread Benjamin Peterson
2011/5/9 Victor Stinner victor.stin...@haypocalc.com:
 Hi,

 Commit changelogs are important to understand why the code was changed.
 I regulary use hg blame to search which commit introduced a particular
 line of code, and I am always happy if I can find an issue number
 because it usually contains the whole story.

 And since the migration to Mercurial, we have also a great tool adding a
 comment to an issue if the changelog contains an issue number (e.g.
 changelog starting with Issue #11: ...). So if someone watchs an
 issue (is in the nosy list), (s)he will be noticed that a related commit
 was pushed. It is not exactly something new: we already do that with
 Subversion except that today it is more automatic.

 I noticed that some recent commits don't contain the issue number:
 please try to always prefix your changelog with the issue number. It is
 not mandatory, but it helps me when I dig the Python history.

 --

 For merge commits: many developers just write merge or merge 3.1. I
 have to go to the parent commit (and something to the grandparent,
 3.1-3.2-3.3) to learn more about the commit.

I thought the whole point of merging was that you brought a changeset
from one branch to another. This why I just write merge because
otherwise you're technically duplicating information that is pulled
onto the branch by merging.

It seems like something that should be solved by tools like a display
visual graph indicating what is merged. (like Bazaar)



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Victor Stinner
Le lundi 09 mai 2011 à 09:00 -0400, Jim Jewett a écrit :
 Are you asserting that all foreign modules (or at least all handled by
 this) are in C, as opposed to C++ or even Java or Fortran?  (And the C
 won't change?)

C and C++ identifiers are restricted to ASCII. I don't know for Fortran
or Java.

Is it possible to write a CPython extension module in Java or Fortran?

(My change doesn't concern Jython: it's an implementation detail of
dynamic modules in CPython.)

 Is this ASCII restriction (as opposed to even UTF8) really needed?

I prefer to explicitly limit module names of dynamic modules to ASCII.

If we decide to extend the support to something else than ASCII, we will
need a working module to test it, and maybe also a test.

 Or are you just saying that we need to create an ASCII name for passing to C?

You pass a Unicode module name to import (import hé or
__import__('hé')), and Python encodes the name to ASCII if it is a
dynamic module. It is still possible to use non-ASCII module names, but
only for modules written in Python.

Victor

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread Victor Stinner
Le lundi 09 mai 2011 à 09:08 -0500, Benjamin Peterson a écrit :
 It seems like something that should be solved by tools like a display
 visual graph indicating what is merged. (like Bazaar)

Yeah, we could fix buildbot, hg.python.org website, improve hg log, and
all other tools using Mercurial. But until that, I would prefer to
duplicate the information.

Victor

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Borrowed and Stolen References in API

2011-05-09 Thread Nick Coghlan
On Fri, May 6, 2011 at 8:27 PM, Antoine Pitrou solip...@pitrou.net wrote:
 On Fri, 06 May 2011 13:28:11 +1200
 Greg Ewing greg.ew...@canterbury.ac.nz wrote:

 Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]:

  This is not always true, for example when the item is already present
  in the dict.
  It's not important to know what the function does to the object,
  Only the action on the reference is relevant.

 Yes, that's the whole point. When using a functon,
 what you need to know is whether it borrows or steals
 a reference.

 Doesn't borrow mean the same as steal in that context?
 If an API borrows a reference, I expect it to take it from me.

Input parameter, borrowed or new reference: caller retains ownership
and must still decref item
Input parameter, stolen reference: caller transfers ownership and must
NOT decref item (or must incref before call to guarantee lifecycle if
planning to continue using the object after the call)

Output parameter or return value, borrowed reference: caller does NOT
receive ownership and does not need to decref item, but needs to be
careful of lifecycle (and promote to a full reference with incref if
the borrowed reference may outlive the original)
Output parameter or return value, stolen or new reference: caller
receives ownership and must decref item

One interesting aspect is that from the caller's point of view, a
*new* reference to the relevant behaves like a borrowed reference for
input parameters, but like a stolen reference for output parameters
and return values. It is typically the converse cases (stolen
reference to an input parameter, borrowed reference to an output
parameter or return value) that requires special attention on the
caller's part.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Steven D'Aprano

Eli Bendersky wrote:

Hi all,

It's a known Python gotcha (*) that the following code:

x = 5
def foo():
print(x)
x = 1
print(x)
foo()

Will throw:

   UnboundLocalError: local variable 'x' referenced before assignment


I think part of the problem is that UnboundLocalError is a jargon name, 
while it's predecessor NameError (used up to Python 1.5) is far more 
intuitively obvious.




On the usage of 'x' in the *first* print. Recently, while reading the
zillionth question on StackOverflow on some variation of this case, I
started thinking whether this behavior is desired or just an implementation
artifact.

[...]

Would it not be worth to make Python's behavior more expected in this case,
at the cost of some implementation complexity? What are the cons to making
such a change? At least judging by the amount of people getting confused by
it, maybe it's in line with the zen of Python to behave more explicitly
here.


I think you are making an unwarranted assumption about what is more 
expected. I presume you are thinking that the expected behaviour is 
that foo() should:


print global x (5)
assign 1 to local x
print local x (1)

If we implemented this change, there would be no more questions about 
UnboundLocalError, but instead there would be lots of questions like 
why is it that globals revert to their old value after I change them in 
a function?.





--
Steven

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with regrtest and with logging

2011-05-09 Thread Nick Coghlan
On Sat, May 7, 2011 at 3:51 AM, Éric Araujo mer...@netwok.org wrote:
 regrtest helpfully reports when a test leaves the environment unclean
 (sys.path, os.environ, logging._handlerList), but I think the implementation
 is buggy: it compares object identity and then value.  Why is comparing
 identity useful?  I’d just use ==.  It makes writing cleanup code easier
 (just use addCleanup(setattr, obj, 'attr', copy(obj.attr))).

Because changing the identity of any of those global state attributes
that regrtest monitors is itself suggestive of a bug. When it comes to
containers, identity matters at least as much as value does (and
sometimes more so - e.g. sys.modules). Replacing those global
containers with new ones isn't guaranteed to work, as they may be
cached in various places rather than always retrieved fresh from the
relevant module namespace. Modifying them in place, on the other hand,
does the right thing even in the presence of cached references.

A comment to that effect may be a useful addition to regrtest, as I
expect others may have similar questions about those identity checks
in the future. (It may even be a useful addition to the documentation,
but I have no idea where it could be sensibly included).

Also, don't be surprised if wholesale cleanup like that isn't
completely reliable - it's far, far better if the test case
understands the changes it is making (even indirectly) and explicitly
reverses them. Save-and-restore should be a last resort technique
(although context managers that are designed for more general use,
such as warnings.catch_warnings(), use save-and-restore by necessity,
since they have no control over the body of the relevant with
statements).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Eli Bendersky
 I think you are making an unwarranted assumption about what is more
 expected. I presume you are thinking that the expected behaviour is that
 foo() should:

 print global x (5)
 assign 1 to local x
 print local x (1)

 If we implemented this change, there would be no more questions about
 UnboundLocalError, but instead there would be lots of questions like why is
 it that globals revert to their old value after I change them in a
 function?.


True, but this is less confusing and follows the rules in a more
straightforward way. x = 1 without a 'global x' assigns a local x, this make
sense and is similar to what happens in C where an inner declaration
temporarily shadows a global one.

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Nick Coghlan
On Mon, May 9, 2011 at 11:00 PM, Jim Jewett jimjjew...@gmail.com wrote:
 Are you asserting that all foreign modules (or at least all handled by
 this) are in C, as opposed to C++ or even Java or Fortran?  (And the C
 won't change?)

The extension module that interfaces them to CPython will be written
in C, or something that can export a C-compatible library interface
(after reading in the Python C API headers).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Eli Bendersky
 x = 5
 def foo ():
print (x)
if bar ():
x = 1
print (x)


I wish you'd annotate this code sample, what do you intend it to
demonstrate?

It probably shows the original complaint even more strongly. As for being a
problem with the suggested solution, I suppose you're right, although it
doesn't make it much different. Still, before a *possible* assignment to
'x', it should be loaded as LOAD_NAME since it was surely not bound as
local, yet.

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Nick Coghlan
On Tue, May 10, 2011 at 1:01 AM, Eli Bendersky eli...@gmail.com wrote:

 I think you are making an unwarranted assumption about what is more
 expected. I presume you are thinking that the expected behaviour is that
 foo() should:

 print global x (5)
 assign 1 to local x
 print local x (1)

 If we implemented this change, there would be no more questions about
 UnboundLocalError, but instead there would be lots of questions like why is
 it that globals revert to their old value after I change them in a
 function?.

 True, but this is less confusing and follows the rules in a more
 straightforward way. x = 1 without a 'global x' assigns a local x, this make
 sense and is similar to what happens in C where an inner declaration
 temporarily shadows a global one.

However, since flow control constructs in Python don't create new
scopes (unlike C/C++), you run into a fundamental problem with cases
like the one Isaac posted, or even nastier ones like the following:

def f():
  if bar():
fill = 1
  else:
fiil = 2
  print(fill)  # Q: What does this do when bool(bar()) is False?

Since we want to make the decision categorically at compile-time, the
simplest, least-confusing option is to say assignment makes a
variable name local, referencing it before the first assignment is now
an error. I don't know of anyone that particularly *likes*
UnboundLocalError, but it's better than letting errors like the one
above pass silently. (It obviously doesn't trap *all* typo-related
errors, but it at least lets you reason sanely about name bindings)

On the reasoning-sanely front, closures likely present a more
compelling argument:

def f():
  def g():
print(x) # We want this to refer to the closure in f(), thanks
  x = 1
  return g

UnboundLocalError is really about aligning the rules for the current
scope with those for references from nested scopes (i.e. x is a local
variable of f, whether it is referenced from f's local scope, or any
nested scope within f)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Nick Coghlan
On Tue, May 10, 2011 at 1:06 AM, Eli Bendersky eli...@gmail.com wrote:
 It probably shows the original complaint even more strongly. As for being a
 problem with the suggested solution, I suppose you're right, although it
 doesn't make it much different. Still, before a *possible* assignment to
 'x', it should be loaded as LOAD_NAME since it was surely not bound as
 local, yet.

Yeah, I've decided I'm happier with the closure based arguments than
the conditional statement related ones. Assignments create local
variables is a relatively simple rule to reason about, and is equally
valid for the current scope and for any nested scopes. The symtable
analysis for nested scopes is ordering independent (and can't be
changed for backwards compatibility reasons if nothing else), and
UnboundLocalError is a natural outgrowth of applying those semantics
to the current scope as well.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread R. David Murray
On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson benja...@python.org 
wrote:
 I thought the whole point of merging was that you brought a changeset
 from one branch to another. This why I just write merge because
 otherwise you're technically duplicating information that is pulled
 onto the branch by merging.

No it isn't.  The commit message isn't pulled into the new branch.

 It seems like something that should be solved by tools like a display
 visual graph indicating what is merged. (like Bazaar)

You'd need some extension to hg log that would show the original commit
message for the first changeset in the merge line in order to fix
this.  I doubt that is going to happen.

Note that saying just 'merge' makes perfect sense when you are pulling
in a whole group of changesets in order to synchronize two branches.
But if you are applying a single changeset to multiple branches,
as we often do in our workflow, then I think duplicating the commit
message is (1) easy to do and (2) very helpful when looking at
hg log output.

--
R. David Murray   http://www.bitdance.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Isaac Morland

On Mon, 9 May 2011, Eli Bendersky wrote:


x = 5
def foo ():
   print (x)
   if bar ():
   x = 1
   print (x)



I wish you'd annotate this code sample, what do you intend it to
demonstrate?

It probably shows the original complaint even more strongly. As for being a
problem with the suggested solution, I suppose you're right, although it
doesn't make it much different. Still, before a *possible* assignment to
'x', it should be loaded as LOAD_NAME since it was surely not bound as
local, yet.


Extrapolating from your suggestion, you're saying before a *possible* 
assignment it will be treated as global, and after a *possible* assignment 
it will be treated as local?


But surely:

print (x)
if False:
x = 1
print (x)

should always print the same thing twice (in the absence of actions taken 
by other threads)!


Replace False by something that is usually (but not always) True, and 
print (x) by something that actually does something, and you had best 
put on your helmet because it's going to be a fun ride.


But I won't be on it.

The idea that the same name within the same scope always refers to the 
same value is an idea from functional programming and not part of Python; 
but surely the same name within the same scope should at least always 
refer to the same variable!


If something is to be done here, it occurs to me that the same parser that 
decides that the initial reference to x should use the local x could 
conceivably issue an error right away - local variable can never be 
assigned before use rather than waiting until runtime.  But even if I 
haven't confused myself about the possibility of this raising a false 
positive (and it certainly could in the presence of dead code), it 
wouldn't catch cases of conditional premature use of a local variable. I 
think in those cases people would still ask the same questions they do 
with the existing implementation.


Isaac Morland   CSCF Web Guru
DC 2554C, x36650WWW Software Specialist
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread Éric Araujo

Hi,

Le 09/05/2011 16:08, Benjamin Peterson a écrit :

2011/5/9 Victor Stinner victor.stin...@haypocalc.com:
For merge commits: many developers just write merge or merge 
3.1. I

have to go to the parent commit (and something to the grandparent,
3.1-3.2-3.3) to learn more about the commit.


I follow conventions I’ve seen elsewhere (maybe Mercurial itself): I 
use “Branch merge” when I merge anonymous branches on the same named 
branch, and “Merge x.y” for forward-porting across named branches.


I also tend to do more than one commit before merging.  It would not be 
very easy with my current toolchain to get the commit message(s) to 
insert into the new message, and I think it’s not necessary.



I thought the whole point of merging was that you brought a changeset
from one branch to another. This why I just write merge because
otherwise you're technically duplicating information that is pulled
onto the branch by merging.


+1.  No interest in manually duplicating available information.

Le 09/05/2011 17:44, R. David Murray a écrit :

No it isn't.  The commit message isn't pulled into the new branch.


Sorry, your terminology does not make sense.  If you mean that the 
commit message is not reused in the new commit after the merge, it’s 
true.  However, the commit message with the relevant information is 
available as part of the changesets that have been pulled and merged.


Regards
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with regrtest and with logging

2011-05-09 Thread Éric Araujo

Hi,

Thanks for the help.  I didn’t know about handler.close.  (By which I 
mean that I used logging without re-reading its documentation, which is 
a testimony to its usability :)


The cases you refer to seem to be _set_logger in packaging/run.py 
(which appears
not to be used at all - there appear to be no other references to it 
in the

code),
Yep, probably dead code.  I think that an handler should be defined 
only once, in the “if __name__ == '__main__'” block.  Am I right?  Just 
like you don’t call sys.exit from library code (hello optparse!), you 
don’t set logging handlers in library code, only in the outmost layer of 
the script.



Dispatcher.__init__ in packaging/run.py and
This is the new-fangled command line parser, which can run global 
(Python-wide) commands (search, uninstall, etc.) as well as traditional 
project-wide commands (build, check, sdist, etc.)



Distribution.parse_command_line in packaging/dist.py.
This is the older command line parser, that can handle only 
project-wide commands.  I’m not sure the work is finished to integrate 
both parsers; my smoke test used to be --help-commands, which can be 
hard to run these days.


The problem is that Dispatcher or Distribution get the quiet or verbose 
options from the command-line deep in the library code, and want to use 
it to configure the log level on the handler, which I’ve just said 
should be set up at a much higher level.  To solve this, I’m going to 
add a *logginghandler* argument to Dispatcher/Distribution; that way, 
the creation of the handler will happen only once and at a high level, 
but the command-line parsing code will be able to set the log handler 
from the command-line arguments. :)


In the second and third cases, can you be sure that only one of these 
code paths

will be executed, at most once?


Gut feeling is yes, but we’ve learned not to trust our instinct with 
distutils.


In the case of the test support code, I'm not really sure that 
LoggingCatcher is
needed. There is already a TestHandler class in test.support which 
captures
records in a buffer, and allows flexible matching for assertions, as 
described in


distutils used its own log module; this mixin was used to intercept 
messages sent with this system.  When we migrated to stdlib logging, I 
added a todo comment to update the code to use something less kludgy :)  
The post you linked to is already in my bookmarks.  Note that this 
support module also helps with Python 2.4+, so I may have to copy-paste 
TestHandler.


So, I will fix the LoggingCatcher mixin to use the much cleaner 
addHandler/removeHandler combo (I’ll avoid calling 
logging._removeHandlerRef if I don’t have to) and try my idea about the 
handler instantiation in the code.  Thanks a lot!


Cheers
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Steven D'Aprano

Eli Bendersky wrote:

I think you are making an unwarranted assumption about what is more
expected. I presume you are thinking that the expected behaviour is that
foo() should:

print global x (5)
assign 1 to local x
print local x (1)

If we implemented this change, there would be no more questions about
UnboundLocalError, but instead there would be lots of questions like why is
it that globals revert to their old value after I change them in a
function?.



True, but this is less confusing and follows the rules in a more
straightforward way. x = 1 without a 'global x' assigns a local x, this make
sense and is similar to what happens in C where an inner declaration
temporarily shadows a global one.


I disagree that it is less confusing. Instead of a nice, straightforward 
error that you can google, the function will silently do the wrong 
thing, giving no clue that weirdness is happening.


def spam():
if x  0:  # refers to global x
x = 1  # now local
if x  0:  # could be either global or local
x = x - 1  # local on the LHS of the equal
# sometimes global on the RHS
else:
x += 1  # local x, but what value does it have?


Just thinking about debugging the mess that this could make gives me a 
headache.




--
Steven

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with regrtest and with logging

2011-05-09 Thread Éric Araujo

Hi,


When it comes to
containers, identity matters at least as much as value does (and
sometimes more so - e.g. sys.modules). Replacing those global
containers with new ones isn't guaranteed to work, as they may be
cached in various places rather than always retrieved fresh from the
relevant module namespace. Modifying them in place, on the other 
hand,

does the right thing even in the presence of cached references.


That makes sense, thanks for the explanation!


A comment to that effect may be a useful addition to regrtest, as I
expect others may have similar questions about those identity checks
in the future. (It may even be a useful addition to the 
documentation,

but I have no idea where it could be sensibly included).


Somewhere in unittest doc, say in the section about tearDown.  Or maybe 
it’s time for a Python testing best practices howto?



Also, don't be surprised if wholesale cleanup like that isn't
completely reliable - it's far, far better if the test case
understands the changes it is making (even indirectly) and explicitly
reverses them.


Yep, I was probably bringing out the big guns too early.  
self.addCleanup(sys.path.remove, path) is better and even shorter than 
my previous code!


Cheers
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Terry Reedy

On 5/9/2011 9:27 AM, Stefan Behnel wrote:

Eli Bendersky, 09.05.2011 14:56:

It's a known Python gotcha (*) that the following code:

x = 5
def foo():
print(x)
x = 1
print(x)
foo()

Will throw:

UnboundLocalError: local variable 'x' referenced before assignment

On the usage of 'x' in the *first* print. Recently, while reading the
zillionth question on StackOverflow on some variation of this case, I
started thinking whether this behavior is desired or just an
implementation
artifact.


Well, basically any compiler these days can detect that a variable is
being used before assignment, or at least that this is possibly the
case, depending on prior branching.

ISTM that your suggestion is to let x refer to the outer x up to the
assignment and to the inner x from that point on. IMHO, that's much
worse than the current behaviour and potentially impractical due to
conditional assignments.

However, it's also a semantic change to reject code with unbound locals
at compile time, as the specific code in question may actually be
unreachable at runtime. This makes me think that it would be best to
discuss this on the python-ideas list first.

If nothing else, I'd like to see a discussion on this behaviour being an
implementation detail of CPython or a feature of the Python language.

Stefan




--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Commit messages: please avoid temporal ambiguity

2011-05-09 Thread Terry Reedy
A commit (push) partition time and behavior into before and after (with 
a short change period in between during which behavior is undefined).


Some commit messages have the form 'x does y'. Does 'does' mean before 
or after? Sometimes that is clear. 'x crashes' means before. 'x return 
correct value' means after. But some messages of this type are unclear 
to me as written.


Consider 'x raises exception'? The temporal reference is obvious to the 
committer but not necessary to everyone else. It could mean 'x used to 
segfault and now raises a catchable exception'. There was a fix like 
this (with a clear message) just today. It could also mean 'x used to 
raise but now return an answer. There have been many fixes like this.


Two minimal fixes are 'x raised exception' or 'make x raise exception'.

--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with regrtest and with logging

2011-05-09 Thread Vinay Sajip
Éric Araujo merwok at netwok.org writes:


  Yep, probably dead code.  I think that an handler should be defined 
  only once, in the “if __name__ == '__main__'” block.  Am I right?  Just 
  like you don’t call sys.exit from library code (hello optparse!), you 
  don’t set logging handlers in library code, only in the outmost layer of 
  the script.

That's right, though it's OK to provide a documented convenience API for adding
handlers.
 
  The problem is that Dispatcher or Distribution get the quiet or verbose 
  options from the command-line deep in the library code, and want to use 
  it to configure the log level on the handler, which I’ve just said 
  should be set up at a much higher level.  To solve this, I’m going to 
  add a *logginghandler* argument to Dispatcher/Distribution; that way, 
  the creation of the handler will happen only once and at a high level, 
  but the command-line parsing code will be able to set the log handler 
  from the command-line arguments. :)

You don't necessarily need to set the level on the handler - why can you not
just set it on the logger? The effect would often be the same: the logger's
level is checked first, and then the handler's level. Generally you set levels
on handlers when you want specific behaviour, such as all ERROR and above to a
particular file, all CRITICAL to an email handler etc.

For command-line scripts outputting to the console and nowhere else, usually you
could just add a StreamHandler (with no level set on it), and set the level on
the logger. Where the functionality may be used in an API, you should perhaps
check logger.hasHandlers() and avoid adding handlers if there are already some
added by a using library or application.

Regards,

Vinay Sajip


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread R. David Murray
On Mon, 09 May 2011 17:55:42 +0200, =?UTF-8?Q?=C3=89ric_Araujo?= 
mer...@netwok.org wrote:
  Le 09/05/2011 16:08, Benjamin Peterson a écrit :
  2011/5/9 Victor Stinner victor.stin...@haypocalc.com:
  For merge commits: many developers just write merge or merge
  3.1. I
  have to go to the parent commit (and something to the grandparent,
  3.1-3.2-3.3) to learn more about the commit.
 
  I follow conventions I’ve seen elsewhere (maybe Mercurial itself): I
  use “Branch merge” when I merge anonymous branches on the same named
  branch, and “Merge x.y” for forward-porting across named branches.
 
  I also tend to do more than one commit before merging.  It would not be
  very easy with my current toolchain to get the commit message(s) to
  insert into the new message, and I think it’s not necessary.
 
  I thought the whole point of merging was that you brought a changeset
  from one branch to another. This why I just write merge because
  otherwise you're technically duplicating information that is pulled
  onto the branch by merging.
 
  +1.  No interest in manually duplicating available information.
 
  Le 09/05/2011 17:44, R. David Murray a écrit :
  No it isn't.  The commit message isn't pulled into the new branch.
 
  Sorry, your terminology does not make sense.  If you mean that the
  commit message is not reused in the new commit after the merge, it’s
  true.  However, the commit message with the relevant information is
  available as part of the changesets that have been pulled and merged.

The changesets are in the repository and there are pointers to them
from the merge changeset, sure, but the data isn't in the checkout
(that's how I understood pulled in to the new branch).

If I do 'hg log' and search for a revno (that I got from hg annotate),
the commit message describing the change is not attached to that revno,
nor as far as I know is there a tool that makes it easy to get from that
revno to the explanatory commit message.  That's what Victor and I are
talking about.  Is there a tool that fixes this problem?  (svnmerge did a
nice job of that from the automate-the-message-generation end of things).

--
R. David Murray   http://www.bitdance.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit messages: please avoid temporal ambiguity

2011-05-09 Thread Ned Batchelder

On 5/9/2011 1:24 PM, Terry Reedy wrote:
A commit (push) partition time and behavior into before and after 
(with a short change period in between during which behavior is 
undefined).


Some commit messages have the form 'x does y'. Does 'does' mean before 
or after? Sometimes that is clear. 'x crashes' means before. 'x return 
correct value' means after. But some messages of this type are unclear 
to me as written.


Consider 'x raises exception'? The temporal reference is obvious to 
the committer but not necessary to everyone else. It could mean 'x 
used to segfault and now raises a catchable exception'. There was a 
fix like this (with a clear message) just today. It could also mean 'x 
used to raise but now return an answer. There have been many fixes 
like this.


Two minimal fixes are 'x raised exception' or 'make x raise exception'.


I've always favored X now properly raises an exception.

--Ned.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit messages: please avoid temporal ambiguity

2011-05-09 Thread Guido van Rossum
On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder n...@nedbatchelder.com wrote:
 On 5/9/2011 1:24 PM, Terry Reedy wrote:

 A commit (push) partition time and behavior into before and after (with a
 short change period in between during which behavior is undefined).

 Some commit messages have the form 'x does y'. Does 'does' mean before or
 after? Sometimes that is clear. 'x crashes' means before. 'x return correct
 value' means after. But some messages of this type are unclear to me as
 written.

 Consider 'x raises exception'? The temporal reference is obvious to the
 committer but not necessary to everyone else. It could mean 'x used to
 segfault and now raises a catchable exception'. There was a fix like this
 (with a clear message) just today. It could also mean 'x used to raise but
 now return an answer. There have been many fixes like this.

 Two minimal fixes are 'x raised exception' or 'make x raise exception'.

 I've always favored X now properly raises an exception.

While my own preference is make X properly raise an exception I'm
happy with any of the alternatives proposed here, and grateful to
Terry for calling this out. Checkin comments of the form X does Y
are ambiguous and confusing. (Same for feature requests in the
tracker.)

I'm curious where the habit to use the present tense comes from; I
wonder if it originates in some agile development practice?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit messages: please avoid temporal ambiguity

2011-05-09 Thread Eric Smith
On 05/09/2011 03:17 PM, Guido van Rossum wrote:
 On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder n...@nedbatchelder.com 
 wrote:
 On 5/9/2011 1:24 PM, Terry Reedy wrote:

 A commit (push) partition time and behavior into before and after (with a
 short change period in between during which behavior is undefined).

 Some commit messages have the form 'x does y'. Does 'does' mean before or
 after? Sometimes that is clear. 'x crashes' means before. 'x return correct
 value' means after. But some messages of this type are unclear to me as
 written.

 Consider 'x raises exception'? The temporal reference is obvious to the
 committer but not necessary to everyone else. It could mean 'x used to
 segfault and now raises a catchable exception'. There was a fix like this
 (with a clear message) just today. It could also mean 'x used to raise but
 now return an answer. There have been many fixes like this.

 Two minimal fixes are 'x raised exception' or 'make x raise exception'.

 I've always favored X now properly raises an exception.
 
 While my own preference is make X properly raise an exception I'm
 happy with any of the alternatives proposed here, and grateful to
 Terry for calling this out. Checkin comments of the form X does Y
 are ambiguous and confusing. (Same for feature requests in the
 tracker.)
 
 I'm curious where the habit to use the present tense comes from; I
 wonder if it originates in some agile development practice?
 

Thanks indeed for bringing this up, Terry. It's been on my to-do list
for a while. I think it comes from just copying the title of a bug
report. The bug is X does Y, and that's what's used in the fix.

Eric.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit messages: please avoid temporal ambiguity

2011-05-09 Thread Guido van Rossum
On Mon, May 9, 2011 at 12:36 PM, Eric Smith e...@trueblade.com wrote:
 On 05/09/2011 03:17 PM, Guido van Rossum wrote:
 On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder n...@nedbatchelder.com 
 wrote:
 On 5/9/2011 1:24 PM, Terry Reedy wrote:

 A commit (push) partition time and behavior into before and after (with a
 short change period in between during which behavior is undefined).

 Some commit messages have the form 'x does y'. Does 'does' mean before or
 after? Sometimes that is clear. 'x crashes' means before. 'x return correct
 value' means after. But some messages of this type are unclear to me as
 written.

 Consider 'x raises exception'? The temporal reference is obvious to the
 committer but not necessary to everyone else. It could mean 'x used to
 segfault and now raises a catchable exception'. There was a fix like this
 (with a clear message) just today. It could also mean 'x used to raise but
 now return an answer. There have been many fixes like this.

 Two minimal fixes are 'x raised exception' or 'make x raise exception'.

 I've always favored X now properly raises an exception.

 While my own preference is make X properly raise an exception I'm
 happy with any of the alternatives proposed here, and grateful to
 Terry for calling this out. Checkin comments of the form X does Y
 are ambiguous and confusing. (Same for feature requests in the
 tracker.)

 I'm curious where the habit to use the present tense comes from; I
 wonder if it originates in some agile development practice?


 Thanks indeed for bringing this up, Terry. It's been on my to-do list
 for a while. I think it comes from just copying the title of a bug
 report. The bug is X does Y, and that's what's used in the fix.

But in bug reports it is also ambiguous, since I've often seen it used
meaning X should do Y which is very confusing when it doesn't do Y
yet at the time the bug is created. :-(

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit messages: please avoid temporal ambiguity

2011-05-09 Thread Terry Reedy

On 5/9/2011 4:05 PM, Guido van Rossum wrote:

On Mon, May 9, 2011 at 12:36 PM, Eric Smithe...@trueblade.com  wrote:

On 05/09/2011 03:17 PM, Guido van Rossum wrote:



While my own preference is make X properly raise an exception I'm
happy with any of the alternatives proposed here, and grateful to
Terry for calling this out.


I am willing to admit that I do not know all corners of Python ;-)
I read the commit messages to learn more; in particular what sort of 
errors exist and how are they fixed.


 Checkin comments of the form X does Y

are ambiguous and confusing. (Same for feature requests in the
tracker.)


I have always assumed that an issue entitled 'x does y' is a bug report 
about doing y now, before a fix.



Thanks indeed for bringing this up, Terry. It's been on my to-do list
for a while. I think it comes from just copying the title of a bug
report. The bug is X does Y, and that's what's used in the fix.


I have also seen this type of message for non-tracker-issue commits.


But in bug reports it is also ambiguous, since I've often seen it used
meaning X should do Y which is very confusing when it doesn't do Y
yet at the time the bug is created. :-(


If I notice a title that bad, I will try to change it.

--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread Terry Reedy

On 5/9/2011 1:54 PM, R. David Murray wrote:


If I do 'hg log' and search for a revno (that I got from hg annotate),
the commit message describing the change is not attached to that revno,
nor as far as I know is there a tool that makes it easy to get from that
revno to the explanatory commit message.  That's what Victor and I are
talking about.  Is there a tool that fixes this problem?  (svnmerge did a
nice job of that from the automate-the-message-generation end of things).


TortoiseSvn, and I presume TortoiseHg also, has a 'recent messages' box 
that makes is trivial to reuse a message. I used it with svn and will 
make sure to use it, if it exists, when I get started with hg.

--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread Benjamin Peterson
2011/5/9 R. David Murray rdmur...@bitdance.com:
 On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson benja...@python.org 
 wrote:
 I thought the whole point of merging was that you brought a changeset
 from one branch to another. This why I just write merge because
 otherwise you're technically duplicating information that is pulled
 onto the branch by merging.

 No it isn't.  The commit message isn't pulled into the new branch.

 It seems like something that should be solved by tools like a display
 visual graph indicating what is merged. (like Bazaar)

 You'd need some extension to hg log that would show the original commit
 message for the first changeset in the merge line in order to fix
 this.  I doubt that is going to happen.

*cough* http://mercurial.selenic.com/wiki/GraphlogExtension


 Note that saying just 'merge' makes perfect sense when you are pulling
 in a whole group of changesets in order to synchronize two branches.
 But if you are applying a single changeset to multiple branches,
 as we often do in our workflow, then I think duplicating the commit
 message is (1) easy to do and (2) very helpful when looking at
 hg log output.

What's the difference between pulling multiple changesets in and one then?


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Neil Hodgson
Victor Stinner:

 C and C++ identifiers are restricted to ASCII. I don't know for Fortran
 or Java.

   Some C and C++ implementations currently allow non-ASCII
identifiers and the forthcoming C1X and C++0x language standards
include non-ASCII identifiers. The allowed characters are specified in
Annexes of the respective standards.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Antoine Pitrou
On Mon, 09 May 2011 16:11:15 +0200
Victor Stinner victor.stin...@haypocalc.com wrote:
 Le lundi 09 mai 2011 à 09:00 -0400, Jim Jewett a écrit :
  Are you asserting that all foreign modules (or at least all handled by
  this) are in C, as opposed to C++ or even Java or Fortran?  (And the C
  won't change?)
 
 C and C++ identifiers are restricted to ASCII. I don't know for Fortran
 or Java.

Why is it important, though?
What matters is not what C/C++ can produce, but what a shared library
can export. So the question is: are shared libraries limited to ASCII
symbols?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Borrowed and Stolen References in API

2011-05-09 Thread Greg Ewing

Nick Coghlan wrote:


One interesting aspect is that from the caller's point of view, a
*new* reference to the relevant behaves like a borrowed reference for
input parameters, but like a stolen reference for output parameters
and return values.


I think it's less confusing to use the term new only for
output/return values, and stolen only for input values.

Inputs are either borrowed or stolen (by the callee).

Outputs are either new (to the caller) or borrowed
(by the caller).

(Or maybe the terms for outputs should be given and lent?-)

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Victor Stinner
Le mardi 10 mai 2011 à 09:52 +1000, Neil Hodgson a écrit :
Some C and C++ implementations currently allow non-ASCII
 identifiers and the forthcoming C1X and C++0x language standards
 include non-ASCII identifiers. The allowed characters are specified in
 Annexes of the respective standards.
 
 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D
 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E

I read these documents but they don't explain which encoding is used in
libraries and programs. Does it mean that Windows and Linux may use
different encodings? At least, the surrogate range (U+DC00-U+DFFF) is
excluded, which is a good news (UTF-8 decoder of Python 3 rejects
surrogate characters).

I discovered -fextended-identifiers option of gcc: using this option,
you can use \u and \U in identifiers, but not \xHH. On
Linux, identifiers are encoded to UTF-8.

Example:
--
#define _ISOC99_SOURCE
#include stdio.h

int f\u00E9() { wprintf(LU+00E9 = \xE9\n); }

int g\U00E8() { wprintf(LU+00E8 = \xE8\n); }

int main() { f\u00E9(); g\U00E8(); return 0; }
--

It's not very practical, I would prefer to write directly Unicode
characters (as I can do in Python 3!). I'm not sure that chineses will
prefer to call \u4f60\u597d() instead of hello().

Ok, I now agree, it is possible to use non-ASCII characters in C. But
what about the encoding of symbols in a dynamic library: is it always
UTF-8?

Victor

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Neil Hodgson
Victor Stinner:

 I read these documents but they don't explain which encoding is used in
 libraries and programs. Does it mean that Windows and Linux may use
 different encodings?

   Yes, Windows will use UTF-16 as it does for almost everything. From
a user's point of view, these should both just be seen as Unicode.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Borrowed and Stolen References in API

2011-05-09 Thread Greg Ewing

Marvin Humphrey wrote:


  incremented: The caller has to account for an additional refcount.
  decremented: The caller has to account for a lost refcount.


I'm not sure that really clarifies anything. These terms
sound like they're talking about the reference count of the
object, but if they correspond to borrowed/stolen, they
don't necessarily correlate with what actually happens to
the reference count.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Borrowed and Stolen References in API

2011-05-09 Thread Marvin Humphrey
On Tue, May 10, 2011 at 12:13:47PM +1200, Greg Ewing wrote:
 Nick Coghlan wrote:

 One interesting aspect is that from the caller's point of view, a
 *new* reference to the relevant behaves like a borrowed reference for
 input parameters, but like a stolen reference for output parameters
 and return values.

 I think it's less confusing to use the term new only for
 output/return values, and stolen only for input values.

 Inputs are either borrowed or stolen (by the callee).

 Outputs are either new (to the caller) or borrowed
 (by the caller).

 (Or maybe the terms for outputs should be given and lent?-)

To solve this problem in a similar system (the Clownfish object system used by
Apache Lucy) we used the keywords incremented and decremented.  Applied to
some Python C API function documentation:

  incremented PyObject* PyTuple_New(Py_ssize_t len)

  int PyTuple_SetItem(PyObject *p, Py_ssize_t pos, 
  decremented PyObject *o)

With incremented and decremented, the perspective is always that of the
caller.  

  incremented: The caller has to account for an additional refcount.
  decremented: The caller has to account for a lost refcount.

Marvin Humphrey

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread R. David Murray
On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson benja...@python.org 
wrote:
 2011/5/9 R. David Murray rdmur...@bitdance.com:
  On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson benja...@python.or=
 g wrote:
  I thought the whole point of merging was that you brought a changeset
  from one branch to another. This why I just write merge because
  otherwise you're technically duplicating information that is pulled
  onto the branch by merging.
 
  No it isn't. =C2=A0The commit message isn't pulled into the new branch.
 
  It seems like something that should be solved by tools like a display
  visual graph indicating what is merged. (like Bazaar)
 
  You'd need some extension to hg log that would show the original commit
  message for the first changeset in the merge line in order to fix
  this. =C2=A0I doubt that is going to happen.
 
 *cough* http://mercurial.selenic.com/wiki/GraphlogExtension

I'm sorry, but I've looked at the output of that and the mental overhead
has so far proven too high for it to be of any use to me.  I apologize for
not having made the full mental transition to distributed VCS/DAG
(apparently), but it sounds like I'm not the only one

  Note that saying just 'merge' makes perfect sense when you are pulling
  in a whole group of changesets in order to synchronize two branches.
  But if you are applying a single changeset to multiple branches,
  as we often do in our workflow, then I think duplicating the commit
  message is (1) easy to do and (2) very helpful when looking at
  hg log output.
 
 What's the difference between pulling multiple changesets in and one then?

I'm talking about merging trunk to a feature branch, for example.
I'd not expect any message other than 'merge' for that.

I'd be satisfied if the commit messages listed the issue numbers involved
in the merge, especially if someone (like Éric) is merging more than
one change at a time.

But as I think about this, frankly I'd rather see atomic commits, even
on merges.  That was something I disliked about svnmerge, the fact that
often an svnmerge commit involved many changesets from the other branch.
That was especially painful in exactly the same situation:  trying to
backtrack a change starting from 'svn blame'.  I limited my own use
of multiple-changeset-svnmerge to doc changes and changesets that were
actually related, despite the overhead involved in doing it that way.

All that said, I'm not trying to impose my will on the workflow, I'll
certainly live with the consensus (though unless there is an outcry
against it I'll continue putting the full commit message in my own
merges).

--
R. David Murray   http://www.bitdance.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Borrowed and Stolen References in API

2011-05-09 Thread Marvin Humphrey
On Tue, May 10, 2011 at 01:28:04PM +1200, Greg Ewing wrote:
 Marvin Humphrey wrote:

   incremented: The caller has to account for an additional refcount.
   decremented: The caller has to account for a lost refcount.

 I'm not sure that really clarifies anything. These terms
 sound like they're talking about the reference count of the
 object, but if they correspond to borrowed/stolen, they
 don't necessarily correlate with what actually happens to
 the reference count.

Hmm, they don't correspond to borrowed/stolen.

stolen from the caller - decremented
stolen from the callee - incremented
borrowed   - [no modifier]

We don't have a modifier keyword which is analogous to borrowed.  The user
is expected to understand object lifespan issues for borrowed references
without explicit guidance.

With regards to what actually happens to the reference count, I would argue
that incremented and decremented are accurate descriptions.

  * When a function returns an incremented object, that function has added
a refcount to it.
  * When a function accepts a decremented object as an argument, it will
consume a refcount from it -- either right away, or at some point in the
future.

In my view, it is not desirable to label arguments or return values as
borrowed; it is only necessary to advise the user when they must take action
to account for a refcount, gained or lost.

Cheers,

Marvin Humphrey

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Commit changelog: issue number and merges

2011-05-09 Thread Stephen J. Turnbull
R. David Murray writes:
  On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson benja...@python.org 
  wrote:

   *cough* http://mercurial.selenic.com/wiki/GraphlogExtension
  
  I'm sorry, but I've looked at the output of that and the mental overhead
  has so far proven too high for it to be of any use to me.

How about the hgk extension, and hg view?

http://mercurial.selenic.com/wiki/HgkExtension

  But as I think about this, frankly I'd rather see atomic commits, even
  on merges.  That was something I disliked about svnmerge, the fact that
  often an svnmerge commit involved many changesets from the other branch.
  That was especially painful in exactly the same situation:  trying to
  backtrack a change starting from 'svn blame'.

I don't understand the issue.  In my experience, hg annotate will
point to the commit on the branch, not to the merge, unless there was
a conflict, in which case the merge is the right place (although not
necessarily the most useful place) to point.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Michael Urman
On Mon, May 9, 2011 at 20:08, Neil Hodgson nyamaton...@gmail.com wrote:
   Yes, Windows will use UTF-16 as it does for almost everything. From
 a user's point of view, these should both just be seen as Unicode.

I'm not convinced this is correct for this case. GetProcAddress takes
an ANSI string, meaning while it could theoretically use UTF-8, in
practice I doubt it uses anything outside of ASCII safely. So while
the name of the library would be encoded in UTF-16, the name of the
function loaded from the library would not be.

http://msdn.microsoft.com/en-us/library/ms683212(v=vs.85).aspx

-- 
Michael Urman
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Neil Hodgson
Michael Urman:

 I'm not convinced this is correct for this case. GetProcAddress takes
 an ANSI string, meaning while it could theoretically use UTF-8, in
 practice I doubt it uses anything outside of ASCII safely. So while
 the name of the library would be encoded in UTF-16, the name of the
 function loaded from the library would not be.

   Yes you are right:
http://scintilla.org/NarrowName.png

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Michael Urman
On Mon, May 9, 2011 at 23:09, Neil Hodgson nyamaton...@gmail.com wrote:
 Michael Urman:

 I'm not convinced this is correct for this case. GetProcAddress takes
 an ANSI string, meaning while it could theoretically use UTF-8, in
 practice I doubt it uses anything outside of ASCII safely. So while
 the name of the library would be encoded in UTF-16, the name of the
 function loaded from the library would not be.

   Yes you are right:
 http://scintilla.org/NarrowName.png

   Neil


That screenshot seems to show UTF-8 is being used. This may just be
the literal bytes in the .c file, but could it be something more
dependable?

http://unicode.org/cgi-bin/GetUnihanData.pl?codepoint=6728
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII

2011-05-09 Thread Neil Hodgson
Michael Urman:

 That screenshot seems to show UTF-8 is being used. This may just be
 the literal bytes in the .c file, but could it be something more
 dependable?

   The file is in UTF-8 so the compiler may just be copying the bytes.
There is a setlocale pragma but that seems to be just for string
literals.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] more timely detection of unbound locals

2011-05-09 Thread Eli Bendersky
On Mon, May 9, 2011 at 18:44, Isaac Morland ijmor...@uwaterloo.ca wrote:

 On Mon, 9 May 2011, Eli Bendersky wrote:

  x = 5
 def foo ():
   print (x)
   if bar ():
   x = 1
   print (x)


 I wish you'd annotate this code sample, what do you intend it to
 demonstrate?

 It probably shows the original complaint even more strongly. As for being
 a
 problem with the suggested solution, I suppose you're right, although it
 doesn't make it much different. Still, before a *possible* assignment to
 'x', it should be loaded as LOAD_NAME since it was surely not bound as
 local, yet.


 Extrapolating from your suggestion, you're saying before a *possible*
 assignment it will be treated as global, and after a *possible* assignment
 it will be treated as local?

 But surely:

 print (x)
 if False:
x = 1
 print (x)

 [snip]

Alright, I now understand the problems with the suggestion. Indeed,
conditional assignments that are only really resolved at runtime are the big
stumbling block here.

However, maybe the error message/reporting can still be improved?

ISTM the UnboundLocalError exception gets raised only in those weird and
confusing cases. After all, why would Python decide an access to some name
is to a local? Only if it found an assignment to that local in the scope.
But that assignment clearly didn't happen yet, so the error is thrown. So
cases like these:

x = 2

def foo1():
  x += 1

def foo2():
  print(x)
  x = 10

def foo3():
  if something_that_didnot_happen:
x = 10
  print(x)

All belong to the category.

With an unlimited error message length it could make sense to say Hey, I
see 'x' may be assigned in this scope, so I mark it local. But this access
to 'x' happens before assignment - so ERROR. This isn't realistic, of
course, so I'm wondering:

1. Does this error message (although unrealistic) capture all possible
appearances of UnboundLocalError?
2. If the answer to (1) is yes - could it be usefully shortened to be
clearer than the current local variable referenced before assignment?

This may not be possible, of course, but it doesn't harm trying :-)
Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com