Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Martin v. Löwis
 He keeps leaving them out, I occasionally tell him they should always
 be included (most recently this came up when we gave conflicting
 advice to a patch contributor). He says what he's doing is OK, because
 he doesn't consider the example in PEP 7 as explicitly disallowing it,
 I think it's a recipe for future maintenance hassles when someone adds
 a second statement to one of the clauses but doesn't add the braces.
 (The only time I consider it reasonable to leave out the braces is for
 one liner if statements, where there's no else clause at all)

While this appears to be settled, I'd like to add that I sided with
Benjamin here all along.

With Python, I accepted a style of minimal punctuation. Examples
of extra punctuation are:
- parens around expression in Python's if (and while):

if (x  10):
  foo ()

- parens around return expression (C and Python)

return(*p);

- braces around single-statement blocks in C

In all these cases, punctuation can be left out without changing
the meaning of the program.

I personally think that a policy requiring braces would be (mildly)
harmful, as it decreases readability of the code. When I read code,
I read every character: not just the identifiers, but also every
punctuation character. If there is extra punctuation, I stop and wonder
what the motivation for the punctuation is - is there any hidden
meaning that required the author to put the punctuation?

There is a single case where I can accept extra punctuation in C:
to make the operator precedence explicit. Many people (including
myself) don't know how

   a | b  *c * *d

would group, so I readily accept extra parens as a clarification.

Wrt. braces, I don't share the concern that there is a risk of
somebody being confused when adding a second statement to a braceless
block. An actual risk is stuff like

   if (cond)
 MACRO(argument);

when MACRO expands to multiple statements. However, we should
accept that this is a bug in MACRO (which should have used the
do-while(0)-idiom), not in the application of the macro.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] RNG in the core

2012-01-03 Thread Christian Heimes
Hello,

all proposed fixes for a randomized hashing function raise and fall with
a good random number generator to feed the random seed. The seed must be
created very early in the startup phase of the interpreter, preferable
before the basic types are initialized. CPython already have multiple
sources for random data (win32_urandom in Modules/posixmodule.c, urandom
in Lib/os.py, Mersenne twister in Modules/_randommodule.c). However we
can't use them because they are wrapped inside Python modules which
require infrastructure like initialized base types.

I propose an addition to the current Python C API:

int PyOS_URandom(char *buf, Py_ssize_t len)

Read len chars from the OS's RNG into the pre-allocated buffer buf.
The RNG should be suitable for cryptography. In case of an error the
function returns -1 and sets an exception, otherwise it returns 0.
On Windows I can re-use most of the code of win32_urandom(). For POSIX I
have to implement os.urandom() in C in order to read data from
/dev/urandom. That's simple and straight forward.


Since some platforms may not have /dev/urandom, we need a PRNG in the
core, too. I therefore propose to move the Mersenne twister from
randommodule.c into the core, too.

typedef struct {
unsigned long state[N];
int index;
} _Py_MT_RandomState;

unsigned long _Py_MT_GenRand_Int32(_Py_MT_RandomState *state); //
genrand_int32()
double _Py_MT_GenRand_Res53(_Py_MT_RandomState *state); // random_random()
void _Py_MT_GenRand_Init(_Py_MT_RandomState *state, unsigned long seed);
// init_genrand()
void _Py_MT_GenRand_InitArray(_Py_MT_RandomState *state, unsigned long
init_key[], unsigned long key_length); // init_by_array


I suggest Python/random.c as source file and Python/pyrandom.h as header
file. Comments?

Christian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Matt Joiner
FWIW I'm against forcing braces to be used. Readability is the highest
concern, and this should be at the discretion of the contributor. A
code formatting tool, or compiler extension is the only proper handle
this, and neither are in use or available.

On Tue, Jan 3, 2012 at 7:44 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 He keeps leaving them out, I occasionally tell him they should always
 be included (most recently this came up when we gave conflicting
 advice to a patch contributor). He says what he's doing is OK, because
 he doesn't consider the example in PEP 7 as explicitly disallowing it,
 I think it's a recipe for future maintenance hassles when someone adds
 a second statement to one of the clauses but doesn't add the braces.
 (The only time I consider it reasonable to leave out the braces is for
 one liner if statements, where there's no else clause at all)

 While this appears to be settled, I'd like to add that I sided with
 Benjamin here all along.

 With Python, I accepted a style of minimal punctuation. Examples
 of extra punctuation are:
 - parens around expression in Python's if (and while):

    if (x  10):
      foo ()

 - parens around return expression (C and Python)

    return(*p);

 - braces around single-statement blocks in C

 In all these cases, punctuation can be left out without changing
 the meaning of the program.

 I personally think that a policy requiring braces would be (mildly)
 harmful, as it decreases readability of the code. When I read code,
 I read every character: not just the identifiers, but also every
 punctuation character. If there is extra punctuation, I stop and wonder
 what the motivation for the punctuation is - is there any hidden
 meaning that required the author to put the punctuation?

 There is a single case where I can accept extra punctuation in C:
 to make the operator precedence explicit. Many people (including
 myself) don't know how

   a | b  *c * *d

 would group, so I readily accept extra parens as a clarification.

 Wrt. braces, I don't share the concern that there is a risk of
 somebody being confused when adding a second statement to a braceless
 block. An actual risk is stuff like

   if (cond)
     MACRO(argument);

 when MACRO expands to multiple statements. However, we should
 accept that this is a bug in MACRO (which should have used the
 do-while(0)-idiom), not in the application of the macro.

 Regards,
 Martin
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com



-- 
ಠ_ಠ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Stephen J. Turnbull
Matt Joiner writes:

  Readability is the highest concern, and this should be at the
  discretion of the contributor.

That's quite backwards.  Readability is community property, and has
as much, if not more, to do with common convention as with some
absolute metric.  The contributor's discretion must yield.

That doesn't mean the contributor has to do all the work; as several
people have pointed out, it makes a lot of sense for experienced
reviewers to make such trivial changes themselves before committing,
especially for new contributors.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Matthieu Brucher
Hi,

I'm not a core Python developer, but it may be intesting to use a real
Crush resistant RNG, as one from Random123 (a parallel random generator
that is Crush resistant, contrary to the Mersenne Twister, and without a
state).

Cheers,

Matthieu Brucher

2012/1/3 Christian Heimes li...@cheimes.de

 Hello,

 all proposed fixes for a randomized hashing function raise and fall with
 a good random number generator to feed the random seed. The seed must be
 created very early in the startup phase of the interpreter, preferable
 before the basic types are initialized. CPython already have multiple
 sources for random data (win32_urandom in Modules/posixmodule.c, urandom
 in Lib/os.py, Mersenne twister in Modules/_randommodule.c). However we
 can't use them because they are wrapped inside Python modules which
 require infrastructure like initialized base types.

 I propose an addition to the current Python C API:

 int PyOS_URandom(char *buf, Py_ssize_t len)

 Read len chars from the OS's RNG into the pre-allocated buffer buf.
 The RNG should be suitable for cryptography. In case of an error the
 function returns -1 and sets an exception, otherwise it returns 0.
 On Windows I can re-use most of the code of win32_urandom(). For POSIX I
 have to implement os.urandom() in C in order to read data from
 /dev/urandom. That's simple and straight forward.


 Since some platforms may not have /dev/urandom, we need a PRNG in the
 core, too. I therefore propose to move the Mersenne twister from
 randommodule.c into the core, too.

 typedef struct {
unsigned long state[N];
int index;
 } _Py_MT_RandomState;

 unsigned long _Py_MT_GenRand_Int32(_Py_MT_RandomState *state); //
 genrand_int32()
 double _Py_MT_GenRand_Res53(_Py_MT_RandomState *state); // random_random()
 void _Py_MT_GenRand_Init(_Py_MT_RandomState *state, unsigned long seed);
 // init_genrand()
 void _Py_MT_GenRand_InitArray(_Py_MT_RandomState *state, unsigned long
 init_key[], unsigned long key_length); // init_by_array


 I suggest Python/random.c as source file and Python/pyrandom.h as header
 file. Comments?

 Christian
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/matthieu.brucher%40gmail.com




-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Antoine Pitrou
On Tue, 03 Jan 2012 14:18:34 +0100
Christian Heimes li...@cheimes.de wrote:
 
 I suggest Python/random.c as source file and Python/pyrandom.h as header
 file. Comments?

Looks good on the principle. The API names for MT are a bit ugly.

 The RNG should be suitable for cryptography.

Sounds like too strong a requirement. For cryptography, we have the ssl
module (and third-party libraries).
(also, suitable for cryptography is somewhat vague; for example, the
Linux man pages insist that /dev/urandom is ok for session keys
but /dev/random is needed for long-lived private keys)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Christian Heimes
Am 03.01.2012 18:23, schrieb Matthieu Brucher:
 Hi,
 
 I'm not a core Python developer, but it may be intesting to use a real
 Crush resistant RNG, as one from Random123 (a parallel random generator
 that is Crush resistant, contrary to the Mersenne Twister, and without a
 state).

Hello Matthieu,

thanks for your input!

The core RNG is going to be part of the randomized hashing function
patch. The patch will be applied to all Python version from 2.6 to 3.3.
Some people may want to applied it to 2.4 and 2.5, too. As the patch is
going to affect six to eight Python versions, it should introduce as few
new code as possible. Any new code might be a source of new bugs. The
Mersenne Twister code is mature and works sufficiently as backup.

Any new RNG should go through a PEP process, too. You are welcome to
write a PEP and implement an additional RNG for the random module. New
developers and new ideas are well received.

Regards,
Christian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Ethan Furman

Stephen J. Turnbull wrote:

Matt Joiner writes:

  Readability is the highest concern, and this should be at the
  discretion of the contributor.

That's quite backwards.  Readability is community property, and has
as much, if not more, to do with common convention as with some
absolute metric.  The contributor's discretion must yield.


Readability also includes more than just the source code; as has already 
been stated:


 if(cond) {
   stmt1;
+  stmt2;
 }

vs.

-if(cond)
+if(cond) {
   stmt1;
+  stmt2;
+}

I find the diff version that already had braces in place much more readable.

~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2012-01-03 Thread Barry Warsaw
On Dec 31, 2011, at 04:56 PM, Guido van Rossum wrote:

Is there a tracker issue yet? The discussion should probably move there.

I think the answer to this was no... until now.

http://bugs.python.org/issue13703

Proposed patches should be linked to this issue now.  Please nosy yourself if
you want to follow the progress.

Cheers,
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] That depends on what the meaning of is is (was Re: http://mail.python.org/pipermail/python-dev/2011-December/115172.html)

2012-01-03 Thread Jim Jewett
On Mon, Jan 2, 2012 at 7:16 PM, PJ Eby p...@telecommunity.com wrote:
 On Mon, Jan 2, 2012 at 4:07 PM, Jim Jewett jimjjew...@gmail.com wrote:

 But the public header file 
 http://hg.python.org/cpython/file/3ed5a6030c9b/Include/dictobject.h 
 defines the typedef structs for PyDictEntry and _dictobject.

 What is the purpose of the requiring a real dict without also
 promising what the header file promises?

 Er, just because it's in the .h doesn't mean it's in the public API.  But in
 any event, if you're actually serious about this, I'd just point out that:

 1. The struct layout doesn't guarantee anything about insertion or lookup
 algorithms,

My concern was about your suggestion of changing the data structure to
accommodate some other algorithm -- particularly if it meant that  the
data would no longer be stored entirely in an array of PyDictEntry.

That shouldn't be done lightly even between major versions, and
certainly should not be done in a bugfix (or security-only) release.

 Are you seriously writing code that relies on the C structure layout of
 dicts?

The first page of search results for PyDictEntry suggested that others
are.  (The code I found did seem to be for getting data from a python
dict into some other language, rather than for wsgi.)

  Because really, that was SO not the point of the dict type
 requirement.  It was so that you could use Python's low-level *API* calls,
 not muck about with the data structure directly.

Would it be too late to clarify that in the PEP itself?

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Steven D'Aprano

Christian Heimes wrote:
[...]

I propose an addition to the current Python C API:

int PyOS_URandom(char *buf, Py_ssize_t len)

Read len chars from the OS's RNG into the pre-allocated buffer buf.
The RNG should be suitable for cryptography.



Since some platforms may not have /dev/urandom, we need a PRNG in the
core, too. I therefore propose to move the Mersenne twister from
randommodule.c into the core, too.


Mersenne twister is not suitable for cryptography.

http://en.wikipedia.org/wiki/Mersenne_twister



--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Matthieu Brucher
 The core RNG is going to be part of the randomized hashing function
 patch. The patch will be applied to all Python version from 2.6 to 3.3.
 Some people may want to applied it to 2.4 and 2.5, too. As the patch is
 going to affect six to eight Python versions, it should introduce as few
 new code as possible. Any new code might be a source of new bugs. The
 Mersenne Twister code is mature and works sufficiently as backup.

 Any new RNG should go through a PEP process, too. You are welcome to
 write a PEP and implement an additional RNG for the random module. New
 developers and new ideas are well received.


Good point.
In fact, these RNG are 100% based on the hash functions provided for
instance by OpenSSL. But I think this library is not a dependency so my
proposal still has the same impact.
The Random123 library is a reimplementation of some cryptographic functions
with two arguments, the key and the counter, and that's it. So if there is
somewhere in the Python C code such cryptographic function, it can be
reused to create Crush-resistant random numbers with no new code line.

Cheers,

Matthieu
-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Victor Stinner
A randomized hash doesn't need cryptographic RNG (which are slow and
need a lot of new code), and the new hash function should maybe not be
cryptographic. We need to make the DoS more expensive for the
attacker, but we don't need to add too much security for that.

Mersenne Twister is useless here: it is only needed when you need to
generate a fast RNG to generate megabytes of random data, whereas we
will not need more than 4 KB. The OS RNG is just fine (fast enough and
not blocking).

So we can use Windows CryptoGen API (which is already implemented in
Python, win32_urandom) and /dev/urandom on UNIX/BSD. /dev/urandom does
never block. We need also a fallback if /dev/urandom is not available.
Because this case should not occur on modern OS, the fallback can be a
weak function like something combining getpid(), gettimeofday(),
address of the stack, etc. To generate 4 KB from few words, we can use
a very simple LCG (x(n+1) = (x(n) * a + c) mod k).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Antoine Pitrou
On Tue, 3 Jan 2012 22:17:06 +0100
Victor Stinner victor.stin...@gmail.com wrote:
 A randomized hash doesn't need cryptographic RNG (which are slow and
 need a lot of new code), and the new hash function should maybe not be
 cryptographic. We need to make the DoS more expensive for the
 attacker, but we don't need to add too much security for that.

Agreed.

 Mersenne Twister is useless here: it is only needed when you need to
 generate a fast RNG to generate megabytes of random data, whereas we
 will not need more than 4 KB. The OS RNG is just fine (fast enough and
 not blocking).

Have you read the following sentence:

“Since some platforms may not have /dev/urandom, we need a PRNG in the
core, too. I therefore propose to move the Mersenne twister from
randommodule.c into the core, too.”

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hash collision security issue (now public)

2012-01-03 Thread Bill Janssen
Christian Heimes li...@cheimes.de wrote:

 Am 29.12.2011 12:13, schrieb Mark Shannon:
  The attack relies on being able to predict the hash value for a given
  string. Randomising the string hash function is quite straightforward.
  There is no need to change the dictionary code.
  
  A possible (*untested*) patch is attached. I'll leave it for those more 
  familiar with unicodeobject.c to do properly.
 
 I'm worried that hash randomization of str is going to break 3rd party
 software that rely on a stable hash across multiple Python instances.
 Persistence layers like ZODB and cross interpreter communication
 channels used by multiprocessing may (!) rely on the fact that the hash
 of a string is fixed.

Software that depends on an undefined hash function for synchronization
and persistence deserves to break, IMO.  There are plenty of
well-defined hash functions available for this purpose.

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Martin v. Löwis
 Have you read the following sentence:
 
 “Since some platforms may not have /dev/urandom, we need a PRNG in the
 core, too. I therefore propose to move the Mersenne twister from
 randommodule.c into the core, too.”

I disagree. We don't need a PRNG on platforms without /dev/urandom or
any other native RNG.
Initializing the string-hash seed to 0 is perfectly fine on those
platforms; we can do slightly better by using, say, the current
time (in ms or µs if available) and the current pid (if available).

People concerned with the security on those systems either need to
switch to a different system, or provide a patch to access the
platform's native random number generator.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Ben Finney
Stephen J. Turnbull step...@xemacs.org writes:

 Matt Joiner writes:

   Readability is the highest concern, and this should be at the
   discretion of the contributor.

 That's quite backwards.  Readability is community property, and has
 as much, if not more, to do with common convention as with some
 absolute metric.  The contributor's discretion must yield.

+1

-- 
 \  “Those who write software only for pay should go hurt some |
  `\ other field.” —Erik Naggum, in _gnu.misc.discuss_ |
_o__)  |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Martin v. Löwis
 Readability also includes more than just the source code; as has already
 been stated:
 
  if(cond) {
stmt1;
 +  stmt2;
  }
 
 vs.
 
 -if(cond)
 +if(cond) {
stmt1;
 +  stmt2;
 +}
 
 I find the diff version that already had braces in place much more
 readable.

Is it really *much* more readable? I have no difficulties reading either
(although I had preferred a space after the if; this worries me
more than the double if line).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Benjamin Peterson
Ethan Furman ethan at stoneleaf.us writes:
 
 Readability also includes more than just the source code; as has already 
 been stated:
 
   if(cond) {
 stmt1;
 +  stmt2;
   }
 
 vs.
 
 -if(cond)
 +if(cond) {
 stmt1;
 +  stmt2;
 +}
 
 I find the diff version that already had braces in place much more readable.

There are much larger problems facing diff readibility. On your basis, we might
as well decree that code should never be arranged or reindented.

Regards,
Benjamin




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Proposed PEP on concurrent programming support

2012-01-03 Thread Mike Meyer
PEP: XXX
Title: Interpreter support for concurrent programming
Version: $Revision$
Last-Modified: $Date$
Author: Mike Meyer m...@mired.org
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 11-Nov-2011
Post-History: 


Abstract


The purpose of this PEP is to explore strategies for making concurrent
programming in Python easier by allowing the interpreter to detect and
notify the user about possible bugs in concurrent access. The reason
for doing so is that Errors should never pass silently.

Such bugs are caused by allowing objects to be accessed simultaneously
from another thread of execution while they are being modified.
Currently, python systems provide no support for such bugs, falling
back on the underlying platform facilities and some tools built on top
of those.  While these tools allow prevention of such modification if
the programmer is aware of the need for them, there are no facilities
to detect that such a need might exist and warn the programmer of it.

The goal is not to prevent such bugs, as that depends on the
programmer getting the logic of the interactions correct, which the
interpreter can't judge.  Nor is the goal to warn the programmer about
any such modifications - the goal is to catch standard idioms making
unsafe modifications.  If the programmer starts tinkering with
Python's internals, it's assumed they are aware of these issues.


Rationale
=

Concurrency bugs are among the hardest bugs to locate and fix.  They
result in corrupt data being generated or used in a computation.  Like
most such bugs, the corruption may not become evident until much later
and far away in the program.  Minor changes in the code can cause the
bugs to fail to manifest.  They may even fail to manifest from run to
run, depending on external factors beyond the control of the
programmer.

Therefore any help in locating and dealing with such bugs is valuable.
If the interpreter is to provide such help, it must be aware of when
things are safe to modify and when they are not. This means it will
almost certainly cause incompatible changes in Python, and may impose
costs so high for non-concurrent operations as to make it untenable.
As such, the final options discussed are destined for Python version 4
or later, and may never be implemented in any mainstream
implementation of Python.

Terminology
===

The word thread is used throughout to mean concurrent thread of
execution.  Nominally, this means a platform thread.  However, it is
intended to include any threading mechanism that allows the
interpreter to change threads between or in the middle of a statement
without the programmer specifically allowing this to happen.

Similarly, the word interpreter means any system that processes and
executes Python language files.  While this normally means cPython,
the changes discussed here should be amenable to other
implementations.


Concept
===

Locking object
--

The idea is that the interpreter should indicate an error anytime an
unlocked object is mutated.  For mutable types, this would mean
changing the value of the type. For Python class instances, this would
mean changing the binding of an attribute.  Mutating an object bound
to such an attribute isn't a change in the object the attribute
belongs to, and so wouldn't indicate an error unless the object bound
to the attribute was unlocked.

Locking by name
---

It's also been suggested that locking names would be useful.  That
is, to prevent a specific attribute of an object from being rebound,
or a key/index entry in a mapping object. This provides a finer
grained locking than just locking the object, as you could lock a
specific attribute or set of attributes of an object, without locking
all of them.

Unfortunately, this isn't sufficient: a set may need to be locked to
prevent deletions for some period, or a dictionary to prevent adding a
key, or a list to prevent changing a slice, etc.

So some other locking mechanism is required.  If that needs to specify
objects, some way of distinguishing between locking a name and locking
the object bound to the name needs to be invented, or there needs to
be two different locking mechanisms.  It's not clear that the finer
grained locking is worth adding yet another language mechanism.


Alternatives



Explicit locking


These alternatives requires that the programmer explicitly name
anything that is going to be changed to lock it before changing it.
This lets the interpreter gets involved, but makes a number of errors
possible based on the order that locks are applied.

Platform locks
''

The current tool set uses platform locks via a C extension.  The
problem with these is that the interpreter has no knowledge of them,
and so can't do anything about detecting the mutation of unlocked
objects.


A ``locking`` keyword
'

Adding a statement to tell the interpreter to lock objects for the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-03 Thread Terry Reedy

On 1/3/2012 5:02 PM, Bill Janssen wrote:


Software that depends on an undefined hash function for synchronization
and persistence deserves to break, IMO.  There are plenty of
well-defined hash functions available for this purpose.


The doc for id() now says This is an integer which is guaranteed to be 
unique and constant for this object during its lifetime. Since the 
default 3.2.2 hash for my win7 64bit CPython is id-address // 16, it can 
have no longer guarantee. I suggest that hash() doc say something 
similar: http://bugs.python.org/issue13707


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Add a new PyUnicode_Fill() function

2012-01-03 Thread Antoine Pitrou

 +.. c:function:: int PyUnicode_Fill(PyObject *unicode, Py_ssize_t start, \
 +Py_ssize_t length, Py_UCS4 fill_char)
 +
 +   Fill a string with a character: write *fill_char* into
 +   ``unicode[start:start+length]``.
 +
 +   Fail if *fill_char* is bigger than the string maximum character, or if the
 +   string has more than 1 reference.
 +
 +   Return the number of written character, or return ``-1`` and raise an
 +   exception on error.

The return type should then be Py_ssize_t, not int.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Nick Coghlan
On Wed, Jan 4, 2012 at 8:21 AM, Martin v. Löwis mar...@v.loewis.de wrote:
 Have you read the following sentence:

 “Since some platforms may not have /dev/urandom, we need a PRNG in the
 core, too. I therefore propose to move the Mersenne twister from
 randommodule.c into the core, too.”

 I disagree. We don't need a PRNG on platforms without /dev/urandom or
 any other native RNG.
 Initializing the string-hash seed to 0 is perfectly fine on those
 platforms; we can do slightly better by using, say, the current
 time (in ms or µs if available) and the current pid (if available).

 People concerned with the security on those systems either need to
 switch to a different system, or provide a patch to access the
 platform's native random number generator.

+1 (especially given how far back this is going to be ported)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RNG in the core

2012-01-03 Thread Antoine Pitrou
On Tue, 03 Jan 2012 23:21:30 +0100
Martin v. Löwis mar...@v.loewis.de wrote:
  Have you read the following sentence:
  
  “Since some platforms may not have /dev/urandom, we need a PRNG in the
  core, too. I therefore propose to move the Mersenne twister from
  randommodule.c into the core, too.”
 
 I disagree. We don't need a PRNG on platforms without /dev/urandom or
 any other native RNG.

Well what if /dev/urandom is unavailable because the program is run
e.g. in a chroot?
(or is /dev/urandom still available in a chroot?)

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Stephen J. Turnbull
Benjamin Peterson writes:
  Ethan Furman ethan at stoneleaf.us writes:
   
   Readability also includes more than just the source code; as has already 
   been stated:

[diffs elided]

   I find the diff version that already had braces in place much more 
   readable.
  
  There are much larger problems facing diff readibility. On your basis, we 
  might
  as well decree that code should never be arranged or reindented.

That's a reasonable approach sometimes used, but it would be hard in
Python.  Specifically, I often produce two patches when substantial
rearrangement is involved.  The first isolates the actual changes, the
second does the reformatting.

In Python, the first patch might be syntactically erroneous, which
would be both annoying for automatic testing and less readable.  A
Python-friendly alternative is to provide both a machine-appliable
diff and a diff ignoring whitespace changes.  This could be a toggle
in web interfaces to the VCS.  I've also sometimes found doing word
diffs to be useful.

Most developers resist such procedures passionately, though.  *shrug*

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Benjamin Peterson
2012/1/3 Stephen J. Turnbull step...@xemacs.org:
 Benjamin Peterson writes:
   Ethan Furman ethan at stoneleaf.us writes:
   
    Readability also includes more than just the source code; as has already
    been stated:

 [diffs elided]

    I find the diff version that already had braces in place much more 
 readable.
  
   There are much larger problems facing diff readibility. On your basis, we 
 might
   as well decree that code should never be arranged or reindented.

 That's a reasonable approach sometimes used

My goodness, I was trying to make a ridiculous-sounding proposition.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed PEP on concurrent programming support

2012-01-03 Thread PJ Eby
On Tue, Jan 3, 2012 at 7:40 PM, Mike Meyer m...@mired.org wrote:

 STM is a relatively new technology being experimented with in newer
 languages, and in a number of 3rd party libraries (both Peak [#Peak]_
 and Kamaelia [#Kamaelia]_ provide STM facilities).


I don't know about Kamaelia, but PEAK's STM (part of the Trellis
event-driven library) is *not* an inter-thread concurrency solution: it's
actually used to sort out the order of events in a co-operative
multitasking scenario.  So, it should not be considered evidence for the
practicality of doing inter-thread co-ordination that way in pure Python.


A suite is marked
 as a `transaction`, and then when an unlocked object is modified,
 instead of indicating an error, a locked copy of it is created to be
 used through the rest of the transaction. If any of the originals are
 modified during the execution of the suite, the suite is rerun from
 the beginning. If it completes, the locked copies are copied back to
 the originals in an atomic manner.


I'm not sure if locked is really the right word here.  A private copy
isn't locked because it's not shared.


The disadvantage is that any code in a transaction must be safe to run
 multiple times.  This forbids any kind of I/O.


More precisely, code in a transaction must be *reversible*, so it doesn't
forbid any I/O that can be undone.  If you can seek backward in an input
file, for example, or delete queued output data, then it can still be done.
 Even I/O like re-drawing a screen can be made STM safe by making the
redraw occur after a transaction that reads and empties a buffer written by
other transactions.


For
 instance, combining STM with explicit locking would allow explicit
 locking when IO was required,


I don't think this idea makes any sense, since STM's don't really lock,
and to control I/O in an STM system you just STM-ize the queues.
 (Generally speaking.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 clarification request: braces

2012-01-03 Thread Stephen J. Turnbull
Benjamin Peterson writes:

  My goodness, I was trying to make a ridiculous-sounding proposition.

In this kind of discussion, that's in the same class as be careful
what you wish for -- because you might just get it.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com