Re: [Python-Dev] py34 makes it harder to read all of a pty

2014-11-12 Thread Charles-François Natali
2014-11-12 22:16 GMT+00:00 Buck Golemon buck.2...@gmail.com:
 This is due to the fix for issue21090, which aimed to un-silence errors
 which previously went unheard. The fix is for me, as a user, to write a loop
 that uses os.read and interpretes EIO as EOF. This is what I had hoped
 file.read() would do for me, however, and what it used to do in previous
 pythons.


There's no reason for read() to interpret EIO as EOF in the general
case: it was masked in previous versions because of a mere bug. The
behavior is now correct, although being able to retrieve the data read
so far in case of a buffered read could be useful.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 475, Retry system calls failing with EINTR

2014-09-01 Thread Charles-François Natali
There's no return value, a KeywordInterrupt exception is raised.
The PEP wouldn't change this behavior.

As for the general behavior: all programming languages/platforms
handle EINTR transparently.
It's high time for Python to have a sensible behavior in this regard.



2014-09-01 8:38 GMT+01:00 Marko Rauhamaa ma...@pacujo.net:
 Victor Stinner victor.stin...@gmail.com:

 No, it's the opposite. The PEP doesn't change the default behaviour of
 SIGINT: CTRL+C always interrupt the program.

 Which raises an interesting question: what happens to the os.read()
 return value if SIGINT is received?


 Marko
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 https://mail.python.org/mailman/options/python-dev/cf.natali%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: PEP 475, Retry system calls failing with EINTR

2014-09-01 Thread Charles-François Natali
2014-09-01 12:15 GMT+01:00 Marko Rauhamaa ma...@pacujo.net:
 Charles-François Natali cf.nat...@gmail.com:

 Which raises an interesting question: what happens to the os.read()
 return value if SIGINT is received?

 There's no return value, a KeywordInterrupt exception is raised.
 The PEP wouldn't change this behavior.

 Slightly disconcerting... but I'm sure overriding SIGINT would cure
 that. You don't want to lose data if you want to continue running.

 As for the general behavior: all programming languages/platforms
 handle EINTR transparently.

 C doesn't. EINTR is there for a purpose.

Python is slightly higher level than C, right? I was referring to
Java, go, Haskell...

Furthermore, that's not true: many operating systems actually restart
syscalls by default (including Linux, man 7 signal):


   Interruption of system calls and library functions by signal handlers
   If a signal handler is invoked while a system call or library
function call is blocked, then either:

   * the call is automatically restarted after the signal handler
returns; or

   * the call fails with the error EINTR.

   Which of these two behaviors occurs depends on the interface
and whether or not the signal handler was established using the
SA_RESTART flag  (see  sigaction(2)).   The
   details vary across UNIX systems; below, the details for Linux.


The reason the interpreter is subject to so many EINTR is that we
*explicitly* clear SA_RESTART because the C-level signal handler must
be handled by the interpreter to have a chance to run the Python-level
handlers from the main loop.

There are many aspects of signal handling in Python that make it
different from C: if you want C semantics, stick to C.

I do not want to have to put all blocking syscalls within a try/except
loop: have a look at the stdlib code, you'll see it's really a pain
and ugly. And look at the number of EINTR-related syscalls we've had.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Exposing the Android platform existence to Python modules

2014-08-01 Thread Charles-François Natali
2014-08-01 13:23 GMT+01:00 Shiz h...@shiz.me:

 Is your P.S. suggestive that you would not be willing to support your port 
 for use by others?  Of course, until it is somewhat complete, it is hard to 
 know how complete and compatible it can be.

 Oh, no, nothing like that. It's just that I'm not sure, as goes for anything, 
 that it would be accepted into mainline CPython. Better safe than sorry in 
 that aspect: maybe the maintainers don't want to support Android in the first 
 place. :)

Well, Android is so popular that supporting it would definitely be interesting.
There are a couple questions however (I'm not familiar at all with
Android, I don't have a smartphone ;-):
- Do you have an idea of the amount of work/patch size required? Do
you have an example of a patch (even if it's a work-in-progess)?
- Is there really a common Android platform? I've heard a lot about
fragmentation, so would we have to support several Android flavours
(like #ifdef __ANDROID_VENDOR_A__, #elif defined
__ANDROID_VENDOR_B__)?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471: scandir(fd) and pathlib.Path(name, dir_fd=None)

2014-07-02 Thread Charles-François Natali
2014-07-01 8:44 GMT+01:00 Victor Stinner victor.stin...@gmail.com:

 IMO we must decide if scandir() must support or not file descriptor.
 It's an important decision which has an important impact on the API.

I don't think we should support it: it's way too complicated to use,
error-prone, and leads to messy APIs.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471: scandir(fd) and pathlib.Path(name, dir_fd=None)

2014-07-02 Thread Charles-François Natali
 2014-07-02 12:51 GMT+02:00 Charles-François Natali cf.nat...@gmail.com:
 I don't think we should support it: it's way too complicated to use,
 error-prone, and leads to messy APIs.

 Can you please elaborate? Which kind of issue do you see? Handling the
 lifetime of the directory file descriptor?

Yes, among other things. You can e.g. have a look at os.fwalk() or
shutil._rmtree_safe_fd() to see that using those *properly* is far
from being trivial.

 You don't like the dir_fd parameter of os functions?

Exactly, I think it complicates the API for little benefit (FWIW, no
other language I know of exposes them).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] should tests be thread-safe?

2014-05-10 Thread Charles-François Natali
 You might have forgotten to include Python-dev in the reply.

Indeed, adding it back!

 Thank you for the reply. I might have expressed the question poorely. I
 meant: I have a script that I know is not thread-safe but it doesn't matter
 because the test itself doesn't run any threads and the current tests are
 never(?) run in multiple threads (-j uses processes). Should this *new* test
 be fixed if e.g., there is a desire to be able to run (at least some) tests
 in multiple threads concurrently in the future?

The short answer is: no, you don't have to make you thread thread
safe, as long as it can reliably run even in the presence of
background threads (like the tkinter threads Victor mentions).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] API and process questions (sparked by Claudiu Popa on 16104

2014-04-29 Thread Charles-François Natali
2014-04-28 21:24 GMT+01:00 Claudiu Popa pcmantic...@gmail.com:
 [...]

 If anyone agrees with the above, then I'll modify the patch. This will
 be its last iteration, any other bikeshedding
 should be addressed by the core dev who'll apply it.

I'm perfectly happy with those proposals.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [issue6839] zipfile can't extract file

2014-04-29 Thread Charles-François Natali
2014-04-30 3:58 GMT+01:00 Steven D'Aprano st...@pearwood.info:
 On Tue, Apr 29, 2014 at 07:48:00PM -0700, Jessica McKellar wrote:
 Hi Adam,

 Gentlemen,
 

 Thanks for contributing to Python! But not everyone on this list is a guy.

 And not all of the guys are gentlemen :-)

And I thought guys could be used to address mixed-gender groups (I'm
pretty sure I've heard some ladies use it in this setting), but I'm
not a native speaker.

The idea being that one should not infer too much from a salutation
from someone who might not be a native speaker (some languages default
to masculine for a mixed audience), although in this case Ladies and
gentlemen is really famous.

In any case, I'm sure he'd like to have his code reviewed by someone,
regardless of its gender!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] API and process questions (sparked by Claudiu Popa on 16104

2014-04-28 Thread Charles-François Natali
 (2)  The patch adds new functionality to use multiple processes in
 parallel.  The normal parameter values are integers indicating how
 many processes to use.  The parameter also needs two special values --
 one to indicate use os.cpu_count, and the other to indicate don't
 use multiprocessing at all.

 (A)  Is there a Best Practices for this situation, with two odd cases?


 No. In this situation I would consider 0 or -1 for use os.cpu_count' and
 None for don't use multi-processing.

Why would the user care if multiprocessing is used behind the scene?
It would be strange for processes=1 to fail if multiprocessing is not
available.

If you set a default value of 1, then compileall() will work
regardless of whether multiprocessing is available.

In short:
processes = 0: use os.cpu_count()
processes == 1 (default): just use normal sequential compiling
processes  1: use multiprocessing

There's no reason to introduce None. Or am I missing something?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] API and process questions (sparked by Claudiu Popa on 16104

2014-04-28 Thread Charles-François Natali
And incidentally, I think that the argument *processes* should be
renamed to *workers*, or *jobs* (like in make), and any mention of
multiprocessing in the documentation should be removed (if any):
multiprocessing is an implementation detail.
When I type:
make -jN

I don't really care that make is using fork() ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] file objects guarantees

2014-04-28 Thread Charles-François Natali
Hi,

What's meant exactly by a file object?

Let me be more specific: for example, pickle.dump() accepts a file object.

Looking at the code, it doesn't check the return value of its write() method.

So it assumes that write() should always write the whole data (not
partial write).

Same thing for read, it assumes there won't be short reads.

A sample use case would be passing a socket.makefile() to pickle: it
works, because makefile() returns a BufferedReader/Writer which takes
care of short read/write.

But the documentation just says file object. And if you have a look
the file object definition in the glossary:
https://docs.python.org/3.5/glossary.html#term-file-object


There are actually three categories of file objects: raw binary files,
buffered binary files and text files. Their interfaces are defined in
the io module. The canonical way to create a file object is by using
the open() function.


So someone passing e.g. a raw binary file - which doesn't handle short
reads/writes - would run into trouble.

It's the same thing for e.g. GzipFile, and probably many others.

Would it make sense to add a note somewhere?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] API and process questions (sparked by Claudiu Popa on 16104

2014-04-28 Thread Charles-François Natali
2014-04-28 18:29 GMT+01:00 Jim J. Jewett jimjjew...@gmail.com:
 On Mon, Apr 28, 2014 at 12:56 PM, Charles-François Natali
 cf.nat...@gmail.com wrote:
 Why would the user care if multiprocessing is used behind the scene?

 Err ... that was another set of questions that I forgot to ask.

 (A)  Why bother raising an error if multiprocessing is unavailable?
 After all, there is a perfectly fine fallback...

 On other hand, errors should not pass silently.  If a user has
 explicitly asked for multiprocessing, there should be some notice that
 it didn't happen.  And builds are presumably something that a
 developer will monitor to respond to the Exception.

The point I'm making is that he's not asking for multiprocessing, he's
asking for parallel build.
If you pass 1 (or keep the default value), there's no fallback
involved: the code should simply proceed sequentially.

 (A1)  What sort of Error?  I'm inclined to raise the original
 ImportError, but the patch prefers a ValueError.

NotImplementedError would maybe make sense.

 As Claudiu pointed out, processes=1 should really mean 1 worker
 process, which is still different from do everything in the main
 process.  I'm not sure that level of control is really worth the
 complexity, but I'm not certain it isn't.

I disagree. If you pass job =1 (and not processes = 1), then you don't
care whether multiprocessing is available or not: you just do
everything in your main process. It would be quite wasteful to create
a single child process!

 processes = 0: use os.cpu_count()

 I could understand doing that for 0 or -1; what is the purpose of
 doing it for both, let alone for -4?

 Are we at the point where the parameter should just take positive
 integers or one of a set of specified string values?

Honestly, as long as it accepts 0, I'm happy.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [numpy wishlist] PyMem_*Calloc

2014-04-15 Thread Charles-François Natali
Indeed, that's very reasonable.

Please open an issue on the tracker!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] pickle self-delimiting

2014-04-01 Thread Charles-François Natali
Hi,

Unless I'm mistaken, pickle's documentation doesn't mention that the pickle
wire-format is self-delimiting. Is there any reason why it's not documented?

The reason I'm asking is because I've seen some code out there doing its
own ad-hoc length-prefix framing.

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pickle self-delimiting

2014-04-01 Thread Charles-François Natali
 No reason AFAIK. However, the fact that it is self-delimited is implicit
 in the fact that Bytes past the pickled object's representation are
 ignored: https://docs.python.org/dev/library/pickle.html#pickle.load

I find this sentence worrying: it could lead one to think that load() could
read more bytes than the expected object representation size: this would
make pickle actually non self-delimiting, and could lead to problems when
reading e.g. from a socket, since an extraneous read() could block.

I think it's worth making it clear in the doc, I'll open an issue on the
tracker.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Confirming status of new modules in 3.4

2014-03-16 Thread Charles-François Natali
2014-03-15 21:44 GMT+00:00 Nikolaus Rath nikol...@rath.org:

 Guido van Rossum gu...@python.org writes:
  This downside of using subclassing as an API should be well known by now
  and widely warned against.

 It wasn't known to me until now. Are these downsides described in some
 more detail somewhere?


The short version is: inheritance breaks encapsulation.

As a trivial and stupid example, let's say you need a list object which
counts the number of items inserted/removed (it's completely stupid, but
that's not the point :-):

So you might do something like:

class CountingList(list):

[...]

def append(self, e):
self.inserted += 1
return super().append(e)

def extend(self, l):
self.inserted += len(l)
return super().extend(l)



Looks fine, it would probably work.

Now, it's actually very fragile: imagine what would happen if list.extend()
was internally implemented by calling list.append() for each element: you'd
end up counting each element twice (since the subclass append() method
would be called).

And that's the problem: by deriving from a class, you become dependent of
its implementation, even though you're using its public API. Which means
that it could work with e.g. CPython but not Pypy, or break with a new
version of Python.

Another related problem is, as Guido explained, that if you add a new
method in the subclass, and the parent class gains a method with the same
name in a new version, you're in trouble.

That's why advising inheritance as a silver bullet for code reuses is IMO
one of the biggest mistakes in OOP, simply because although attractive,
inheritance breaks encapsulation.

As a rule of thumb, you should only use inheritance within a
module/package, or in other words only if you're in control of the
implementation.

The alternative is to use composition

For more details, I highly encourage anyone interested in looking at the
book Effective Java by Joshua Bloch (the example above is inspired by his
book). Although Java-centric, it's packed with many advises, patterns and
anti-patterns that are relevant to OOP and just programming in general
(it's in my top-5 books).

cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Confirming status of new modules in 3.4

2014-03-15 Thread Charles-François Natali
2014-03-15 11:02 GMT+00:00 Giampaolo Rodola' g.rod...@gmail.com:

 One part which can be improved is that right now the selectors module
doesn't take advance of e/poll()'s modify() method: instead it just
unregister() and register() the fd every time, which is of course
considerably slower (there's also a TODO in the code about this).
 I guess that can be fixed later in a safely manner.

Sure, it can be fixed easily, but I'd like to see the gain of this on a
non-trivial benchmark (I mean a realistic workload, not just calling
modify() in a tight loop).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-25 Thread Charles-François Natali
2013/11/25 Greg Ewing greg.ew...@canterbury.ac.nz:
 Ben Hoyt wrote:

 However, it seems there was no further discussion about why not
 extension and extensions? I have never heard a filename extension
 being called a suffix.


 You can't have read many unix man pages, then! I just
 searched for suffix in the gcc man page, and found
 this:

 For any given input file, the file name suffix determines what kind of
 compilation is done:


 I know it is a suffix in the sense of the
 English word, but I've never heard it called that in this context, and
 I think context is important.


 This probably depends on your background. In my experience,
 the term extension arose in OSes where it was a formal
 part of the filename syntax, often highly constrained.
 E.g. RT11, CP/M, early MS-DOS.

 Unix has never had a formal notion of extensions like that,
 only informal conventions, and has called them suffixes at
 least some of the time for as long as I can remember.

Indeed.
Just for reference, here's an extract of POSIX basename(1) man page [1]:

SYNOPSIS

basename string [suffix]

DESCRIPTION

The string operand shall be treated as a pathname, as defined in XBD
Pathname. The string string shall be converted to the filename
corresponding to the last pathname component in string and then the
suffix string suffix, if present, shall be removed.


[1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/basename.html


cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 454 - tracemalloc - accepted

2013-11-21 Thread Charles-François Natali
Hi,

I'm happy to officially accept PEP 454 aka tracemalloc.
The API has substantially improved over the past weeks, and is now
both easy to use and suitable as a fundation for high-level tools for
memory-profiling.

Thanks to Victor for his work!


Charles-François
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 454 (tracemalloc) close to pronouncement

2013-11-10 Thread Charles-François Natali
Hi,

After several exchanges with Victor, PEP 454 has reached a status
which I consider ready for pronuncement [1]: so if you have any last
minute comment, now is the time!

Cheers,

cf


[1] http://www.python.org/dev/peps/pep-0454/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updated PEP 454 (tracemalloc): no more metrics!

2013-10-24 Thread Charles-François Natali
2013/10/24 Kristján Valur Jónsson krist...@ccpgames.com:

 Now, I would personally not truncate the stack, because I can afford the 
 memory,
 but even if I would, for example, to hide a bunch of detail, I would want to 
 throw away
 the _lower_ detals of the stack.  It is unimportant to me to know if memory 
 was
 allocated in
 ...;itertools.py;logging.py;stringutil.py
 but more important to know that it was allocated in
 main.py;databaseengine.py;enginesettings.py;...

Well, maybe to you, but if you look at valgrind for example, it keeps
the top of the stack: and it makes a lot of sense to me, since
otherwise you won't be able to find where the leak occurred.

Anyway, since the stack depth is a tunable parameter, this shouldn't
be an issue in practice: just save the whole stack.


2013/10/24 MRAB pyt...@mrabarnett.plus.com:

 When I was looking for memory leaks in the regex module I simply wrote all
 of the allocations, reallocations and deallocations to a log file and then
 parsed it afterwards using a Python script. Simple, but effective.

We've all done that ;-)

 1) really, all that is required in terms of data is the 
 traceback.get_traces() function.  Further, it _need_ not return addresses 
 since they are not required for analysis.  It is sufficient for it to return 
 a list of (traceback, size, count) tuples.

Sure. Since the beginning, I'm also leaning towards a minimal API, and
let third-party tools do the analysis.

It makes a lot of sense, since some people will want just basic
snapshot information, some others will want to compute various
statistics, some others will want to display the result in a GUI...

But OTOT, it would be too bad not to ship the stdlib with a basic tool
to process data, to as to make it usable out-of-the box.

And in this regard, we should probably mimick what's done for CPU
profiling: there are both low-level profiling data gathering
infrastructure (profile and cProfile), but there's also a pstats.Stats
class allowing basic operations/display on this raw data.

That's IMO a reasonable balance.

cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Switch subprocess stdin to a socketpair, attempting to fix issue #19293 (AIX

2013-10-23 Thread Charles-François Natali
 For the record, pipe I/O seems a little faster than socket I/O under
 Linux:

 $ ./python -m timeit -s import os, socket; a,b = socket.socketpair(); 
 r=a.fileno(); w=b.fileno(); x=b'x'*1000 os.write(w, x); os.read(r, 1000)
 100 loops, best of 3: 1.1 usec per loop

 $ ./python -m timeit -s import os, socket; a,b = socket.socketpair(); 
 x=b'x'*1000
 a.sendall(x); b.recv(1000)
 100 loops, best of 3: 1.02 usec per loop

 $ ./python -m timeit -s import os; r, w = os.pipe(); x=b'x'*1000 
 os.write(w, x); os.read(r, 1000)
 100 loops, best of 3: 0.82 usec per loop

That's a raw write()/read() benchmark, but it's not taking something
important into account: pipes/socket are usually used to communicate
between concurrently running processes. And in this case, an important
factor is the pipe/socket buffer size: the smaller it is, the more
context switches (due to blocking writes/reads) you'll get, which
greatly decreases throughput.
And by default, Unix sockets have large buffers than pipes (between 4K
and 64K for pipes depending on the OS):

I wrote a quick benchmark forking a child process, with the parent
writing data through the pipe, and waiting for the child to read it
all. here are the results (on Linux):

# time python /tmp/test.py pipe

real0m2.479s
user0m1.344s
sys 0m1.860s

# time python /tmp/test.py socketpair

real0m1.454s
user0m1.242s
sys 0m1.234s

So socketpair is actually faster.

But as noted by Victor, there a slight differences between pipes and
sockets I can think of:
- pipes guarantee write atomicity if less than PIPE_BUF is written,
which is not the case for sockets
- more annoying: in subprocess, the pipes are not set non-blocking:
after a select()/poll() returns a FD write-ready, we write less than
PIPE_BUF at a time to avoid blocking: this likely wouldn't work with a
socketpair

But this patch doesn't touch subprocess itself, and the FDs is only
used by asyncio, which sets them non-blocking: so this could only be
an issue for the spawned process, if it does rely on the two
pipe-specific behaviors above.

OTOH, having a unique implementation on all platforms makes sense, and
I don't know if it'll actually be a problem in practice, we we could
ship as-is and wait until someone complains ;-)

cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] pathlib (PEP 428) status

2013-10-23 Thread Charles-François Natali
Hi,

What's the current status of pathlib? Is it targeted for 3.4?

It would be a really nice addition, and AFAICT it has already been
maturing a while on pypi, and discussed several times here.
If I remember correctly, the only remaining issue was stat()'s result caching.

cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc): new minimalist version

2013-10-19 Thread Charles-François Natali
 ``get_tracemalloc_memory()`` function:

 Get the memory usage in bytes of the ``tracemalloc`` module as a
 tuple: ``(size: int, free: int)``.

 * *size*: total size of bytes allocated by the module,
   including *free* bytes
 * *free*: number of free bytes available to store data

 What's *free* exactly? I assume it's linked to the internal storage
 area used by tracemalloc itself, but that's not clear at all.

 Also, is the tracemalloc overhead included in the above stats (I'm
 mainly thinking about get_stats() and get_traced_memory()?
 If yes, I find it somewhat confusing: for example, AFAICT, valgrind's
 memcheck doesn't report the memory overhead, although it can be quite
 large, simply because it's not interesting.

 My goal is to able to explain how *every* byte is allocated in Python.
 If you enable tracemalloc, your RSS memory will double, or something
 like that. You can use get_tracemalloc_memory() to add metrics to a
 snapshot. It helps to understand how the RSS memory evolves.

 Basically, get_tracemalloc_size() is the memory used to store traces.
 It's something internal to the C module (_tracemalloc). This memory is
 not traced because it *is* the traces... and so is not counted in
 get_traced_memory().

 The issue is probably the name (or maybe also the doc): would you
 prefer get_python_memory() / get_traces_memory() names, instead of
 get_traced_memory() / get_tracemalloc_memory()?

No, the names are fine as-is.

 FYI Objects allocated in tracemalloc.py (real objects, not traces) are
 not counted in get_traced_memory() because of a filter set up by
 default (it was not the case in previous versions of the PEP). You can
 remove the filter using tracemalloc.clear_filters() to see this
 memory. There are two exceptions: Python objects created for the
 result of get_traces() and get_stats() are never traced for
 efficiency. It *is* possible to trace these objects, but it's really
 too slow. get_traces() and get_stats() may be called outside
 tracemalloc.py, so another filter would be needed. Well, it's easier
 to never trace these objects. Anyway, they are not interesting to
 understand where your application leaks memory.

Perfect, that's all I wanted to know.

 get_object_trace(obj) is a shortcut for
 get_trace(get_object_address(obj)). I agree that the wrong size
 information can be surprising.

 I can delete get_object_trace(), or rename the function to
 get_object_traceback() and modify it to only return the traceback.

 I prefer to keep the function (modified for get_object_traceback).
 tracemalloc can be combined with other tools like Melia, Heapy or
 objgraph to combine information. When you find an interesting object
 with these tools, you may be interested to know where it was
 allocated.

If you mean modify it to return only the trace, then that's fine.
As for the name, traceback does indeed sound less confusing than
trace, but we should just make sure that the names are consistent
across the API (i.e. always use trace or always use traceback,
not both).

 ``get_trace(address)`` function:

 Get the trace of a memory block as a ``(size: int, traceback)``
 tuple where *traceback* is a tuple of ``(filename: str, lineno:
 int)`` tuples, *filename* and *lineno* can be ``None``.

 Return ``None`` if the ``tracemalloc`` module did not trace the
 allocation of the memory block.

 See also ``get_object_trace()``, ``get_stats()`` and
 ``get_traces()`` functions.

 Do you have example use cases where you want to work with a raw addresses?

 An address is the unique key to identify a memory block. In Python,
 you don't manipulate directly memory blocks, that's why you have a
 get_object_address() function (link objects to traces).

 I added get_trace() because get_traces() is very slow. It would be
 stupid to call it if you only need one trace of a memory block.

 I'm not sure that this function is really useful. I added it to
 workaround the performance issue, and because I believe that someone
 will need it later :-)

 What do you suggest for this function?

Well, I can certainly find a use-case for get_object_trace(): even if
it uses get_trace() internally, I'm not convinced that the later is
useful.
If we cannot come up with a use case for working with raw addresses,
I'm tempted to just keep get_object_trace() public, and make
get_object_address() and get_trace() private.
In short, don't make any address-manipulating function public.

 Are those ``match`` methods really necessary for the end user, i.e.
 are they worth being exposed as part of the public API?

 (Oh, I just realized that match_lineno() and may lead to bugs, I removed it.)

 Initially, I exposed the methods for unit tests. Later, I used them in
 Snapshot.apply_filters() to factorize the code (before I add 2
 implementations to match a filter, one in C, another in Python).

 I see tracemalloc more as a library, I don't know yet how it will be
 used by new tools based on it. 

Re: [Python-Dev] PEP 454 (tracemalloc): new minimalist version

2013-10-19 Thread Charles-François Natali
2013/10/19 Nick Coghlan ncogh...@gmail.com:

 Speaking of which... Charles-François, would you be willing to act as
 BDFL-Delegate for this PEP? This will be a very useful new analysis tool,
 and between yourself and Victor it looks like you'll be able to come up with
 a solid API.

 I just suggested that approach to Guido and he also liked the idea :)

Well, I'd be happy to help get this merged.

There's still the deadline problem: do we have to get this PEP
approved and merged within 24 hours?

cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc): new minimalist version

2013-10-18 Thread Charles-François Natali
Hi,

I'm happy to see this move forward!

 API
 ===

 Main Functions
 --

 ``clear_traces()`` function:

 Clear traces and statistics on Python memory allocations, and reset
 the ``get_traced_memory()`` counter.

That's nitpicking, but how about just ``reset()`` (I'm probably biased
by oprofile's opcontrol --reset)?

 ``get_stats()`` function:

 Get statistics on traced Python memory blocks as a dictionary
 ``{filename (str): {line_number (int): stats}}`` where *stats* in a
 ``(size: int, count: int)`` tuple, *filename* and *line_number* can
 be ``None``.

It's probably obvious, but you might want to say once what *size* and
*count* represent (and the unit for *size*).

 ``get_tracemalloc_memory()`` function:

 Get the memory usage in bytes of the ``tracemalloc`` module as a
 tuple: ``(size: int, free: int)``.

 * *size*: total size of bytes allocated by the module,
   including *free* bytes
 * *free*: number of free bytes available to store data

What's *free* exactly? I assume it's linked to the internal storage
area used by tracemalloc itself, but that's not clear at all.

Also, is the tracemalloc overhead included in the above stats (I'm
mainly thinking about get_stats() and get_traced_memory()?
If yes, I find it somewhat confusing: for example, AFAICT, valgrind's
memcheck doesn't report the memory overhead, although it can be quite
large, simply because it's not interesting.

 Trace Functions
 ---

 ``get_traceback_limit()`` function:

 Get the maximum number of frames stored in the traceback of a trace
 of a memory block.

 Use the ``set_traceback_limit()`` function to change the limit.

I didn't see anywhere the default value for this setting: it would be
nice to write it somewhere, and also explain the rationale (memory/CPU
overhead...).

 ``get_object_address(obj)`` function:

 Get the address of the main memory block of the specified Python object.

 A Python object can be composed by multiple memory blocks, the
 function only returns the address of the main memory block.

IOW, this should return the same as id() on CPython? If yes, it could
be an interesting note.

 ``get_object_trace(obj)`` function:

 Get the trace of a Python object *obj* as a ``(size: int,
 traceback)`` tuple where *traceback* is a tuple of ``(filename: str,
 lineno: int)`` tuples, *filename* and *lineno* can be ``None``.

I find the trace word confusing, so it might be interesting to add a
note somewhere explaining what it is (callstack leading to the object
allocation, or whatever).

Also, this function leaves me a mixed feeling: it's called
get_object_trace(), but you also return the object size - well, a
vague estimate thereof. I wonder if the size really belongs here,
especially if the information returned isn't really accurate: it will
be for an integer, but not for e.g. a list, right? How about just
using sys.getsizeof(), which would give a more accurate result?

 ``get_trace(address)`` function:

 Get the trace of a memory block as a ``(size: int, traceback)``
 tuple where *traceback* is a tuple of ``(filename: str, lineno:
 int)`` tuples, *filename* and *lineno* can be ``None``.

 Return ``None`` if the ``tracemalloc`` module did not trace the
 allocation of the memory block.

 See also ``get_object_trace()``, ``get_stats()`` and
 ``get_traces()`` functions.

Do you have example use cases where you want to work with a raw addresses?

 Filter
 --

 ``Filter(include: bool, pattern: str, lineno: int=None, traceback:
 bool=False)`` class:

 Filter to select which memory allocations are traced. Filters can be
 used to reduce the memory usage of the ``tracemalloc`` module, which
 can be read using the ``get_tracemalloc_memory()`` function.

 ``match(filename: str, lineno: int)`` method:

 Return ``True`` if the filter matchs the filename and line number,
 ``False`` otherwise.

 ``match_filename(filename: str)`` method:

 Return ``True`` if the filter matchs the filename, ``False`` otherwise.

 ``match_lineno(lineno: int)`` method:

 Return ``True`` if the filter matchs the line number, ``False``
 otherwise.

 ``match_traceback(traceback)`` method:

 Return ``True`` if the filter matchs the *traceback*, ``False``
 otherwise.

 *traceback* is a tuple of ``(filename: str, lineno: int)`` tuples.

Are those ``match`` methods really necessary for the end user, i.e.
are they worth being exposed as part of the public API?

 StatsDiff
 -

 ``StatsDiff(differences, old_stats, new_stats)`` class:

 Differences between two ``GroupedStats`` instances.

 The ``GroupedStats.compare_to()`` method creates a ``StatsDiff``
 instance.

 ``sort()`` method:

 Sort the ``differences`` list from the biggest difference to the
 smallest difference. Sort by ``abs(size_diff)``, *size*,
 ``abs(count_diff)``, *count* and then by *key*.

 

Re: [Python-Dev] cpython: Try doing a raw test of os.fork()/os.kill().

2013-10-17 Thread Charles-François Natali
2013/10/17 Antoine Pitrou solip...@pitrou.net:
 On Thu, 17 Oct 2013 15:33:02 +0200 (CEST)
 richard.oudkerk python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/9558e9360afc
 changeset:   86401:9558e9360afc
 parent:  86399:9cd88b39ef62
 user:Richard Oudkerk shibt...@gmail.com
 date:Thu Oct 17 14:24:06 2013 +0100
 summary:
   Try doing a raw test of os.fork()/os.kill().

 For this kind of ad-hoc testing, you can also use a custom builder to
 avoid disrupting the main source tree:

AFAICT, the problem he's trying to debug (issue #19227) only occurs on
two specific - stable - buildbots.

cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: Pathlib

2013-09-16 Thread Charles-François Natali
2013/9/16 Antoine Pitrou solip...@pitrou.net:
 Le Sun, 15 Sep 2013 06:46:08 -0700,
 Ethan Furman et...@stoneleaf.us a écrit :
 I see PEP 428 is both targeted at 3.4 and still in draft status.

 What remains to be done to ask for pronouncement?

 I think I have a couple of items left to integrate in the PEP.
 Mostly it needs me to take a bit of time and finalize the PEP, and
 then have a PEP delegate (or Guido) pronounce on it.

IIRC, during the last discussion round, we were still debating between
implicit stat() result caching - which requires an explicit restat()
method - vs a mapping between the stat() method and a stat() syscall.

What was the conclusion?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] DTRACE support

2013-09-06 Thread Charles-François Natali
 As far as I know, Erlang, Ruby, PHP, Perl, etc., support Dtrace.
 Python is embarrasingly missing from this list.

 Some examples:

 http://crypt.codemancers.com/posts/2013-04-16-profile-ruby-apps-dtrace-part1/
 http://www.phpdeveloper.org/news/18859
 http://www.erlang.org/doc/apps/runtime_tools/DTRACE.html

 I have spend a very long time on a patch for Dtrace support in most
 platforms with dtrace available. Currently working under Solaris and
 derivatives, and MacOS X. Last time I checked, it would crash FreeBSD
 because bugs in the dtrace port, but that was a long time ago.

 I would like to push this to Python 3.4, and the window is going to be
 closed soon, so I think this is the time to ask for opinions and
 support here.

 Does Python-Dev have any opinion or interest in this project?. Should
 I push for it?

IMO, that's a large, intrusive patch, which distracts the reader from
the main code and logic.

Here's an extract from Modules/gcmodule.c:

static void
dtrace_gc_done(Py_ssize_t value)
{
PYTHON_GC_DONE((long) value);
/*
* Currently a USDT tail-call will not receive the correct arguments.
* Disable the tail call here.
*/
#if defined(__sparc)
asm(nop);
#endif
}

Also have a look at cevalc.c:
http://bugs.python.org/review/13405/diff/6152/Python/ceval.c


IMO it's not worth it (personally strace/gdb/valgrind are more than
enough for me, and we''re about to gain memory tracing with Victor's
tracemalloc).

cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] DTRACE support

2013-09-06 Thread Charles-François Natali
 The main value of DTrace is systemwide observability. You can see
 something strange at kernel level and trace it to a particular line
 of code in a random Python script. There is no other tool that can do
 that. You have complete transversal observability of ALL the code
 running in your computer, kernel or usermode, clean reports with
 threads, etc.

Don't get me wrong, I'm not saying DTrace is useless.
I'm just saying that, as far as I'm concerned, I've never had any
trouble debugging/tunning a Python script with non-intrusive tools
(strace, gdb, valgrind, and oprofile for profiling). Of course, this
includes analysing bug reports.

 Maybe the biggest objection would be that most python-devs are running
 Linux, and you don't have dtrace support on linux unless you are
 running Oracle distribution. But world is larger than linux, and there
 are some efforts to port DTrace to Linux itself. DTrace is available
 on Solaris and derivatives, MacOS X and FreeBSD.

That's true, I might have a different opinion if I used Solaris. But
that's not the case, so te me, the cognitive overhead incurred by this
large patch isn't worth it.

So I'm -1, but that's a personal opinion :-)

cf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Add a new tracemalloc module to trace memory allocations

2013-08-31 Thread Charles-François Natali
2013/8/29 Victor Stinner victor.stin...@gmail.com:
 Charles-François Natali and Serhiy Storchaka asked me to add this
 module somewhere in Python 3.4: how about adding pyfailmalloc to the
 main repo (maybe under Tools), with a script making it easy to run the
 tests suite with it enabled?

There are two reasons I think it would be a great addition:
- since OOM conditions are - almost - never tested, the OOM handling
code is - almost - always incorrect: indeed, Victor has found and
fixed several dozens crashes thanks to this module
- this module is actually really simple (~150 LOC)

I have two comments on the API:
1)
failmalloc.enable(range: int=1000): schedule a memory allocation
failure in random.randint(1, range) allocations.

That's one shot, i.e. only one failure will be triggered. So if this
failure occurs in a place where the code is prepared to handle
MemoryError (e.g. bigmem tests), no failure will occur in the
remaining test. It would be better IMO to repeat this (i.e. reset the
next failure counter), to increase the coverage.

2)
It's a consequence of 1): since only one malloc() failure is
triggered, it doesn't really reflect how a OOM condition would appear
in real life: usually, it's either because you've exhausted your
address space or the machine is under memory pressure, which means
that once you've hit OOM, you're likely to encounter it again on
subsequent allocations, for example if your OOM handling code
allocates new memory (that's why it's so complicated to properly
handle OOM, and one might want to use memory parachutes).
It might be interesting to be able to pass an absolute maximum memory
usage, or an option where once you've triggered an malloc() failure,
you record the current memory usage, and use it as ceiling for
subsequent allocations.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] EINTR handling...

2013-08-30 Thread Charles-François Natali
Hello,

This has been bothering me for years: why don't we properly handle
EINTR, by running registered signal handlers and restarting the
interrupted syscall (or eventually returning early e.g. for sleep)?

EINTR is really a nuisance, and exposing it to Python code is just pointless.

Now some people might argue that some code relies on EINTR to
interrupt a syscall on purpose, but I don't really buy it: it's highly
non-portable (depends on the syscall, SA_RESTART flag...) and subject
to race conditions (it it comes before the syscall or if you get a
partial read/write you'll deadlock).

Furthermore, the stdlib code base is not consistent: some code paths
handle EINTR, e.g. subprocess, multiprocessing, sock_sendall() does
but not sock_send()...
Just grep for EINTR and InterruptedError and you'll be amazed.

GHC, the JVM and probably other platforms handle EINTR, maybe it's
time for us too?

Just for reference, here are some issues due to EINTR popping up:
http://bugs.python.org/issue17097
http://bugs.python.org/issue12268
http://bugs.python.org/issue9867
http://bugs.python.org/issue7978
http://bugs.python.org/issue12493
http://bugs.python.org/issue3771


cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] EINTR handling...

2013-08-30 Thread Charles-François Natali
2013/8/30 Amaury Forgeot d'Arc amaur...@gmail.com:
 I agree.
 Is there a way to see in C code where EINTR is not handled?

EINTR can be returned on slow syscalls, so a good heuristic would be
to start with code that releases the GIL.
But I don't see a generic way apart from grepping for syscalls that
are documented to return EINTR.

 Or a method to handle this systematically?

The glibc defines this macro:

# define TEMP_FAILURE_RETRY(expression) \
  (__extension__  \
({ long int __result; \
   do __result = (long int) (expression); \
   while (__result == -1L  errno == EINTR); \
   __result; }))
#endif

which you can then use as:
pid = TEMP_FAILURE_RETRY(waitpid(pid, status, options));

Unfortunately, it's not as easy for us, since we must release the GIL
around the syscall, try again if it failed with EINTR, only after
having called PyErr_CheckSignals() to run signal handlers.

e.g. waitpid():


Py_BEGIN_ALLOW_THREADS
pid = waitpid(pid, status, options);
Py_END_ALLOW_THREADS


should become (conceptually):


begin_handle_eintr:
Py_BEGIN_ALLOW_THREADS
pid = waitpid(pid, status, options);
Py_END_ALLOW_THREADS

if (pid  0  errno == EINTR) {
if (PyErr_CheckSignals())
return NULL;
goto begin_handle_eintr;
}


We might want to go for a clever macro (like BEGIN_SELECT_LOOP in
socketmodule.c).

2013/8/30 Nick Coghlan ncogh...@gmail.com:
 Sounds good to me. I don't believe there's been a conscious decision
 that we *shouldn't* handle it, it just hasn't annoyed anyone enough
 for them to propose a systematic fix in CPython. If that latter part
 is no longer true, great ;)

Great, I'll open a bug report then :)

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] hg.python.org is slow

2013-08-27 Thread Charles-François Natali
Hi,

I'm trying to checkout a pristine clone from
ssh://h...@hg.python.org/cpython, and it's taking forever:

07:45:35.605941 IP 192.168.0.23.43098 
virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22081460, win 14225,
options [nop,nop,TS val 368519 ecr 2401783356], length 0
07:45:38.558348 IP virt-7yvsjn.psf.osuosl.org.ssh 
192.168.0.23.43098: Flags [.], seq 22081460:22082908, ack 53985, win
501, options [nop,nop,TS val 2401784064 ecr 368519], length 1448
07:45:38.558404 IP 192.168.0.23.43098 
virt-7yvsjn.psf.osuosl.org.ssh: Flags [.], ack 22082908, win 14225,
options [nop,nop,TS val 369257 ecr 2401784064], length 0
07:45:39.649995 IP virt-7yvsjn.psf.osuosl.org.ssh 
192.168.0.23.43098: Flags [.], seq 22082908:22084356, ack 53985, win
501, options [nop,nop,TS val 2401784367 ecr 369257], length 1448


See the time to just get an ACK?

Am I the only one experiencing this?

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] hg.python.org is slow

2013-08-27 Thread Charles-François Natali
2013/8/27 Antoine Pitrou solip...@pitrou.net:
 Sounds a lot like a network problem, then?

If I'm the only one, it's likely, although these pathological timeouts
are transient, and I don't have any problem with other servers (my
line sustains 8Mb/s without problem).

 Have you tried a traceroute?

I'll try tonight if this persists, and keep you posted.

2013/8/27 Ned Deily n...@acm.org:
 BTW, do you have ssh compression enabled for that host?

Yep.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review

2013-08-23 Thread Charles-François Natali
Hello,

A couple remarks:

 The following functions are modified to make newly created file descriptors 
 non-inheritable by default:
 [...]
 os.dup()

then

 os.dup2() has a new optional inheritable parameter: os.dup2(fd, fd2, 
 inheritable=True). fd2 is created inheritable by default, but non-inheritable 
 if inheritable is False.

Why does dup2() create inheritable FD, and not dup()?

I think a hint is given a little later:

 Applications using the subprocess module with the pass_fds parameter or using 
 os.dup2() to redirect standard streams should not be affected.

But that's overly-optimistic.

For example, a lot of code uses the guarantee that dup()/open()...
returns the lowest numbered file descriptor available, so code like
this:

r, w = os.pipe()
if os.fork() == 0:
# child
os.close(r)
os.close(1)
dup(w)

*will break*

And that's a lot of code (e.g. that's what _posixsubprocess.c uses,
but since it's implemented in C it's wouldn't be affected).

We've already had this discussion, and I stand by my claim that
changing the default *will break* user code.
Furthermore, many people use Python for system programming, and this
change would be highly surprising.

So no matter what the final decision on this PEP is, it must be kept in mind.

 The programming languages Go, Perl and Ruby make newly created file 
 descriptors non-inheritable by default: since Go 1.0 (2009), Perl 1.0 (1987) 
 and Ruby 2.0 (2013).

OK, but do they expose OS file descriptors?
I'm sure such a change would be fine for Java, which doesn't expose
FDs and fork(), but Python's another story.

Last time, I said that to me, the FD inheritance issue is solved on
POSIX by the subprocess module which passes close_fds. In my own code,
I use subprocess, which is the official, portable and safe way to
create child processes in Python. Someone using fork() + exec() should
know what he's doing, and be able to deal with the consequences: I'm
not only talking about FD inheritance, but also about
async-signal/multi-threaded safety ;-)

As for Windows, since it doesn't have fork(), it would make sense to
make its FD non heritable by default. And then use what you describe
here to selectively inherit FDs (i.e. implement keep_fds):

Since Windows Vista, CreateProcess() supports an extension of the
STARTUPINFO struture: the STARTUPINFOEX structure. Using this new
structure, it is possible to specify a list of handles to inherit:
PROC_THREAD_ATTRIBUTE_HANDLE_LIST. Read Programmatically controlling
which handles are inherited by new processes in Win32 (Raymond Chen,
Dec 2011) for more information.


cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446 (make FD non inheritable) ready for a final review

2013-08-23 Thread Charles-François Natali
 About your example: I'm not sure that it is reliable/portable. I sa
 daemon libraries closing *all* file descriptors and then expecting new
 file descriptors to become 0, 1 and 2. Your example is different
 because w is still open. On Windows, I have seen cases with only fd 0,
 1, 2 open, and the next open() call gives the fd 10 or 13...

Well, my example uses fork(), so obviously doesn't apply to Windows.
It's perfectly safe on Unix.

 I'm optimistic and I expect that most Python applications and
 libraries already use the subprocess module. The subprocess module
 closes all file descriptors (except 0, 1, 2) since Python 3.2.
 Developers relying on the FD inheritance and using the subprocess with
 Python 3.2 or later already had to use the pass_fds parameter.

As long as the PEP makes it clear that this breaks backward
compatibility, that's fine. IMO the risk of breakage outweights the
modicum benefit.

 The subprocess module has still a (minor?) race condition in the child
 process. Another C thread can create a new file descriptor after the
 subprocess module closed all file descriptors and before exec(). I
 hope that it is very unlikely, but it can happen.

No it can't, because after fork(), there's only one thread.
It's perfectly safe.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446: Open issues/questions

2013-08-02 Thread Charles-François Natali
2013/8/2 Victor Stinner victor.stin...@gmail.com:
 2013/7/28 Antoine Pitrou solip...@pitrou.net:
 (A) How should we support support where os.set_inheritable() is not
 supported? Can we announce that os.set_inheritable() is always
 available or not? Does such platform exist?

 FD_CLOEXEC is POSIX:
 http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html

 Ok, but this information does not help me. Does Python support
 non-POSIX platforms? (Windows has HANDLE_FLAG_INHERIT.)

 If we cannot answer to my question, it's safer to leave
 os.get/set_inheritable() optional (need hasattr in tests for example).

On Unix platforms, you should always have FD_CLOEXEC.
If there were such a platform without FD inheritance support, then it
would probably make sense to make it a no-op anyway.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446: Open issues/questions

2013-08-02 Thread Charles-François Natali
2013/8/2 Victor Stinner victor.stin...@gmail.com:
 On Windows, inheritable handles (including open files) are still
 inherited when a standard stream is overriden in the subprocess module
 (default value of close_fds is set to False in this case). This issue
 cannot be solved (at least, I don't see how): it is a limitation of
 Windows. bInheritedHandles must be set to FALSE (inherit *all*
 inheritable handles) when handles of standard streams are specified in
 the startup information of CreateProcess().

Then how about changing the default to creating file descriptors
unheritable on Windows (which is apparently the default)?
Then you can implement keep_fds by setting them inheritable right
before creation, and resetting them right after: sure there's a race
in a multi-threaded program, but AFAICT that's already the case right
now, and Windows API doesn't leave us any other choice.
Amusingly, they address this case by recommending putting process
creation in a critical section:
http://support.microsoft.com/kb/315939/en-us

This way, we keep default platform behavior on Unix and on Windows (so
user using low-level syscalls/APIs won't be surprised), and we have a
clean way to selectively inherit FD in child processes through
subprocess.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446: Open issues/questions

2013-07-30 Thread Charles-François Natali
 Having stdin/stdout/stderr cloexec (e.g. after a dup() to redirect to
 a log file, a socket...) will likely break a lot of code, e.g. code
 using os.system(), or code calling exec manually (and I'm sure there's
 a bunch of it).

 Hmm. os.exec*() could easily make standard streams non-CLOEXEC before
 calling the underlying C library function. Things are more annoying for
 os.system(), though.

 Also, it'll be puzzling to have syscalls automatically set the cloexec
 flag. I guess a lot of people doing system programming with Python
 will get bitten, but that's a discussion we already had months ago...

 Perhaps this advocates for a global flag, e.g.
 sys.set_default_fd_inheritance(), with False (non-inheritable) being
 the default for sanity and security.

This looks more and more like PEP 433 :-)

And honestly, when I think about it, I think that this whole mess is a
solution looking for a problem.
If we don't want to inherit file descriptors in child processes, the
answer is simple: the subprocess module (this fact is not even
mentioned in the PEP).
If a user wants to use the execve() syscall directly, then he should
be aware of the implications. I don't think that patching half the
stdlib and complicating the API of many functions is the right way to
do this.

The stdlib should be updated to replace the handful of places where
exec() is called explicitly by subprocess (the only one I can think on
top of my head is http.server.CGIHTTPRequestHandler (issue #16945)),
otherwise that's about it.

cf




 Regards

 Antoine.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/cf.natali%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446: Open issues/questions

2013-07-28 Thread Charles-François Natali
2013/7/28 Antoine Pitrou solip...@pitrou.net:
 (C) Should we handle standard streams (0: stdin, 1: stdout, 2: stderr)
 differently? For example, os.dup2(fd, 0) should make the file
 descriptor 0 (stdin) inheritable or non-inheritable? On Windows,
 os.set_inheritable(fd, False) fails (error 87, invalid argument) on
 standard streams (0, 1, 2) and copies of the standard streams (created
 by os.dup()).

 I have been advocating for that, but I now realize that special-casing
 these three descriptors in a myriad of fd-creating functions isn't very
 attractive.
 (if a standard stream fd has been closed, any fd-creating function can
 re-create that fd: including socket.socket(), etc.)

 So perhaps only the *original* standard streams should be left
 inheritable?

Having stdin/stdout/stderr cloexec (e.g. after a dup() to redirect to
a log file, a socket...) will likely break a lot of code, e.g. code
using os.system(), or code calling exec manually (and I'm sure there's
a bunch of it).

Also, it'll be puzzling to have syscalls automatically set the cloexec
flag. I guess a lot of people doing system programming with Python
will get bitten, but that's a discussion we already had months ago...

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446: Add new parameters to configure the inherance of files and for non-blocking sockets

2013-07-07 Thread Charles-François Natali
2013/7/7 Cameron Simpson c...@zip.com.au:
 On 06Jul2013 11:23, Charles-François Natali cf.nat...@gmail.com wrote:
 |  I've read your Rejected Alternatives more closely and Ulrich
 |  Drepper's article, though I think the article also supports adding
 |  a blocking (default True) parameter to open() and os.open(). If you
 |  try to change that default on a platform where it doesn't work, an
 |  exception should be raised.
 |
 | Contrarily to close-on-exec, non-blocking only applies to a limited
 | type of files (e.g. it doesn't work for regular files, which represent
 | 90% of open() use cases).

 sockets, pipes, serial devices, ...

How do you use open() on a socket (which are already covered by
socket(blocking=...)? Also, I said *regular files* - for which
O_NONBLOCK doesn't make sense - represent 90% of io.open() use cases,
and stand by this claim. Nothing prevents you from setting the FD
non-blocking manually.

 And you can set it on anything. Just because some things don't block
 anyway isn't really a counter argument.

Well, it complicates the signature and implementation.
If we go the same way, why stop there and not expose O_DSYNC, O_SYNC,
O_DIRECT...

When using a high-level API like io.open(), I think we should only
expose portable flags, which are supported both on all operating
systems (like the 'x' O_EXCL flag added in 3.3) and file types.

If you want precise control over the open() sementics, os.open() is
the way to go (that's also the rationale behind io.open() opener
argument, see http://bugs.python.org/issue12105)

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446: Add new parameters to configure the inherance of files and for non-blocking sockets

2013-07-06 Thread Charles-François Natali
 I've read your Rejected Alternatives more closely and Ulrich
 Drepper's article, though I think the article also supports adding
 a blocking (default True) parameter to open() and os.open(). If you
 try to change that default on a platform where it doesn't work, an
 exception should be raised.

Contrarily to close-on-exec, non-blocking only applies to a limited
type of files (e.g. it doesn't work for regular files, which represent
90% of open() use cases).

Also, one of the main reasons for exposing close-on-exec in
open()/socket() etc is to make it possible to create file descriptors
with the close-on-exec flag atomically, to prevent unwanted FD
inheritance especially in multi-threaded code. And that's not
necessary for the non-blocking parameter.

Those are two reasons why IMO blocking doesn't have to receive the
same treatment as close-on-exec (there's also the Windows issue but
I'm not familiar with it).

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 446: Add new parameters to configure the inherance of files and for non-blocking sockets

2013-07-05 Thread Charles-François Natali
2013/7/4 Victor Stinner victor.stin...@gmail.com:
 Even if the PEP 433 was not explicitly rejected, no consensus could be
 reached. I didn't want to loose all my work on this PEP and so I'm
 proposing something new which should make everbody agrees :-)

Thanks Victor, I think this one is perfectly fine!

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] stat module in C -- what to do with stat.py?

2013-06-20 Thread Charles-François Natali
2013/6/20 Thomas Wouters tho...@python.org:
 If the .py file is going to be wrong or incomplete, why would we want to
 keep it -- or use it as fallback -- at all? If we're dead set on having a
 .py file instead of requiring it to be part of the interpreter (whichever
 that is, however it was built), it should be generated as part of the build
 process. Personally, I don't see the value in it; other implementations will
 need to do *something* special to use it anyway.

That's exactly my rationale for pushing for removal.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pyparallel and new memory API discussions...

2013-06-19 Thread Charles-François Natali
2013/6/19 Trent Nelson tr...@snakebite.org:

 The new memory API discussions (and PEP) warrant a quick pyparallel
 update: a couple of weeks after PyCon, I came up with a solution for
 the biggest show-stopper that has been plaguing pyparallel since its
 inception: being able to detect the modification of main thread
 Python objects from within a parallel context.

 For example, `data.append(4)` in the example below will generate an
 AssignmentError exception, because data is a main thread object, and
 `data.append(4)` gets executed from within a parallel context::

 data = [ 1, 2, 3 ]

 def work():
 data.append(4)

 async.submit_work(work)

 The solution turned out to be deceptively simple:

   1.  Prior to running parallel threads, lock all main thread
   memory pages as read-only (via VirtualProtect on Windows,
   mprotect on POSIX).

   2.  Detect attempts to write to main thread pages during parallel
   thread execution (via SEH on Windows or a SIGSEGV trap on POSIX),
   and raise an exception instead (detection is done in the ceval
   frame exec loop).

Quick stupid question: because of refcounts, the pages will be written
to even in case of read-only access. How do you deal with this?

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] HAVE_FSTAT?

2013-05-19 Thread Charles-François Natali
2013/5/17 Antoine Pitrou solip...@pitrou.net:

 Hello,

 Some pieces of code are still guarded by:
 #ifdef HAVE_FSTAT
   ...
 #endif

 I would expect all systems to have fstat() these days. It's pretty
 basic POSIX, and even Windows has had it for ages. Shouldn't we simply
 make those code blocks unconditional? It would avoid having to maintain
 unused fallback paths.

I was sure I'd seen a post/bug report about this:
http://bugs.python.org/issue12082

The OP was trying to build Python on an embedded platform without fstat().

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] Python 3.2.5 and Python 3.3.2

2013-05-16 Thread Charles-François Natali
2013/5/16 Serhiy Storchaka storch...@gmail.com:
 16.05.13 08:20, Georg Brandl написав(ла):

 On behalf of the Python development team, I am pleased to announce the
 releases of Python 3.2.5 and 3.3.2.

 The releases fix a few regressions in 3.2.4 and 3.3.1 in the zipfile, gzip
 and xml.sax modules.  Details can be found in the changelogs:


 It seems that I'm the main culprit of this releases.

Well, when I look at the changelogs, what strikes me more is that
you're the author of *many* fixes, and also a lot of new
features/improvements.

So I wouldn't feel bad if I were you, this kind of things happens (and
it certainly did to me).

Cheers,

Charles
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info

2013-05-14 Thread Charles-François Natali
 I wonder how sshfs compared to nfs.

(I've modified your benchmark to also test the case where data isn't
in the page cache).

Local ext3:
cached:
os.walk took 0.096s, scandir.walk took 0.030s -- 3.2x as fast
uncached:
os.walk took 0.320s, scandir.walk took 0.130s -- 2.5x as fast

NFSv3, 1Gb/s network:
cached:
os.walk took 0.220s, scandir.walk took 0.078s -- 2.8x as fast
uncached:
os.walk took 0.269s, scandir.walk took 0.139s -- 1.9x as fast
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 435 - requesting pronouncement

2013-05-05 Thread Charles-François Natali
I'm chiming in late, but am I the only one who's really bothered by the syntax?

class Color(Enum):
red = 1
green = 2
blue = 3

I really don't see why one has to provide values, since an enum
constant *is* the value.
In many cases, there's no natural mapping between an enum constant and
a value, e.g. there's no reason why Color.red should be mapped to 1
and Color.blue to 3.

Furthermore, the PEP makes it to possible to do something like:

class Color(Enum):
red = 1
green = 2
blue = 3
red_alias = 1


which is IMO really confusing, since enum instances are supposed to be distinct.

All the languages I can think of that support explicit values (Java
being particular in the sense that it's really a full-fledge object
which can have attributes, methods, etc) make it optional by default.

Finally, I think 99% of users won't care about the assigned value
(which is just an implementation detail), so explicit value will be
just noise annoying users (well, me at least :-).

cf



2013/5/5 Eli Bendersky eli...@gmail.com:
 Hello pydev,

 PEP 435 is ready for final review. A lot of the feedback from the last few
 weeks of discussions has been incorporated. Naturally, not everything could
 go in because some minor (mostly preference-based) issues did not reach a
 consensus. We do feel, however, that the end result is better than in the
 beginning and that Python can finally have a useful enumeration type in the
 standard library.

 I'm attaching the latest version of the PEP for convenience. If you've read
 previous versions, the easiest way to get acquainted with the recent changes
 is to go through the revision log at http://hg.python.org/peps

 A reference implementation for PEP 435 is available at
 https://bitbucket.org/stoneleaf/ref435

 Kind regards and happy weekend.





 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/cf.natali%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: stat caching undesirable?

2013-05-02 Thread Charles-François Natali
 Yes, definitely. This is exactly what my os.walk() replacement,
 Betterwalk, does:
 https://github.com/benhoyt/betterwalk#readme

 On Windows you get *all* stat information from iterating the directory
 entries (FindFirstFile etc). And on Linux most of the time you get enough
 for os.walk() not to need an extra stat (though it does depend on the file
 system).

 I still hope to clean up Betterwalk and make a C version so we can use it in
 the standard library. In many cases it speeds up os.walk() by several times,
 even an order of magnitude in some cases. I intend for it to be a drop-in
 replacement for os.walk(), just faster.

Actually, there's Gregory's scandir() implementation (returning a
generator to be able to cope with large directories) on it's way:

http://bugs.python.org/issue11406

It's already been suggested to make it return a tuple (with d_type).
I'm sure a review of the code (especially the Windows implementation)
will be welcome.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428: stat caching undesirable?

2013-05-01 Thread Charles-François Natali
 3) Leave it up to performance critical code, such as the import
 machinery, or walkdirs that Nick mentioned, to do their own caching, and
 simplify the filepath API for the simple case.

 But one can still make life easier for code like that, by adding
 is_file() and friends on the stat result object as I suggested.

+1 from me.
PEP 428 goes in the right direction with a distinction between pure
path and concrete path. Pure path support syntactic operations,
whereas I would expect concrete paths to actually access the file
system. Having a method like restat() is a hint that something's
wrong, I'm convinced this will bite some people.

I'm also be in favor of having a wrapper class around os.stat() result
which would export utility methods such as is_file()/is_directory()
and owner/group, etc attributes.

That way, the default behavior would be correct, and this helper class
would make it easier for users like walkdir() to implement their own
caching.

As an added benefit, this would make path objects actually immutable,
which is always a good thing (simpler, and you get thread-safety for
free).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-04-05 Thread Charles-François Natali
Hello,

 async.submit_work(func, args, kwds, callback=None, errback=None)

 How do you implement arguments passing and return value?

 e.g. let's say I pass a list as argument: how do you iterate on the
 list from the worker thread without modifying the backing objects for
 refcounts (IIUC you use a per-thread heap and don't do any
 refcounting).

 Correct, nothing special is done for the arguments (apart from
 incref'ing them in the main thread before kicking off the parallel
 thread (then decref'ing them in the main thread once we're sure the
 parallel thread has finished)).

IIUC you incref the argument from the main thread before publishing it
to the worker thread: but what about containers like list? How do you
make sure the refcounts of the elements don't get deallocated while
the worker thread iterates? More generally, how do you deal with
non-local objects?

BTW I don't know if you did, but you could probably have a look at
Go's goroutines and Erlang processes.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-04-04 Thread Charles-François Natali
Just a quick implementation question (didn't have time to read through
all your emails :-)

async.submit_work(func, args, kwds, callback=None, errback=None)

How do you implement arguments passing and return value?

e.g. let's say I pass a list as argument: how do you iterate on the
list from the worker thread without modifying the backing objects for
refcounts (IIUC you use a per-thread heap and don't do any
refcounting). Same thing for return value, how do you pass it to the
callback?

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Announcement] New mailing list for code quality tools including Flake8, Pyflakes and Pep8

2013-04-03 Thread Charles-François Natali
 Are you planning to cover the code quality of the interpreter itself
 too? I've been recently reading through the cert.org secure coding
 practice recommendations and was wondering if there has is any ongoing
 effort to perform static analysis on the cpython codebase.

AFAICT CPython already benefits from Coverity scans (I guess the
Python-security guys receive those notifications). Note that this only
covers the C codebase.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release or not release the GIL

2013-02-01 Thread Charles-François Natali
 dup2(oldfd, newfd) closes oldfd.

 No, it doesn't close oldfd.

 It may close newfd if it was already open.

(I guess that's what he meant).

Anyway, only dup2() should probably release the GIL.

One reasonable heuristic is to check the man page: if the syscall can
return EINTR, then the GIL should be released.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter

2013-01-28 Thread Charles-François Natali
 Library code should not be relying on globals settings that can change.
 Library code should be explicit in its calls so that the current value of a
 global setting is irrelevant.

That's one of the problems I've raised with this global flag since the
beginning: it's useless for libraries, including the stdlib (and, as a
reminder, this PEP started out of a a bug report against socket
inheritance in socketserver).

And once again, it's an hidden global variable, so you won't be able
any more to tell what this code does:

r, w = os.pipe()
if os.fork() == 0:
os.close(w)
os.execve(['myprog'])


Furthermore, if the above code is part of a library, and relies upon
'r' FD inheritance, it will break if the user sets the global cloexec
flag. And the fact that a library relies upon FD inheritance is an
implementation detail, the users shouldn't have to wonder whether
enabling a global flag (in their code, not in a library) will break a
given library: the only alternative for such code to continue working
would be to pass cloexec=True explicitly to os.pipe()...

The global socket.settimeout() is IMO a bad idea, and shouldn't be emulated.

So I'm definitely -1 against any form of tunable value (be it a
sys.setdefaultcloexec(), an environment variable or command-line
flag), and still against changing the default value.

But I promise that's the last time I'm bringing those arguments up,
and I perfectly admit that some people want it as much as I don't want
it :-)

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] usefulness of extension modules section in Misc/NEWS

2013-01-27 Thread Charles-François Natali
Hi,

What's exactly the guideline for choosing between the Library and
Extension modules section when updating Misc/NEWS?
Is it just the fact that the modified files live under Lib/ or Modules/?

I've frequently made a mistake when updating Misc/NEWS, and when
looking at it, I'm not the only one.

Is there really a good reason for having distinct sections?

If the intended audience for this file are end users, ISTM that the
only things that matters is that it's a library change, the fact that
the modification impacted Python/C code isn't really relevant.

Also, for example if you're rewriting a library from Python to C (or
vice versa), should it appear under both sections?

FWIW, the What's new documents don't have such a distinction.

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 433: Choose the default value of the new cloexec parameter

2013-01-25 Thread Charles-François Natali
Hello,

 I tried to list in the PEP 433 advantages and drawbacks of each option.

 If I recorded correctly opinions, the different options have the
 following supporters:

  a) cloexec=False by default
  b) cloexec=True by default: Charles-François Natali
  c) configurable default value: Antoine Pitrou, Nick Coghlan, Guido van Rossum

You can actually count me in the cloexec=False camp, and against the
idea of a configurable default value. Here's why:

Why cloexec shouldn't be set by default:
- While it's really tempting to fix one of Unix historical worst
decisions, I don't think we can set file descriptors cloexec by
default: this would break some applications (I don't think there would
be too many of them, but still), but most notably, this would break
POSIX semantics. If Python didn't expose POSIX syscalls and file
descriptors, but only high-level file streams/sockets/etc, then we
could probably go ahead, but now it's too late. Someone said earlier
on python-dev that many people use Python for prototyping, and indeed,
when using POSIX API, you expect POSIX semantics.

Why the default value shouldn't be tunable:
- I think it's useless: if the default cloexec behavior can be altered
(either by a command-line flag, an environment variable or a sys
module function), then libraries cannot rely on it and have to make
file descriptors cloexec on an individual basis, since the default
flag can be disabled. So it would basically be useless for the Python
standard library, and any third-party library. So the only use case is
for application writers that use raw exec() (since subprocess already
closes file descriptors  3, and AFAICT we don't expose a way to
create processes manually on Windows), but there I think they fall
into two categories: those who are aware of the problem of file
descriptor inheritance, and who therefore set their FDs cloexec
manually, and those who are not familiar with this issue, and who'll
never look up a sys.setdefaultcloexec() tunable (and if they do, they
might think: Hey, if that's so nice, why isn't it on by default?
Wait, it might break applications? I'll just leave the default
then.).
- But most importantly, I think such a tunable flag is a really wrong
idea because it's a global tunable that alters the underlying
operating system semantics. Consider this code:

r, w = os.pipe()
if os.fork() == 0:
os.execve(['myprog'])


With a tunable flag, just by looking at this code, you have no way to
know whether the file descriptor will be inherited by the child
process. That would be introducing an hidden global variable silently
changing the semantics of the underlying operating system, and that's
just so wrong.

Sure, we do have global tunables:

sys.setcheckinterval()
sys.setrecursionlimit()
sys.setswitchinterval()

hash_randomization


But those alter extralinguistic behavior, i.e. they don't affect the
semantics of the language or underlying operating system in a way that
would break or change the behavior of a conforming program.

Although it's not as bad, just to belabor the point, imagine we
introduced a new method:

sys.enable_integer_division(boolean)
Depending on the value of this flag, the division of two integers will
either yield a floating point or truncated integer value.


Global variables are bad, hidden global variables are worse, and
hidden global variables altering language/operating system semantics
are evil :-)

What I'd like to see:
- Adding a cloexec parameter to file descriptor creating
functions/classes is fine, it will make it easier for a
library/application writer to create file descriptors cloexec,
especially in an atomic way.
- We should go over the standard library, and create FDs cloexec if
they're not handed over to the caller, either because they're
opened/closed before returning, or because the underlying file
descriptor is kept private (not fileno() method, although it's
relatively rare). That's the approach chosen by glibc, and it makes
sense: if another thread forks() while a thread is in the middle of
getpwnam(), you don't want to leak an open file descriptor to
/etc/passwd (or /etc/shadow).

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 433: Add cloexec argument to functions creating file descriptors

2013-01-13 Thread Charles-François Natali
Hello,

 PEP: 433
 Title: Add cloexec argument to functions creating file descriptors

I'm not a native English speaker, but it seems to me that the correct
wording should be parameter (part of the function
definition/prototype, whereas argument refers to the actual value
supplied).

 This PEP proposes to add a new optional argument ``cloexec`` on
 functions creating file descriptors in the Python standard library. If
 the argument is ``True``, the close-on-exec flag will be set on the
 new file descriptor.

It would probably be useful to recap briefly what the close-on-exec flag does.

Also, ISTM that Windows also supports this flag. If it does, then
cloexec might not be the best name, because it refers to the
execve() Unix system call. Maybe something like noinherit would be
clearer (although coming from a Unix background cloexec is
crystal-clear to me :-).

 On UNIX, subprocess closes file descriptors greater than 2 by default
 since Python 3.2 [#subprocess_close]_. All file descriptors created by
 the parent process are automatically closed.

(in the child process)

 ``xmlrpc.server.SimpleXMLRPCServer`` sets the close-on-exec flag of
 the listening socket, the parent class ``socketserver.BaseServer``
 does not set this flag.

As has been discussed earlier, the real issue is that the server
socket is not closed in the child process. Setting it cloexec would
only add an extra security for multi-threaded programs.

 Inherited file descriptors issues
 -

 Closing the file descriptor in the parent process does not close the
 related resource (file, socket, ...) because it is still open in the
 child process.

You might want to go through the bug tracker to find examples of such
issues, and list them:
http://bugs.python.org/issue7213
http://bugs.python.org/issue12786
http://bugs.python.org/issue2320
http://bugs.python.org/issue3006

The list goes on.
Some of those examples resulted in deadlocks.

 The listening socket of TCPServer is not closed on ``exec()``: the
 child process is able to get connection from new clients; if the
 parent closes the listening socket and create a new listening socket
 on the same address, it would get an address already is used error.

See above for the real cause.

 Not closing file descriptors can lead to resource exhaustion: even if
 the parent closes all files, creating a new file descriptor may fail
 with too many files because files are still open in the child
 process.

You might want to detail the course of events (a child if forked
before the parent gets a chance to close the file descriptors...
EMFILE).

 Leaking file descriptors is a major security vulnerability. An
 untrusted child process can read sensitive data like passwords and
 take control of the parent process though leaked file descriptors. It
 is for example a known vulnerability to escape from a chroot.

You might add a link to this:
https://www.securecoding.cert.org/confluence/display/seccode/FIO42-C.+Ensure+files+are+properly+closed+when+they+are+no+longer+needed

It can also result in DoS (if the child process highjacks the server
socket and accepts connections).

Example of vulnerabilities:
http://www.openssh.com/txt/portable-keysign-rand-helper.adv
http://www.securityfocus.com/archive/1/348368
http://cwe.mitre.org/data/definitions/403.html

 The problem is that these flags and functions are not portable: only
 recent versions of operating systems support them. ``O_CLOEXEC`` and
 ``SOCK_CLOEXEC`` flags are ignored by old Linux versions and so
 ``FD_CLOEXEC`` flag must be checked using ``fcntl(fd, F_GETFD)``.  If
 the kernel ignores ``O_CLOEXEC`` or ``SOCK_CLOEXEC`` flag, a call to
 ``fcntl(fd, F_SETFD, flags)`` is required to set close-on-exec flag.

 .. note::
OpenBSD older 5.2 does not close the file descriptor with
close-on-exec flag set if ``fork()`` is used before ``exec()``, but
it works correctly if ``exec()`` is called without ``fork()``.

That would be *really* surprising, are your sure your test case is correct?
Otherwise it could be a compilation issue, because I simply can't
believe OpenBSD would ignore the close-on-exec flag.

 This PEP only change the close-on-exec flag of file descriptors
 created by the Python standard library, or by modules using the
 standard library.  Third party modules not using the standard library
 should be modified to conform to this PEP. The new
 ``os.set_cloexec()`` function can be used for example.

 Impacted functions:

  * ``os.forkpty()``
  * ``http.server.CGIHTTPRequestHandler.run_cgi()``

I've opened http://bugs.python.org/issue16945 to rewrite this to use subprocess.

 Impacted modules:

  * ``multiprocessing``
  * ``socketserver``
  * ``subprocess``
  * ``tempfile``

Hum, I thought temporay file are already created with the close-on-exec flag.

  * ``xmlrpc.server``
  * Maybe: ``signal``, ``threading``

 XXX Should ``subprocess.Popen`` set the close-on-exec flag on file XXX
 XXX descriptors of the 

Re: [Python-Dev] fork or exec?

2013-01-11 Thread Charles-François Natali
 *Lots* of applications make use of POSIX semantics for fork() / exec().

This doesn't mean much. We're talking about inheritance of FDs  2
upon exec, which is a very limited subset of POSIX semantics for
fork() / exec().

I personally think that there's been enough feedback to show that we
should stick with the default POSIX behavior, however broken it is...

 Can someone please point to a writeop of the security issues involved?

I've posted sample codes earlier in this thread, but here's a writeup
by Ulrich Drepper:
http://udrepper.livejournal.com/20407.html
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Set close-on-exec flag by default in SocketServer

2013-01-10 Thread Charles-François Natali
 The SocketServer class creates a socket to listen on clients, and a
 new socket per client (only for stream server like TCPServer, not for
 UDPServer).

 Until recently (2011-05-24, issue #5715), the listening socket was not
 closed after fork for the ForkingMixIn flavor. This caused two issues:
 it's a security leak, and it causes address already in use error if
 the server is restarted (see the first message of #12107 for an
 example with Django).

Note that the server socket is actually still not closed in the child
process: one this gets fixed, setting FD_CLOEXEC will not be useful
anymore (but it would be an extra security it it could be done
atomically, especially against race conditions in multi-threaded
applications). (Same thing for the client socket, which is actually
already closed in the parent process).

As for the backward compatibility issue, here's a thought: subprocess
was changed in 3.2 to close all FDs  2 in the child process by
default. AFAICT, we didn't get a single report complaining about this
behavior change. OTOH, we did get numerous bug reports due to FDs
inherited by subprocesses before that change. (I know that Python =
3.2 is less widespread than its predecessors, but still).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] fork or exec?

2013-01-10 Thread Charles-François Natali
 So, I read your e-mail again and I'm wondering if you're making a logic
 error, or if I'm misunderstanding something:

 1. first you're talking about duplicate file or socket objects after
 *fork()* (which is an issue I agree is quite annoying)

 2. the solution you're proposing doesn't close the file descriptors
 after fork() but after *exec()*.

 Basically the solution doesn't address the problem. Many fork() calls
 aren't followed by an exec() call (multiprocessing comes to mind).

Yes.
In this specific case, the proper solution is to close the server
socket right after fork() in the child process.

We can't do anything about file descriptors inherited upon fork() (and
shouldn't do anything of course, except on an individual basis like
this socket server example).

On the other hand, setting file descriptors close-on-exec has the
advantage of avoiding file descriptor inheritance to spawned
(fork()+exec()) child processes, which, in 99% of cases, don't need
them (apart from stdin/stdout/stderr). Not only can this cause subtle
bugs (socket/file not being closed when the parent closes the file
descriptor, deadlocks, there are several such examples in the bug
tracker), but also a security issue, because contrarily to a fork()ed
process which runs code controlled by the library/user, after exec()
you might be running arbitrary code.

Let's take the example of CGIHTTPServer:

  # Child
  try:
  try:
  os.setuid(nobody)
  except os.error:
  pass
  os.dup2(self.rfile.fileno(), 0)
  os.dup2(self.wfile.fileno(), 1)
  os.execve(scriptfile, args, env)



The code tries to execute a CGI script as user nobody to minimize
privilege, but if the current process has an sensitive file opened,
the file descriptor will be leaked to the CGI script, which can do
anything with it.

In short, close-on-exec can solve a whole class of problems (but does
not really apply to this specific case).

 On the other hand, the one widespread user of exec() after fork() in the
 stdlib, namely subprocess, *already* closes file descriptors by
 default, so the exec() issue doesn't really exist anymore for us (or
 is at least quite exotic).

See the above example. There can be valid reasons to use fork()+exec()
instead of subprocess.

Disclaimer: I'm not saying we should be changing all FDs to
close-on-exec by default like Ruby did, I'm just saying that there's a
real problem.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] fork or exec?

2013-01-10 Thread Charles-François Natali
 Network servers like inetd or apache MPM (prefork) uses a process
 listening on a socket, and then fork to execute a request in a child
 process. I don't know how it works exactly, but I guess that the child
 process need a socket from the parent to send the answer to the
 client. If the socket is closed on execute (ex: Apache with CGI), it
 does not work :-)

Yes, but the above (setting close-on-exec by default) would *not*
apply to stdin, stdout and stderr. inetd servers use dup(socket, 0);
dup(socket, 1, dup(socket, 2) before forking, so it would still work.

 Example with CGIHTTPRequestHandler.run_cgi(), self.connection is the
 socket coming from accept():

 self.rfile = self.connection.makefile('rb', self.rbufsize)
 self.wfile = self.connection.makefile('wb', self.wbufsize)
 ...
 try:
 os.setuid(nobody)
 except OSError:
 pass
 os.dup2(self.rfile.fileno(), 0)
 os.dup2(self.wfile.fileno(), 1)
 os.execve(scriptfile, args, env)

Same thing here.

And the same thing holds for shell-type pipelines: you're always using
stdin, stdout or stderr.

 Do you have an example of what that something may be?
 Apart from standard streams, I can't think of any inherited file
 descriptor an external program would want to rely on.

Indeed, it should be really rare.

There are far more programs that are bitten by FD inheritance upon
exec than programs relying on it, and whereas failures and security
issues in the first category are hard to debug and unpredictable
(especially in a multi-threaded program), a program relying on a FD
that would be closed will fail immediately with EBADF, and so could be
updated quickly and easily.

 In other words, I think close-on-exec by default is probably a
 reasonable decision.

close-on-exec should probably have been the default in Unix, and is a
much saner option.

The only question is whether we're willing to take the risk of
breaking - admittedly a handful - of applications to avoid a whole
class of difficult to debug and potential security issues.

Note that if we do choose to set all file descriptors close-on-exec by
default, there are several questions open:
- This would hold for open(), Socket() and other high-level
file-descriptor wrappers. Should it be enabled also for low-level
syscall wrappers like os.open(), os.pipe(), etc?
- On platforms that don't support atomic close-on-exec (e.g. open()
with O_CLOEXEC, socket() with SOCK_CLOEXEC, pipe2(), etc), this would
require extra fcntl()/ioctl() syscalls. The cost is probably
negligible, but we'd have to check the impact on some benchmarks.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] fork or exec?

2013-01-10 Thread Charles-François Natali
 That could always be overcome by passing close_fds=False explicitly to
 subprocess from my code, though, right? I'm not doing that now, but then I'm 
 not
 using the esoteric options in python-gnupg code, either.

You could do that, or better explicitly support this option, and only
specify this file descriptor in subprocess.Popen keep_fds argument.

 My point was that the GnuPG usage looked like an example where fds other than
 0, 1 and 2 might be used by design in not-uncommonly-used programs. From a
 discussion I had with Barry Warsaw a while ago, I seem to remember that there
 was other software which relied on these features. See [1] for details.

Yes, it might be used. But I maintain that it should be really rare,
and even if it's not, since the official way to launch subprocesses
is through the subprocess module, FDs  2 are already closed by
default since Python 3.2. And the failure will be immediate and
obvious (EBADF).

Note that I admit I may be completely wrong, that's why I suggested to
Victor to bring this up on python-dev to gather as much feedback as
possible. Something saying like we never ever break backward
compatibility intentionally, even in corner cases or this would
break POSIX semantics would be enough (but OTOH, the subprocess
change did break those hypothetical rules).

Another pet peeve of mine is the non-handling of EINTR by low-level
syscall wrappers, which results in code like this spread all over the
stdlib and user code:
while True:
try:
return syscall(...)
except OSError as e:
if e.errno =!= errno.EINTR:
raise

(and if it's select()/poll()/etc, the code doesn't update the timeout
in 90% of cases).
It gets a little better since the Exception hierarchy rework
(InterruptedException), but's still a nuisance.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Set close-on-exec flag by default in SocketServer

2013-01-09 Thread Charles-François Natali
 My question is: would you accept to break backward compatibility (in
 Python 3.4) to fix a potential security vulnerability?

Although obvious, the security implications are not restricted to
sockets (yes, it's a contrived example):

# cat test_inherit.py
import fcntl
import os
import pwd
import sys


f = open(/tmp/passwd, 'w+')

#fcntl.fcntl(f.fileno(), fcntl.F_SETFD, fcntl.FD_CLOEXEC)

if os.fork() == 0:
os.setuid(pwd.getpwnam('nobody').pw_uid)
os.execv(sys.executable, ['python', '-c', 'import os; os.write(3,
owned)'])
else:
os.waitpid(-1, 0)
f.seek(0)
print(f.read())
f.close()
# python test_inherit.py
owned


 I'm not sure that close-on-exec flag must be set on the listening
 socket *and* on the client sockets. What do you think?

In the listening socket is inherited, it can lead to EADDRINUSE, or
the child process hijacking new connections (by accept()ing on the
same socket).
As for the client sockets, there's at least one reason to set them
close-on-exec: if a second forked process inherits the first process'
client socket, even when the first client closes its file descriptor
(and exits), the socket won't be closed until the the second process
exits too: so one long-running child process can delay other child
processes connection shutdown for arbitrarily long.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bumping autoconf from 2.68 to 2.69

2012-10-16 Thread Charles-François Natali
 My understanding is that we use a specific version of autoconf.
 The reason is that otherwise we end up with useless churn in the repo
 as the generated file changes when different committers use different
 versions.  In the past we have had issues with a new autoconf version
 actually breaking the Python build, so we also need to test a new version
 before switching to it.

Well, so I guess all committers will have to use the same
Linux/FreeBSD/whatever distribution then?
AFAICT there's no requirement regarding the mercurial version used by
committers either.

Charles
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bumping autoconf from 2.68 to 2.69

2012-10-16 Thread Charles-François Natali
 It should be sufficient to install autoconf-x.y into /home/user/bin or
 something similar. Installing autoconf from source really takes about
 3 minutes.

Well, maybe, maybe not.
autoconf depends on a least m4 and Perl, and you may very well have a
compatibility issue here.
That's why most distributions have package managers, and in 2012 we're
past the './configure  make  make install.

 It doesn't matter which OS or Mercurial version a developer uses as
 they don't implicitly affect any versioned resources; autoconf does.

If you're worried about the noise in diff, it's never been a problem
at least to me (just don't post a configure diff for review, the
configure.ac is enough).

If you're worried about runtime compatibility, then autoconf is not
your only worry. Proper build also depends on the target shell, target
toolchain (gcc, libc, etc).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Checking if unsigned int less then zero.

2012-06-23 Thread Charles-François Natali
 Playing with cpython source, I found some strange strings in
 socketmodule.c:

 ---
  if (flowinfo  0 || flowinfo  0xf) {
  PyErr_SetString(
  PyExc_OverflowError,
  getsockaddrarg: flowinfo must be 0-1048575.);
  return 0;
  }
 ---

 ---
  if (flowinfo  0 || flowinfo  0xf) {
  PyErr_SetString(PyExc_OverflowError,
  getsockaddrarg: flowinfo must be 0-1048575.);
  return NULL;
  }
 ---

 The flowinfo variable declared few strings above as unsgined int. Is
 there any practical sense in this check? Seems like gcc just removes
 this check. I think any compiler will generate code that checks as
 unsigned, for example in x86 its JAE/JGE. May be this code is for bad
 compilers or exotic arch?

Removed.

Thanks,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] [help wanted] - IrDA sockets support

2012-04-25 Thread Charles-François Natali
Hi,

Issue #1522400 (http://bugs.python.org/issue1522400) has a patch
adding IrDA socket support.
It builds under Linux and Windows, however it cannot go any further
because no developer involved in the issue has access to IrDA capable
devices, which makes testing impossible.
So, if you have access to such devices and are interested, feel free
to chime in and help get this merged.

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Closes Issue #14661: posix module: add O_EXEC, O_SEARCH, O_TTY_INIT (I add some

2012-04-24 Thread Charles-François Natali
 jesus.cea python-check...@python.org wrote:
 http://hg.python.org/cpython/rev/2023f48b32b6
 changeset:   76537:2023f48b32b6
 user:Jesus Cea j...@jcea.es
 date:Tue Apr 24 20:59:17 2012 +0200
 summary:
   Closes Issue #14661: posix module: add O_EXEC, O_SEARCH, O_TTY_INIT (I
 add some Solaris constants too)

 Could you please add a Misc/NEWS entry for all this?

I also tend to always update Misc/ACKS too, even for trivial patches.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Experimenting with STM on CPython

2012-04-11 Thread Charles-François Natali
 Yes, that's using STM on my regular laptop.  How HTM would help
 remains unclear at this point, because in this approach transactions
 are typically rather large --- likely much larger than what the
 first-generation HTM-capable processors will support next year.

 Ok. I guess once the code is there, the hardware will eventually catch up.

 However, I'm not sure what you consider large. A lot of manipulation
 operations for the builtin types are not all that involved, at least in the
 normal cases (read: fast paths) that involve no memory reallocation etc.,
 and anything that can be called by and doesn't call into the interpreter
 would be a complete and independent transaction all by itself, as the GIL
 is allowed to be released between any two ticks.

Large as in L2-cache large, and as in you won't get a page fault or
an interrupt, you won't make any syscall, any I/O... ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 418: Add monotonic clock

2012-03-28 Thread Charles-François Natali
 What's wrong with time.time() again?  As documented in
 http://docs.python.org/py3k/library/time.html it makes no guarantees,
 and specifically there is *no* guarantee that it will ever behave
 *badly*wink/.  Of course, we'll have to guarantee that, if a
 badly-behaved clock is available, users can get access to it, so call
 that time._time().

I'm not sure I understand your suggestion correctly, but replacing
time.time() by time.monotonic() with fallback won't work, because
time.monotonic() isn't wall-clock time: it can very well use an
arbitrary reference point (most likely system start-up time).

As for the hires() function, since there's no guarantee whatsoever
that it does provide a better resolution than time.time(), this would
be really misleading IMHO.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-16 Thread Charles-François Natali
I personally don't see any reason to drop a module that isn't
terminally broken or unmaintainable, apart from scaring users away by
making them think that we don't care about backward compatibility.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] best place for an atomic file API

2012-02-15 Thread Charles-François Natali
Hi,

Issue #8604 aims at adding an atomic file API to make it easier to
create/update files atomically, using rename() on POSIX systems and
MoveFileEx() on Windows (which are now available through
os.replace()). It would also use fsync() on POSIX to make sure data is
committed to disk.
For example, it could be used by importlib to avoid races when
writting bytecode files (issues #13392, #13003, #13146), or more
generally by any application that wants to make sure to end up with a
consistent file even in face of crash (e.g. it seems that mercurial
implemented their own version).

Basically the usage would be, e.g.:

with AtomicFile('foo') as f:
pickle.dump(obj, f)

or

with AtomicFile('foo') as f:
   chunk = heavyCrunch()
   f.write(chunk)
   chunk = CrunchSomeMore()
   f.write(chunk)


What would be the best place for a such a class?
_pyio, tempfile, or a new atomicfile

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems)

2012-02-12 Thread Charles-François Natali
 There actually *is* an easy way, in regular ls: look at the link count.
 It comes out of ls -l by default, and if it's 1, there will be an
 identical file.

 This doesn't tell me which file it is, which is practically useless if I
 have both python3.3 and python3.2 in that directory.

You can use 'ls -i' to print the inode, or you could use find's
'samefile' option.
But this is definitely not as straightforward as a it would be for a
symlink, and I'm also curious to know the reason behind this choice.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Backed out changeset 36f2e236c601: For some reason, rewinddir() doesn't work as

2012-01-09 Thread Charles-François Natali
 Can rewinddir() end up touching the filesystem to retrieve data? I
 noticed that your previous change (the one this checkin reverted)
 moved it outside the GIL release macros.

 It just resets a position count. (in glibc).

Actually, it also calls lseek() on the directory FD:
http://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/rewinddir.c;hb=HEAD

But lseek() doesn't (normally) perform I/O, it just sets an offset in
the kernel file structure:
http://lxr.free-electrons.com/source/fs/read_write.c#L38

For example, it's not documented to return EINTR.

Now, one could imagine that the kernel could do some read-ahead or
some other magic things when passed SEEK_DATA or SEEK_HOLE, but
seeking at the beginning of a directory FD should be fast.

Anyway, I ended up reverting this change, because for some reason this
broke OpenIndiana buildbots (maybe rewinddir() is a no-op before
readdir() has been called?).

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] svn.python.org certificate expired

2012-01-09 Thread Charles-François Natali
Hi,

All the buildbots are turning red because of test_ssl:

==
ERROR: test_connect (test.test_ssl.NetworkedTests)
--
Traceback (most recent call last):
  File /var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/test/test_ssl.py,
line 616, in test_connect
s.connect((svn.python.org, 443))
  File /var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/ssl.py,
line 519, in connect
self._real_connect(addr, False)
  File /var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/ssl.py,
line 509, in _real_connect
self.do_handshake()
  File /var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/ssl.py,
line 489, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [Errno 1] _ssl.c:420: error:14090086:SSL
routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed


It seems that svn.python.org certificate expired today (09/01/2012).

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] usefulness of Python version of threading.RLock

2012-01-08 Thread Charles-François Natali
 The yes/no answer is No, we can't drop it.

Thanks, that's a clear answer :-)

 I'm not convinced of the benefits of removing the pure Python RLock
 implementation

Indeed.
As noted, this issue with signal handlers is more general, so this
wouldn't solve the problem at hand. I just wanted to know whether we
could remove this duplicate code, but since it might be used by some
implementations, it's best to keep it.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] usefulness of Python version of threading.RLock

2012-01-06 Thread Charles-François Natali
Thanks for those precisions, but I must admit it doesn't help me much...
Can we drop it? A yes/no answer will do it ;-)

 I'm pretty sure the Python version of RLock is in use in several alternative
 implementations that provide an alternative _thread.lock. I think gevent
 would fall into this camp, as well as a personal project of mine in a
 similar vein that operates on python3.

Sorry, I'm not sure I understand. Do those projects use _PyRLock directly?
If yes, then aliasing it to _CRLock should do the trick, no?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] usefulness of Python version of threading.RLock

2012-01-05 Thread Charles-François Natali
Hi,

Issue #13697 (http://bugs.python.org/issue13697) deals with a problem
with the Python version of threading.RLock (a signal handler which
tries to acquire the same RLock is called right at the wrong time)
which doesn't affect the C version.
Whether such a use case can be considered good practise or the best
way to fix this is not settled yet, but the question that arose to me
is: why do we have both a C and Python version?.
Here's Antoine answer (he suggested to me to bring this up on python-dev:

The C version is quite recent, and there's a school of thought that we
should always provide fallback Python implementations.
(also, arguably a Python implementation makes things easier to
prototype, although I don't think it's the case for an RLock)


So, what do you guys think?
Would it be okay to nuke the Python version?
Do you have more details on this school of thought?

Also, while we're at it, Victor created #13550 to try to rewrite the
logging hack of the threading module: there again, I think we could
just remove this logging altogether. What do you think?

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fwd: Anyone still using Python 2.5?

2011-12-21 Thread Charles-François Natali
 Do people still have to use this in commercial environments or is
 everyone on 2.6+ nowadays?

RHEL 5.7 ships with Python 2.4.3. So no, not everybody is on 2.6+
today, and this won't happen before a couple years.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] STM and python

2011-11-30 Thread Charles-François Natali
 However given advances in locking and garbage collection in the last
 decade, what attempts have been made recently to try these new ideas
 out? In particular, how unlikely is it that all the thread safe
 primitives, global contexts, and reference counting functions be made
 __transaction_atomic, and magical parallelism performance boosts
 ensue?


I'd say that given that the current libitm implementation uses a
single global lock, you're more likely to see a performance loss.
TM is useful to synchronize non-trivial operations: an increment or
decrement of a reference count is highly trivial (and expensive when
performed atomically, as noted), and TM's never going to help if you
put each refcount operation in its own transaction: see Armin's
http://morepypy.blogspot.com/2011/08/we-need-software-transactional-memory.html
for more realistic use cases.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unexpected behaviour in compileall

2011-11-02 Thread Charles-François Natali
2011/11/2 Vinay Sajip vinay_sa...@yahoo.co.uk:
 I just started getting errors in my PEP 404 / pythonv branch, but they don't
 at first glance appear related to the functionality of this branch. What I'm
 seeing is that during installation, some of the .pyc/.pyo files written by
 compileall have mode 600 rather than the expected 644, with the result that
 test_compileall fails when run from the installed Python as an unprivileged
 user. If I manually do

It's a consequence of http://hg.python.org/cpython/rev/740baff4f169.
I'll fix that.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] socket module build failure

2011-10-07 Thread Charles-François Natali
Hello,

2011/10/7 Vinay Sajip vinay_sa...@yahoo.co.uk:
 I work on Ubuntu Jaunty for my cpython development work - an old version, I
 know, but still quite serviceable and has worked well for me over many months.
 With the latest default cpython repository, however, I can't run the 
 regression
 suite because the socket module now fails to build:


It's due to the recent inclusion of PF_CAN support:
http://hg.python.org/cpython/rev/e767318baccd

It looks like your header files are different from what's found in
other distributions.
Please reopen issue #10141, we'll try to go from there.

Cheers,

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as

2011-10-06 Thread Charles-François Natali
 I'd have expect this test to fail on _any_ UNIX system if run as root.
 Root's allowed to write to stuff! Any stuff! About the only permission
 with any effect on root is the eXecute bit for the exec call, to prevent
 blindly running random data files.

You're right, here's another test on Linux (I must have screwed up
when I tested this on my box):

# mkdir /tmp/foo
# chmod -w /tmp/foo
# touch /tmp/foo/bar
# ls /tmp/foo
bar

You can still set the directory immutable if you really want to deny
write to root:

# chattr +i /tmp/foo
# touch /tmp/foo/spam
touch: cannot touch `/tmp/foo/spam': Permission denied

 Equally, why on earth are you running tests as root!?!?!?!?! Madness.
 It's as bad as compiling stuff as root etc etc. A bad idea all around,
 securitywise.

Agreed, I would personally never run a buildbot as root.
I just changed this because I was tired of seeing the same buildbots
always red (thus masking real failures).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Using PEP384 Stable ABI for the lzma extension module

2011-10-06 Thread Charles-François Natali
That's not a given. Depending on the memory allocator, a copy can be
avoided. That's why the str += str hack is much more efficient under
Linux than Windows, AFAIK.
  
   Even Linux will have to copy a block on realloc in certain cases, no?
 
  Probably so. How often is totally unknown to me :)
 
 http://www.gnu.org/software/libc/manual/html_node/Changing-Block-Size.html

 It depends on whether there's enough free memory after the buffer you
 currently have allocated.  I suppose that this becomes a question of what
 people consider the general case :-)

 But under certain circumstances (if a large block is requested), the
 allocator uses mmap(), no?

That's right, if the block requested is bigger than mmap_threshold
(256K by default with glibc, forgetting the sliding window algorithm):
I'm not sure of what percentage of strings/buffers are concerned in a
typical program.

 In which case mremap() should allow to resize without copying anything.

Yes, there's no copying. Note however that it doesn't come for free,
the kernel will still zero-fill the pages before handling them to
user-space. It is still way faster than on, let's say, Solaris.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython (3.2): Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as

2011-10-04 Thread Charles-François Natali
 summary:
   Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when

   run as
 root (directory permissions are ignored).

 The same directory permission semantics apply to other (all?)
 BSD-derived systems, not just FreeBSD.  For example, the test still
 fails in the same way on OS X when run via sudo.


Thanks, I didn't know: I only noticed this on the FreeBSD buildbots (I
guess OS-X buildbots don't run as root). Note that it does behave as
expected on Linux (note the use of quotation marks, I'm not sure
whether this behavior is authorized by POSIX).
I changed the test to skip when the effective UID is 0, regardless of
the OS, to stay on the safe side.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Issue #12981: rewrite multiprocessing_{sendfd, recvfd} in Python.

2011-09-26 Thread Charles-François Natali
 On Sun, Sep 25, 2011 at 4:04 AM, charles-francois.natali
 python-check...@python.org wrote:
 +if not(sys.platform == 'win32' or (hasattr(socket, 'CMSG_LEN') and
 +                                   hasattr(socket, 'SCM_RIGHTS'))):
     raise ImportError('pickling of connections not supported')

 I'm pretty sure the functionality checks for CMSG_LEN and SCM_RIGHTS
 mean the platform check for Windows is now redundant.


I'm not sure I understand what you mean.
FD passing is supported on Unix with sendmsg/SCM_RIGHTS, and on
Windows using whatever Windows uses for that purpose (see
http://hg.python.org/cpython/file/2b47f0146639/Lib/multiprocessing/reduction.py#l63).
If we remove the check for Windows, an ImportError will be raised
systematically, unless you suggest that Windows does support
sendmsg/SCM_RIGHTS (I somehow doubt Windows supports Unix domain
sockets, but I don't know Windows at all).

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] issue 6721 Locks in python standard library should be sanitized on fork

2011-08-29 Thread Charles-François Natali
 +3 (agreed to Jesse, Antoine and Ask here).
  The http://bugs.python.org/issue8713 described non-fork implementation
 that always uses subprocesses rather than plain forked processes is the
 right way forward for multiprocessing.

I see two drawbacks:
- it will be slower, since the interpreter startup time is
non-negligible (well, normally you shouldn't spawn a new process for
every item, but it should be noted)
- it'll consume more memory, since we lose the COW advantage (even
though it's already limited by the fact that even treating a variable
read-only can trigger an incref, as was noted in a previous thread)

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Software Transactional Memory for Python

2011-08-27 Thread Charles-François Natali
Hi Armin,

 This is basically dangerous, because it corresponds to taking lock
 GIL and lock L, in that order, whereas the thread B takes lock L and
 plays around with lock GIL in the opposite order.  I think a
 reasonable solution to avoid deadlocks is simply not to use explicit
 locks inside with atomic blocks.

The problem is that many locks are actually acquired implicitely.
For example, `print` to a buffered stream will acquire the fileobject's mutex.
Also, even if the code inside the with atomic block doesn't directly
or indirectely acquire a lock, there's still the possibility of
asynchronous code that acquire locks being executed in the middle of
this block: for example, signal handlers are run on behalf of the main
thread from the main eval loop and in certain other places, and the GC
might kick in at any time.

 Generally speaking it can be regarded as wrong to do any action that
 causes an unbounded wait in a with atomic block,

Indeed.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sendmsg/recvmsg on Mac OS X

2011-08-24 Thread Charles-François Natali
 The buildbots are complaining about some of tests for the new
 socket.sendmsg/recvmsg added by issue #6560 for *nix platforms that
 provide CMSG_LEN.

Looks like kernel bugs:
http://developer.apple.com/library/mac/#qa/qa1541/_index.html


Yes. Mac OS X 10.5 fixes a number of kernel bugs related to descriptor passing
[...]
Avoid passing two or more descriptors back-to-back.


We should probably add
@requires_mac_ver(10, 5)

for testFDPassSeparate and testFDPassSeparateMinSpace.

As for InterruptedSendTimeoutTest and testInterruptedSendmsgTimeout,
it also looks like a kernel bug: the syscall should fail with EINTR
once the socket buffer is full. I guess one should skip those on OS-X.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sendmsg/recvmsg on Mac OS X

2011-08-24 Thread Charles-François Natali
 But Snow Leopard, where these failures occur, is OS X 10.6.

*sighs*
It still looks like a kernel/libc bug to me: AFAICT, both the code and
the tests are correct.
And apparently, there are still issues pertaining to FD passing on
10.5 (and maybe later, I couldn't find a public access to their bug
tracker):
http://lists.apple.com/archives/Darwin-dev/2008/Feb/msg00033.html

Anyway, if someone with a recent OS X release could run test_socket,
it would probably help. Follow ups to http://bugs.python.org/issue6560
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] issue 6721 Locks in python standard library should be sanitized on fork

2011-08-23 Thread Charles-François Natali
2011/8/23, Nir Aides n...@winpdb.org:
 Hi all,

Hello Nir,

 Please consider this invitation to stick your head into an interesting
 problem:
 http://bugs.python.org/issue6721

Just for the record, I'm now in favor of the atfork mechanism. It
won't solve the problem for I/O locks, but it'll at least make room
for a clean and cross-library way to setup atfork handlers. I just
skimmed over it, but it seemed Gregory's atfork module could be a good
starting point.

cf
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] issue 6721 Locks in python standard library should be sanitized on fork

2011-08-23 Thread Charles-François Natali
2011/8/23 Antoine Pitrou solip...@pitrou.net:
 Well, I would consider the I/O locks the most glaring problem. Right
 now, your program can freeze if you happen to do a fork() while e.g.
 the stderr lock is taken by another thread (which is quite common when
 debugging).

Indeed.
To solve this, a similar mechanism could be used: after fork(), in the
child process:
- just reset each I/O lock (destroy/re-create the lock) if we can
guarantee that the file object is in a consistent state (i.e. that all
the invariants hold). That's the approach I used in my initial patch.
- call a fileobject method which resets the I/O lock and sets the file
object to a consistent state (in other word, an atfork handler)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comments of the PEP 3151

2011-07-27 Thread Charles-François Natali
 I assume that ESHUTDOWN is the errno in question?  (This is also already 
 mentioned in the PEP.)

 Indeed, I mentioned it in the PEP, as it appears in asyncore.py.
 But I can't find it on www.opengroup.org, and no man page on my Linux
 system (except the errno man page) seems to mention it.

It's not POSIX, but it's defined on Linux and FreeBSD (at least):
http://lxr.free-electrons.com/source/include/asm-generic/errno.h#L81
http://fxr.watson.org/fxr/source/sys/errno.h?v=FREEBSD53#L122

 The description from errnomodule.c says Cannot send after transport
 endpoint shutdown, but send() actually returns EPIPE, not ESHUTDOWN,
 when the socket has been shutdown:

Indeed, as required by POSIX.

But grepping through the Linux kernel source code, it seems to be used
extensively for USB devices, see
http://lxr.free-electrons.com/ident?i=ESHUTDOWN
So the transport endpoint doesn't necessarily refer to a socket.
It's also documented in
http://lxr.free-electrons.com/source/Documentation/usb/error-codes.txt

Finally, I found one place in the networking stack where ESHUTDOWN is
used, in the SCTP code:
http://lxr.free-electrons.com/source/net/sctp/outqueue.c#L329
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython (2.7): - Issue #12603: Fix pydoc.synopsis() on files with non-negative st_mtime.

2011-07-27 Thread Charles-François Natali
 +- Issue #12603: Fix pydoc.synopsis() on files with non-negative
 st_mtime.
 +

 Surely you mean non-positive? Non-negative st_mtime being the common
 case.

Of course (st_mtime = 0).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Issue #11784: Improve multiprocessing.Process.join() documentation. Patch by

2011-07-26 Thread Charles-François Natali
 There’s a dedicated file to thank doc contributors: Doc/ACKS.rst

I didn't know about this file, thanks.
In my defense, there's this comment at the top of Misc/ACKS:

This list is not complete and not in any useful order, but I would
like to thank everybody who contributed in any way, with code, hints,
bug reports, ideas, moral support, endorsement, or even complaints
Without you, I would've stopped working on Python long ago!

--Guido


What's the rationale for having a dedicated file?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Merge - Issue #12592: Make Python build on OpenBSD 5 (and future major

2011-07-23 Thread Charles-François Natali
 Note that this commit wasn't actually a merge -- you'll have to use the
 hg merge command for that.

You're right.
I guess that's what happens when I try to work past my usual bedtime ;-)

By the way, I'm still getting errors upon push, and it looks like when
I push a patch, this doesn't trigger any build on the buildbots. It
used to work, any idea what's going on?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   >