Re: [Python-Dev] Status of packaging in 3.3

2012-06-21 Thread Zooko Wilcox-O'Hearn
On Thu, Jun 21, 2012 at 12:57 AM, Nick Coghlan ncogh...@gmail.com wrote:

 Standard assumptions about the behaviour of site and distutils cease to be 
 valid once setuptools is installed
…
 - advocacy for the egg format and the associated sys.path changes that 
 result for all Python programs running on a system
…
 System administrators (and developers that think like system administrators 
 when it comes to configuration management) *hate* what setuptools (and 
 setuptools based installers) can do to their systems.

I have extensive experience with this, including quite a few bug
reports and a few patches in setuptools and distribute, plus
maintaining my own fork of setuptools to build and deploy my own
projects, plus interviewing quite a few Python developers about why
they hated setuptools, plus supporting one of them who hates
setuptools even though he and I use it in a build system
(https://tahoe-lafs.org).

I believe that 80% to 90% of the hatred alluded to above is due to a
single issue: the fact that setuptools causes your Python interpreter
to disrespect the PYTHONPATH, in violation of the documentation in
http://docs.python.org/release/2.7.2/install/index.html#inst-search-path
, which says:


The PYTHONPATH variable can be set to a list of paths that will be
added to the beginning of sys.path. For example, if PYTHONPATH is set
to /www/python:/opt/py, the search path will begin with
['/www/python', '/opt/py']. (Note that directories must exist in order
to be added to sys.path; the site module removes paths that don’t
exist.)


Fortunately, this issue is fixable! I opened a bug report and I and a
others have provided patches that makes setuptools stop doing this
behavior. This makes the above documentation true again. The negative
impact on features or backwards-compatibility doesn't seem to be
great.

http://bugs.python.org/setuptools/issue53

Philip J. Eby provisionally approved of one of the patches, except for
some specific requirement that I didn't really understand how to fix
and that now I don't exactly remember:

http://mail.python.org/pipermail/distutils-sig/2009-January/010880.html

Regards,

Zooko
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 418 is too divisive and confusing and should be postponed

2012-04-05 Thread Zooko Wilcox-O'Hearn
Folks:

Good job, Victor Stinner on baking the accumulated knowledge of this
thread into PEP 418. Even though I'm very interested in the topic, I
haven't been able to digest the whole thread(s) on the list and
understand what the current collective understanding is. The detailed
PEP document helps a lot.

I think there are still some mistakes, either in our collective
understanding as reflected by the PEP, or in my own head.

For starters, I still don't understand the first, most basic thing:
what do people mean when they say monotonic clock? I don't
understand the current text of PEP 418 with regard to the definition
of that word.

Allow me to resort to an analogy. There is an infinitely long,
perfectly straight and flat racetrack. There is a flag that gets
dragged along it at a constant rate, with the label REAL TIME on the
flag. There are some runners, each with a different label on their
chest:

Runner A: a helicopter hovers over Runner A. Occasionally it picks him
up and plops him down right next to the flag. Also, he wears a headset
and listens to instructions from his coach to run a little faster or
slower, as necessary, to remain abreast of the flag.

Runner B: a helicopter hovers over Runner B. If he is behind the flag,
it will pick him up and plop him down right next to the flag. However,
if he is ahead of the flag it will not pick him up.

Runner C: no helicopter ever picks up Runner C, but he does wear a
headset and listens to instructions from his coach to run a little
faster or a little slower. His coach tells him to run a little faster
if he is behind the flag or run a little slower if he is in front of
the flag, with the goal of eventually having him right next to the
flag.

Runner D: like Runner C, he never gets picked up, but he listens to
instructions to run a little faster or a little slower. However,
instead of telling him to run faster in order to catch up to the flag,
or to run slower in order to catch down to the flag, his coach
instead tells him to run a little faster if he is moving slower than
the flag is moving, and to run a little slower if he is moving faster
than the flag is moving. Note that this is very different from Runner
C, in that it is not intended to cause him to eventually be right next
to the flag, and indeed if it is done right it guarantees that he will
*never* be right next to the flag, although he will be moving just as
fast as the flag is moving.

Runner E: no helicopter, no headset. He just proceeds at his own pace,
blissfully unaware of the exhortations of others.

Now: which ones of these five runners do you call monotonic? Which
ones do you call steady?

Regards,

Zooko
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] this is why we shouldn't call it a monotonic clock (was: PEP 418 is too divisive and confusing and should be postponed)

2012-04-05 Thread Zooko Wilcox-O'Hearn
On Thu, Apr 5, 2012 at 7:14 PM, Greg Ewing greg.ew...@canterbury.ac.nz wrote:

 This is the strict mathematical meaning of the word monotonic, but the way 
 it's used in relation to OS clocks, it seems to mean rather more than that.

Yep. As far as I can tell, nobody has a use for an unsteady, monotonic clock.

There seem to be two groups of people:

1. Those who think that monotonic clock means a clock that never
goes backwards. These people are in the majority. After all, that's
what the word monotonic means ¹ . However, a clock which guarantees
*only* this is useless.

2. Those who think that monotonic clock means a clock that never
jumps, and that runs at a rate approximating the rate of real time.
This is a very useful kind of clock to have! It is what C++ now calls
a steady clock. It is what all the major operating systems provide.

The people in class 1 are more correct, technically, and far more
numerous, but the concept from 1 is a useless concept that should be
forgotten.

So before proceeding, we should mutually agree that we have no
interest in implementing a clock of type 1. It wouldn't serve anyone's
use case (correct me if I'm wrong!) and the major operating systems
don't offer such a thing anyway.

Then, if we all agree to stop thinking about that first concept, then
we need to agree whether we're all going to use the word monotonic
clock to refer to the second concept, or if we're going to use a
different word (such as steady clock) to refer to the second
concept. I would prefer the latter, as it will relieve us of the need
to repeatedly explain to newcomers: That word doesn't mean what you
think it means..

The main reason to use the word monotonic clock to refer to the
second concept is that POSIX does so, but since Mac OS X, Solaris,
Windows, and C++ have all avoided following POSIX's mistake, I think
Python should too.

Regards,

Zooko

¹ http://mathworld.wolfram.com/MonotonicSequence.html
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop the new time.wallclock() function?

2012-03-26 Thread Zooko Wilcox-O'Hearn
On Fri, Mar 23, 2012 at 11:27 AM, Victor Stinner
victor.stin...@gmail.com wrote:

 time.steady(strict=False) is what you need to implement timeout.

No, that doesn't fit my requirements, which are about event
scheduling, profiling, and timeouts. See below for more about my
requirements.

I didn't say this explicitly enough in my previous post:

Some use cases (timeouts, event scheduling, profiling, sensing)
require a steady clock. Others (calendaring, communicating times to
users, generating times for comparison to remote hosts) require a wall
clock.

Now here's the kicker: each use case incur significant risks if it
uses the wrong kind of clock.

If you're implementing event scheduling or sensing and control, and
you accidentally get a wall clock when you thought you had a steady
clock, then your program may go seriously wrong -- events may fire in
the wrong order, measurements of your sensors may be wildly incorrect.
This can lead to serious accidents. On the other hand, if you're
implementing calendaring or display of real local time of day to a
user, and you are using a steady clock for some reason, then you risk
displaying incorrect results to the user.

So using one kind of clock and then falling back to the other kind
is a choice that should be rare, explicit, and discouraged. The
provision of such a function in the standard library is an attractive
nuisance -- a thing that people naturally think that they want when
they haven't though about it very carefully, but that is actually
dangerous.

If someone has a use case which fits the steady or else fall back to
wall clock pattern, I would like to learn about it.

Regards,

Zooko
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 418: Add monotonic clock

2012-03-26 Thread Zooko Wilcox-O'Hearn
  system_clock = wall clock time
  monotonic_clock = always goes forward but can be adjusted
  steady_clock = always goes forward and cannot be adjusted
  high_resolution_clock = steady_clock || system_clock

Note that the C++ standard deprecated monotonic_clock once they
realized that there is absolutely no point in having a clock that
jumps forward but not back, and that none of the operating systems
implement such a thing -- instead they all implement a clock which
doesn't jump in either direction.

http://stackoverflow.com/questions/6777278/what-is-the-rationale-for-renaming-monotonic-clock-to-steady-clock-in-chrono

In other words, yes! +1! The C++ standards folks just went through the
process that we're now going through, and if we do it right we'll end
up at the same place they are:

http://en.cppreference.com/w/cpp/chrono/system_clock


system_clock represents the system-wide real time wall clock. It may
not be monotonic: on most systems, the system time can be adjusted at
any moment. It is the only clock that has the ability to map its time
points to C time, and, therefore, to be displayed.

steady_clock: monotonic clock that will never be adjusted

high_resolution_clock: the clock with the shortest tick period available


Note that we don't really have the option of providing a clock which
is monotonic but not steady in the sense of can jump forward but
not back. It is a misunderstanding (doubtless due to the confusing
name monotonic) to think that such a thing is offered by the
underlying platforms. We can choose to *call* it monotonic,
following POSIX instead of calling it steady, following C++.

Regards,

Zooko
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop the new time.wallclock() function?

2012-03-26 Thread Zooko Wilcox-O'Hearn
On Mon, Mar 26, 2012 at 5:07 PM, Victor Stinner
victor.stin...@gmail.com wrote:

 If someone has a use case which fits the steady or else fall back to wall 
 clock pattern, I would like to learn about it.

 Python 3.2 doesn't provide a monotonic clock, so most program uses 
 time.time() even if a monotonic clock would be better in some functions. For 
 these programs, you can replace time.time() by time.steady() where you need 
 to compute a time delta (e.g. compute a timeout) to avoid issues with the 
 system clock update. The idea is to improve the program without refusing to 
 start if no monotonic clock is available.

I agree that this is a reasonable use case. I think of it as basically
being a kind of backward-compatibility, for situations where an
unsteady clock is okay, and a steady clock isn't available. Twisted
faces a similar issue:

http://twistedmatrix.com/trac/ticket/2424

It might good for use cases like this to explicitly implement the
try-and-fallback, since they might have specific needs about how it is
done. For one thing, some such uses may need to emit a warning, or
even to require the caller to explicitly override, such a refusing to
start if a steady clock isn't available unless the user specifies
--unsteady-clock-ok.

For motivating examples, consider software written using Twisted 
12.0 or Python  3.2 which is using a clock to drive real world
sensing and control -- measuring the position of a machine and using
time deltas to calculate the machine's velocity, in order to
automatically control the motion of the machine. For some uses, it is
okay if the measurement could, in rare cases, be drastically wrong.
For other uses, that is not an acceptable risk.

One reason I'm sensitive to this issue is that I work in the field of
security, and making the behavior dependent on the system clock
extends the reliance set, i.e. the set of things that an attacker
could use against you. For example, if your robot depends on the
system clock for its sensing and control, and if your system clock
obeys NTP, then the set of things that an attacker could use against
you includes your NTP servers. If your robot depends instead on a
steady clock, then NTP servers are not in the reliance set.

Now, if your control platform doesn't have a steady clock, you may
choose to go ahead, while making sure that the NTP servers are
authenticated, or you may choose to disable NTP on the control
platform, etc., but that choice might need to be made explicitly by
the operator, rather than automatically by the library.

Regards,

Zooko
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop the new time.wallclock() function?

2012-03-23 Thread Zooko Wilcox-O'Hearn
 I merged the two functions into one function: time.steady(strict=False).

 time.steady() should be monotonic most of the time, but may use a fallback.

 time.steady(strict=True) fails with OSError or NotImplementedError if
 reading the monotonic clock failed or if no monotonic clock is available.

If someone wants time.steady(strict=False), then why don't they just
continue to use time.time()?

I want time.steady(strict=True), and I'm glad you're providing it and
I'm willing to use it this way, although it is slightly annoying
because time.steady(strict=True) really means
time.steady(i_really_mean_it=True). Else, I would have used
time.time().

I am aware of a large number of use cases for a steady clock (event
scheduling, profiling, timeouts), and a large number of uses cases for
a NTP-respecting wall clock clock (calendaring, displaying to a
user, timestamping). I'm not aware of any use case for steady if
implemented, else wall-clock, and it sounds like a mistake to me.

Regards,

Zooko
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] ctime: I don't think that word means what you think it means.

2009-06-13 Thread Zooko Wilcox-O'Hearn

The stat module uses the st_ctime slot to hold two kinds of values
which are semantically different and which are frequently
confused with one another.  It chooses which kind of value to put in
there based on platform -- Windows gets the file creation time and all
other platforms get the ctime.  The only sane way to use this API is
then to switch on platform:

if platform.system() == Windows:
metadata[creation time] = s.st_ctime
else:
metadata[unix ctime] = s.st_ctime

(That is an actual code snippet from the Allmydata-Tahoe project.)

Many or even most programmers incorrectly think that unix ctime is file
creation time, so instead of using the sane idiom above, they write the
following:

metadata[ctime] = s.st_ctime

thus passing on the confusion to the users of their metadata, who may
not be able to tell on which platform this metadata was created.   
This is

the situation we have found ourselves in for the Allmydata-Tahoe
project -- we now have a bunch of ctime values stored in our
filesystem and no way to tell which kind they were.

More and more filesystems such as ZFS and Mac HFS+ apparently offer
creation time nowadays.

I propose the following changes:

1.  Add a st_crtime field which gets populated on filesystems
(Windows, ZFS, Mac) which can do so.

That is hopefully not too controversial and we could proceed to do so
even if the next proposal gets bogged down:

2.  Add a st_unixctime field which gets populated *only* by the unix
ctime and never by any other value (even on Windows, where the unix
ctime *is* available even though nobody cares about it), and deprecate
the hopelessly ambiguous st_ctime field.

You may be interested in http://allmydata.org/trac/tahoe/ticket/628
(mtime and ctime: I don't think that word means what you think it
means.) where the Allmydata-Tahoe project is carefully unpicking the
mess we made for ourselves by confusing ctime with file-creation time.

This is ticket http://bugs.python.org/issue5720 .

Regards,

Zooko

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] .pth files are evil

2009-05-10 Thread Zooko Wilcox-O'Hearn

On May 9, 2009, at 9:39 AM, P.J. Eby wrote:

It would be really straightforward, though, for someone to  
implement an easy_install variant that does this.  Just invoke  
easy_install -Zmaxd /some/tmpdir packagelist to get a full set of  
unpacked .egg directories in /some/tmpdir, and then move the  
contents of the resulting .egg subdirs to the target location,  
renaming EGG-INFO subdirs to projectname-version.egg-info subdirs.


Except for the renaming part, this is exactly what GNU stow does.

(Of course, this ignores the issue of uninstalling previous  
versions, or overwriting of conflicting files in the target -- does  
pip handle these?)


GNU stow does handle these issues.

Regards,

Zooko
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] how GNU stow is complementary rather than alternative to distutils

2009-05-10 Thread Zooko Wilcox-O'Hearn

On May 10, 2009, at 11:18 AM, Martin v. Löwis wrote:

If GNU stow solves all your problems, why do you want to use  
easy_install in the first place?


That's a good question.  The answer is that there are two separate  
jobs: building executables and putting them in a directory structure  
of the appropriate shape for your system is one job, and installing  
or uninstalling that tree into your system is another.  GNU stow does  
only the latter.


The input to GNU stow is a set of executables, library files, etc.,  
in a directory tree that is of the right shape for your system.  For  
example, if you are on a Linux system, then your scripts all need to  
be in $prefix/bin/, your shared libs should be in $prefix/lib, your  
Python packages ought to be in $prefix/lib/python$x.$y/site- 
packages/, etc.  GNU stow is blissfully ignorant about all issues of  
building binaries, and choosing where to place files, etc. -- that's  
the job of the build system of the package, e.g. the ./configure -- 
prefix=foo  make  make install for most C packages, or the  
python ./setup.py install --prefix=foo for Python packages using  
distutils (footnote 1).


Once GNU stow has the well-shaped directory which is the output of  
the build process, then it follows a very dumb, completely reversible  
(uninstallable) process of symlinking those files into the system  
directory structure.


It is a beautiful, elegant hack because it is sooo dumb.  It is also  
very nice to use the same tool to manage packages written in any  
programming language, provided only that they can build a directory  
tree of the right shape and content.


However, there are lots of things that it doesn't do, such as  
automatically acquiring and building dependencies, or producing  
executables for the target platform for each of your console  
scripts.  Not to mention creating a directory named $prefx/lib/python 
$x.$y/site-packages and cp'ing your Python files into it.  That's  
why you still need a build system even if you use GNU stow for an  
install-and-uninstall system.


The thing that prevents this from working with setuptools is that  
setuptools creates a file named easy_install.pth during the python ./ 
setup.py install --prefix=foo if you build two different Python  
packages this way, they will each create an easy_install.pth file,  
and then when you ask GNU stow to link the two resulting packages  
into your system, it will say You are asking me to install two  
different packages which both claim that they need to write a file  
named '/usr/local/lib/python2.5/site-packages/easy_install.pth'.  I'm  
too dumb to deal with this conflict, so I give up..  If I understand  
correctly, your (MvL's) suggestion that easy_install create a .pth  
file named easy_install-$PACKAGE-$VERSION.pth instead of  
easy_install.pth would indeed make it work with GNU stow.


Regards,

Zooko

footnote 1: Aside from the .pth file issue, the other reason that  
setuptools doesn't work for this use while distutils does is that  
setuptools tries to hard to save you from making a mistake: maybe you  
don't know what you are doing if you ask it to install into a  
previously non-existent prefix dir foo.  This one is easier to fix:  
http://bugs.python.org/setuptools/issue54 # be more like distutils  
with regard to --prefix= .

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

2009-05-06 Thread Zooko Wilcox-O'Hearn

On May 6, 2009, at 7:33 AM, Stephen J. Turnbull wrote:


You have convinced me that the PEP should wait as well.

In its current form it is incomplete and dangerous.


+1 on delaying PEP 383

I think PEP 383 is a good idea in principle, but I'm still struggling  
to understand it myself, and it seems to offer new hazards for the  
unwary programmer.


On the other hand, maybe the wary programmers are waiting for Python  
3.2 anyway wink.


On the gripping hand, if PEP 383 is released in Python 3.1, will that  
obligate python-dev to support it indefinitely, at least in backwards- 
compatibility mode?  I'm not thinking of API compatibility as much as  
data compatibility -- someone used Python 3.1 to write down some  
filenames, and now a few years later they are trying to use the  
latest and greatest Python release to read those filenames...


Regards,

Zooko
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

2009-05-06 Thread Zooko Wilcox-O'Hearn

On May 6, 2009, at 10:54 AM, Antoine Pitrou wrote:


Zooko Wilcox-O'Hearn zooko at zooko.com writes:


I'm not thinking of API compatibility as much as data  
compatibility -- someone used Python 3.1 to write down some  
filenames, and now a few years later they are trying to use the  
latest and greatest Python release to read those filenames...


Well, if the filenames are generated by Python (as opposed to read  
from an existing directory on disk), they should be regular unicode  
objects without any lone surrogates, so I don't see the  
compatibility problem.


I meant that the application reads filenames from an existing  
directory on disk, saves those filenames, and then later, using a  
future version of Python, wants to read them and use them.


I'm not saying that I know this would be a problem.  I'm saying that  
I personally can't tell whether it would be a problem or not, and the  
extensive discussions so far have not convinced me that there is  
anyone who both understands PEP 383 and considers this use case.


Many people who apparently understand encoding issues well have said  
something to the effect that there is no problem, but those people  
haven't yet managed to get through my thick skull how I would use PEP  
383 safely for this sort of use case -- the one where data generated  
by os.listdir() travels forward in time or the one were that data  
travels sideways to other systems, including Windows or other systems  
that validate incoming unicode.


That's why I am a bit uncomfortable about PEP 383 being quickly  
implemented and deployed in Python 3.1.


By the way, much of the detailed discussion about what Tahoe requires  
and how that may or may not benefit from PEP 383 has now moved to the  
tahoe-dev mailing list: http://allmydata.org/cgi-bin/mailman/listinfo/ 
tahoe-dev .


Regards,

Zooko

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com