date:20120216

Am 15.02.2012 21:06, schrieb Antoine Pitrou:
 On Wed, 15 Feb 2012 20:56:26 +0100
 Martin v. Löwis mar...@v.loewis.de wrote:

 With the quartz in Victor's machine, a single clock takes 0.3ns, so
 three of them make a nanosecond. As the quartz may not be entirely
 accurate (and also as the CPU frequency may change) you have to measure
 the clock rate against an external time source, but Linux has
 implemented algorithms for that. On my system, dmesg shows

 [2.236894] Refined TSC clocksource calibration: 2793.000 MHz.
 [2.236900] Switching to clocksource tsc
 
 But that's still not meaningful. By the time clock_gettime() returns,
 an unpredictable number of nanoseconds have elapsed, and even more when
 returning to the Python evaluation loop.

This is not exactly true: while the current time won't be what was
returned when using it, it is certainly possible to predict how long it
takes to return from a system call. So the result is not accurate, but
meaningful.

If you are formally arguing that uncertain evens may happen, such as
the scheduler interrupting the thread: this is true for any clock
reading; the actual time may be many milliseconds off by the time it
is used. That is no reason to return to second resolution.

 So the nanosecond precision is just an illusion, and a float should
 really be enough to represent durations for any task where Python is
 suitable as a language.

I agree with that statement - I was just refuting your claim that Linux
cannot do nanosecond measurements.

Please do recognize the point I made to Guido: despite us three agreeing
that a float is good enough for time stamps, people will continue to
submit patches and ask for new features until we give in.

One way to delay that by several years could be to reject the PEP in
a way that makes it clear that not only the specific approach is
rejected, but any approach using anything else but floats.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

 Maybe an alternative PEP could be written that supports the filesystem
 copying use case only, using some specialized ns APIs? I really think
 that all you need is st_{a,c,m}time_ns fields and os.utime_ns().

I'm -1 on that, because it will make people write complicated code.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] best place for an atomic file API

 (MvL complained in the tracker issue about a lack of concrete use
 cases, but I think fixing race conditions when overwriting bytecode
 files in importlib and the existing distutils/packaging use cases
 cover that)

I certainly agree that there are applications of atomic replace, and
that the os module should expose the relevant platform APIs where
available.

I'm not so sure that atomic writes is a useful concept. I haven't seen
a proposed implementation, yet, but I'm doubtful that truly ACID
writes are possible unless the operating system supports transactions
(which only Windows 7 does). Even if you are ignoring Isolation,
Atomic already is a challenge: if you first write to a tempfile, then
rename it, you may end up with a state tempfile (e.g. if the process
is killed), and no rollback operation.

So atomic write to me promises something that it likely can't
deliver. OTOH, I still think that the promise isn't actually asked
for in practice (not even when overwriting bytecode files)

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

 So, getting back to the topic again, is there any reason why you would
 oppose backing the ElementTree module in the stdlib by cElementTree's
 accelerator module? Or can we just consider this part of the discussion
 settled and start getting work done?

I'd still like to know who is in charge of the etree package now. I know
that I'm not, so I just don't have any opinion on the technical question
of using the accelerator module (it sounds like a reasonable idea, but
it also sounds like something that may break existing code). If the
maintainer of the etree package would pronounce that it is ok to make
this change, I'd have no objection at all. Lacking a maintainer, I feel
responsible for any bad consequences of that change, which makes me feel
uneasy about it.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

 Does this imply that each and every package in the stdlib currently
 has a dedicated maintainer who promised to be dedicated to it? Or
 otherwise, should those packages that *don't* have a maintainer be
 removed from the standard library?

That is my opinion, yes. Some people (including myself) are willing to
act as maintainers for large sets of modules, covering even code that
they don't ever use themselves.

 Isn't that a bit harsh? ElementTree is an overall functional library
 and AFAIK the preferred stdlib tool for processing XML for many
 developers. It currently needs some attention to fix a few issues,
 expose the fast C implementation by default when ElementTree is
 imported, and improve the documentation. At this point, I'm interested
 enough to work on these - given that the political issue with Fredrik
 Lundh is resolved. However, I can't *honestly* say I promise to
 maintain the package until 2017. So, what's next?

If you feel qualified to make changes, go ahead and make them. Take the
praise if they are good changes, take the blame if they fire back.
Please do try to stay around until either has happened.

It would also good if you would declare I will maintain the etree package.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-16 Thread Eli Bendersky

 I'd still like to know who is in charge of the etree package now. I know
 that I'm not, so I just don't have any opinion on the technical question
 of using the accelerator module (it sounds like a reasonable idea, but
 it also sounds like something that may break existing code). If the
 maintainer of the etree package would pronounce that it is ok to make
 this change, I'd have no objection at all. Lacking a maintainer, I feel
 responsible for any bad consequences of that change, which makes me feel
 uneasy about it.



Martin, as you've seen Fredrik Lundh finally officially ceded the
maintenance of the ElementTree code to the Python developers:
http://mail.python.org/pipermail/python-dev/2012-February/116389.html

The change of backing ElementTree by cElementTree has already been
implemented in the default branch (3.3) by Florent Xicluna with careful
review from me and others. etree has an extensive (albeit a bit clumsy) set
of tests which keep passing successfully after the change. The bots are
also happy.

In the past couple of years Florent has been the de-facto maintainer of
etree in the standard library, although I don't think he ever committed
to keep maintaining it for years to come. Neither can I make this
commitment, however I do declare that I will do my best to keep the library
functional, and I also plan to work on improving its documentation and
cleaning up some of the accumulated cruft in its implementation. I also
have all the intentions to take the blame if something breaks. That said,
Florent is probably the one most familiar with the code at this point, and
although his help will be most appreciated I can't expect or demand from
him to stick around for a few years. We're all volunteers here, after all.

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems)

2012-02-16 Thread Ned Deily

In article 
cadisq7fg3vgxd39teuvbcvhhmpkuwss0qcksrfpkn5ye0dv...@mail.gmail.com,
 Nick Coghlan ncogh...@gmail.com wrote:

 On Thu, Feb 16, 2012 at 12:06 PM, Guido van Rossum gu...@python.org wrote:
  Anyway, I don't think anyone is objecting against the PEP allowing symlinks 
  now.
 
 Yeah, the onus is just back on me to do the final updates to the PEP
 and patch based on the discussion in this thread. Unless life
 unexpectedly intervenes, I expect that to happen on Saturday (my
 time).
 
 After that, the only further work is for Ned to supply whatever
 updates he needs to bring the 2.7 Mac OS X installers into line with
 the new naming scheme.

There are two issues that I know of for OS X.  One is just getting a 
python2 symlink into the bin directory of a framework build.  That's 
easy.  The other is managing symlinks (python, python2, and python3) 
across framework bin directories; currently there's no infrastructure 
for that.  That part will probably have to wait until PyCon.

-- 
 Ned Deily,
 n...@acm.org

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012/2/16 Martin v. Löwis mar...@v.loewis.de:
 Maybe an alternative PEP could be written that supports the filesystem
 copying use case only, using some specialized ns APIs? I really think
 that all you need is st_{a,c,m}time_ns fields and os.utime_ns().

 I'm -1 on that, because it will make people write complicated code.

Python 3.3 *has already* APIs for nanosecond timestamps:
os.utimensat(), os.futimens(), signal.sigtimedwait(), etc. These
functions expect a (seconds: int, nanoseconds: int) tuple.

We have to decide before the Python 3.3 release if this API is just
fine, or if it should be changed. After the release, it will be more
difficult to change the API.

If os.utimensat() expects a tuple, it would be nice to have a function
getting time as a tuple, like the C language has the clock_gettime()
function to get a timestamp as a timespec structure.

During the discussion, many developers wanted a type allowing to do
arithmetic operations like t2-t1 to compute a delta, or t+delta to
set a timezone. It is possible to do arithmetic on a tuple, but it
is not practical and I don't like a type with a fixed resolution (in
some cases you need millisecond, microseconds or 100 ns resolution).

If you consider that the float loss of precision is not an issue for
nanoseconds, we should use float for os.utimensat(), os.futimens() and
signal.sigtimedwait(), just for consistency.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] best place for an atomic file API

Most users don't need a truly ACID write, but implement their own
best-effort function. Instead of having a different implement in each
project, Python can provide something better, especially when the OS
provides low level function to implement such feature.

Victor

2012/2/16 Martin v. Löwis mar...@v.loewis.de:
 I'm not so sure that atomic writes is a useful concept. I haven't seen
 a proposed implementation, yet, but I'm doubtful that truly ACID
 writes are possible unless the operating system supports transactions
 (which only Windows 7 does). Even if you are ignoring Isolation,
 Atomic already is a challenge: if you first write to a tempfile, then
 rename it, you may end up with a state tempfile (e.g. if the process
 is killed), and no rollback operation.

 So atomic write to me promises something that it likely can't
 deliver. OTOH, I still think that the promise isn't actually asked
 for in practice (not even when overwriting bytecode files)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems)

 There are two issues that I know of for OS X.  One is just getting a 
 python2 symlink into the bin directory of a framework build.  That's 
 easy.

Where exactly in the Makefile is that reflected? ISTM that the current
patch already covers that, since the framwork* targets are not concerned
with the bin directory.

 The other is managing symlinks (python, python2, and python3) 
 across framework bin directories; currently there's no infrastructure 
 for that.  That part will probably have to wait until PyCon.

What is the framework bin directory? The links are proposed for
/usr/local/bin resp. /usr/bin. The proposed patch already manages
these links across releases (the most recent install wins).

If you are concerned about multiple feature releases: this is not an
issue, since the links are just proposed for Python 2.7 (distributions
may also add them for 2.6 and earlier, but we are not going to make
a release in that direction).

It may be that the PEP becomes irrelevant before it is widely accepted:
if the sole remaining Python 2 version is 2.7, users may just as well
refer to python2 as python2.7.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] best place for an atomic file API

Am 16.02.2012 10:54, schrieb Victor Stinner:
 Most users don't need a truly ACID write, but implement their own
 best-effort function. Instead of having a different implement in each
 project, Python can provide something better, especially when the OS
 provides low level function to implement such feature.

It's then critical how this is named, IMO (and exactly what semantics
it comprises). Calling it atomic when it is not is a mistake.

Also notice that one user commented that that he already implemented
something like this, and left out the issue of *permissions*. I found
that interesting, since preserving permissions might indeed a
requirement in a lot of in-place update use cases, but hasn't been
considered in this discussion yet.

So rather than providing a mechanism for atomic writes, I think
providing a mechanism to update a file is what people might need.

One way of providing this might be a u mode for open, which
updates an existing file on close (unlike a, which appends,
and unlike w, which truncates first).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

Am 16.02.2012 10:51, schrieb Victor Stinner:
 2012/2/16 Martin v. Löwis mar...@v.loewis.de:
 Maybe an alternative PEP could be written that supports the filesystem
 copying use case only, using some specialized ns APIs? I really think
 that all you need is st_{a,c,m}time_ns fields and os.utime_ns().

 I'm -1 on that, because it will make people write complicated code.
 
 Python 3.3 *has already* APIs for nanosecond timestamps:
 os.utimensat(), os.futimens(), signal.sigtimedwait(), etc. These
 functions expect a (seconds: int, nanoseconds: int) tuple.

I'm -1 on adding these APIs, also. Since Python 3.3 is not released
yet, it's not too late to revert them.

 If you consider that the float loss of precision is not an issue for
 nanoseconds, we should use float for os.utimensat(), os.futimens() and
 signal.sigtimedwait(), just for consistency.

I'm wondering what use cases utimensat and futimens have that are not
covered by utime/utimes (except for the higher resolution).

Keeping the ns in the name but not doing nanoseconds would be bad, IMO.

For sigtimedwait, accepting float is indeed the right thing to do.

In the long run, we should see whether using 128-bit floats
is feasible.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] best place for an atomic file API

2012-02-16 Thread Vinay Sajip

Martin v. Löwis martin at v.loewis.de writes:

 One way of providing this might be a u mode for open, which
 updates an existing file on close (unlike a, which appends,
 and unlike w, which truncates first).

Doesn't r+ cover this?

Regards,

Vinay Sajip

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems)

2012-02-16 Thread Nick Coghlan

On Thu, Feb 16, 2012 at 8:01 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 It may be that the PEP becomes irrelevant before it is widely accepted:
 if the sole remaining Python 2 version is 2.7, users may just as well
 refer to python2 as python2.7.

My hope is that a clear signal from us supporting a python2 symlink
for cross-distro compatibility will encourage the commercial distros
to add such a link to their 2.6 based variants (e.g. anything with an
explicit python2.7 reference won't run by default on RHEL6, or
rebuilds based on that).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems)

2012-02-16 Thread Nick Coghlan

On Wed, Feb 15, 2012 at 12:44 AM, Barry Warsaw ba...@python.org wrote:
 On Feb 14, 2012, at 12:38 PM, Nick Coghlan wrote:
I have no idea, and I'm not going to open that can of worms for this
PEP. We need to say something about the executable aliases so that
people can eventually write cross-platform python2 shebang lines, but
how particular distros actually manage the transition is going to
depend more on their infrastructure and community than it is anything
to do with us.

 Then I think all the PEP needs to say is that it is explicitly up to the
 distros to determine if, when, where, and how they transition.  I.e. take it
 off of python-dev's plate.

It turns out I'd forgotten what was in the PEP - the Notes section
already contained a lot of suggestions along those lines. I changed
the title of the section to Migration Notes, but tried to make it
clear that those *aren't* consensus recommendations, just ideas
distros may want to think about when considering making the switch.

The updated version is live on python.org:
http://www.python.org/dev/peps/pep-0394/

I didn't end up giving an explicit rationale for the choice to use a
symlink chain, since it really isn't that important to the main
purpose of the PEP (i.e. encouraging distros to make sure python2 is
on the system path somewhere).

Once MvL or Guido give the nod to the latest version, I'll bump it up
to approved.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012/2/15 Guido van Rossum gu...@python.org:
 So using floats we can match 100ns precision, right?

Nope, not to store an Epoch timestamp newer than january 1987:

 x=2**29; (x+1e-7) != x # no loss of precision
True
 x=2**30; (x+1e-7) != x # lose precision
False
 print(datetime.timedelta(seconds=2**29))
6213 days, 18:48:32
 print(datetime.datetime.fromtimestamp(2**29))
1987-01-05 19:48:32

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

 A data point on this specific use case.  The following code throws its
 assert ~90% of the time in Python 3.2.2 on a modern Linux machine (assuming
 foo exists and bar does not):

   import shutil
   import os
   shutil.copy2(foo, bar)
   assert os.stat(foo).st_mtime == os.stat(bar).st_mtime

It works because Python uses float for utime() and for stat(). But
this assertion may fail if another program checks file timestamps
without lossing precision (because of float), e.g. a program written
in C that compares st_*time and st_*time_ns fields.

 I fixed this in trunk last September
 (issue 12904); os.utime now preserves all the precision that Python
 currently conveys.

Let's try in a ext4 filesystem:

$ ~/prog/python/timestamp/python
Python 3.3.0a0 (default:35d6cc531800+, Feb 16 2012, 13:32:56)
 import decimal, os, shutil, time
 open(test, x).close()
 shutil.copy2(test, test2)
 os.stat(test, timestamp=decimal.Decimal).st_mtime
Decimal('1329395871.874886224')
 os.stat(test2, timestamp=decimal.Decimal).st_mtime
Decimal('1329395871.873350282')
 os.stat(test2, timestamp=decimal.Decimal).st_mtime - os.stat(test, 
 timestamp=decimal.Decimal).st_mtime
Decimal('-0.001535942')

So shutil.copy2() failed to copy the timestamp: test2 is 1 ms older than test...

Let's try with a program not written in Python: GNU make. The makefile:
-
test2: test
@echo Copy test into test2
@~/prog/python/default/python -c 'import shutil;
shutil.copy2(test, test2)'
test:
@echo Create test
@touch test
clean:
rm -f test test2
-

First try:

$ make clean
rm -f test test2
$ make
Create test
Copy test into test2
$ make
Copy test into test2

= test2 is always older than test and so is always regenerated.

Second try:

$ make clean
rm -f test test2
$ make
Create test
Copy test into test2
$ make
make: `test2' is up to date.

= oh, here test2 is newer or has the exact same modification time, so
there is no need to rebuild it.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

 PEP author Victor asked
 (in http://mail.python.org/pipermail/python-dev/2012-February/116499.html):

 Maybe I missed the answer, but how do you handle timestamp with an
 unspecified starting point like os.times() or time.clock()? Should we
 leave these function unchanged?

 If *all* you know is that it is monotonic, then you can't -- but then
 you don't really have resolution either, as the clock may well speed up
 or slow down.

 If you do have resolution, and the only problem is that you don't know
 what the epoch was, then you can figure that out well enough by (once
 per type per process) comparing it to something that does have an epoch,
 like time.gmtime().

Hum, I suppose that you can expect that time.time() - time.monotonic()
is constant or evolve very slowly. time.monotonic() should return a
number of second. But you are right, usually monotonic clocks are less
accurate.

On Windows, QueryPerformanceCounter() is less accurate than
GetSystemTimeAsFileTime() for example:
http://msdn.microsoft.com/en-us/magazine/cc163996.aspx
(read the The Issue of Frequency section)

time.monotonic() (function added to Python 3.3) documentation should
maybe mention the second unit and the accuracy issue.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-16 Thread Antoine Pitrou

On Thu, 16 Feb 2012 13:46:18 +0100
Victor Stinner victor.stin...@gmail.com wrote:
 
 Let's try in a ext4 filesystem:
 
 $ ~/prog/python/timestamp/python
 Python 3.3.0a0 (default:35d6cc531800+, Feb 16 2012, 13:32:56)
  import decimal, os, shutil, time
  open(test, x).close()
  shutil.copy2(test, test2)
  os.stat(test, timestamp=decimal.Decimal).st_mtime
 Decimal('1329395871.874886224')
  os.stat(test2, timestamp=decimal.Decimal).st_mtime
 Decimal('1329395871.873350282')

This looks fishy. Floating-point numbers are precise enough to
represent the difference between these two numbers:

 f = 1329395871.874886224
 f.hex()
'0x1.3cf3e27f7fe23p+30'
 g = 1329395871.873350282
 g.hex()
'0x1.3cf3e27f7e4f9p+30'

If I run your snippet and inspect modification times using `stat`, the
difference is much smaller (around 10 ns, not 1 ms):

$ stat test | \grep Modify
Modify: 2012-02-16 13:51:25.643597139 +0100
$ stat test2 | \grep Modify
Modify: 2012-02-16 13:51:25.643597126 +0100

In other words, you should check your PEP implementation for bugs.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

 The way Linux does that is to use the time-stamping counter of the
 processor (the rdtsc instructions), which (originally) counts one unit
 per CPU clock. I believe current processors use slightly different
 countings (e.g. through the APIC), but still: you get a resolution
 within the clock frequency of the CPU quartz.

Linux has an internal clocksource API supporting different hardwares:

PIT (Intel 8253 chipset): configurable frequency between 8.2 Hz and 1.2 MHz
PMTMR (power management timer): ACPI clock with a frequency of 3.5 MHz
TSC (Time Stamp Counter): frequency of your CPU
HPET (High Precision Event Timer): frequency of at least 10 MHz (14.3
MHz on my computer)

Linux has an algorithm to choose the best clock depend on its
performance and accurary.

Most clocks have a frequency higher than 1 MHz and so a resolution
smaller than 1 us, even if the clock is not really accurate.

I suppose that you can plug specialized hardward like an atomic
clocks, or a GPS receiver, for a better accurary.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

 If I run your snippet and inspect modification times using `stat`, the
 difference is much smaller (around 10 ns, not 1 ms):

 $ stat test | \grep Modify
 Modify: 2012-02-16 13:51:25.643597139 +0100
 $ stat test2 | \grep Modify
 Modify: 2012-02-16 13:51:25.643597126 +0100

The loss of precision is not constant: it depends on the timestamp value.

Another example using the stat program:

import decimal, os, shutil, time
try:
os.unlink(test)
except OSError:
pass
try:
os.unlink(test2)
except OSError:
pass
open(test, x).close()
shutil.copy2(test, test2)
print(os.stat(test, timestamp=decimal.Decimal).st_mtime)
print(os.stat(test2, timestamp=decimal.Decimal).st_mtime)
print(os.stat(test2, timestamp=decimal.Decimal).st_mtime -
os.stat(test, timestamp=decimal.Decimal).st_mtime)
os.system(stat test|grep ^Mod)
os.system(stat test2|grep ^Mod)


Outputs:

$ ./python x.py
1329398229.918858600
1329398229.918208829
-0.000649771
Modify: 2012-02-16 14:17:09.918858600 +0100
Modify: 2012-02-16 14:17:09.918208829 +0100

$ ./python x.py
1329398230.862858588
1329398230.861343658
-0.001514930
Modify: 2012-02-16 14:17:10.862858588 +0100
Modify: 2012-02-16 14:17:10.861343658 +0100

$ ./python x.py
1329398232.450858570
1329398232.450067044
-0.000791526
Modify: 2012-02-16 14:17:12.450858570 +0100
Modify: 2012-02-16 14:17:12.450067044 +0100

$ ./python x.py
1329398233.090858561
1329398233.090853761
-0.04800
Modify: 2012-02-16 14:17:13.090858561 +0100
Modify: 2012-02-16 14:17:13.090853761 +0100

The loss of precision is between 1 ms and 4 us. Decimal timestamps
display exactly the same value than the stat program: I don't see any
bug in this example.

Victor

PS: Don't try os.utime(Decimal) with my patch, the conversion from
Decimal to _PyTime_t does still use float internally (I know this
issue, it should be fixed in my patch) and so loss precision ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-16 Thread Antoine Pitrou


Le jeudi 16 février 2012 à 14:20 +0100, Victor Stinner a écrit :
  If I run your snippet and inspect modification times using `stat`, the
  difference is much smaller (around 10 ns, not 1 ms):
 
  $ stat test | \grep Modify
  Modify: 2012-02-16 13:51:25.643597139 +0100
  $ stat test2 | \grep Modify
  Modify: 2012-02-16 13:51:25.643597126 +0100
 
 The loss of precision is not constant: it depends on the timestamp value.

Well, I've tried several times and I can't reproduce a 1 ms difference.

 The loss of precision is between 1 ms and 4 us.

It still looks fishy to me. IEEE doubles have a 52-bit mantissa. Since
the integral part of a timestamp takes 32 bits or less, there are still
20 bits left for the fractional part: which allows for at least a 1 µs
precision (2**20 ~= 10**6). A 1 ms precision loss looks like a bug.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems)

2012-02-16 Thread Ned Deily

I'm away from the source for the next 36 hours.  I'll reply with patches by 
Saturday morning.
___
  Ned Deily
  n...@acm.org  --  []

 . Original Message ...
On Thu, 16 Feb 2012 11:01:39 +0100 Martin v. Löwis mar...@v.loewis.de 
wrote:
 There are two issues that I know of for OS X.  One is just getting a 
 python2 symlink into the bin directory of a framework build.  That's 
 easy.

Where exactly in the Makefile is that reflected? ISTM that the current
patch already covers that, since the framwork* targets are not concerned
with the bin directory.

 The other is managing symlinks (python, python2, and python3) 
 across framework bin directories; currently there's no infrastructure 
 for that.  That part will probably have to wait until PyCon.

What is the framework bin directory? The links are proposed for
/usr/local/bin resp. /usr/bin. The proposed patch already manages
these links across releases (the most recent install wins).

If you are concerned about multiple feature releases: this is not an
issue, since the links are just proposed for Python 2.7 (distributions
may also add them for 2.6 and earlier, but we are not going to make
a release in that direction).

It may be that the PEP becomes irrelevant before it is widely accepted:
if the sole remaining Python 2 version is 2.7, users may just as well
refer to python2 as python2.7.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] best place for an atomic file API

2012-02-16 Thread Serhiy Storchaka


15.02.12 23:16, Charles-François Natali написав(ла):

Issue #8604 aims at adding an atomic file API to make it easier to
create/update files atomically, using rename() on POSIX systems and
MoveFileEx() on Windows (which are now available through
os.replace()). It would also use fsync() on POSIX to make sure data is
committed to disk.
For example, it could be used by importlib to avoid races when
writting bytecode files (issues #13392, #13003, #13146), or more
generally by any application that wants to make sure to end up with a
consistent file even in face of crash (e.g. it seems that mercurial
implemented their own version).


What if target file is symlink?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-16 Thread Larry Hastings


On 02/15/2012 08:12 PM, Guido van Rossum wrote:

On Wed, Feb 15, 2012 at 7:28 PM, Larry Hastingsla...@hastings.org  wrote:

I fixed this in trunk last September
(issue 12904); os.utime now preserves all the precision that Python
currently conveys.

So, essentially you fixed this particular issue without having to do
anything as drastic as the proposed PEP...


I wouldn't say that.  The underlying representation is still 
nanoseconds, and Python only preserves roughly hundred-nanosecond 
precision.  My patch only ensures that reading and writing atime/mtime 
looks consistent to Python programs using the os module.  Any code that 
examined the nanosecond-precise values from stat()--written in Python or 
any other language--would notice the values didn't match.


I'm definitely +1 for extending Python to represent nanosecond precision 
ctime/atime/mtime, but doing so in a way that permits seamlessly adding 
more precision down the road when the Linux kernel hackers get bored 
again and add femtosecond resolution.  (And then presumably attosecond 
resolution four years later.)  I haven't read 410 yet so I have no 
opinion on it.


I wrote a patch last year that adds new Decimal ctime/mtime/atime fields 
to the output of stat, but it's a horrific performance regression 
(os.stat is 10x slower) and the reviewers were ambivalent so I've let it 
rot.  Anyway I now agree that we should improve the precision of 
datetime objects and use those instead of Decimal.  (But not 
timedeltas--ctime/mtime/atime are absolute times, not deltas.)



/arry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 394 request for pronouncement (python2 symlink in *nix systems)

2012-02-16 Thread Barry Warsaw

On Feb 16, 2012, at 09:54 PM, Nick Coghlan wrote:

It turns out I'd forgotten what was in the PEP - the Notes section
already contained a lot of suggestions along those lines. I changed
the title of the section to Migration Notes, but tried to make it
clear that those *aren't* consensus recommendations, just ideas
distros may want to think about when considering making the switch.

The updated version is live on python.org:
http://www.python.org/dev/peps/pep-0394/

That section looks great Nick, thanks.

I have one very minor quibble left.  In many places the PEP says something
like:

For the time being, it is recommended that python should refer to python2
(however, some distributions have already chosen otherwise; see the
Rationale and Migration Notes below).

which implies that we may change our recommendation, but never quite says what
the mechanism is for us to do that.

You could change the status of this PEP from Draft to Active, which perhaps
implies a little more strongly that this PEP will be updated should our
recommendation ever change.  I suspect it won't though (or at least won't any
time soon).

If you mark the PEP as Final, we still have the option of updating the PEP
some time later to reflect new recommendations.  It might be worth a quick
sentence to that effect in the PEP.

As I say though, this is a very minor quibble, so just DTRT.  +1 and thanks
for your great work on it.

Cheers,
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-16 Thread Ezio Melotti


On 14/02/2012 9.58, Stefan Behnel wrote:

Nick Coghlan, 14.02.2012 05:44:

On Tue, Feb 14, 2012 at 2:25 PM, Eli Bendersky wrote:

With the deprecation warning being silent, is there much to lose, though?

Yes, it creates problems for anyone that deliberately converts all
warnings to errors when running their test suites. This forces them to
spend time switching over to a Python version dependent import of
either cElementTree or ElementTree that could have been spent doing
something actually productive instead of mere busywork.


If I'm writing code that imports cElementTree on 3.3+, and I explicitly 
turn on DeprecationWarnings (that would otherwise be silenced) to check 
if I'm doing something wrong, I would like Python to tell me You don't 
need to import that anymore, just use ElementTree..
If I'm also converting all the warnings to errors, it's probably because 
I really want my code to do the right thing and spending 1 minute to 
add/change two line of code to fix this won't probably bother me too much.
Regular users won't even notice the warning, unless they stumble upon 
the note in the doc or enable the warnings (and eventually when the 
module is removed).



And, of course, even people that *don't* convert warnings to errors
when running tests will have to make the same switch when the module
is eventually removed.


When the module is eventually removed and you didn't warn them in 
advance, the situation is going to turn much worse, because their code 
will suddenly stop working once they upgrade to the newer version.
I don't mind keeping the module and the warning around for a few 
versions and give enough time for everyone to update their imports, but 
if eventually the module is removed I don't want all these developers to 
come and say why you removed cElementTree without saying anything and 
broke all my code?.




I'm -1 on emitting a deprecation warning just because cElementTree is being
replaced by a bare import. That's an implementation detail, just like
cElementTree should have been an implementation detail in the first place.
In all currently maintained CPython releases, importing cElementTree is the
right thing to do for users.


From 3.3 the right thing will be importing ElementTree, and at some 
point in the future that will be the only way to do it.



These days, other Python implementations already provide the cElementTree
module as a bare alias for ElementTree.py anyway, without emitting any
warnings. Why should CPython be the only one that shouts at users for
importing it?


I would watch this from the opposite point of view.  Why should the 
other Python implementation have a to keep around a dummy module due to 
a CPython implementation detail?
If we all go through a deprecation process we will eventually be able to 
get rid of this.


Best Regards,
Ezio Melotti


Stefan

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-16 Thread Charles-François Natali

I personally don't see any reason to drop a module that isn't
terminally broken or unmaintainable, apart from scaring users away by
making them think that we don't care about backward compatibility.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP for new dictionary implementation

PEP author Mark Shannon wrote
(in
http://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt):

... allows ... (the ``__dict__`` attribute of an object) to share
keys with other attribute dictionaries of instances of the same class.

Is the same class a deliberate restriction, or just a convenience
of implementation? I have often created subclasses (or even families
of subclasses) where instances (as opposed to the type) aren't likely
to have additional attributes. These would benefit from key-sharing
across classes, but I grant that it is a minority use case that isn't
worth optimizing if it complicates the implementation.

By separating the keys (and hashes) from the values it is possible
to share the keys between multiple dictionaries and improve memory use.

Have you timed not storing the hash (in the dict) at all, at least for
(unicode) str-only dicts? Going to the string for its own cached hash
breaks locality a bit more, but saves 1/3 of the memory for combined
tables, and may make a big difference for classes that have relatively
few instances.

Reduction in memory use is directly related to the number of dictionaries
with shared keys in existence at any time. These dictionaries are typically
half the size of the current dictionary implementation.

How do you measure that? The limit for huge N across huge numbers
of dicts should be 1/3 (because both hashes and keys are shared); I
assume that gets swamped by object overhead in typical small dicts.

If a table is split the values in the keys table are ignored,
instead the values are held in a separate array.

If they're just dead weight, then why not use them to hold indices
into the array, so that values arrays only have to be as long as
the number of keys, rather than rounding them up to a large-enough
power-of-two? (On average, this should save half the slots.)

A combined-table dictionary never becomes a split-table dictionary.

I thought it did (at least temporarily) as part of resizing; are you
saying that it will be re-split by the time another thread is
allowed to see it, so that it is never observed as combined?

Given that this optimization is limited to class instances, I think
there should be some explanation of why you didn't just automatically
add slots for each variable assigned (by hard-coded name) within a
method; the keys would still be stored on the type, and array storage
could still be used for the values; the __dict__ slot could initially
be a NULL pointer, and instance dicts could be added exactly when they
were needed, covering only the oddball keys.

I would reword (or at least reformat) the Cons section; at the
moment, it looks like there are four separate objections, and seems
to be a bit dismissive towards backwards copmatibility. Perhaps
something like:

While this PEP does not change any documented APIs or invariants,
it does break some de facto invariants.

C extension modules may be relying on the current physical layout
of a dictionary. That said, extensions which rely on internals may
already need to be recompiled with each feature release; there are
already changes planned for both Unicode (for efficiency) and dicts
(for security) that would require authors of these extensions to
at least review their code.

Because iteration (and repr) order can depend on the order in which
keys are inserted, it will be possible to construct instances that
iterate in a different order than they would under the current
implementation. Note, however, that this will happen very rarely
in code which does not deliberately trigger the differences, and
that test cases which rely on a particular iteration order will
already need to be corrected in order to take advantage of the
security enhancements being discussed under hash randomization, or
for use with Jython and PyPy.

-jJ

If there are still threading problems with my replies, please
email me with details, so that I can try to resolve them. -jJ

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-16 Thread Ezio Melotti


On 16/02/2012 19.55, Antoine Pitrou wrote:

On Thu, 16 Feb 2012 19:32:24 +0200
Ezio Melottiezio.melo...@gmail.com  wrote:

If I'm writing code that imports cElementTree on 3.3+, and I explicitly
turn on DeprecationWarnings (that would otherwise be silenced) to check
if I'm doing something wrong, I would like Python to tell me You don't
need to import that anymore, just use ElementTree..
If I'm also converting all the warnings to errors, it's probably because
I really want my code to do the right thing and spending 1 minute to
add/change two line of code to fix this won't probably bother me too much.

But then you're going from a cumbersome situation (where you have to
import cElementTree and then fallback on regular ElementTree) to an
even more cumbersome one (where you have to first check the Python
version, then conditionally import cElementTree, then fallback on
regular ElementTree).


This is true if you need to support Python =3.2, but on the long run 
this won't be needed anymore and a plain import ElementTree will be 
enough.





When the module is eventually removed and you didn't warn them in
advance, the situation is going to turn much worse, because their code
will suddenly stop working once they upgrade to the newer version.

Why would we remove the module? It seems supporting it should be
mostly trivial (it's an alias).


I'm assuming that eventually the module will be removed (maybe for 
Python 4?), and I don't expect nor want to seen it removed in the near 
future.
If something gets removed it should be deprecated first, and it's 
usually better to deprecate it sooner so that the developers have more 
time to update their code.
As I proposed on the tracker though, we could even delay the deprecation 
to 3.4 (by that time they might not need to support 3.2 anymore).





I would watch this from the opposite point of view.  Why should the
other Python implementation have a to keep around a dummy module due to
a CPython implementation detail?

I don't know, but they already have this module, and it certainly costs
them nothing to keep it.


There will also be a cost if people keep importing cElementTree and fall 
back on ElementTree on failure even when this won't be necessary 
anymore.  This also means that more people will have to fix their code 
if/when the module will be removed if they kept using cElementTree.  
They can also find cElementTree in old code/tutorial and figure out that 
it's better to use the C one because is faster and keep doing so because 
the only warning that would stop them is hidden in the doc.


I think the problem with the DeprecationWarnings being too noisy was 
fixed by silencing them; if they are still too noisy then we need a 
better mechanism to warn people who care (and going to check the doc 
every once in a while to see if some new doc warning has been added 
doesn't strike me as a valid solution).


Best Regards,
Ezio Melotti
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] folding cElementTree behind ElementTree in 3.3

2012-02-16 Thread Tim Delaney

On 17 February 2012 04:55, Antoine Pitrou solip...@pitrou.net wrote:

 But then you're going from a cumbersome situation (where you have to
 import cElementTree and then fallback on regular ElementTree) to an
 even more cumbersome one (where you have to first check the Python
 version, then conditionally import cElementTree, then fallback on
 regular ElementTree).


Well, you can reverse the import so you're not relying on version numbers:

import  xml.etree.ElementTree as ElementTree

try:
import xml.etree.cElementTree as ElementTree
except ImportError:
pass

There is a slight cost compared to previously (always importing the python
version) and you'll still be using cElementTree directly until it's
removed, but if/when it is removed you won't notice it.

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Store timestamps as decimal.Decimal objects



In http://mail.python.org/pipermail/python-dev/2012-February/116073.html
Nick Coghlan wrote:

 Besides, float128 is a bad example - such a type could just be
 returned directly where we return float64 now. (The only reason we
 can't do that with Decimal is because we deliberately don't allow
 implicit conversion of float values to Decimal values in binary
 operations).

If we could really replace float with another type, then there is
no reason that type couldn't be a nearly trivial Decimal subclass
which simply flips the default value of the (never used by any
caller) allow_float parameter to internal function _convert_other.

Since decimal inherits straight from object, this subtype could
even be made to inherit from float as well, and to store the lower-
precision value there.  It could even produce the decimal version
lazily, so as to minimize slowdown on cases that do not need the
greater precision.

Of course, that still doesn't answer questions on whether the higher
precision is a good idea ...

-jJ

-- 

If there are still threading problems with my replies, please 
email me with details, so that I can try to resolve them.  -jJ

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP for new dictionary implementation

2012-02-16 Thread Antoine Pitrou


On Wed, 08 Feb 2012 19:18:14 +
Mark Shannon m...@hotpy.org wrote:
 Proposed PEP for new dictionary implementation, PEP 410?
 is attached.
 

So, I'm running a few benchmarks using Twisted's test suite
(see https://bitbucket.org/pitrou/t3k/wiki/Home).

At the end of `python -i bin/trial twisted.internet.test`:
- vanilla 3.3: RSS = 94 MB
- new dict:RSS = 91 MB

At the end of `python -i bin/trial twisted.python.test`:
- vanilla 3.3: RSS = 31.5 MB
- new dict:RSS = 30 MB

At the end of `python -i bin/trial twisted.conch.test`:
- vanilla 3.3: RSS = 68 MB
- new dict:RSS = 42 MB (!)

At the end of `python -i bin/trial twisted.trial.test`:
- vanilla 3.3: RSS = 32 MB
- new dict:RSS = 30 MB

At the end of `python -i bin/trial twisted.test`:
- vanilla 3.3: RSS = 62 MB
- new dict:RSS = 78 MB (!)

Runtimes were mostly similar in these test runs.

Perspective broker benchmark (doc/core/benchmarks/tpclient.py and
doc/core/benchmarks/tpserver.py):
- vanilla 3.3: 422 MB/sec
- new dict:402 MB/sec

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] plugging the hash attack



In http://mail.python.org/pipermail/python-dev/2012-January/116003.html


  Benjamin Peterson wrote:
  2. It will be off by default in stable releases ... This will
  prevent code breakage ...

 2012/1/27 Steven D'Aprano steve at pearwood.info:
  ... it will become on by default in some future release?

 On Fri, Jan 27, 2012, Benjamin Peterson benjamin at python.org wrote:
 Yes, 3.3. The solution in 3.3 could even be one of the more
 sophisticated proposals we have today.

Brett Cannon (Mon Jan 30) wrote:

 I think that would be good. And I would  even argue we remove support for
 turning it off to force people to no longer lean on dict ordering as a
 crutch (in 3.3 obviously).

Turning it on by default is fine.

Removing the ability to turn it off is bad.

If regression tests fail with python 3, the easiest thing to do is just
not to migrate to python 3.  Some decisions (certainly around unittest,
but I think even around hash codes) were settled precisely because tests
shouldn't break unless the functionality has really changed.  Python 3
isn't yet so dominant as to change that tradeoff.

I would go so far as to add an extra step in the porting recommendations;
before porting to python 3.x, run your test suite several times with
hash randomization turned on; any failures at this point are relying on
formally undefined behavior and should be fixed, but can *probably* be
fixed just by wrapping the results in sorted.

(I would offer a patch to the porting-to-py3 recommendation, except that
I couldn't find any not associated specifically with 3.0)

-jJ

-- 

If there are still threading problems with my replies, please 
email me with details, so that I can try to resolve them.  -jJ

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP for new dictionary implementation

Am 11.02.2012 22:22, schrieb Mark Shannon:
 Antoine Pitrou wrote:
 Hello Mark,

 I think the PEP should explain what happens when a keys table needs
 resizing when setting an object's attribute.
 
 If the object is the only instance of a class, it remains split,
 otherwise the table is combined.

Hi Mark,

Answering on-list is fine, but please do add such answers to the PEP
when requested.

I have such a question also: why does it provide storage for the value
slot in the keys array, where this slot is actually not used?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP for new dictionary implementation

Am 13.02.2012 13:46, schrieb Mark Shannon:
 Revised PEP for new dictionary implementation, PEP 412?
 is attached.

Committed as PEP 412.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP for new dictionary implementation

Am 16.02.2012 19:24, schrieb Jim J. Jewett:

PEP author Mark Shannon wrote
(in
http://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt):

... allows ... (the ``__dict__`` attribute of an object) to share
keys with other attribute dictionaries of instances of the same class.

Is the same class a deliberate restriction, or just a convenience
of implementation?

It's about the implementation: the class keeps a pointer to the key set.
A subclass has a separate pointer for that.

I have often created subclasses (or even families
of subclasses) where instances (as opposed to the type) aren't likely
to have additional attributes. These would benefit from key-sharing
across classes, but I grant that it is a minority use case that isn't
worth optimizing if it complicates the implementation.

In particular, the potential savings are small: the instances of the
subclass will share the key sets per-class. So if you have S subclasses,
you could save up to S keysets, whereas you are already saving N-S-1
keysets (assuming you have a total of N objects across all classes).

I'd be in favor of that, but it is actually an unrelated change: whether
or not you share key sets is unrelated to whether or not str-only dicts
drop the cached hash. Given a dict, it may be tricky to determine
whether or not it is str-only, i.e. what layout to use.

It's more difficult than that. He also drops the smalltable (which I
think is a good idea), so accounting how this all plays together is tricky.

If a table is split the values in the keys table are ignored,
instead the values are held in a separate array.

Good idea. However, how do you track per-dict how large the table is?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

  $ stat test | \grep Modify
  Modify: 2012-02-16 13:51:25.643597139 +0100
  $ stat test2 | \grep Modify
  Modify: 2012-02-16 13:51:25.643597126 +0100

 The loss of precision is not constant: it depends on the timestamp value.

 Well, I've tried several times and I can't reproduce a 1 ms difference.

 The loss of precision is between 1 ms and 4 us.

 It still looks fishy to me. IEEE doubles have a 52-bit mantissa. Since
 the integral part of a timestamp takes 32 bits or less, there are still
 20 bits left for the fractional part: which allows for at least a 1 µs
 precision (2**20 ~= 10**6). A 1 ms precision loss looks like a bug.

Oh... It was a important bug in my function used to change the
denominator of a timestamp. I tried to workaround integer overflow,
but I added a bug. I changed my patch to use PyLong which has no
integer overflow issue.

Fixed example:

 open(test, x).close()
 import shutil
 shutil.copy2(test, test2)
[94386 refs]
 print(os.stat(test, datetime.datetime).st_mtime)
2012-02-16 21:58:30.835062+00:00
 print(os.stat(test2, datetime.datetime).st_mtime)
2012-02-16 21:58:30.835062+00:00
 print(os.stat(test, decimal.Decimal).st_mtime)
1329429510.835061686
 print(os.stat(test2, decimal.Decimal).st_mtime)
1329429510.835061789
 os.stat(test2, decimal.Decimal).st_mtime - os.stat(test, 
 decimal.Decimal).st_mtime
Decimal('1.03E-7')

So the difference is only 0.1 us (100 ns).

It doesn't change anything to the Makefile issue, if timestamps are
different in a single nanosecond, they are seen as different by make
(by another program comparing the timestamp of two files using
nanosecond precision).

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Counting collisions for the win



In http://mail.python.org/pipermail/python-dev/2012-January/115715.html
Frank Sievertsen wrote:

Am 20.01.2012 13:08, schrieb Victor Stinner:
 I'm surprised we haven't seen bug reports about it from users
 of 64-bit Pythons long ago
 A Python dictionary only uses the lower bits of a hash value. If your
 dictionary has less than 2**32 items, the dictionary order is exactly
 the same on 32 and 64 bits system: hash32(str)  mask == hash64(str)
 mask for mask= 2**32-1.

 No, that's not true.
 Whenever a collision happens, other bits are mixed in very fast.

 Frank

Bits are mixed in quickly from a denial-of-service standpoint, but
Victor is correct from a Why don't the tests already fail? standpoint.

A dict with 2**12 slots, holding over 2700 entries, will be far larger
than most test cases -- particularly those with visible output.  In a
dict that size, 32-bit and 64-bit machines will still probe the same
first, second, third, fourth, fifth, and sixth slots.  Even on the
rare cases when there are at least 6 collisions, the next slots may
well be either the same, or close enough that it doesn't show up in a
changed iteration order.

-jJ

-- 

If there are still threading problems with my replies, please 
email me with details, so that I can try to resolve them.  -jJ

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-16 Thread Guido van Rossum

On Thu, Feb 16, 2012 at 2:04 PM, Victor Stinner
victor.stin...@gmail.com wrote:
 It doesn't change anything to the Makefile issue, if timestamps are
 different in a single nanosecond, they are seen as different by make
 (by another program comparing the timestamp of two files using
 nanosecond precision).

But make doesn't compare timestamps for equality -- it compares for
newer. That shouldn't be so critical, since if there is an *actual*
causal link between file A and B, the difference in timestamps should
always be much larger than 100 ns.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012/2/16 Guido van Rossum gu...@python.org:
 On Thu, Feb 16, 2012 at 2:04 PM, Victor Stinner
 victor.stin...@gmail.com wrote:
 It doesn't change anything to the Makefile issue, if timestamps are
 different in a single nanosecond, they are seen as different by make
 (by another program comparing the timestamp of two files using
 nanosecond precision).

 But make doesn't compare timestamps for equality -- it compares for
 newer. That shouldn't be so critical, since if there is an *actual*
 causal link between file A and B, the difference in timestamps should
 always be much larger than 100 ns.

The problem is that shutil.copy2() produces sometimes *older*
timestamp :-/ As shown in my previous email: in such case, make will
always rebuild the second file instead of only build it once.

Example with two consecutive runs:

$ ./python diff.py
1329432426.650957952
1329432426.650958061
1.09E-7

$ ./python diff.py
1329432427.854957910
1329432427.854957819
-9.1E-8

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-16 Thread Guido van Rossum

On Thu, Feb 16, 2012 at 2:48 PM, Victor Stinner
victor.stin...@gmail.com wrote:
 2012/2/16 Guido van Rossum gu...@python.org:
 On Thu, Feb 16, 2012 at 2:04 PM, Victor Stinner
 victor.stin...@gmail.com wrote:
 It doesn't change anything to the Makefile issue, if timestamps are
 different in a single nanosecond, they are seen as different by make
 (by another program comparing the timestamp of two files using
 nanosecond precision).

 But make doesn't compare timestamps for equality -- it compares for
 newer. That shouldn't be so critical, since if there is an *actual*
 causal link between file A and B, the difference in timestamps should
 always be much larger than 100 ns.

 The problem is that shutil.copy2() produces sometimes *older*
 timestamp :-/ As shown in my previous email: in such case, make will
 always rebuild the second file instead of only build it once.

 Example with two consecutive runs:

 $ ./python diff.py
 1329432426.650957952
 1329432426.650958061
 1.09E-7

 $ ./python diff.py
 1329432427.854957910
 1329432427.854957819
 -9.1E-8

Have you been able to reproduce this with an actual Makefile? What's
the scenario? I'm thinking of a Makefile like this:

a:
cp /dev/null a
b: a
cp a b

Now say a doesn't exist and we run make b. This will create a and
then b. I can't believe that the difference between the mtimes of a
and b is so small that if you copy the directory containing Makefile,
a and b using a Python tool that reproduces mtimes only with usec
accuracy you'll end up with a directory where a is newer than n.

What am I missing?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

The problem is that shutil.copy2() produces sometimes *older*
timestamp :-/ (...)

Have you been able to reproduce this with an actual Makefile? What's
the scenario?

Hum. I asked the Internet who use shutil.copy2() and I found an old
issue (Decimal('43462967.173053') seconds ago):

Python issue #10148: st_mtime differs after shutil.copy2 (october 2010)
When copying a file with shutil.copy2() between two ext4 filesystems
on 64-bit Linux, the mtime of the destination file is different after
the copy. It appears as if the resolution is slightly different, so
the mtime is truncated slightly. (...)

I don't know if it is a theorical or practical issue. Then I found:

Python issue #11941: Support st_atim, st_mtim and st_ctim attributes
in os.stat_result
They would expose relevant functionality from libc's stat() and
provide better precision than floating-point-based st_atime, st_mtime
and st_ctime attributes.

Which is connected the issue that motivated me to write the PEP:

Python issue #11457: os.stat(): add new fields to get timestamps as
Decimal objects with nanosecond resolution
Support for such precision is available at the least on 2.6 Linux kernels.
This is important for example with the tarfile module with the pax
tar format. The POSIX tar standard[3] mandates storing the mtime in
the extended header (if it is not an integer) with as much precision
as is available in the underlying file system, and likewise to restore
this time properly upon extraction. Currently this is not possible.
The mailbox module would benefit from having this precision available.

For the tarfile use case, we need at least a way to get the
modification time with a nanosecond resolution *and* to set the
modification time with a nanosecond resolution. We just need to decide
which type is the best for this usecase, which is the purpose of the
PEP 410 :-)

Another use case of nanosecond timestamps are profilers (and maybe
benchmark tools). The profiler itself may be implemented in a
different language than Python. For example, DTrace uses nanosecond
timestamps.

Other examples.

Debian bug #627460: (gcp) Expose nanoseconds in python (15 May 2011)
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=627460
Debian bug #626787: (gcp) gcp: timestamp is not always copied exact
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=626787
When copying a (large) file from HDD to USB the files timestamp is
not copied exact. It seems to work fine with smaller files (up to
1Gig), I couldn't spot the time-diff on these files.
(gcp is a grid enabled version of the scp copy command.)

fuse-python supports nanosecond resolution: they chose to mimick the C
API using:

class Timespec(FuseStruct):

Cf. struct timespec in time.h:
http://www.opengroup.org/onlinepubs/009695399/basedefs/time.h.html

def __init__(self, name=None, **kw):
self.tv_sec = None
self.tv_nsec = None
kw['name'] = name
FuseStruct.__init__(self, **kw)

Python issue #9079: Make gettimeofday available in time module
... exposes gettimeofday as time.gettimeofday() returning (sec, usec) pair

The Oracle database supports timestamps with a nanosecond resolution.
A related article about Ruby:
http://marcricblog.blogspot.com/2010/04/who-cares-about-nanosecond.html
Files are uploaded in groups (fifteen maximum). It was important to
know the order on which files have been upload. Depending on the size
of the files and users’ internet broadband capacity, some files could
be uploaded in the same second.

And a last one for the fun:

This Week in Python Stupidity: os.stat, os.utime and Sub-Second
Timestamps (November 15, 2009)
http://ciaranm.wordpress.com/2009/11/15/this-week-in-python-stupidity-os-stat-os-utime-and-sub-second-timestamps/
Yup, that’s right, Python’s underlying type for floats is an IEEE 754
double, which is only good for about sixteen decimal digits. With ten
digits before the decimal point, that leaves six for sub-second
resolutions, which is three short of the range required to preserve
POSIX nanosecond-resolution timestamps. With dates after the year 2300
or so, that leaves only five accurate digits, which isn’t even enough
to deal with microseconds correctly. Brilliant.
Python does have a half-assed fixed point type. Not sure why they
don’t use it more.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-16 Thread Guido van Rossum

So, make is unaffected. In my first post on this subject I already
noted that the only real use case is making a directory or filesystem
copy and then verifying that the copy is identical using native tools
that compare times with nsec precision. At least one of the bugs you
quote is about the current 1-second granularity, which is already
addressed by using floats (up to ~usec precision). The fs copy use
case should be pretty rare, and I would be okay with a separate
lower-level API that uses a long to represent nanoseconds (though MvL
doesn't like that either). Using (seconds, nsec) tuples is silly
though.

--Guido

On Thu, Feb 16, 2012 at 4:04 PM, Victor Stinner
victor.stin...@gmail.com wrote:
The problem is that shutil.copy2() produces sometimes *older*
timestamp :-/ (...)

Have you been able to reproduce this with an actual Makefile? What's
the scenario?

Hum. I asked the Internet who use shutil.copy2() and I found an old
issue (Decimal('43462967.173053') seconds ago):

I don't know if it is a theorical or practical issue. Then I found:

Which is connected the issue that motivated me to write the PEP:

Other examples.

fuse-python supports nanosecond resolution: they chose to mimick the C
API using:

class Timespec(FuseStruct):
Cf. struct timespec in time.h:
http://www.opengroup.org/onlinepubs/009695399/basedefs/time.h.html
def __init__(self, name=None, **kw):
self.tv_sec = None
self.tv_nsec = None
kw['name'] = name
FuseStruct.__init__(self, **kw)

Python issue #9079: Make gettimeofday available in time module
... exposes gettimeofday as time.gettimeofday() returning (sec, usec) pair

And a last one for the fun:

Re: [Python-Dev] PEP 410 (Decimal timestamp): the implementation is ready for a review

2012-02-16 Thread Alexander Belopolsky

On Wed, Feb 15, 2012 at 11:39 AM, Guido van Rossum gu...@python.org wrote:
 Maybe it's okay to wait a few years on this, until either 128-bit
 floats are more common or cDecimal becomes the default floating point
 type?

+1
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP for new dictionary implementation

2012-02-16 Thread Jim Jewett

On Thu, Feb 16, 2012 at 4:34 PM, Martin v. Löwis mar...@v.loewis.de wrote:
Am 16.02.2012 19:24, schrieb Jim J. Jewett:

PEP author Mark Shannon wrote
(in
http://mail.python.org/pipermail/python-dev/attachments/20120208/05be469a/attachment.txt):

... allows ... (the ``__dict__`` attribute of an object) to share
keys with other attribute dictionaries of instances of the same class.

Is the same class a deliberate restriction, or just a convenience
of implementation?

It's about the implementation: the class keeps a pointer to the key set.
A subclass has a separate pointer for that.

I would prefer to see that reason in the PEP; after a few years, I
have trouble finding email, even when I remember reading the
conversation.

I'd be in favor of that, but it is actually an unrelated change: whether
or not you share key sets is unrelated to whether or not str-only dicts
drop the cached hash.

Except that the biggest arguments against it are that it breaks cache
locality, and it changes the dictentry struct -- which this patch
already does anyway.

Given a dict, it may be tricky to determine
whether or not it is str-only, i.e. what layout to use.

Isn't that exactly the same determination needed when deciding whether
or not to use lookdict_unicode? (It would make the switch to the more
general lookdict more expensive, as that would involve a new
allocation.)

It's more difficult than that. He also drops the smalltable (which I
think is a good idea), so accounting how this all plays together is tricky.

All the more reason to explain in the PEP how he measured or approximated it.

If a table is split the values in the keys table are ignored,
instead the values are held in a separate array.

Good idea. However, how do you track per-dict how large the table is?

Why would you want to?

The per-instance array needs to be at least as large as the highest
index used by any key for which it has a value; if the keys table gets
far larger (or even shrinks), that doesn't really matter to the
instance. What does matter to the instance is getting a value of its
own for a new (to it) key -- and then the keys table can tell it which
index to use, which in turn tells it whether or not it needs to grow
the array.

Are are you thinking of len(o.__dict__), which will indeed be a bit
slower? That will happen with split dicts and potentially missing
values, regardless of how much memory is set aside (or not) for the
missing values.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Disabling a test that fails on some bots. Will investigate the failure soon

2012-02-16 Thread Nick Coghlan

On Fri, Feb 17, 2012 at 2:09 AM, eli.bendersky
python-check...@python.org wrote:
 diff --git a/Lib/test/test_xml_etree_c.py b/Lib/test/test_xml_etree_c.py
 --- a/Lib/test/test_xml_etree_c.py
 +++ b/Lib/test/test_xml_etree_c.py
 @@ -53,8 +53,8 @@
         # actual class. In the Python version it's a class.
         self.assertNotIsInstance(cET.Element, type)

 -    def test_correct_import_cET_alias(self):
 -        self.assertNotIsInstance(cET_alias.Element, type)
 +    #def test_correct_import_cET_alias(self):
 +        #self.assertNotIsInstance(cET_alias.Element, type)

While this one was fixed quickly, *please* don't comment tests out
without some kind of explanation in the code (not just in the checkin
message).

Even better is to use the expected_failure() decorator or the skip() decorator.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Disabling a test that fails on some bots. Will investigate the failure soon

2012-02-16 Thread Eli Bendersky

On Fri, Feb 17, 2012 at 05:50, Nick Coghlan ncogh...@gmail.com wrote:

 On Fri, Feb 17, 2012 at 2:09 AM, eli.bendersky
 python-check...@python.org wrote:
  diff --git a/Lib/test/test_xml_etree_c.py b/Lib/test/test_xml_etree_c.py
  --- a/Lib/test/test_xml_etree_c.py
  +++ b/Lib/test/test_xml_etree_c.py
  @@ -53,8 +53,8 @@
  # actual class. In the Python version it's a class.
  self.assertNotIsInstance(cET.Element, type)
 
  -def test_correct_import_cET_alias(self):
  -self.assertNotIsInstance(cET_alias.Element, type)
  +#def test_correct_import_cET_alias(self):
  +#self.assertNotIsInstance(cET_alias.Element, type)

 While this one was fixed quickly, *please* don't comment tests out
 without some kind of explanation in the code (not just in the checkin
 message).

 Even better is to use the expected_failure() decorator or the skip()
 decorator.
  http://mail.python.org/mailman/listinfo/python-checkins


I just saw this test failing in some bots and wanted to fix it ASAP,
without spending time on a real investigation. The follow-up fix came less
than 2 hours later. But yes, I agree that commenting out wasn't a good
choice - I should've just deleted it for the time I was working on a fix.

By the way, I later discussed the failing test with Florent and
http://bugs.python.org/issue14035 is the result. That failure had made no
sense until Florent got deeper into import_fresh_module.

Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP for new dictionary implementation

 Good idea. However, how do you track per-dict how large the table is?
 
 Why would you want to?
 
 The per-instance array needs to be at least as large as the highest
 index used by any key for which it has a value; if the keys table gets
 far larger (or even shrinks), that doesn't really matter to the
 instance.  What does matter to the instance is getting a value of its
 own for a new (to it) key -- and then the keys table can tell it which
 index to use, which in turn tells it whether or not it needs to grow
 the array.

To determine whether it needs to grow the array, it needs to find out
how large the array is, no? So: how do you do that?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] PEP 394 accepted