Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Nick Craig-Wood
On Sun, Oct 01, 2006 at 02:01:51PM -0400, Jean-Paul Calderone wrote:
 Each line in an interactive session is compiled separately, like modules
 are compiled separately.  With the current implementation, literals in a
 single compilation unit have a chance to be cached like this.  Literals
 in different compilation units, even for the same value, don't.

That makes sense - thanks for the explanation!

-- 
Nick Craig-Wood [EMAIL PROTECTED] -- http://www.craig-wood.com/nick
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 355 status

2006-10-02 Thread Jason Orendorff
On 9/30/06, Giovanni Bajo [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
  OK. Pronouncement: PEP 355 is dead. [...]

 It would be terrific if you gave us some clue about what is
 wrong in PEP355, [...]

Here are my guesses.  I believe Guido rejected this PEP for a lot of reasons.

By the way, what I'm about to do is known as channeling Guido
(badly) and I'm pretty sure it annoys him.  Sorry, Guido.  Please
don't treat the following as authoritative; I have never met Guido and
obviously I cannot speak for him.

- I don't think Guido ever saw much benefit from path objects.  That
is, the Motivation was not compelling.  I think the main motivation is
to eliminate some clutter and add a handful of useful methods to the
stdlib, so it's easy to see how this could be the case.

- Guido just flat-out didn't like the looks of the PEP.  Too much
weirdness.  (path.py contains more weirdness, including some stuff
Guido particularly disliked, and I think it's fair to say that PEP355
suffered somewhat by association.)

- Any proposal to add a Second Way To Do It has to meet a very high
standard.  PEP355 was too big to be considered an incremental change.
Yet it didn't even attempt to fix all the perceived problems with the
existing APIs.  A more thorough job would have had a better chance.

- Nobody liked the API design--too many methods.

- Now we're hearing rumors of better ideas out there, which comes as a relief.

I suspect any one of these could have scuttled the proposal.

-j
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 315 - do while

2006-10-02 Thread Nick Coghlan
Hans Polak wrote:
 Hi Nick,
 
 Yep, PEP 315. Sorry about that.
 
 Now, about your suggestion 
  do:
  setup code
  while condition
  loop body
  else:
  loop completion code
 
 This is pythonic, but not logical. The 'do' will execute at least once, so
 the else clause is not needed, nor is the loop completion code.  The loop
 body should go before the while terminator.

This objection is based on a misunderstanding of what the else clause is for 
in a Python loop. The else clause is only executed if the loop terminated 
naturally (the exit condition became false) rather than being explicitly 
terminated using a break statement.

This behaviour is most commonly useful when using a for loop to search through 
an iterable (breaking when the object is found, and using the else clause to 
handle the 'not found' case), but it is also defined for while loops.

Regards,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Martin v. Löwis
Nick Coghlan schrieb:
 Right. Although I do wonder what kind of software people write to run
 into this problem. As Guido points out, the numbers must be the result
 from some computation, or created by an extension module by different
 means. If people have many *simultaneous* copies of 0.0, I would expect
 there is something else really wrong with the data structures or
 algorithms they use.
 
 I suspect the problem would typically stem from floating point values
 that are read in from a human-readable file rather than being the result
 of a 'calculation' as such:

That's how you can end up with 100 different copies of 0.0. But
apparently, people are creating millions of them, and keep them in
memory simultaneously. Unless the text file *only* consists of floating
point numbers, I would expect they have bigger problems than that.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Martin v. Löwis
Kristján V. Jónsson schrieb:
 Well, a lot of extension code, like ours use PyFloat_FromDouble(foo);
 This can be from vectors and stuff.

Hmm. If you get a lot of 0.0 values from vectors and stuff, I would
expect that memory usage is already high.

In any case, a module that creates a lot of copies of 0.0 that way
could do its own caching, right?

 Very often these are values from a database.  Integral float values
 are very common in such case and id didn't occur to me that they
 weren't being reused, at least for small values.

Sure - but why are keeping people them in memory all the time?
Also, isn't it a mis-design of the database if you have many float
values in it that represent natural numbers? Shouldn't you use
a more appropriate data type, then?

 Also, a lot of arithmetic involving floats is expected to end in
 integers, like computing some index from a float value.  Integers get
 promoted to floats when touched by them, as you know.

Again, sounds like a programming error to me.

 Anyway, I now precreate integral values from -10 to 10 with great
 effect.  The cost is minimal, the benefit great.

In an extension module, the knowledge about the application domain
is larger, so it may be reasonable to do the caching there. I would
still expect that in the typical application where this is an issue,
there is some kind of larger design bug.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Martin v. Löwis
Kristján V. Jónsson schrieb:
 I can't see how this situation is any different from the re-use of
 low ints.  There is no fundamental law that says that ints below 100
 are more common than other, yet experience shows that  this is so,
 and so they are reused.

There are two important differences:
1. it is possible to determine whether the value is special in
   constant time, and also fetch the singleton value in constant
   time for ints; the same isn't possible for floats.
2. it may be that there is a loss of precision in reusing an existing
   value (although I'm not certain that this could really happen).
   For example, could it be that two values compare successful in
   ==, yet are different values? I know this can't happen for
   integers, so I feel much more comfortable with that cache.

 Rather than to view this as a programming error, why not simply
 accept that this is a recurring pattern and adjust python to be more
 efficient when faced by it?  Surely a lot of karma lies that way?

I'm worried about the penalty that this causes in terms of run-time
cost. Also, how do you chose what values to cache?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Kristján V . Jónsson
I see, you are thinking of the general fractional case.
My point was that whole numbers seem to pop up often and to reuse those is easy 
I did a test of tracking actual floating point numbers and the majority of 
heavy usage comes
from integral values.  It would indeed be strange if some fractional number 
were heavily use but it can be argued that integral ones are special in many 
ways.
Anyway, Skip noted that 50% of all floats are whole numbers between -10 and 10 
inclusive, and this is the code that I employ in our python build today:

PyObject *
PyFloat_FromDouble(double fval)
{
register PyFloatObject *op;
int ival;
if (free_list == NULL) {
if ((free_list = fill_free_list()) == NULL)
return NULL;
/* CCP addition, cache common values */
if (!f_reuse[0]) {
int i;
for(i = 0; i21; i++)
f_reuse[i] = PyFloat_FromDouble((double)(i-10));
}
}
/* CCP addition, check for recycling */
ival = (int)fval;
if ((double)ival == fval  ival=-10  ival = 10) {
ival+=10;
if (f_reuse[ival]) {
Py_INCREF(f_reuse[ival]);
return f_reuse[ival];
}
}
...


Cheers,

Kristján

 -Original Message-
 From: Martin v. Löwis [mailto:[EMAIL PROTECTED] 
 Sent: 2. október 2006 14:37
 To: Kristján V. Jónsson
 Cc: Bob Ippolito; python-dev@python.org
 Subject: Re: [Python-Dev] Caching float(0.0)
 
 Kristján V. Jónsson schrieb:
  I can't see how this situation is any different from the 
 re-use of low 
  ints.  There is no fundamental law that says that ints 
 below 100 are 
  more common than other, yet experience shows that  this is 
 so, and so 
  they are reused.
 
 There are two important differences:
 1. it is possible to determine whether the value is special in
constant time, and also fetch the singleton value in constant
time for ints; the same isn't possible for floats.
 2. it may be that there is a loss of precision in reusing an existing
value (although I'm not certain that this could really happen).
For example, could it be that two values compare successful in
==, yet are different values? I know this can't happen for
integers, so I feel much more comfortable with that cache.
 
  Rather than to view this as a programming error, why not 
 simply accept 
  that this is a recurring pattern and adjust python to be more 
  efficient when faced by it?  Surely a lot of karma lies that way?
 
 I'm worried about the penalty that this causes in terms of 
 run-time cost. Also, how do you chose what values to cache?
 
 Regards,
 Martin
 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Michael Hudson
Martin v. Löwis [EMAIL PROTECTED] writes:

 Kristján V. Jónsson schrieb:
 I can't see how this situation is any different from the re-use of
 low ints.  There is no fundamental law that says that ints below 100
 are more common than other, yet experience shows that  this is so,
 and so they are reused.

 There are two important differences:
 1. it is possible to determine whether the value is special in
constant time, and also fetch the singleton value in constant
time for ints; the same isn't possible for floats.

I don't think you mean constant time here do you?  I think most of
the code posted so far has been constant time, at least in terms of
instruction count, though some might indeed be fairly slow on some
processors -- conversion from double to integer on the PowerPC
involves a trip off to memory for example.  Even so, everything should
be fairly efficient compared to allocation, even with PyMalloc.

 2. it may be that there is a loss of precision in reusing an existing
value (although I'm not certain that this could really happen).
For example, could it be that two values compare successful in
==, yet are different values? I know this can't happen for
integers, so I feel much more comfortable with that cache.

I think the only case is that the two zeros compare equal, which is
unfortunate given that it's the most compelling value to cache...

I don't know a reliable and fast way to distinguish +0.0 and -0.0.

Cheers,
mwh

-- 
  The bottom tier is what a certain class of wanker would call
  business objects ...  -- Greg Ward, 9 Dec 1999
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Martin v. Löwis
Michael Hudson schrieb:
 1. it is possible to determine whether the value is special in
constant time, and also fetch the singleton value in constant
time for ints; the same isn't possible for floats.
 
 I don't think you mean constant time here do you?  

Right; I really wondered whether the code was dependent or independent
of the number of special-case numbers.

 I think most of
 the code posted so far has been constant time, at least in terms of
 instruction count, though some might indeed be fairly slow on some
 processors -- conversion from double to integer on the PowerPC
 involves a trip off to memory for example.

Kristian's code testing only for integers in a range would be of
that kind. Code that tests for a list of literals determined
at compile time typically needs time linear with the number of
special-cased constants (of course, as that there is a fixed
number of constants, this is O(1)).

 2. it may be that there is a loss of precision in reusing an existing
value (although I'm not certain that this could really happen).
For example, could it be that two values compare successful in
==, yet are different values? I know this can't happen for
integers, so I feel much more comfortable with that cache.
 
 I think the only case is that the two zeros compare equal, which is
 unfortunate given that it's the most compelling value to cache...

Thanks for pointing that out. I can believe this is the only case
in IEEE-754; I also wonder whether alternative implementations
could cause problems (although I don't really worry too much
about VMS).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Aahz
On Mon, Oct 02, 2006, Martin v. L?wis wrote:
 Michael Hudson schrieb:

 I think most of
 the code posted so far has been constant time, at least in terms of
 instruction count, though some might indeed be fairly slow on some
 processors -- conversion from double to integer on the PowerPC
 involves a trip off to memory for example.
 
 Kristian's code testing only for integers in a range would be of
 that kind. Code that tests for a list of literals determined
 at compile time typically needs time linear with the number of
 special-cased constants (of course, as that there is a fixed
 number of constants, this is O(1)).

What if we do this work only on float()?
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

LL YR VWL R BLNG T S  -- www.nancybuttons.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Tim Hochberg
[EMAIL PROTECTED] wrote:\/
 Steve By these statistics I think the answer to the original question
 Steve is clearly no in the general case.
 
 As someone else (Guido?) pointed out, the literal case isn't all that
 interesting.  I modified floatobject.c to track a few interesting
 floating point values:
 
[...code...]
 
 So for a largely non-floating point application, a fair number of floats
 are allocated, a bit over 25% of them are -1.0, 0.0 or +1.0, and nearly 50%
 of them are whole numbers between -10.0 and 10.0, inclusive.
 
 Seems like it at least deserves a serious look.  It would be nice to have
 the numeric crowd contribute to this subject as well.

As a representative of the numeric crowd, I'll say that I've never 
noticed this to be a problem. I suspect that it's a non issue since we 
generally store our numbers in arrays, not big piles of Python floats, 
so there's no opportunity for identical floats to pile up.

-tim

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Created branch for PEP 302 phase 2 work (in C)

2006-10-02 Thread Brett Cannon
In the interest of time I have decided to go ahead and do the PEP 302 phase 2 work in C. I fully expect to tackle rewriting import in Python in my spare time after I finish this work since I will be much more familiar with how the whole import machinery works and it sounds like a fun challenge.
The branch for the work is in pep302_phase2 . Any help would be appreciated in this work. I plan on keeping a BRANCH_PLANS file that outlines the what/why/how of the whole thing.-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Created branch for PEP 302 phase 2 work (in C)

2006-10-02 Thread Phillip J. Eby
At 01:01 PM 10/2/2006 -0700, Brett Cannon wrote:
In the interest of time I have decided to go ahead and do the PEP 302 
phase 2 work in C.

Just FYI, it's not possible (so far as I know) to implement phase 2 while 
maintaining backward compatibility with existing 2.x code.  So this work 
shouldn't go back to the 2.x trunk without discussion of those issues.

Essentially, I abandoned trying to do the phase 2 work for Python 2.5 
because there's too much code in the field that depends on the current 
order of when special/built-in imports are processed vs. when PEP 302 
imports are processed.  Thus, instead of adding new PEP 302 APIs (like 
get_loader) to 'imp', I added them to 'pkgutil'.  There are, I believe, 
some notes in that module's source regarding what the ordering issues are 
w/meta_path vs. the way import works now.

That having been said, we could possibly have a transition for 2.6, but 
everybody who's written any PEP 302 emulation code (outside of pkgutil 
itself) would have to adapt their code somewhat.

I'm surprised, however, that you think working on this in C is going to be 
*less* time than it would take to simply replace __import__ with a Python 
function that reimplements PEP 302...  especially since pkgutil contains a 
whole lot of the code you'd need, e.g.:


 def __import__(...):
 ...
 loader = pkgutil.find_loader(fullname)
 if loader is not None:
 module = loader.load_module(fullname)
 ...

And much of the rest of the above can probably be filled out by swiping 
code from ihooks, imputil, or other Python __import__ implementations.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Created branch for PEP 302 phase 2 work (in C)

2006-10-02 Thread Paul Moore
On 10/2/06, Phillip J. Eby [EMAIL PROTECTED] wrote:
 Just FYI, it's not possible (so far as I know) to implement phase 2 while
 maintaining backward compatibility with existing 2.x code.  So this work
 shouldn't go back to the 2.x trunk without discussion of those issues.

While that's a fair point, we need to be clear what compatibility
issues there are. The built in import mechanisms aren't well
documented, so it's not a black-and-white situation. An unqualified
statement there are issues isn't much help on its own...

 Essentially, I abandoned trying to do the phase 2 work for Python 2.5
 because there's too much code in the field that depends on the current
 order of when special/built-in imports are processed vs. when PEP 302
 imports are processed.

Can you say what that code is, and who we should be talking to to
understand their issues? If not, how do we find such code? Presumably,
you've got a lot of feedback through your work on setuptools/eggs - do
you have a record of who might participate in a discussion?

 Thus, instead of adding new PEP 302 APIs (like
 get_loader) to 'imp', I added them to 'pkgutil'.

How does that help? Where the code goes doesn't seem likely to make
much difference...

 There are, I believe,
 some notes in that module's source regarding what the ordering issues are
 w/meta_path vs. the way import works now.

The only notes I could see in pkgutil.py refer to special locations
like the Windows registry, and refer to the fact that they will be
searched after path entries, not before (for reasons I couldn't quite
follow, but that's likely because I only read the comments fairly
quickly). But if the whole mechanism is moved to sys.meta_path (which
is what Phase 2 is about) surely it's possible to choose the ordering
just by the order the importers go on sys.meta_path?

 That having been said, we could possibly have a transition for 2.6, but
 everybody who's written any PEP 302 emulation code (outside of pkgutil
 itself) would have to adapt their code somewhat.

I don't really see how we're going to address that other than by
implementing it, and waiting for people with issues to speak up.
Highlighting the changes early is good, as it avoids a mid-beta rush
of people suddenly finding issues, but I doubt we'll do much better
than that.

 I'm surprised, however, that you think working on this in C is going to be
 *less* time than it would take to simply replace __import__ with a Python
 function that reimplements PEP 302...

That I do agree with. There's a bootstrapping issue (you can't import
the Python module that does all this without using a C-coded import
mechanism) but that should be resolvable.

  especially since pkgutil contains a
 whole lot of the code you'd need, e.g.:

Yes, I'm quite surprised at how much has appeared in pkgutil. The
what's new entry is very terse, and the module documentation itself
hasn't been updated to mention the new stuff. That's a shame, as it
looks very useful (and as you say, could form a substantial part of
this change if we were coding it in Python).

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Created branch for PEP 302 phase 2 work (in C)

2006-10-02 Thread Brett Cannon
On 10/2/06, Paul Moore [EMAIL PROTECTED] wrote:
On 10/2/06, Phillip J. Eby [EMAIL PROTECTED] wrote:[SNIP] I'm surprised, however, that you think working on this in C is going to be *less* time than it would take to simply replace __import__ with a Python
 function that reimplements PEP 302...That I do agree with. There's a bootstrapping issue (you can't importthe Python module that does all this without using a C-coded importmechanism) but that should be resolvable.
This is why I asked for input from people on which would take less time. Almost all the answers I got was that the  the C code was delicate but that it was workable. Several people said they wished for a Python implementation, but hardly anyone said flat-out, don't waste your time, the Python version will be faster to do.
As for the bootstrapping, I am sure it is resolvable as well. There are several ways to go about it that are all tractable.-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Delaney, Timothy (Tim)
[EMAIL PROTECTED] wrote:

 Steve By these statistics I think the answer to the original
 question Steve is clearly no in the general case.
 
 As someone else (Guido?) pointed out, the literal case isn't all that
 interesting.  I modified floatobject.c to track a few interesting
 floating point values:
 
 static unsigned int nfloats[5] = {
 0, /* -1.0 */
 0, /*  0.0 */
 0, /* +1.0 */
 0, /* everything else */
 0, /* whole numbers from -10.0 ... 10.0 */
 };
 
 PyObject *
 PyFloat_FromDouble(double fval)
 {
 register PyFloatObject *op;
 if (free_list == NULL) {
 if ((free_list = fill_free_list()) == NULL)
 return NULL;
 }
 
 if (fval == 0.0) nfloats[1]++;
 else if (fval == 1.0) nfloats[2]++;
 else if (fval == -1.0) nfloats[0]++;
 else nfloats[3]++;
 
 if (fval = -10.0  fval = 10.0  (int)fval == fval) {
 nfloats[4]++;
 }

This doesn't actually give us a very useful indication of potential
memory savings. What I think would be more useful is tracking the
maximum simultaneous count of each value i.e. what the maximum refcount
would have been if they were shared.

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Created branch for PEP 302 phase 2 work (in C)

2006-10-02 Thread Phillip J. Eby
At 03:48 PM 10/2/2006 -0700, Brett Cannon wrote:


On 10/2/06, Paul Moore mailto:[EMAIL PROTECTED][EMAIL PROTECTED] 
wrote:
On 10/2/06, Phillip J. Eby 
mailto:[EMAIL PROTECTED][EMAIL PROTECTED] wrote:
[SNIP]
  I'm surprised, however, that you think working on this in C is going to be
  *less* time than it would take to simply replace __import__ with a Python
  function that reimplements PEP 302...

That I do agree with. There's a bootstrapping issue (you can't import
the Python module that does all this without using a C-coded import
mechanism) but that should be resolvable.

This is why I asked for input from people on which would take less 
time.  Almost all the answers I got was that the the C code was delicate 
but that it was workable.  Several people said they wished for a Python 
implementation, but hardly anyone said flat-out, don't waste your time, 
the Python version will be faster to do.

As for the bootstrapping, I am sure it is resolvable as well.  There are 
several ways to go about it that are all tractable.

When I implemented the PEP 302 fix for the import speedups, I basically 
prototyped it using Python code that got loaded prior to 'site.py'.  Once I 
had the Python version solid, I converted it to a C type via 
straightforward code transcription.  That's pretty much the route I would 
follow for this too, although of course freezing the Python version into 
C code is also an option, since there's not much performance benefit to be 
had from a C translation, except for two parts of __import__: the part that 
checks sys.modules to shortcut the process, and the part that runs after 
the target module has been loaded or found.  Aside from this fast path 
part of __import__, any additional interpretation overhead will probably be 
dwarfed by I/O considerations.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PSF Infrastructure Committee's recommendation for a new issue tracker

2006-10-02 Thread Brett Cannon
On behalf of the PSF Infrastructure committee, I am happy to report that we have reached a recommendation for a new issue tracker for Python!But first, I want to extend our thanks to all who stepped forward to provide the committee with a test installation of an issue tracker to use as a basis of our evaluations. Having several trackers to compare may have made this more time-consuming, but it helped to realize what people did and did not like about the various issue trackers and solidify what we thought python-dev would want. Thank you!
The Infrastructure committee (Andrew Kuchling, Thomas Wouters, Barry Warsaw, Martin v. Loewis, and myself; Richard Jones excused himself from the discussion because of personal bias) met and discussed the four trackers being considered to replace SourceForge: Launchpad, JIRA, Roundup, and Trac. After evaluating the trackers on several points (issue creation, querying, etc.), we reached a tie between JIRA and Roundup in terms of pure tracker features.
For JIRA, members found it to be a very powerful, polished issue tracker. But some found it to be a little more complex than they would like in an issue tracker.Roundup, on the other hand, had the exact opposite points. While not as polished as JIRA, it is the simpler tracker which some committee members preferred.
As for Trac and Launchpad, both had fundamental issues that led to them not being chosen in the end. Most of the considerations had to do with customization or UI problems.With JIRA and Roundup being considered equal overall in terms of the tracker themselves, there is the tie-breaking issue of hosting. Atlassian, the company that created JIRA, has offered us free hosting of a JIRA installation. This cannot be overlooked as keeping an issue tracker running is not easy and requires supervision at various hours of the day to make sure possible downtime is minimized. There is also always the issue of upgrading, etc. that come with any major software installation.
Details on the hosting is pasted in at the end of this email as provided by Jonathan Nolen of Atlassian. He has also been cc:ed on this email so as to allow him to answer any questions directly.In order for Roundup to be considered equivalent in terms of an overall tracker package there needs to be a sufficient number of volunteer admins (roughly 6 - 10 people) who can help set up and maintain the Roundup installation. If enough people can be gathered, then Roundup will become the recommendation of the committee based on the fact that the trackers are roughly equal but that Roundup is implemented in Python and is FLOSS. If not enough support can be gathered, the committee's recommendation of going with JIRA will stand.
If people want Roundup to be considered the tracker we go with by volunteering to be an admin, please email infrastructure at python.org and state your time commitment, the timezone you would be working from, and your level of Roundup knowledge. Please email the committee by October 16. If enough people step forward we will notify python-dev that Roundup should be considered the recommendation of the committee and graciously turn down Atlassian's offer.
-Brett CannonChairman, PSF Infrastructure Committee---[email from Jonathan, unedited, with details about hosting]Hosting is with 
http://contegix.com. They host all of our servers, aswell as those of Cenqua, Codehaus, Jive (I think), and a bunch ofother folks in the Java community.They have engineers online 24x7x365. I've contacted them at all hours
of the night and weekend and never failed to get a response with 5minutes, though they guarantee 30 minutes. The engineers I've workedwith have been universally top-notch. They've been able to help withevery kind of question I've thrown at them. It's hard to describe how
great they are, but it's like having a full-time sysadmin on staffwho knows everything about your systems, who never goes to sleep, andwho is always seems chipper at the very thought of making any changeyou might ask.
Ideally, we'd set it up so that the appropriate members of the Pythonteam could contact Contegix directly for any requests you may have.You'll also have direct access yourself if you need to do any work on
your own.As far as the export, they will set it up any way you like. The twoobvious ways that come to mind are copying the XML backup or adatabase dump each night (Or whatever frequency you specify). Either
option would allow you to fully restore a JIRA instance to the pointof the backup with full history.They will pro-actively keep your apps up to date as well. Theyusually know as soon as we release new versions and will contact your
to arrange upgrades almost immediately. They also perform things likeOS upgrades and patches on a regular basis without having to beprompted.Contegix will set up monitoring on your server(s) to watch things
like disk-space, memory, CPU and networking usage. If any of thoseresources 

Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread Terry Reedy

Kristján V. Jónsson [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
Anyway, Skip noted that 50% of all floats are whole numbers between -10 
and 10 inclusive,

Please, no.  He said something like this about *non-floating-point 
applications* (evidence unspecified, that I remember).  But such 
applications, by definition, usually don't have enough floats for caching 
(or conversion time) to matter too much.

For true floating point measurements (of temperature, for instance), 
'integral' measurements (which are an artifact of the scale used (degrees F 
versus C versus K)) should generally be no more common than other realized 
measurements.

Thirty years ago, a major stat package written in Fortran (BMDP) required 
that all data be stored as (Fortran 4-byte) floats for analysis.  So a 
column of yes/no or male/female data would be stored as 0.0/1.0 or perhaps 
1.0/2.0.  That skewed the distribution of floats.  But Python and, I hope, 
Python apps, are more modern than that.

and this is the code that I employ in our python build today:

[snip]

For the analysis of typical floating point data, this is all pointless and 
a complete waste of time.  After a billion comversions or so, I expect the 
extra time might add up to something noticeable.

 From: Martin v. Löwis [mailto:[EMAIL PROTECTED]
 I'm worried about the penalty that this causes in terms of
 run-time cost.

Me too.

 Also, how do you chose what values to cache?

At one time (don't know about today), it was mandatory in some Fortran 
circles to name the small float constants used in a particular program with 
the equivalent of C #defines.  In Python,
zero = 0.0, half = 0.5, one = 1.0, twopi = 6.29..., eee = 2.7..., phi = 
.617..., etc. (Note that naming is not restricted to integral or otherwise 
'nice' values.)

The purpose then was to allow easy conversion from float to double to 
extended double.  And in some cases, it also made the code clearer.  With 
Python, the same procedure would guarantee only one copy (caching) of the 
same floats for constructed data structures.

Float caching strikes me a a good subject for cookbook recipies, but not, 
without real data and a willingness to slightly screw some users, for the 
default core code.

Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Created branch for PEP 302 phase 2 work (in C)

2006-10-02 Thread A.M. Kuchling
On Mon, Oct 02, 2006 at 11:27:07PM +0100, Paul Moore wrote:
 Yes, I'm quite surprised at how much has appeared in pkgutil. The
 what's new entry is very terse, and the module documentation itself
 hasn't been updated to mention the new stuff. 

These two things are related, of course; I couldn't figure out which
bits of pkgutil.py are intended to be publicly used and which weren't.
There's an __all__ in the module, but some things such as read_code()
don't look like they're intended for external use.

--amk

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread skip

Tim This doesn't actually give us a very useful indication of potential
Tim memory savings. What I think would be more useful is tracking the
Tim maximum simultaneous count of each value i.e. what the maximum
Tim refcount would have been if they were shared.

Most definitely.  I just posted what I came up with in about two minutes.
I'll add some code to track the high water mark as well and report back.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread skip

Terry Kristján V. Jónsson [EMAIL PROTECTED] wrote:
 Anyway, Skip noted that 50% of all floats are whole numbers between
 -10 and 10 inclusive,

Terry Please, no.  He said something like this about
Terry *non-floating-point applications* (evidence unspecified, that I
Terry remember).  But such applications, by definition, usually don't
Terry have enough floats for caching (or conversion time) to matter too
Terry much.

Correct.  The non-floating-point application I chose was the one that was
most immediately available, make test.  Note that I have no proof that
regrtest.py isn't terribly floating point intensive.  I just sort of guessed
that it was.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Created branch for PEP 302 phase 2 work (in C)

2006-10-02 Thread Phillip J. Eby
At 08:21 PM 10/2/2006 -0400, A.M. Kuchling wrote:
On Mon, Oct 02, 2006 at 11:27:07PM +0100, Paul Moore wrote:
  Yes, I'm quite surprised at how much has appeared in pkgutil. The
  what's new entry is very terse, and the module documentation itself
  hasn't been updated to mention the new stuff.

These two things are related, of course; I couldn't figure out which
bits of pkgutil.py are intended to be publicly used and which weren't.
There's an __all__ in the module, but some things such as read_code()
don't look like they're intended for external use.

The __all__ listing is correct; I intended to expose read_code() for the 
benefit of other importer implementations and Python utilities.  Over the 
years, I've found myself writing the equivalent of read_code() several 
times, so it seemed to me to make sense to expose it as a utility function, 
since it already needed to be there for the ImpLoader class to work.

In general, the idea behind the additions to pkgutil was to make life 
easier for people doing import-related operations, by being a Python 
reference implementation of commonly-reinvented parts of the import 
process.  The '-m' machinery in 2.5 had a bunch of this stuff in it, and so 
did setuptools, so I yanked the code from both and refactored it to allow 
reuse by both, then fleshed it out to support all the optional PEP 302 
loader protocols, and additional protocols needed to support tools like 
pydoc being able to run against arbitrary importers (esp. zip files).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Caching float(0.0)

2006-10-02 Thread skip

skip Most definitely.  I just posted what I came up with in about two
skip minutes.  I'll add some code to track the high water mark as well
skip and report back.

Using the smallest change I could get away with, I came up with these
allocation figures (same as before):

-1.0: 29048
 0.0: 524340
+1.0: 91560
rest: 1753479
whole numbers -10.0 to 10.0: 1151543

and these max ref counts:

-1.0: 16
 0.0: 136
+1.0: 161
rest: 1
whole numbers -10.0 to 10.0: 161

When I have a couple more minutes I'll just implement a cache for whole
numbers between -10.0 and 10.0 and test that whole range of values right.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com