Re: [Python-Dev] Releasing 2.5.4

2008-12-23 Thread Martin v. Löwis
> My understanding of the problem is that clearerr() needs to be called
> before any FILE read operations on *some* platforms. The only platform I
> saw mentioned was OS X. Towards that end, I have attached a much simpler
> patch onto the tracker issue, which maybe somebody can verify solves the
> problem because I do not have access to a platform which fails the test
> that was originally given.

Thanks. I won't then reject the patch outright, only revert it from 2.5.
I can't give this a second try, as 2.5.3 was already supposed to be the
last release - I don't want to find myself reverting your patch two
weeks from now.

Is the approach that you add a clearerr call is added for each read
operation?

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems compiling 2.6.1 on Solaris 10

2008-12-23 Thread Martin v. Löwis
> I am hoping someone can assist me. I normally don’t care if the _ctypes
> module builds or not, but I now need to have it build.
> 
> I am running Solaris 10 with Sun’s C compiler under SunStudio 11.

I don't think ctypes (rather, libffi) supports Sun C. You will need to
port it (as you have already ruled out the other options, such as using
gcc, or not using ctypes).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems compiling 2.6.1 on Solaris 10

2008-12-23 Thread Nick Coghlan
Martin v. Löwis wrote:
>> I am hoping someone can assist me. I normally don’t care if the _ctypes
>> module builds or not, but I now need to have it build.
>>
>> I am running Solaris 10 with Sun’s C compiler under SunStudio 11.
> 
> I don't think ctypes (rather, libffi) supports Sun C. You will need to
> port it (as you have already ruled out the other options, such as using
> gcc, or not using ctypes).

There is also an existing issue relating to this:

http://bugs.python.org/issue2552

(although it doesn't add much beyond what Martin already said)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Rocky Bernstein
Now that there is a package mechanism (are package mechanisms?) like
zipimporter that bundle source code into a single file, should the
notion of a "file" location should be adjusted to include the package
and/or importer?

Is there a standard API or routine which can extract this information
given a code object?

A use case here I am thinking of here is in a stack trace or a
debugger, or a tool which wants to show in great detail information
from a code object possibly via a frame. For example does this come
from a zipped egg? And if so, which one?

For concreteness, here is what I did and here's what I saw.  Select
one of the zipimporter eggs at http://code.google.com/p/pytracer and
install one of these.

I did this on GNU/Linux and Python 2.5 and I look at the co_filename
of one of the methods:

>>> import tracer
>>> tracer.__dict__['size'].func_code.co_filename
'build/bdist.linux-i686/egg/tracer.py'

But there is no file called "build/bdist.linux-686/egg/tracer.py" in
the filesystem. Instead there is a member "tracer.py" inside
/usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg'.

It's possible I caused this egg to get built incorrectly or that
setuptools has a bug which entered that misleading information.
However, shouldn't there be a standard way to untangle package
location, loader and member inside the package?

As best as I can tell, PEP 302 which discussed importer hooks and
suggests a standard way to get file data. But it doesn't address a
standard way to get container package and/or loader information.

Also I'm not sure there *is* a standard print string way to show
member inside a package. zipimporter may insert co_filename strings
like:

  /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/tracer.py

but the trouble with this is that it means file routines have to scan
the path and notice say that
/usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg is a *file*,
not a directory. And a file stat/reading routine needs to understand
what kind of packager that is in order to get tracer.py information.

(Are there any file routines in place for doing this?)

Thanks.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-23 Thread M.-A. Lemburg
On 2008-12-22 22:45, Steven D'Aprano wrote:
> On Mon, 22 Dec 2008 11:20:59 pm M.-A. Lemburg wrote:
>> On 2008-12-20 23:16, Martin v. Löwis wrote:
> I will try next week to see if I can come up with a smaller,
> submittable example.  Thanks.
 These long exit times are usually caused by the garbage collection
 of objects. This can be a very time consuming task.
>>> I doubt that. The long exit times are usually caused by a bad
>>> malloc implementation.
>> With "garbage collection" I meant the process of Py_DECREF'ing the
>> objects in large containers or deeply nested structures, not the GC
>> mechanism for breaking circular references in Python.
>>
>> This will usually also involve free() calls, so the malloc
>> implementation affects this as well. However, I've seen such long
>> exit times on Linux and Windows, which both have rather good
>> malloc implementations.
>>
>> I don't think there's anything much we can do about it at the
>> interpreter level. Deleting millions of objects takes time and that's
>> not really surprising at all. It takes even longer if you have
>> instances with .__del__() methods written in Python.
> 
> 
> This behaviour appears to be specific to deleting dicts, not deleting 
> random objects. I haven't yet confirmed that the problem still exists 
> in trunk (I hope to have time tonight or tomorrow), but in my previous 
> tests deleting millions of items stored in a list of tuples completed 
> in a minute or two, while deleting the same items stored as key:item 
> pairs in a dict took 30+ minutes. I say plus because I never had the 
> patience to let it run to completion, it could have been hours for all 
> I know.

That's interesting. The dictionary dealloc routine doesn't give
any hint as to why this should take longer than deallocating
a list of tuples.

However, due to the way dictionary tables are allocated, it is
possible that you create a table that is nearly twice the size
of the actual number of items needed by the dictionary. At those
dictionary size, this can result in a lot of extra memory being
allocated, certainly more than the corresponding list of tuples
would use.

>> Applications can choose other mechanisms for speeding up the
>> exit process in various (less clean) ways, if they have a need for
>> this.
>>
>> BTW: Rather than using a huge in-memory dict, I'd suggest to either
>> use an on-disk dictionary such as the ones found in mxBeeBase or
>> a database.
> 
> The original poster's application uses 45GB of data. In my earlier 
> tests, I've experienced the problem with ~ 300 *megabytes* of data: 
> hardly what I would call "huge".

Times have changed, that's true :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Dec 23 2008)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2008-12-02: Released mxODBC.Connect 1.0.0  http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Releasing 2.5.4

2008-12-23 Thread Scott Dial
Martin v. Löwis wrote:
>> My understanding of the problem is that clearerr() needs to be called
>> before any FILE read operations on *some* platforms. The only platform I
>> saw mentioned was OS X. Towards that end, I have attached a much simpler
>> patch onto the tracker issue, which maybe somebody can verify solves the
>> problem because I do not have access to a platform which fails the test
>> that was originally given.
> 
> Thanks. I won't then reject the patch outright, only revert it from 2.5.
> I can't give this a second try, as 2.5.3 was already supposed to be the
> last release - I don't want to find myself reverting your patch two
> weeks from now.

I agree, and as far as I can tell, the bug (assuming the report is
accurate) only occurs on a few platforms and since it's received little
attention over the life of the issue on the tracker, I imagine it's not
very important to many people. And since I don't have an effected
platform to test, I can't even be sure that it really solves the bug.
So, I agree leave it out.

> Is the approach that you add a clearerr call is added for each read
> operation?

Yes, I merely added clearerr() calls just prior to first the fread,
fgets, and getc calls in each of the read methods for files. I'll make a
clean patch against the trunk and update the issue on the tracker, then
maybe the reporter or someone else with an effected platform can verify
my patch.

-Scott

-- 
Scott Dial
[email protected]
[email protected]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Paul Moore
2008/12/23 Rocky Bernstein :
> Now that there is a package mechanism (are package mechanisms?) like
> zipimporter that bundle source code into a single file, should the
> notion of a "file" location should be adjusted to include the package
> and/or importer?

Check PEP 302 (http://www.python.org/dev/peps/pep-0302/) specifically
the get_source (optional) method. It's not exactly what you describe,
but it may help. Please note that it's optional - if you loaded the
code from a zipfile containing only bytecode files, there is no source
to get, so you have to be prepared for that case. But if the source is
available, this should give you a way of getting to it.

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Nick Coghlan
Rocky Bernstein wrote:
> As best as I can tell, PEP 302 which discussed importer hooks and
> suggests a standard way to get file data. But it doesn't address a
> standard way to get container package and/or loader information.

If a "filename" may not be an actual filename, but instead a
pseduo-filename created based on the __file__ attribute of a Python
module, then there are a few mechanisms for accessing it:

1. Use the package/module name and the relative path from that location,
then use pkgutil.get_data to retrieve it. This has the advantage of
correctly handling the case where no __loader__ attribute is present (or
it is None), which can happen for standard filesystem imports. However,
it only works in Python 2.6 and above (since get_data() is a new
addition to pkgutil).

2. Implement your own version of pkgutil.get_data - more work, but it is
the only way to get something along those lines that works for versions
prior to Python 2.6

3. Do what a number of standard library APIs (e.g. linecache) that
accept filenames do and also accept an optional "module globals"
argument. If the globals argument is passed in and contains a
"__loader__" entry, use the appropriate loader method when processing
the "filename" that was passed in.

> Also I'm not sure there *is* a standard print string way to show
> member inside a package. zipimporter may insert co_filename strings
> like:
> 
>   /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/tracer.py
> 
> but the trouble with this is that it means file routines have to scan
> the path and notice say that
> /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg is a *file*,
> not a directory. And a file stat/reading routine needs to understand
> what kind of packager that is in order to get tracer.py information.
> 
> (Are there any file routines in place for doing this?)

Finding a loader given only a pseudo-filename and no module is actually
possible in the specific case of zipimport, but is still pretty obscure
at this point in time:

1. Scan sys.path looking for an entry that matches the start of the
pseudo-filename (remembering to use os.path.normpath).

2. Once such a path entry has been found, use PEP 302 to find the
associated importer object (the undocumented pkgutil.get_importer
function does exactly that - although, as with any undocumented feature,
the promises of API compatibility across major version changes aren't as
strong as they would be for an officially documented and supported
interface).

3. Hope that the importer is one like zipimport that allows get_data()
to be invoked directly on the importer object, rather than only
providing it on a separate loader object after the module has been
loaded. If it needs a real loader instead of just the importer, then
you're back to the original problem of needing a module or package name
(or globals dictionary) in addition to the pseudo filename.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Paul Moore
2008/12/23  :
> What is wanted is a uniform way get and describe a file location
> from a code object that takes into account the file might be a member
> of an archive.

But a code object may not have come from a file. Ignoring the
interactive prompt (not because it's unimportant, just because people
have a tendency to assume it's the only special case :-)) you need to
consider code loaded via a PEP302 importer from (say) a sqlite
database, or code created using compile(), or possibly even more
esoteric means.

So I'm not sure your request is clearly specified.

> Are there even guidelines for saying what string goes into a code
> object's co_filename? Clearly it should be related to the source code
> that generated the code, and there are various conventions that seem
> to exist when the code comes from an "eval" or an "exec".

I'm not aware of guidelines - the documentation for compile() says
"The filename argument should give the file from which the code was
read; pass some recognizable value if it wasn't read from a file
('' is commonly used)" which is pretty non-commital.

> But empirically it seems as though there's some variation. It could be
> an absolute file or a file with no root directory specified. (But is
> it possible to have things like "." and ".."?). And in the case of a
> member of a package what happens? Should it be just the member without
> the package? Or should it include the package name like
>   /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/tracer.py ?
>
> Or be unspecified? If left unspecified as I gather it is now, it makes
> it more important to have some sort of common routine to be able to
> pick out the archive part in a filesystem from the member name inside
> the archive.

I think you need to be clear on *why* you want to know this
information. Once it's clear what you're trying to achieve, it will be
easier to say what the options are.

It sounds like you're trying to propose a stronger convention, to be
enforced in the future. (At least, your suggestion of producing stack
traces implies that you want stack trace code not to have to deal with
the current situation). When PEP 302 was being developed, we were
looking at similar issues. That's why I pointed you at get_source() -
it was the best we could do with all the various conflicting
requirements, and the fact that it's optional is because we had to
cater for cases where there simply wasn't a meaningful answer.
Frankly, backward compatibility requirements kill a lot of the options
here.

Maybe what you want is a *pair* of linked conventions:

- co_filename (or a replacement) returns a (notionally opaque, but
in practice a filename for file-based cases) token representing "the
file or other object the code came from"
-  xxx.get_source_code(token) is a function (I don't know where,
xxx is a placeholder for some "suitable" module) which, given such a
token, returns the source, or None if there's no viable concept of
"the source".

Or maybe you want a (possibly separate) attribute of a code object,
which holds a string containing a human-readable (but quite possibly
not machine-parseable) value representing the "place the code came
from" - co_filename is essentially this at the moment, and maybe your
complaint is merely that you don't find its contents sufficiently
human-readable in the case of the zipimport module (in which case you
might want to search some of the archives for the discussions on the
constraints imposed on zipimport, because objects on sys.path must be
strings and cannot be arbitrary objects...)

I'm sorry if this is a little rambling. I can appreciate that there's
some sort of issue that you see here, but I don't yet see any
practical way of changing things that would help. And as always,
there's backward compatibility to consider - existing code isn't going
to change, so new code has to be prepared to handle that.

I hope this is of some help,
Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-23 Thread Nick Coghlan
M.-A. Lemburg wrote:
> On 2008-12-22 22:45, Steven D'Aprano wrote:
>> This behaviour appears to be specific to deleting dicts, not deleting 
>> random objects. I haven't yet confirmed that the problem still exists 
>> in trunk (I hope to have time tonight or tomorrow), but in my previous 
>> tests deleting millions of items stored in a list of tuples completed 
>> in a minute or two, while deleting the same items stored as key:item 
>> pairs in a dict took 30+ minutes. I say plus because I never had the 
>> patience to let it run to completion, it could have been hours for all 
>> I know.
> 
> That's interesting. The dictionary dealloc routine doesn't give
> any hint as to why this should take longer than deallocating
> a list of tuples.

Shuffling the list with random.shuffle before deleting it makes a
*massive* difference to how long the deallocation takes.

Not only that, but after the shuffled list has been deallocated,
deleting an unshuffled list subsequently takes significantly longer.

(I posted numbers and a test script showing these effects elsewhere in
the thread).

The important factor seems to be deallocation order relative to
allocation order.

A simple list deletes objects in the reverse of the order of creation,
while a reversed list deletes them in order of creation. Both of these
seem to scale fairly linearly.

A dict with a hash order that I believe is a fair approximation of
creation order also didn't appear to exhibit particularly poor scaling
(at least not within the 20 million objects I could test).

The shuffled list, on the other hand, was pretty atrocious, taking
nearly twice as long to be destroyed as an unshuffled list of the same size.

I'd like to add another dict to the test which eliminates the current
coupling between hash order and creation order, and see if it exhibits
poor behaviour which is similar to that of the shuffled list, but I'm
not sure when I'll get to that (probably post-Christmas).

Note that I think these results are consistent with the theory that the
problem lies in the way partially allocated memory pools are tracked in
the obmalloc code - it makes sense that deallocating in creation order
or in reverse of creation order would tend to clean up each arena in
order and keep the obmalloc internal state neat and tidy, while
deallocating objects effectively at random would lead to a lot of
additional bookkeeping as the "most used" and "least used" arenas change
over the course of the deallocation.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Paul Moore
2008/12/23 Nick Coghlan :
> Finding a loader given only a pseudo-filename and no module is actually
> possible in the specific case of zipimport, but is still pretty obscure
> at this point in time:
>
> 1. Scan sys.path looking for an entry that matches the start of the
> pseudo-filename (remembering to use os.path.normpath).
>
> 2. Once such a path entry has been found, use PEP 302 to find the
> associated importer object (the undocumented pkgutil.get_importer
> function does exactly that - although, as with any undocumented feature,
> the promises of API compatibility across major version changes aren't as
> strong as they would be for an officially documented and supported
> interface).
>
> 3. Hope that the importer is one like zipimport that allows get_data()
> to be invoked directly on the importer object, rather than only
> providing it on a separate loader object after the module has been
> loaded. If it needs a real loader instead of just the importer, then
> you're back to the original problem of needing a module or package name
> (or globals dictionary) in addition to the pseudo filename.

There were lots of proposals tossed around on python-dev at the time
PEP 302 was being developed, which might have made all this easier.
Most, if not all, were killed by backward compatibility requirements.

I have some hopes that when Brett completes his "import in Python"
work, that will add sufficient flexibility to allow people to
experiment with all of this machinery, and ultimately maybe move
forward with a more modular import mechanism. But the timescales for
Brett's changes won't be until at least Python 3.1, and it'll be a
release or two after that before any significant change can be eased
in in a compatible manner. That's going to take a lot of energy on
someone's part.

Paul.

PS One of these days, I'm going to write an insanely useful importer
which takes the least-convenient option wherever PEP 302 allows
flexibility. It'll be adopted by everyone because it's so great, and
all the software that currently makes unwarranted assumptions about
importers will break and get fixed to support it because otherwise its
users will rebel, and we'll live in a paradise where everything
follows the specs to the letter. Oh, yes, and I'm going to win the
lottery every week for the next month :-)

PPS Seriously, setuptools and the adoptions of eggs has pushed a lot
of code to be much more careful about unwarranted assumptions that
code lives in the filesystem. That's an incredibly good thing, and
very hard to do right (witness the setuptools "zip_safe" parameter
which acts as a get-out clause). Much kudos to setuptools for getting
as far as it has.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread rocky
Paul Moore writes:
 > 2008/12/23 Rocky Bernstein :
 > > Now that there is a package mechanism (are package mechanisms?) like
 > > zipimporter that bundle source code into a single file, should the
 > > notion of a "file" location should be adjusted to include the package
 > > and/or importer?
 > 
 > Check PEP 302 (http://www.python.org/dev/peps/pep-0302/) specifically
 > the get_source (optional) method. 

Yes, that's one of the things I was thinking when I wrote:

  As best as I can tell, PEP 302 which discussed importer hooks and
  suggests a standard way to get file data.

And by "suggests" I meant was implying that yes I know this is
optional.


 > It's not exactly what you describe,
 > but it may help. 

Yes, it's not exactly what is desired. 

 > Please note that it's optional - if you loaded the
 > code from a zipfile containing only bytecode files, there is no source
 > to get, so you have to be prepared for that case. But if the source is
 > available, this should give you a way of getting to it.

What is wanted is a uniform way get and describe a file location
from a code object that takes into account the file might be a member
of an archive. 

Are there even guidelines for saying what string goes into a code
object's co_filename? Clearly it should be related to the source code
that generated the code, and there are various conventions that seem
to exist when the code comes from an "eval" or an "exec". 

But empirically it seems as though there's some variation. It could be
an absolute file or a file with no root directory specified. (But is
it possible to have things like "." and ".."?). And in the case of a
member of a package what happens? Should it be just the member without
the package? Or should it include the package name like
   /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/tracer.py ? 

Or be unspecified? If left unspecified as I gather it is now, it makes
it more important to have some sort of common routine to be able to
pick out the archive part in a filesystem from the member name inside 
the archive.


 > 
 > Paul.
 > 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Phillip J. Eby

At 06:55 AM 12/23/2008 -0500, Rocky Bernstein wrote:

Now that there is a package mechanism (are package mechanisms?) like
zipimporter that bundle source code into a single file, should the
notion of a "file" location should be adjusted to include the package
and/or importer?

Is there a standard API or routine which can extract this information
given a code object?


The inspect module (in 2.5 and up) supports retrieving the source 
lines for any object that has module globals.  So you could do it for 
a class, a function, a method, module-level code, or even a frame, 
but not for a standalone code object.


I believe there are also certain inspect module APIs that will return 
a pseudo-filename, i.e. the zipfile name followed by the path within 
the zipfile.




Also I'm not sure there *is* a standard print string way to show
member inside a package. zipimporter may insert co_filename strings
like:

  /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/tracer.py


AFAIK, it'll only do this if the zipfile doesn't contain a usable 
.pyc or .pyo.  Ordinarily, co_filename will be the name of the 
original source file before the zipfile was created.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Phillip J. Eby

At 04:00 PM 12/23/2008 +, Paul Moore wrote:

PPS Seriously, setuptools and the adoptions of eggs has pushed a lot
of code to be much more careful about unwarranted assumptions that
code lives in the filesystem. That's an incredibly good thing, and
very hard to do right (witness the setuptools "zip_safe" parameter
which acts as a get-out clause). Much kudos to setuptools for getting
as far as it has.


And ironically, if I ever get the time to actually work on a new 
version of easy_install (as opposed to perpetually tweaking the old 
one), the default zipping and default sys.path munging will be among 
the first things to go.  ;-)


Ironically, my choice of isolated directories and zipfiles for 
quick-and-dirty uninstall support has ended up costing far too much, 
compared to if I'd just taken the time to design a decent uninstall 
feature.  Of course, hindsight is 20-20; in order to fully understand 
the requirements of a problem, you sometimes have to get a rather 
long way towards solving it the simple, obvious...  and wrong way.


(And, it didn't help that I had significant time constraints pushing 
me in the direction of the Seemingly-Simplest-At-The-Moment Thing 
That Could Possibly Work.)


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread R. Bernstein
Paul Moore writes:
 > 2008/12/23  :
 > > What is wanted is a uniform way get and describe a file location
 > > from a code object that takes into account the file might be a member
 > > of an archive.
 > 
 > But a code object may not have come from a file. 

Right. That's why I mentioned for example "eval" and "exec" that you
cite below. So remove the "file" in what is cited above. Replace with:
"a unform way to get information (not necessarily just the source
text) about the location/origin of code from a code object.

 > Ignoring the
 > interactive prompt (not because it's unimportant, just because people
 > have a tendency to assume it's the only special case :-)) you need to
 > consider code loaded via a PEP302 importer from (say) a sqlite
 > database, or code created using compile(), or possibly even more
 > esoteric means.
 > 
 > So I'm not sure your request is clearly specified.

Is the above any more clear? 

 > 
 > > Are there even guidelines for saying what string goes into a code
 > > object's co_filename? Clearly it should be related to the source code
 > > that generated the code, and there are various conventions that seem
 > > to exist when the code comes from an "eval" or an "exec".
 > 
 > I'm not aware of guidelines - the documentation for compile() says
 > "The filename argument should give the file from which the code was
 > read; pass some recognizable value if it wasn't read from a file
 > ('' is commonly used)" which is pretty non-commital.
 > 
 > > But empirically it seems as though there's some variation. It could be
 > > an absolute file or a file with no root directory specified. (But is
 > > it possible to have things like "." and ".."?). And in the case of a
 > > member of a package what happens? Should it be just the member without
 > > the package? Or should it include the package name like
 > >   /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/tracer.py ?
 > >
 > > Or be unspecified? If left unspecified as I gather it is now, it makes
 > > it more important to have some sort of common routine to be able to
 > > pick out the archive part in a filesystem from the member name inside
 > > the archive.
 > 
 > I think you need to be clear on *why* you want to know this
 > information. Once it's clear what you're trying to achieve, it will be
 > easier to say what the options are.

This is what I wrote originally (slightly modified):
  
  A use case here I am thinking of here is in a stack trace or a
  debugger, or a tool which wants to show in great detail, information
  from a code object obtained possibly via a frame object.

I find it kind of sucky to see in a traceback: "" as opposed
to the text (or prefix of the text) of the actual string that was
passed. Or something that has been referred to as a "pseudo-file" like
/usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/foo/bar.py
when it is really member foo/bar.py of zipped egg
/usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg.

(As a separate issue, it seems that zipimporter file locations inside
setuptools may have a problem.)

Inside a debugger or an IDE, it is conceivable a person might want
loader, and module information, and if the code is part of an archive
file, then member information. (If part of an eval string then, the
eval string.)

 > 
 > It sounds like you're trying to propose a stronger convention, to be
 > enforced in the future. 

Well, I wasn't sure if there was one. But I gather from what you write,
there isn't. :-)

Yes, I would suggest a stronger convention. Or a more up-front
statement that none is desired/forthcoming.

 > (At least, your suggestion of producing stack
 > traces implies that you want stack trace code not to have to deal with
 > the current situation). When PEP 302 was being developed, we were
 > looking at similar issues. That's why I pointed you at get_source() -
 > it was the best we could do with all the various conflicting
 > requirements, and the fact that it's optional is because we had to
 > cater for cases where there simply wasn't a meaningful answer.
 > Frankly, backward compatibility requirements kill a lot of the options
 > here.
 > 
 > Maybe what you want is a *pair* of linked conventions:
 > 
 > - co_filename (or a replacement) returns a (notionally opaque, but
 > in practice a filename for file-based cases) token representing "the
 > file or other object the code came from"

This would be nice.

 > -  xxx.get_source_code(token) is a function (I don't know where,
 > xxx is a placeholder for some "suitable" module) which, given such a
 > token, returns the source, or None if there's no viable concept of
 > "the source".

There always is a viable concept of a source. It's whatever was done
to get the code. For example, if it was via an eval then the source
was the eval function and a string, same for exec. If it's via
database access, well that then and some summary info about what's
known about that. 

 > 
 > Or maybe you want a (possibly separate) a

Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Paul Moore
2008/12/23 R. Bernstein :
>  A use case here I am thinking of here is in a stack trace or a
>  debugger, or a tool which wants to show in great detail, information
>  from a code object obtained possibly via a frame object.

Thanks for the clarifications. I see what you're after much better now.

> I find it kind of sucky to see in a traceback: "" as opposed
> to the text (or prefix of the text) of the actual string that was
> passed. Or something that has been referred to as a "pseudo-file" like
> /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/foo/bar.py
> when it is really member foo/bar.py of zipped egg
> /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg.

Fair comment. That points to a "human readable" type of string. It's
not available at the moment, but I guess it could be.

But see below.

>  > -  xxx.get_source_code(token) is a function (I don't know where,
>  > xxx is a placeholder for some "suitable" module) which, given such a
>  > token, returns the source, or None if there's no viable concept of
>  > "the source".
>
> There always is a viable concept of a source. It's whatever was done
> to get the code. For example, if it was via an eval then the source
> was the eval function and a string, same for exec. If it's via
> database access, well that then and some summary info about what's
> known about that.

Hmm, "source" colloquially, yes "bytecode loaded from \xxx.pyc",
for example. But not "source" in the sense of "source code". Some
applications run with only bytecode shipped, no source code available
at all.

> There are two problems. One is displaying location information in an
> unambiguous way -- the pseudo-file above is ambiguous and so is
>  since there's no guarentee that OS's make to not name a file
> that. The second problem is programmatically getting information such
> as a debugger or an IDE might do so that the information can be
> conveyed back to a user who might want to inspect surrounding source
> code or modules.

This is more than you were asking for above.

The first problem is addressed with a "human readable" (narrative)
description, as above.

The second, however, requires machine-readable access to source code
(if it exists). That's what the loader get_source() call does for you.
But you have to be prepared for the fact that it may not be possible
to get source code, and decide what you want to happen in that case.

>  > I hope this is of some help,
>
> Yes, thanks. At least I now have a clearer idea of the state of
> where things stand.

Good. Sorry it's not better news :-)

Paul
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread R. Bernstein
Nick Coghlan writes:
 > 3. Do what a number of standard library APIs (e.g. linecache) that
 > accept filenames do and also accept an optional "module globals"
 > argument. 

Actually, I did this and committed a change (to pydb) before posting
any of these queries. ;-)

If "a number of standard library APIs" are doing the *same* thing,
then shouldn't this exposed as a common routine?

If on the other hand, by "a number" you mean "one" as in linecache --
1 *is* a number too! -- then perhaps the relevant code that is buried
inside the "updatecache" should be exposed on its own.  (As a side
benefit that code can be tested separately too!)

Should I file a feature request for this? 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems compiling 2.6.1 on Solaris 10

2008-12-23 Thread Ellinghaus, Lance
Martin,
Thank you very much. At least I know what I need to do now. 

> From: "Martin v. Löwis" [mailto:[email protected]] 
> I don't think ctypes (rather, libffi) supports Sun C. You will need to
> port it (as you have already ruled out the other options, such as using
> gcc, or not using ctypes).

Lance

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-23 Thread Mike Coleman
On Sat, Dec 20, 2008 at 6:22 PM, Mike Coleman  wrote:
> Re "held" and "intern_it":  Haha!  That's evil and extremely evil,
> respectively.  :-)

P.S.  I tried the "held" idea out (interning integers in a list), and
unfortunately it didn't make that much difference.  In the example I
tried, there were 104465178 instances of integers from range(33467).
I guess if ints are 12 bytes (per Beazley's book, but not sure if that
still holds), then that would correspond to a 1GB reduction.  Judging
by 'top', it might have been 2 or 3GB instead, from a total of 45G.

Mike
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] suggest change to "Failed to find the necessary bits to build these modules" message

2008-12-23 Thread Mike Coleman
I was thrown by the "Failed to find the necessary bits to build these
modules" message at the end of newer Python builds, and thought that
this indicated that the Python executable itself was not built.
That's arguably stupidity on my part, but I wonder if others will not
trip on this, too.

Would it be possible to change this wording slightly, to something like

Python built, but failed to find the necessary bits to build these modules

?
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Brett Cannon
On Tue, Dec 23, 2008 at 08:00, Paul Moore  wrote:
> 2008/12/23 Nick Coghlan :
>> Finding a loader given only a pseudo-filename and no module is actually
>> possible in the specific case of zipimport, but is still pretty obscure
>> at this point in time:
>>
>> 1. Scan sys.path looking for an entry that matches the start of the
>> pseudo-filename (remembering to use os.path.normpath).
>>
>> 2. Once such a path entry has been found, use PEP 302 to find the
>> associated importer object (the undocumented pkgutil.get_importer
>> function does exactly that - although, as with any undocumented feature,
>> the promises of API compatibility across major version changes aren't as
>> strong as they would be for an officially documented and supported
>> interface).
>>
>> 3. Hope that the importer is one like zipimport that allows get_data()
>> to be invoked directly on the importer object, rather than only
>> providing it on a separate loader object after the module has been
>> loaded. If it needs a real loader instead of just the importer, then
>> you're back to the original problem of needing a module or package name
>> (or globals dictionary) in addition to the pseudo filename.
>
> There were lots of proposals tossed around on python-dev at the time
> PEP 302 was being developed, which might have made all this easier.
> Most, if not all, were killed by backward compatibility requirements.
>
> I have some hopes that when Brett completes his "import in Python"
> work, that will add sufficient flexibility to allow people to
> experiment with all of this machinery, and ultimately maybe move
> forward with a more modular import mechanism.

I have actually made a good amount of progress as of late. It's a New
Years resolution to get importlib done, but I am actually aiming for
before January 1 (sans the damn compile() problem I am having).This
goal does ignore everything but a compatible __import__, though.

> But the timescales for
> Brett's changes won't be until at least Python 3.1, and it'll be a
> release or two after that before any significant change can be eased
> in in a compatible manner.

I suspect that any import work will be a Pending/DeprecationWarning
deal, so 3.3 would be the first version that could have any real
changes as the default.

> That's going to take a lot of energy on
> someone's part.

That would be me. =) After importlib is finished I have a couple of
PEPs planned plus properly documenting how the import machinery works
in the language spec. And I suspect this will lead to some discussions
about things, e.g. requirements of the format for __file__ and
__path__ in regards to when they point inside of an archive, etc.

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] suggest change to "Failed to find the necessary bits to build these modules" message

2008-12-23 Thread Brett Cannon
On Tue, Dec 23, 2008 at 09:59, Mike Coleman  wrote:
> I was thrown by the "Failed to find the necessary bits to build these
> modules" message at the end of newer Python builds, and thought that
> this indicated that the Python executable itself was not built.
> That's arguably stupidity on my part, but I wonder if others will not
> trip on this, too.
>
> Would it be possible to change this wording slightly, to something like
>
>Python built, but failed to find the necessary bits to build these modules
>
> ?

Sounds reasonable to me. Can you file a bug report at bugs.python.org,
Mike, so this doesn't get lost?

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] suggest change to "Failed to find the necessary bits to build these modules" message

2008-12-23 Thread Mike Coleman
Done: http://bugs.python.org/issue4731


On Tue, Dec 23, 2008 at 12:13 PM, Brett Cannon  wrote:
> On Tue, Dec 23, 2008 at 09:59, Mike Coleman  wrote:
>> I was thrown by the "Failed to find the necessary bits to build these
>> modules" message at the end of newer Python builds, and thought that
>> this indicated that the Python executable itself was not built.
>> That's arguably stupidity on my part, but I wonder if others will not
>> trip on this, too.
>>
>> Would it be possible to change this wording slightly, to something like
>>
>>Python built, but failed to find the necessary bits to build these modules
>>
>> ?
>
> Sounds reasonable to me. Can you file a bug report at bugs.python.org,
> Mike, so this doesn't get lost?
>
> -Brett
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] [ANN] Python 2.5.4 (final)

2008-12-23 Thread Martin v. Löwis
On behalf of the Python development team and the Python community, I'm
happy to announce the release of Python 2.5.4 (final).

Python 2.5.3 unfortunately contained an incorrect patch that could
cause interpreter crashes; the only change in Python 2.5.4 relative
to 2.5.4 is the reversal of this patch.

2.5.4 is the last bug fix release of Python 2.5. Future 2.5.x releases
will only include security fixes. According to the release notes, about
80 bugs and patches have been addressed since Python 2.5.2, many of
them improving the stability of the interpreter, and improving its
portability.

See the release notes at the website (also available as Misc/NEWS in
the source distribution) for details of bugs fixed; most of them prevent
interpreter crashes (and now cause proper Python exceptions in cases
where the interpreter may have crashed before).

For more information on Python 2.5.4, including download
links for various platforms, release notes, and known issues, please
see:

http://www.python.org/2.5.4

Highlights of the previous major Python releases are available
from the Python 2.5 page, at

http://www.python.org/2.5/highlights.html

Enjoy this release,
Martin

Martin v. Loewis
[email protected]
Python Release Manager
(on behalf of the entire python-dev team)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hello everyone + little question around Cpython/stackless

2008-12-23 Thread Pascal Chambon

Allright then, I understand the problem...

Thanks a lot,
regards,
Pascal


  



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-23 Thread Kristján Valur Jónsson
I'd like to suggest here, if you are giving this code a facelift, that on 
Windows you use VirtualAlloc and friends to allocate the arenas.  This gives 
you the most direct access to the VM manager and makes sure that a release 
arena is immediately availible to the rest of the system.  It also makes sure 
that you don't mess with the regular heap and fragment it.
Kristján

-Original Message-
From: [email protected] 
[mailto:[email protected]] On Behalf Of 
"Martin v. Löwis"
Sent: 22. desember 2008 22:56
To: Antoine Pitrou
Cc: [email protected]
Subject: Re: [Python-Dev] extremely slow exit for program having huge (45G) 
dict (python 2.5.2)

>> Allocation of a new pool would have to do a linear search in these
>> pointers (finding the arena with the least number of pools);
> 
> You mean the least number of free pools, right?

Correct.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-23 Thread Martin v. Löwis
> I'd like to suggest here, if you are giving this code a facelift,
> that on Windows you use VirtualAlloc and friends to allocate the
> arenas.  This gives you the most direct access to the VM manager and
> makes sure that a release arena is immediately availible to the rest
> of the system.  It also makes sure that you don't mess with the
> regular heap and fragment it.

While I'd like to see this done myself, I believe it is independent
from the problem at hand. Contributions are welcome.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ANN] Python 2.5.4 (final)

2008-12-23 Thread Terry Reedy

Martin v. Löwis wrote:


For more information on Python 2.5.4, including download
links for various platforms, release notes, and known issues, please
see:

http://www.python.org/2.5.4


http://www.python.org/download/releases/2.5.4/

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Nick Coghlan
R. Bernstein wrote:
> Nick Coghlan writes:
>  > 3. Do what a number of standard library APIs (e.g. linecache) that
>  > accept filenames do and also accept an optional "module globals"
>  > argument. 
> 
> Actually, I did this and committed a change (to pydb) before posting
> any of these queries. ;-)
> 
> If "a number of standard library APIs" are doing the *same* thing,
> then shouldn't this exposed as a common routine?
> 
> If on the other hand, by "a number" you mean "one" as in linecache --
> 1 *is* a number too! -- then perhaps the relevant code that is buried
> inside the "updatecache" should be exposed on its own.  (As a side
> benefit that code can be tested separately too!)
> 
> Should I file a feature request for this? 

The reason for my slightly odd phrasing is that all of the examples I
was originally going to mention (traceback, pdb, doctest, inspect)
actually all end up calling linecache to do the heavy lifting.

So it is possible that linecache.getlines() actually *is* the common
routine you're looking for - it just needs to be added to the
documentation and the __all__ attribute for linecache to be officially
supported. Currently, only the single line getline() function is
documented and exposed via __all__, but I don't see any reason for that
restriction - linecache.getlines() has been there with a stable API
since at least Python 2.5.

For cases where you have an appropriate Python object (i.e. a module,
function, method, class, traceback, frame or code object) rather than a
pseudo-filename, then inspect.getsource() actually jumps through a lot
of hoops to try to find the actual source code for that object - in
those cases, using the appropriate inspect function is generally a much
better idea than trying to interpret __file__ yourself.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [ANN] Python 2.5.4 (final)

2008-12-23 Thread Martin v. Löwis
>> For more information on Python 2.5.4, including download
>> links for various platforms, release notes, and known issues, please
>> see:
>>
>> http://www.python.org/2.5.4
> 
> http://www.python.org/download/releases/2.5.4/

Thanks for pointing that out; the original URL now also works as well
(as it does for all other releases).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-23 Thread Steven D'Aprano
On Sun, 21 Dec 2008 06:45:11 am Antoine Pitrou wrote:
> Steven D'Aprano  pearwood.info> writes:
> > In November 2007, a similar problem was reported on the
> > comp.lang.python newsgroup. 370MB was large enough to demonstrate
> > the problem. I don't know if a bug was ever reported.
>
> Do you still reproduce it on trunk?
> I've tried your scripts on my machine and they work fine, even if I
> leave garbage collecting enabled during the process.
> (dual core 64-bit machine but in 32-bit mode)

I'm afraid that sometime over the last year, I replaced my computer's 
motherboard, and now I can't reproduce the behaviour at all. I've tried 
two different boxes, with both Python 2.6.1 and 2.5.1.


-- 
Steven D'Aprano
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread R. Bernstein
Nick Coghlan writes:
 > R. Bernstein wrote:
 > > Nick Coghlan writes:
 > >  > 3. Do what a number of standard library APIs (e.g. linecache) that
 > >  > accept filenames do and also accept an optional "module globals"
 > >  > argument. 
 > > 
 > > Actually, I did this and committed a change (to pydb) before posting
 > > any of these queries. ;-)
 > > 
 > > If "a number of standard library APIs" are doing the *same* thing,
 > > then shouldn't this exposed as a common routine?
 > > 
 > > If on the other hand, by "a number" you mean "one" as in linecache --
 > > 1 *is* a number too! -- then perhaps the relevant code that is buried
 > > inside the "updatecache" should be exposed on its own.  (As a side
 > > benefit that code can be tested separately too!)
 > > 
 > > Should I file a feature request for this? 
 > 
 > The reason for my slightly odd phrasing is that all of the examples I
 > was originally going to mention (traceback, pdb, doctest, inspect)
 > actually all end up calling linecache to do the heavy lifting.
 > 
 > So it is possible that linecache.getlines() actually *is* the common
 > routine you're looking for 

I never asked about getting the text lines for the source code, no
matter how many times people suggest that as an alternative. :-)

Instead, I was asking about a common way to get information about the
source location for say a frame or traceback object (which might
include package name and type) and suggest that there should be a more
unambiguous way to display this information than seems to be in use at
present.

Part of work to retrieve or displaying that information has to do the
some of the same things that is inside of linecache.updatecache()
*before* it retrieves the lines of the source code (when
possible). And possibly parts of it include parts of what's done in
pieces of the inspect module.

 > - it just needs to be added to the
 > documentation and the __all__ attribute for linecache to be officially
 > supported. Currently, only the single line getline() function is
 > documented and exposed via __all__, but I don't see any reason for that
 > restriction - linecache.getlines() has been there with a stable API
 > since at least Python 2.5.
 > 
 > For cases where you have an appropriate Python object (i.e. a module,
 > function, method, class, traceback, frame or code object) rather than a
 > pseudo-filename, then inspect.getsource() actually jumps through a lot
 > of hoops to try to find the actual source code for that object - in
 > those cases, using the appropriate inspect function is generally a much
 > better idea than trying to interpret __file__ yourself.
 > 
 > Cheers,
 > Nick.

Thanks for the information. I will keep in mind those inspect routines. 

They probably will be a helpful for another problem I had been
wondering about -- how one can determine if there is no code
associated at a given a line and file. (In other words and invalid
location for a debugger line breakpoint, such as because the line
part of a comment or the interior line of a string that spans many
lines)

 > 
 > -- 
 > Nick Coghlan   |   [email protected]   |   Brisbane, Australia
 > ---
 > ___
 > Python-Dev mailing list
 > [email protected]
 > http://mail.python.org/mailman/listinfo/python-dev
 > Unsubscribe: 
 > http://mail.python.org/mailman/options/python-dev/rocky%40gnu.org
 > 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread Steve Holden
R. Bernstein wrote:
> Nick Coghlan writes:
>  > R. Bernstein wrote:
>  > > Nick Coghlan writes:
>  > >  > 3. Do what a number of standard library APIs (e.g. linecache) that
>  > >  > accept filenames do and also accept an optional "module globals"
>  > >  > argument. 
>  > > 
>  > > Actually, I did this and committed a change (to pydb) before posting
>  > > any of these queries. ;-)
>  > > 
>  > > If "a number of standard library APIs" are doing the *same* thing,
>  > > then shouldn't this exposed as a common routine?
>  > > 
>  > > If on the other hand, by "a number" you mean "one" as in linecache --
>  > > 1 *is* a number too! -- then perhaps the relevant code that is buried
>  > > inside the "updatecache" should be exposed on its own.  (As a side
>  > > benefit that code can be tested separately too!)
>  > > 
>  > > Should I file a feature request for this? 
>  > 
>  > The reason for my slightly odd phrasing is that all of the examples I
>  > was originally going to mention (traceback, pdb, doctest, inspect)
>  > actually all end up calling linecache to do the heavy lifting.
>  > 
>  > So it is possible that linecache.getlines() actually *is* the common
>  > routine you're looking for 
> 
> I never asked about getting the text lines for the source code, no
> matter how many times people suggest that as an alternative. :-)
> 
> Instead, I was asking about a common way to get information about the
> source location for say a frame or traceback object (which might
> include package name and type) and suggest that there should be a more
> unambiguous way to display this information than seems to be in use at
> present.
> 
I agree. Since PEP 302 many parts of Python are rather too file-centric
for my liking. I notes almost four years ago, for example, that the
interpreter assumes that the os module will be imported from filestore
in order to set the prefix. This issue appears to have received no
attention since, and I'm certainly not the one with the best skills or
knowledge to solve this problem.

  http://bugs.python.org/issue1116520

> Part of work to retrieve or displaying that information has to do the
> some of the same things that is inside of linecache.updatecache()
> *before* it retrieves the lines of the source code (when
> possible). And possibly parts of it include parts of what's done in
> pieces of the inspect module.
> 
>  > - it just needs to be added to the
>  > documentation and the __all__ attribute for linecache to be officially
>  > supported. Currently, only the single line getline() function is
>  > documented and exposed via __all__, but I don't see any reason for that
>  > restriction - linecache.getlines() has been there with a stable API
>  > since at least Python 2.5.
>  > 
>  > For cases where you have an appropriate Python object (i.e. a module,
>  > function, method, class, traceback, frame or code object) rather than a
>  > pseudo-filename, then inspect.getsource() actually jumps through a lot
>  > of hoops to try to find the actual source code for that object - in
>  > those cases, using the appropriate inspect function is generally a much
>  > better idea than trying to interpret __file__ yourself.
>  > 
>  > Cheers,
>  > Nick.
> 
> Thanks for the information. I will keep in mind those inspect routines. 
> 
> They probably will be a helpful for another problem I had been
> wondering about -- how one can determine if there is no code
> associated at a given a line and file. (In other words and invalid
> location for a debugger line breakpoint, such as because the line
> part of a comment or the interior line of a string that spans many
> lines)
> 
Looks like that start of some necessary attention to this issue. The
inspect module might indeed offer the right facilities. I'm still
wondering what we do about the various prefix settings in an environment
where there are no filestore imports at all.

In the event I can assist feel free to rope me in.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a way or API for retrieving from a code object a loader method and package file where the code comes from?

2008-12-23 Thread rocky
Paul Moore writes:
 > 2008/12/23 R. Bernstein :
 > >  A use case here I am thinking of here is in a stack trace or a
 > >  debugger, or a tool which wants to show in great detail, information
 > >  from a code object obtained possibly via a frame object.
 > 
 > Thanks for the clarifications. I see what you're after much better now.
 > 
 > > I find it kind of sucky to see in a traceback: "" as opposed
 > > to the text (or prefix of the text) of the actual string that was
 > > passed. Or something that has been referred to as a "pseudo-file" like
 > > /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg/foo/bar.py
 > > when it is really member foo/bar.py of zipped egg
 > > /usr/lib/python2.5/site-packages/tracer-0.1.0-py2.5.egg.
 > 
 > Fair comment. That points to a "human readable" type of string. It's
 > not available at the moment, but I guess it could be.
 > 
 > But see below.
 > 
 > >  > -  xxx.get_source_code(token) is a function (I don't know where,
 > >  > xxx is a placeholder for some "suitable" module) which, given such a
 > >  > token, returns the source, or None if there's no viable concept of
 > >  > "the source".
 > >
 > > There always is a viable concept of a source. It's whatever was done
 > > to get the code. For example, if it was via an eval then the source
 > > was the eval function and a string, same for exec. If it's via
 > > database access, well that then and some summary info about what's
 > > known about that.
 > 
 > Hmm, "source" colloquially, yes "bytecode loaded from \xxx.pyc",
 > for example. But not "source" in the sense of "source code". Some
 > applications run with only bytecode shipped, no source code available
 > at all.
 > 
 > > There are two problems. One is displaying location information in an
 > > unambiguous way -- the pseudo-file above is ambiguous and so is
 > >  since there's no guarentee that OS's make to not name a file
 > > that. The second problem is programmatically getting information such
 > > as a debugger or an IDE might do so that the information can be
 > > conveyed back to a user who might want to inspect surrounding source
 > > code or modules.
 > 
 > This is more than you were asking for above.
 > 
 > The first problem is addressed with a "human readable" (narrative)
 > description, as above.
 > 
 > The second, however, requires machine-readable access to source code
 > (if it exists). That's what the loader get_source() call does for you.
 > But you have to be prepared for the fact that it may not be possible
 > to get source code, and decide what you want to happen in that case.

I'm missing your point here. 

When one uses information from a traceback, or is in a debugger, or is
in an IDE, it is assumed that in order to use the information given
you'll need access to the source code. And IDE's and debuggers have
had to deal with the fact that source code is not available from day
one, even before there was zipimporter.

In order to get the strings of source text that linecache.getlines()
gives, it has to prowl around for other information, possibly looking
for a loader along the protocol defined in PEP 302 and/or others. And
its that information that a debugger, IDE or some tool of that ilk
might need.

Many IDE's and debuggers nowadays open a socket and pass information
back and forth over that. An obvious advantage is that it means you
can debug remotely. But in order for this to work, some information is
generally passed back and for regarding the location of the source
text. In the Java world and Eclipse for example, it is possible for
the jar to be in a different location from on the machine which you
might be debugging on. And probably too often that jar isn't the same
one. So it is helpful in this kind of scenario to break out a location
into the name of a jar and the member inside the jar. Perhaps also
some information about that jar.

It is possible that instead of passing around locations, debuggers and
such tools instead use get_source() instead, because that's what
Python has to offer.  :-)

I jest here, but honestly I've been surprised that there is no IDE
that I know of that in fact works this way. The machine running the
code clearly may have more accurate access to the source than a
front-end IDE. Undeterred by the harsh facts of reality, I have hope
that someday there *might* be an IDE that has provision for this. So
in a Ruby debugger (ruby-debug) one can request checksum information
on the files the debugger things are loaded in order to facilitate
checking that the source one an IDE might be showing in fact matches
the source for that part of the code that one is currently under
investigation.


 > 
 > >  > I hope this is of some help,
 > >
 > > Yes, thanks. At least I now have a clearer idea of the state of
 > > where things stand.
 > 
 > Good. Sorry it's not better news :-)
 > 
 > Paul
 > 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/li