Re: [Python-Dev] Memory management in the AST parser compiler

2005-11-23 Thread Thomas Lee
Neil Schemenauer wrote:

Fredrik Lundh [EMAIL PROTECTED] wrote:
  

Thomas Lee wrote:



Even if it meant we had just one function call - one, safe function call
that deallocated all the memory allocated within a function - that we
had to put before each and every return, that's better than what we
have.
  

alloca?



Perhaps we should use the memory management technique that the rest
of Python uses: reference counting.  I don't see why the AST
structures couldn't be PyObjects.

  Neil

  

I'm +1 for reference counting. It's going to be a little error prone 
initially (certainly much less error prone than the current system in 
the long run), but the pooling/arena idea is going to screw with all 
sorts of stuff within the AST and possibly in bits of Python/compile.c 
too. At least, all my attempts wound up looking that way :)

Cheers,
Tom

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com

  


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] urlparse brokenness

2005-11-23 Thread Paul Jimenez

It is my assertion that urlparse is currently broken.  Specifically, I 
think that urlparse breaks an abstraction boundary with ill effect.

In writing a mailclient, I wished to allow my users to specify their
imap server as a url, such as 'imap://user:[EMAIL PROTECTED]:port/'. Which
worked fine. I then thought that the natural extension to support
configuration of imapssl would be 'imaps://user:[EMAIL PROTECTED]:port/'
which failed - user:[EMAIL PROTECTED]:port got parsed as the *path* of
the URL instead of the network location. It turns out that urlparse
keeps a table of url schemes that 'use netloc'... that is to say,
that have a 'user:[EMAIL PROTECTED]:port' part to their URL. I think this
'special knowledge' about particular schemes 1) breaks an abstraction
boundary by having a function whose charter is to pull apart a
particularly-formatted string behave differently based on the meaning of
the string instead of the structure of it and 2) fails to be extensible
or forward compatible due to hardcoded 'magic' strings - if schemes were
somehow 'registerable' as 'netloc using' or not, then this objection
might be nullified, but the previous objection would still stand.

So I propose that urlsplit, the main offender, be replaced with something
that looks like:

def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')):
Parse a URL into 5 components:
scheme://netloc/path?query#fragment
Return a 5-tuple: (scheme, netloc, path, query, fragment).
Note that we don't break the components up in smaller bits
(e.g. netloc is a single string) and we don't expand % escapes.
key = url, scheme, allow_fragments, default
cached = _parse_cache.get(key, None)
if cached:
return cached
if len(_parse_cache) = MAX_CACHE_SIZE: # avoid runaway growth
clear_cache()

if :// in url:
uscheme, npqf = url.split(://, 1)
else:
uscheme = scheme
if not uscheme:
uscheme = default[0]
npqf = url
pathidx = npqf.find('/')
if pathidx == -1:  # not found
netloc = npqf
path, query, fragment = default[1:4]
else:
netloc = npqf[:pathidx]
pqf = npqf[pathidx:]
if '?' in pqf:
path, qf = pqf.split('?',1)
else:
path, qf = pqf, ''.join(default[3:5])
if ('#' in qf) and allow_fragments:
query, fragment = qf.split('#',1)
else:
query, fragment = default[3:5]
tuple = (uscheme, netloc, path, query, fragment)
_parse_cache[key] = tuple
return tuple

Note that I'm not sold on the _parse_cache, but I'm assuming it was there
for a reason so I'm leaving that functionality as-is.

If this isn't the right forum for this discussion, or the right place to 
submit code, please let me know.  Also, please cc: me directly on responses
as I'm not subscribed to the firehose that is python-dev.

  --pj

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] urlparse brokenness

2005-11-23 Thread Aahz
On Tue, Nov 22, 2005, Paul Jimenez wrote:

 If this isn't the right forum for this discussion, or the right place
 to submit code, please let me know.  Also, please cc: me directly on
 responses as I'm not subscribed to the firehose that is python-dev.

This is the right forum for discussion.  You should post your patch to
SourceForge *before* starting a discussion on python-dev, including a
link to the patch in your post.  It is not essential, but it is certainly
a courtesy to subscribe to python-dev for the duration of the discussion;
you can feel feel to filter threads you're not interested in.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur.  --Red Adair
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 302, PEP 338 and imp.getloader (was Re: a Python interface for the AST (WAS: DRAFT: python-dev...)

2005-11-23 Thread Phillip J. Eby
At 11:51 PM 11/23/2005 +1000, Nick Coghlan wrote:
The key thing that is missing is the imp.getloader functionality discussed
at the end of PEP 302.

This isn't hard to implement per se; setuptools for example has a 
'get_importer' function, and going from importer to loader is simple:

def get_importer(path_item):
 Retrieve a PEP 302 importer for the given path item

 If there is no importer, this returns a wrapper around the builtin import
 machinery.  The returned importer is only cached if it was created by a
 path hook.
 
 try:
 importer = sys.path_importer_cache[path_item]
 except KeyError:
 for hook in sys.path_hooks:
 try:
 importer = hook(path_item)
 except ImportError:
 pass
 else:
 break
 else:
 importer = None

 sys.path_importer_cache.setdefault(path_item,importer)
 if importer is None:
 try:
 importer = ImpWrapper(path_item)
 except ImportError:
 pass
 return importer

So with the above function you could do something like:

def get_loader(fullname, path):
 for path_item in path:
 try:
 loader = get_importer(path_item).find_module(fullname)
 if loader is not None:
 return loader
 except ImportError:
 continue
 else:
 return None

in order to implement the rest.


** I'm open to suggestions on how to deal with argv[0] and __file__. They
should be set to whatever __file__ would be set to by the module loader, but
the Importer Protocol in PEP 302 doesn't seem to expose that information. The
current proposal is a compromise that matches the existing behaviour of -m
(which supports scripts like regrtest.py) while still giving a meaningful
value for scripts which are not part of the normal filesystem.

Ugh.  Those are tricky, no question.  I can think of several simple answers 
for each, all of which are wrong in some way.  :)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] urlparse brokenness

2005-11-23 Thread Mike Brown
Paul Jimenez wrote:
 So I propose that urlsplit, the main offender, be replaced with something
 that looks like:
 
 def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')):

+1 in principle.

You should probably do a
global _parse_cache

and add 'is not None' after 'if cached'.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...)

2005-11-23 Thread Brett Cannon
On 11/23/05, Greg Ewing [EMAIL PROTECTED] wrote:
 Brett Cannon wrote:

  There are two problems to this topic; how to
  get the AST structs into Python objects and how to allow Python code
  to modify the AST before bytecode emission

 I'm astounded to hear that the AST isn't made from
 Python objects in the first place. Is there a particular
 reason it wasn't done that way?


I honestly don't know, Greg.  All of the structs are generated by
Parser/asdl_c.py which reads in the AST definition from
Parser/Python.asdl .  The code that is used to allocate and initialize
the structs is in Python/Python-ast.c and is also auto-generated by
Parser/asdl_c.py .

I am guessing here, but it might have to do with type safety.  Some
nodes can be different kinds of subnodes (like the stmt node) and thus
are created using a single struct and a bunch unions internally.  So
there is some added security that stuff is being done correctly.

Otherwise memory is the only other reason I can think of.  Or Jeremy
just didn't think of doing it that way when this was all started years
ago.  =)  But since it is all auto-generated it should be doable to
make them Python objects.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com