Re: [Python-Dev] Memory management in the AST parser compiler
Neil Schemenauer wrote: Fredrik Lundh [EMAIL PROTECTED] wrote: Thomas Lee wrote: Even if it meant we had just one function call - one, safe function call that deallocated all the memory allocated within a function - that we had to put before each and every return, that's better than what we have. alloca? Perhaps we should use the memory management technique that the rest of Python uses: reference counting. I don't see why the AST structures couldn't be PyObjects. Neil I'm +1 for reference counting. It's going to be a little error prone initially (certainly much less error prone than the current system in the long run), but the pooling/arena idea is going to screw with all sorts of stuff within the AST and possibly in bits of Python/compile.c too. At least, all my attempts wound up looking that way :) Cheers, Tom ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] urlparse brokenness
It is my assertion that urlparse is currently broken. Specifically, I think that urlparse breaks an abstraction boundary with ill effect. In writing a mailclient, I wished to allow my users to specify their imap server as a url, such as 'imap://user:[EMAIL PROTECTED]:port/'. Which worked fine. I then thought that the natural extension to support configuration of imapssl would be 'imaps://user:[EMAIL PROTECTED]:port/' which failed - user:[EMAIL PROTECTED]:port got parsed as the *path* of the URL instead of the network location. It turns out that urlparse keeps a table of url schemes that 'use netloc'... that is to say, that have a 'user:[EMAIL PROTECTED]:port' part to their URL. I think this 'special knowledge' about particular schemes 1) breaks an abstraction boundary by having a function whose charter is to pull apart a particularly-formatted string behave differently based on the meaning of the string instead of the structure of it and 2) fails to be extensible or forward compatible due to hardcoded 'magic' strings - if schemes were somehow 'registerable' as 'netloc using' or not, then this objection might be nullified, but the previous objection would still stand. So I propose that urlsplit, the main offender, be replaced with something that looks like: def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')): Parse a URL into 5 components: scheme://netloc/path?query#fragment Return a 5-tuple: (scheme, netloc, path, query, fragment). Note that we don't break the components up in smaller bits (e.g. netloc is a single string) and we don't expand % escapes. key = url, scheme, allow_fragments, default cached = _parse_cache.get(key, None) if cached: return cached if len(_parse_cache) = MAX_CACHE_SIZE: # avoid runaway growth clear_cache() if :// in url: uscheme, npqf = url.split(://, 1) else: uscheme = scheme if not uscheme: uscheme = default[0] npqf = url pathidx = npqf.find('/') if pathidx == -1: # not found netloc = npqf path, query, fragment = default[1:4] else: netloc = npqf[:pathidx] pqf = npqf[pathidx:] if '?' in pqf: path, qf = pqf.split('?',1) else: path, qf = pqf, ''.join(default[3:5]) if ('#' in qf) and allow_fragments: query, fragment = qf.split('#',1) else: query, fragment = default[3:5] tuple = (uscheme, netloc, path, query, fragment) _parse_cache[key] = tuple return tuple Note that I'm not sold on the _parse_cache, but I'm assuming it was there for a reason so I'm leaving that functionality as-is. If this isn't the right forum for this discussion, or the right place to submit code, please let me know. Also, please cc: me directly on responses as I'm not subscribed to the firehose that is python-dev. --pj ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] urlparse brokenness
On Tue, Nov 22, 2005, Paul Jimenez wrote: If this isn't the right forum for this discussion, or the right place to submit code, please let me know. Also, please cc: me directly on responses as I'm not subscribed to the firehose that is python-dev. This is the right forum for discussion. You should post your patch to SourceForge *before* starting a discussion on python-dev, including a link to the patch in your post. It is not essential, but it is certainly a courtesy to subscribe to python-dev for the duration of the discussion; you can feel feel to filter threads you're not interested in. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ If you think it's expensive to hire a professional to do the job, wait until you hire an amateur. --Red Adair ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 302, PEP 338 and imp.getloader (was Re: a Python interface for the AST (WAS: DRAFT: python-dev...)
At 11:51 PM 11/23/2005 +1000, Nick Coghlan wrote: The key thing that is missing is the imp.getloader functionality discussed at the end of PEP 302. This isn't hard to implement per se; setuptools for example has a 'get_importer' function, and going from importer to loader is simple: def get_importer(path_item): Retrieve a PEP 302 importer for the given path item If there is no importer, this returns a wrapper around the builtin import machinery. The returned importer is only cached if it was created by a path hook. try: importer = sys.path_importer_cache[path_item] except KeyError: for hook in sys.path_hooks: try: importer = hook(path_item) except ImportError: pass else: break else: importer = None sys.path_importer_cache.setdefault(path_item,importer) if importer is None: try: importer = ImpWrapper(path_item) except ImportError: pass return importer So with the above function you could do something like: def get_loader(fullname, path): for path_item in path: try: loader = get_importer(path_item).find_module(fullname) if loader is not None: return loader except ImportError: continue else: return None in order to implement the rest. ** I'm open to suggestions on how to deal with argv[0] and __file__. They should be set to whatever __file__ would be set to by the module loader, but the Importer Protocol in PEP 302 doesn't seem to expose that information. The current proposal is a compromise that matches the existing behaviour of -m (which supports scripts like regrtest.py) while still giving a meaningful value for scripts which are not part of the normal filesystem. Ugh. Those are tricky, no question. I can think of several simple answers for each, all of which are wrong in some way. :) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] urlparse brokenness
Paul Jimenez wrote: So I propose that urlsplit, the main offender, be replaced with something that looks like: def urlsplit(url, scheme='', allow_fragments=1, default=('','','','','')): +1 in principle. You should probably do a global _parse_cache and add 'is not None' after 'if cached'. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] a Python interface for the AST (WAS: DRAFT: python-dev...)
On 11/23/05, Greg Ewing [EMAIL PROTECTED] wrote: Brett Cannon wrote: There are two problems to this topic; how to get the AST structs into Python objects and how to allow Python code to modify the AST before bytecode emission I'm astounded to hear that the AST isn't made from Python objects in the first place. Is there a particular reason it wasn't done that way? I honestly don't know, Greg. All of the structs are generated by Parser/asdl_c.py which reads in the AST definition from Parser/Python.asdl . The code that is used to allocate and initialize the structs is in Python/Python-ast.c and is also auto-generated by Parser/asdl_c.py . I am guessing here, but it might have to do with type safety. Some nodes can be different kinds of subnodes (like the stmt node) and thus are created using a single struct and a bunch unions internally. So there is some added security that stuff is being done correctly. Otherwise memory is the only other reason I can think of. Or Jeremy just didn't think of doing it that way when this was all started years ago. =) But since it is all auto-generated it should be doable to make them Python objects. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com