Re: [pypy-dev] Re: Mixed modules for both PyPy and CPython
Hi VanL, On Wed, Apr 12, 2006 at 13:45 -0600, VanL wrote: Carl Friedrich Bolz wrote: Careful. Restricted Python (or RPython) means something very specific in PyPy lingo. Restricted Python means that you stick to some restrictions in your coding style to make type inference possible. See http://codespeak.net/pypy/dist/pypy/doc/coding-guide.html#restricted-python Sorry - I knew that. I just didn't make the connection from one side of my brain to another. Maybe secure python (spy)? just as a side note: secure is a very vague term, technically (and politically). The professor i learned with, warned against talking about security without specifying against which kind of attacks and about which scenarios one is talking about it. Did you see my security related posting a few days ago, btw? What made me think this would work are a couple of comments made here and on the py3k list. First, Armin (about a year ago, I think)made some comments about the objectspace abstraction allowing a Remote Object Space and allowing different object spaces to participate in the evaluation of code. Right, that is still an open topic, a bit touched by the parallel discussion on this list of integrating PyPy (and RPython) with CPython. Second, comments on py3k list indicated that secure python is difficult because of a) introspection, b) type inference, and c) GIL acquisition. Hum, this list looks a bit weird to me. Could you state what the actual attacks are for which security measures are discussed? Or which use cases are people on py3k having in mind? cheers thanks, holger ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Summer of Code 2006
Niklaus Haldimann wrote: Hi there Google is doing Summer of Code again this year: http://code.google.com/soc/ It would be possible to enter PyPy directly as a mentoring organization this time, instead of going through the PSF. Last year, student slots were given to mentoring organizations proportional to the number of applications. If there are enough applications for PyPy proper that might bring more students on board than taking slots from the PSF pool as last year (but maybe PyPy's influence in the PSF is big enough, and this doesn't really matter). That's a good news! I'd like to participate to soc for doing pypy's stuffs, I hope I'll able to apply (and to win :-)). ciao Anto ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Avoiding code duplication
Armin Rigo wrote: I think this is kind-of-reasonable. The ADT method approach of the lltypesystem was introduced late during the development of the rtyper; by now, it would be reasonable to define common method names between the ADT methods of the lltypesystem and the GENERIC_METHODS of the ootypesystem. I am unsure about the performance penalty. The current version of many ll helpers, for example, read the 'items' pointer only once and reuse it; if this gets replaced by ADT methods like 'getitem_nonneg()', it means that althought the call is probably inlined there is still the overhead of reading 'items' through each iteration in the list. Who knows, maybe C compilers will notice and move the read out of the loop. Just give it a try on a small example like ll_listindex(), I guess... Well, as we decided on #pypy I've changed the ADT interface. As I wrote in the commit log: The interface of ListRepr and FixedSizeListRepr has changed: two accessor methods has been added: ll_getitem_fast and ll_setitem_fast. They should be used instead of the ll_items()[index] idiom: that way when ootypesystem's list will support that interface we will able to write function useable with both typesystem with no modification. The various ll_* helper function has been adapted to use the new interface. Moreover function that accessed directly to the l.length field has been changed to call the ll_length() method instead, for the same reasons as above. The next step is to rename ootypesystem's list _GENERIC_METHODS to match the ADT methods in lltypesystem's list, then we could try to share most of ll_* function that currently belongs only to lltypesystem/rlist.py. I hope I will do it tomorrow. A different comment: as you mentioned on IRC it would be nice if the back-end could choose which methods it implements natively. At one point there was the idea that maybe the 'oopspec' attributes that started to show up in lltypesystem/rlist.py (used by the JIT only) could be useful in this respect. If I remember correctly, the idea didn't work out because of the different 'lowleveltype' needed, and the difference in the interface. Merging the ADT method names of lltyped lists and the GENERIC_METHODS of ootyped lists could be a step in this direction again. The interesting point is that each oo back-end could then choose to special-case the ll_xxx() functions with the oopspecs that they recognize, and just translate the other ones normally. (The ll back-ends always translate them all.) I saw that 'oopspec' attributes, but I didn't understand the exact semantic; your proposal sounds reasonable to me: if I can figure out correctly this way the typesystem specific code would be reduced to the minimum and will help to port other Repr such as rdict to ootypesystem, too. I'll investigate a bit in this direction as soon as I can. good Easter to all, ciao Anto ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Avoiding code duplication
Hi Antonio, On Sat, Apr 15, 2006 at 12:59:07PM +0200, Antonio Cuni wrote: I saw that 'oopspec' attributes, but I didn't understand the exact semantic; The oopspec string tells what is the abstract list operation that this particular ll_*() function implement. For example: def ll_prepend(l, newitem): ... ll_prepend.oopspec = 'list.insert(l, 0, newitem)' means that ll_prepend() is equivalent to an insert with the index set to zero. In the stirng, the pseudo-arguments between the ( ) are either real argument names of the ll_ function, or constants. So for example, if a backend has got its own way to implement the insert() calls in general, it could figure out from the oopspec that the ll_prepend() helper can be replaced by a custom stub invoking the backend's own version of insertion with an index of 0. That's essentially what the JIT does -- see handle_highlevel_operation() in jit/hintannotator/model.py. A bientot, Armin. ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Mixed modules for both PyPy and CPython
Hi Holger, On Fri, Apr 14, 2006 at 07:44:12PM +0200, holger krekel wrote: Hum, i wonder how strongly opposed these explicit versus implicit level separation models need to be. Yes, you're right here. It's mostly about what we need to do next: we must choose one of the two models and develop it enough, until it becomes useful for PyPy and CPython alike. We could possibly do both models in parallel, but I'm not sure it's the best way forward at this point. Is it not possible to support a programming model that can mostly avoid knowing about interpreter versus application level distinction without extending/refactoring the annotator? Sure, no-one thinks about refactoring the annotator. The implicit model already works, by using the existing support for SomeObjects, completed by Christian over the time. It's a bit hackish, though, and we'll definitely need ways to control where SomeObjects are expected or not. At the moment, what makes me reluctant to continue with the implicit model are two other issues: on the one hand, it's unclear how it would work for PyPy (it works for CPython only); and there are many language design issues ahead that I'd rather avoid for the time being. explicit approach nicely working before experimenting with where we can go from there, right? Yes, exactly my opinion. (Btw, i wouldn't mind if such glue code would not allow all possible interactions - our primary goal is not to provide seemless integration with CPython here). I'm not too worried about this. Our mixed-module model already supports mostly any kind of interaction, including defining new types with properties and overridden operations. The path to support the same for CPython extension modules is more or less clear, and very incremental. A bientot, Armin. ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
[pypy-dev] ext module testing modes
Hi Armin, On Sat, Apr 15, 2006 at 13:02 +0200, [EMAIL PROTECTED] wrote: Author: arigo Date: Sat Apr 15 13:02:14 2006 New Revision: 25852 Added: pypy/dist/pypy/rpython/rctypes/socketmodule/test_addr.py (contents, props changed) Modified: pypy/dist/pypy/rpython/rctypes/socketmodule/_socket.py pypy/dist/pypy/rpython/rctypes/socketmodule/ctypes_socket.py Log: Very incomplete implementation of getaddrinfo(), with a test (only works on on-line machines so far). The idea is that rctypes should now support all ctypes constructions that were necessary. I will start a regular mixed-module _socket based on this, but first we need to figure out how to best test mixed-modules based on ctypes -- ideally, they should be testable and compilable without the rest of the PyPy interpreter... regarding py.test support: I think eventually we may have the following testing distinctions regarding ext modules: - test mixed module with std objspace (running on top of cpython) (current default) - test mixed module with cpy-objspace connected to CPython runtime via rctypes - test mixed module on top of pypy-c I guess the second testing mode could be specified by --objspace=cpy and for the third we may simply allow to specify --appdirect which would trigger application level tests to run directly on the executable instead through pypy interpreter indirection. (with --exec=/path/to/executable you can already point to pypy-c but PyPy does not support enough for py.test to run this way). makes sense? holger ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Mixed modules for both PyPy and CPython
Hi Armin, On Sat, Apr 15, 2006 at 15:09 +0200, Armin Rigo wrote: On Fri, Apr 14, 2006 at 07:44:12PM +0200, holger krekel wrote: Hum, i wonder how strongly opposed these explicit versus implicit level separation models need to be. Yes, you're right here. It's mostly about what we need to do next: we must choose one of the two models and develop it enough, until it becomes useful for PyPy and CPython alike. We could possibly do both models in parallel, but I'm not sure it's the best way forward at this point. Is it not possible to support a programming model that can mostly avoid knowing about interpreter versus application level distinction without extending/refactoring the annotator? Sure, no-one thinks about refactoring the annotator. The implicit model already works, by using the existing support for SomeObjects, completed by Christian over the time. It's a bit hackish, though, and we'll definitely need ways to control where SomeObjects are expected or not. At the moment, what makes me reluctant to continue with the implicit model are two other issues: on the one hand, it's unclear how it would work for PyPy (it works for CPython only); and there are many language design issues ahead that I'd rather avoid for the time being. I agree but may have a somewhat different idea in mind when talking about a more implicit model: namely assuming that all objects live within the current RPython model (no SomeObject's whatsoever) and providing explicit interactions (like gateway.interp2app), exposing of type definitions etc. explicit approach nicely working before experimenting with where we can go from there, right? Yes, exactly my opinion. (Btw, i wouldn't mind if such glue code would not allow all possible interactions - our primary goal is not to provide seemless integration with CPython here). I'm not too worried about this. Our mixed-module model already supports mostly any kind of interaction, including defining new types with properties and overridden operations. The path to support the same for CPython extension modules is more or less clear, and very incremental. Yes, the mixed modules (and interpreter/gateway's, typedef's) support interaction but by rather explicitely programming the machinery. With glue code i mean code where the user does not need to know about such machinery so much. IOW, the question is which implicit models (as seen from the ext module programmer) are possible without having SomeObjects around? holger ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
Re: [pypy-dev] Mixed modules for both PyPy and CPython
On Sat, Apr 15, 2006 at 18:23 +0200, Armin Rigo wrote: On Sat, Apr 15, 2006 at 03:58:09PM +0200, holger krekel wrote: I agree but may have a somewhat different idea in mind when talking about a more implicit model: (...) Ok, then we agree everywhere -- short of a confusing terminology: we gave the names implicit levels and explicit levels to very precise things and now you're calling an implicit model something that is inbetween :-) indeed, i wasn't quite explicit enough, i guess :) IMO implicit and explicit do not denote a binary property but there rather can be quantities of implicit or explicit, therefore the more in more implicit model. holger ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
[pypy-dev] Re: Mixed modules for both PyPy and CPython
Hello, holger krekel wrote: Second, comments on py3k list indicated that secure python is difficult because of a) introspection, b) type inference, and c) GIL acquisition. Hum, this list looks a bit weird to me. Could you state what the actual attacks are for which security measures are discussed? Or which use cases are people on py3k having in mind? This is an amalgam of several different posts (and maybe different threads) but here goes: In the thread Will we have a true restricted exec environment for python 3000, Vineet Jain asked for a restricted mode which would 1. Limit the memory consumed by the script 2. Limit access to file system and other system resources 3. Limit cpu time that the script will take 4. Be able to specify which modules are available for import. In responses to that request, various people commented on the difficulties of implementing such a restricted mode. On that thread, several people had the same idea I had, to try to use PyPy for this purpose - however, it didn't look like many people were up-to-date reading both lists (and thus familiar-ish with PyPy's execution model). A) Introspection Nick Coghlan stated that: I'm interested, but I'm also aware of how much work it would be. I'm disinclined to trust any mechanism which allows the untrusted code to run in the same process, as the implications of being able to do: self.__class__.__mro__[-1].__subtypes__() are somewhat staggering, and designing an in-process sandbox to cope with that is a big ask (and demonstrating that the sandbox actually *achieves* that goal is even tougher). Vineet volunteered with a proposal to start a light python subinterpreter, which would be controlled by the main interpreter. Nick countered, But will it allow you to use numbers or strings? If yes, then you can get to object(), and hence to pretty much whatever C builtins you want. So its not enough to try to hide dangerous builtins like file(), you want to remove them from the light version entirely (routing all file system and network access requests through the main application). But if the file objects are gone, what happens to the Python machinery that relies on them (like import)? Python's powerful introspection is a severe drawback from a security POV - it is *really* hard to make a user stay in a box you put them in without crippling some part of the language as a side effect. Thus, in CPy, allowing someone to access a C type effectively opens up all the C types. In PyPy, however, each type is effectively in its own box. Further, PyPy already has a structure that can deal with these sorts of accesses: the flowgraph. Operations in PyPy come about because of traversals of the graph - certain branches of the graph could be restricted or proxied out to a trusted interpreter. B) GIL Acquisition Another person suggested leveraging the multiple subinterpreter code which already exists in CPython to create a restricted-exec interpreter. MvL noted that GIL acquisition made that difficult: Part of the problem is that it doesn't really work. Some objects *are* shared across interpreters, such as global objects in extension modules (extension modules are initialized only once). I believe that the GIL management code (for acquiring the GIL out of nowhere) breaks if there are multiple interpreters. C) Type inference I tried to find the thread for this one - its not from the Py3K list - but I recall a couple years ago someone attempting to make an rexec version of python. One of the comments that I recall from that discussion had to do with understanding what types were being manipulated. I believe there was an example somewhat like operator.add is trusted class A: def __add__(self, other): ... something evil here ... a, b = A(), 1 a + b [something evil happens] However, this is a foggy memory that I have so far been unable to substantiate. Thanks, VanL ___ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev