Re: [pypy-dev] Re: Mixed modules for both PyPy and CPython

2006-04-15 Thread holger krekel
Hi VanL, 

On Wed, Apr 12, 2006 at 13:45 -0600, VanL wrote:
 Carl Friedrich Bolz wrote:
 
  Careful. Restricted Python (or RPython) means something very
  specific in PyPy lingo. Restricted Python means that you stick to some
  restrictions in your coding style to make type inference possible. See
 
  
 http://codespeak.net/pypy/dist/pypy/doc/coding-guide.html#restricted-python
 
 
 
 Sorry - I knew that.  I just didn't make the connection from one side of 
 my brain to another.  Maybe secure python (spy)?

just as a side note: secure is a very vague term, technically (and 
politically).  The professor i learned with, warned against talking 
about security without specifying against which kind of attacks and about 
which scenarios one is talking about it. 

Did you see my security related posting a few days ago, btw? 

 What made me think this would work are a couple of comments made here 
 and on the py3k list.
 First, Armin (about a year ago, I think)made some comments about the 
 objectspace abstraction allowing a Remote Object Space and allowing 
 different object spaces to participate in the evaluation of code.

Right, that is still an open topic, a bit touched by the
parallel discussion on this list of integrating PyPy (and
RPython) with CPython. 

 Second, comments on py3k list indicated that secure python is difficult 
 because of a) introspection, b) type inference, and c) GIL acquisition. 

Hum, this list looks a bit weird to me.  Could you state what
the actual attacks are for which security measures are discussed? 
Or which use cases are people on py3k having in mind? 

cheers  thanks, 

holger
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Summer of Code 2006

2006-04-15 Thread Antonio Cuni

Niklaus Haldimann wrote:

Hi there

Google is doing Summer of Code again this year: http://code.google.com/soc/

It would be possible to enter PyPy directly as a mentoring organization
this time, instead of going through the PSF. Last year, student slots
were given to mentoring organizations proportional to the number of
applications. If there are enough applications for PyPy proper that
might bring more students on board than taking slots from the PSF pool
as last year (but maybe PyPy's influence in the PSF is big enough, and
this doesn't really matter).


That's a good news! I'd like to participate to soc for doing pypy's 
stuffs, I hope I'll able to apply (and to win :-)).


ciao Anto
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Avoiding code duplication

2006-04-15 Thread Antonio Cuni

Armin Rigo wrote:


I think this is kind-of-reasonable.  The ADT method approach of the
lltypesystem was introduced late during the development of the rtyper;
by now, it would be reasonable to define common method names between the
ADT methods of the lltypesystem and the GENERIC_METHODS of the
ootypesystem.

I am unsure about the performance penalty.  The current version of many
ll helpers, for example, read the 'items' pointer only once and reuse
it; if this gets replaced by ADT methods like 'getitem_nonneg()', it
means that althought the call is probably inlined there is still the
overhead of reading 'items' through each iteration in the list.  Who
knows, maybe C compilers will notice and move the read out of the loop.
Just give it a try on a small example like ll_listindex(), I guess...


Well,
as we decided on #pypy I've changed the ADT interface. As I wrote in the 
commit log:



The interface of ListRepr and FixedSizeListRepr has changed: two
accessor methods has been added: ll_getitem_fast and
ll_setitem_fast. They should be used instead of the ll_items()[index]
idiom: that way when ootypesystem's list will support that interface
we will able to write function useable with both typesystem with no
modification.

The various ll_* helper function has been adapted to use the new
interface. Moreover function that accessed directly to the l.length
field has been changed to call the ll_length() method instead, for
the same reasons as above.


The next step is to rename ootypesystem's list _GENERIC_METHODS to match 
the ADT methods in lltypesystem's list, then we could try to share most 
of ll_* function that currently belongs only to lltypesystem/rlist.py.

I hope I will do it tomorrow.


A different comment: as you mentioned on IRC it would be nice if the
back-end could choose which methods it implements natively.  At one
point there was the idea that maybe the 'oopspec' attributes that
started to show up in lltypesystem/rlist.py (used by the JIT only) could
be useful in this respect.  If I remember correctly, the idea didn't
work out because of the different 'lowleveltype' needed, and the
difference in the interface.  Merging the ADT method names of lltyped
lists and the GENERIC_METHODS of ootyped lists could be a step in this
direction again.  The interesting point is that each oo back-end could
then choose to special-case the ll_xxx() functions with the oopspecs
that they recognize, and just translate the other ones normally.  (The
ll back-ends always translate them all.)


I saw that 'oopspec' attributes, but I didn't understand the exact 
semantic; your proposal sounds reasonable to me: if I can figure out 
correctly this way the typesystem specific code would be reduced to the 
minimum and will help to port other Repr such as rdict to ootypesystem, 
too. I'll investigate a bit in this direction as soon as I can.


good Easter to all,
ciao Anto
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Avoiding code duplication

2006-04-15 Thread Armin Rigo
Hi Antonio,

On Sat, Apr 15, 2006 at 12:59:07PM +0200, Antonio Cuni wrote:
 I saw that 'oopspec' attributes, but I didn't understand the exact 
 semantic;

The oopspec string tells what is the abstract list operation that this
particular ll_*() function implement.  For example:

def ll_prepend(l, newitem):
...
ll_prepend.oopspec = 'list.insert(l, 0, newitem)'

means that ll_prepend() is equivalent to an insert with the index set to
zero.  In the stirng, the pseudo-arguments between the ( ) are either
real argument names of the ll_ function, or constants.

So for example, if a backend has got its own way to implement the
insert() calls in general, it could figure out from the oopspec that the
ll_prepend() helper can be replaced by a custom stub invoking the
backend's own version of insertion with an index of 0.  That's
essentially what the JIT does -- see handle_highlevel_operation() in
jit/hintannotator/model.py.


A bientot,

Armin.
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Mixed modules for both PyPy and CPython

2006-04-15 Thread Armin Rigo
Hi Holger,

On Fri, Apr 14, 2006 at 07:44:12PM +0200, holger krekel wrote:
 Hum, i wonder how strongly opposed these explicit versus implicit
 level separation models need to be.

Yes, you're right here.  It's mostly about what we need to do next: we
must choose one of the two models and develop it enough, until it
becomes useful for PyPy and CPython alike.  We could possibly do both
models in parallel, but I'm not sure it's the best way forward at this
point.

 Is it not possible to support a 
 programming model that can mostly avoid knowing about interpreter versus 
 application level distinction without extending/refactoring the annotator?

Sure, no-one thinks about refactoring the annotator.  The implicit model
already works, by using the existing support for SomeObjects, completed
by Christian over the time.  It's a bit hackish, though, and we'll
definitely need ways to control where SomeObjects are expected or not.
At the moment, what makes me reluctant to continue with the implicit
model are two other issues: on the one hand, it's unclear how it would
work for PyPy (it works for CPython only); and there are many language
design issues ahead that I'd rather avoid for the time being.

 explicit approach nicely working before experimenting with where 
 we can go from there, right?

Yes, exactly my opinion.

 (Btw, i wouldn't
 mind if such glue code would not allow all possible interactions - 
 our primary goal is not to provide seemless integration with CPython here).

I'm not too worried about this.  Our mixed-module model already supports
mostly any kind of interaction, including defining new types with
properties and overridden operations.  The path to support the same for
CPython extension modules is more or less clear, and very incremental.


A bientot,

Armin.
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


[pypy-dev] ext module testing modes

2006-04-15 Thread holger krekel
Hi Armin, 

On Sat, Apr 15, 2006 at 13:02 +0200, [EMAIL PROTECTED] wrote:
 Author: arigo
 Date: Sat Apr 15 13:02:14 2006
 New Revision: 25852
 
 Added:
pypy/dist/pypy/rpython/rctypes/socketmodule/test_addr.py   (contents, 
 props changed)
 Modified:
pypy/dist/pypy/rpython/rctypes/socketmodule/_socket.py
pypy/dist/pypy/rpython/rctypes/socketmodule/ctypes_socket.py
 Log:
 Very incomplete implementation of getaddrinfo(), with a test
 (only works on on-line machines so far).  The idea is that rctypes
 should now support all ctypes constructions that were necessary.
 I will start a regular mixed-module _socket based on this, but
 first we need to figure out how to best test mixed-modules based
 on ctypes -- ideally, they should be testable and compilable
 without the rest of the PyPy interpreter...

regarding py.test support: I think eventually we may have the
following testing distinctions regarding ext modules: 

- test mixed module with std objspace (running on top of cpython)
  (current default)

- test mixed module with cpy-objspace connected to CPython
  runtime via rctypes 

- test mixed module on top of pypy-c 

I guess the second testing mode could be specified by --objspace=cpy 
and for the third we may simply allow to specify --appdirect 
which would trigger application level tests to run directly on the 
executable instead through pypy interpreter indirection.  (with 
--exec=/path/to/executable you can already point to pypy-c but
PyPy does not support enough for py.test to run this way). 

makes sense? 

holger
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Mixed modules for both PyPy and CPython

2006-04-15 Thread holger krekel
Hi Armin, 

On Sat, Apr 15, 2006 at 15:09 +0200, Armin Rigo wrote:
 On Fri, Apr 14, 2006 at 07:44:12PM +0200, holger krekel wrote:
  Hum, i wonder how strongly opposed these explicit versus implicit
  level separation models need to be.
 
 Yes, you're right here.  It's mostly about what we need to do next: we
 must choose one of the two models and develop it enough, until it
 becomes useful for PyPy and CPython alike.  We could possibly do both
 models in parallel, but I'm not sure it's the best way forward at this
 point.
 
  Is it not possible to support a 
  programming model that can mostly avoid knowing about interpreter versus 
  application level distinction without extending/refactoring the annotator?
 
 Sure, no-one thinks about refactoring the annotator.  The implicit model
 already works, by using the existing support for SomeObjects, completed
 by Christian over the time.  It's a bit hackish, though, and we'll
 definitely need ways to control where SomeObjects are expected or not.
 At the moment, what makes me reluctant to continue with the implicit
 model are two other issues: on the one hand, it's unclear how it would
 work for PyPy (it works for CPython only); and there are many language
 design issues ahead that I'd rather avoid for the time being.

I agree but may have a somewhat different idea in mind when 
talking about a more implicit model: namely assuming that all objects live 
within the current RPython model (no SomeObject's whatsoever) and providing 
explicit interactions (like gateway.interp2app), exposing of type definitions
etc. 

  explicit approach nicely working before experimenting with where 
  we can go from there, right?
 
 Yes, exactly my opinion.
 
  (Btw, i wouldn't
  mind if such glue code would not allow all possible interactions - 
  our primary goal is not to provide seemless integration with CPython here).
 
 I'm not too worried about this.  Our mixed-module model already supports
 mostly any kind of interaction, including defining new types with
 properties and overridden operations.  The path to support the same for
 CPython extension modules is more or less clear, and very incremental.

Yes, the mixed modules (and interpreter/gateway's, typedef's) support
interaction but by rather explicitely programming the machinery. 
With glue code i mean code where the user does not need to
know about such machinery so much.  IOW, the question is 
which implicit models (as seen from the ext module programmer) 
are possible without having SomeObjects around? 

holger
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


Re: [pypy-dev] Mixed modules for both PyPy and CPython

2006-04-15 Thread holger krekel
On Sat, Apr 15, 2006 at 18:23 +0200, Armin Rigo wrote:
 On Sat, Apr 15, 2006 at 03:58:09PM +0200, holger krekel wrote:
  I agree but may have a somewhat different idea in mind when 
  talking about a more implicit model: (...)
 
 Ok, then we agree everywhere -- short of a confusing terminology: we
 gave the names implicit levels and explicit levels to very precise
 things and now you're calling an implicit model something that is
 inbetween :-)

indeed, i wasn't quite explicit enough, i guess :) 
IMO implicit and explicit do not denote a binary property 
but there rather can be quantities of implicit or explicit, 
therefore the more in more implicit model. 

holger
___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev


[pypy-dev] Re: Mixed modules for both PyPy and CPython

2006-04-15 Thread VanL

Hello,

holger krekel wrote:

Second, comments on py3k list indicated that secure python is difficult 
because of a) introspection, b) type inference, and c) GIL acquisition. 


Hum, this list looks a bit weird to me.  Could you state what
the actual attacks are for which security measures are discussed? 
Or which use cases are people on py3k having in mind? 


This is an amalgam of several different posts (and maybe different 
threads) but here goes:


In the thread Will we have a true restricted exec environment for 
python 3000, Vineet Jain asked for a restricted mode which would


1. Limit the memory consumed by the script
2. Limit access to file system and other system resources
3. Limit cpu time that the script will take
4. Be able to specify which modules are available for import.

In responses to that request, various people commented on the 
difficulties of implementing such a restricted mode.  On that thread, 
several people had the same idea I had, to try to use PyPy for this 
purpose - however, it didn't look like many people were up-to-date 
reading both lists (and thus familiar-ish with PyPy's execution model).



A) Introspection


Nick Coghlan stated that:

I'm interested, but I'm also aware of how much work it would be. I'm 
disinclined to trust any mechanism which allows the untrusted code to 
run in  the same process, as the implications of being able to do:


self.__class__.__mro__[-1].__subtypes__()

are somewhat staggering, and designing an in-process sandbox to cope 
with that is a big ask (and demonstrating that the sandbox actually 
*achieves* that goal is even tougher).


Vineet volunteered with a proposal to start a light python 
subinterpreter, which would be controlled by the main interpreter.


Nick countered, But will it allow you to use numbers or strings?

If yes, then you can get to object(), and hence to pretty much whatever 
C builtins you want. So its not enough to try to hide dangerous builtins 
like file(), you want to remove them from the light version entirely 
(routing all file system and network access requests through the main 
application). But if the file objects are gone, what happens to the 
Python machinery that relies on them (like import)?


Python's powerful introspection is a severe drawback from a security POV 
- it is *really* hard to make a user stay in a box you put them in 
without crippling some part of the language as a side effect.


Thus, in CPy, allowing someone to access a C type effectively opens up 
all the C types.  In PyPy, however, each type is effectively in its own 
box.  Further, PyPy already has a structure that can deal with these 
sorts of accesses: the flowgraph.  Operations in PyPy come about because 
of traversals of the graph - certain branches of the graph could be 
restricted or proxied out to a trusted interpreter.



B) GIL Acquisition


Another person suggested leveraging the multiple subinterpreter code 
which already exists in CPython to create a restricted-exec interpreter. 
 MvL noted that GIL acquisition made that difficult:


Part of the problem is that it doesn't really work. Some objects *are* 
shared across interpreters, such as global objects in extension modules 
(extension modules are initialized only once). I believe that the GIL 
management code (for acquiring the GIL out of nowhere) breaks if there 
are multiple interpreters.



C) Type inference

I tried to find the thread for this one - its not from the Py3K list - 
but I recall a couple years ago someone attempting to make an rexec 
version of python.  One of the comments that I recall from that 
discussion had to do with understanding what types were being 
manipulated.  I believe there was an example somewhat like


operator.add is trusted

class A:
   def __add__(self, other):
  ... something evil here ...

a, b = A(), 1

a + b
[something evil happens]

However, this is a foggy memory that I have so far been unable to 
substantiate.


Thanks,

VanL

___
pypy-dev@codespeak.net
http://codespeak.net/mailman/listinfo/pypy-dev