Re: RFC: (ice-9 sandbox)

2017-04-18 Thread Andy Wingo
On Fri 31 Mar 2017 11:27, Andy Wingo  writes:

> Attached is a module that can evaluate an expression within a sandbox.

Pushed to master.  See NEWS here, where I include a couple more entries
of note:

* Notable changes

** New sandboxed evaluation facility

Guile now has a way to execute untrusted code in a safe way.  See
"Sandboxed Evaluation" in the manual for full details, including some
important notes on limitations on the sandbox's ability to prevent
resource exhaustion.

** All literal constants are read-only

According to the Scheme language definition, it is an error to attempt
to mutate a "constant literal".  A constant literal is data that is a
literal quoted part of a program.  For example, all of these are errors:

  (set-car! '(1 . 2) 42)
  (append! '(1 2 3) '(4 5 6))
  (vector-set! '#(a b c) 1 'B)

Guile takes advantage of this provision of Scheme to deduplicate shared
structure in constant literals within a compilation unit, and to
allocate constant data directly in the compiled object file.  If the
data needs no relocation at run-time, as is the case for pairs or
vectors that only contain immediate values, then the data can actually
be shared between different Guile processes, using the operating
system's virtual memory facilities.

However, in Guile 2.2.0, constants that needed relocation were actually
mutable -- though (vector-set! '#(a b c) 1 'B) was an error, Guile
wouldn't actually cause an exception to be raised, silently allowing the
mutation.  This could affect future users of this constant, or indeed of
any constant in the compilation unit that shared structure with the
original vector.

Additionally, attempting to mutate constant literals mapped in the
read-only section of files would actually cause a segmentation fault, as
the operating system prohibits writes to read-only memory.  "Don't do
that" isn't a very nice solution :)

Both of these problems have been fixed.  Any attempt to mutate a
constant literal will now raise an exception, whether the constant needs
relocation or not.

** Syntax objects are now a distinct type

It used to be that syntax objects were represented as a tagged vector.
These values could be forged by users to break scoping abstractions,
preventing the implementation of sandboxing facilities in Guile.  We are
as embarrassed about the previous situation as we pleased are about the
fact that we've fixed it.

Unfortunately, during the 2.2 stable series (or at least during part of
it), we need to support files compiled with Guile 2.2.0.  These files
may contain macros that contain legacy syntax object constants.  See the
discussion of "allow-legacy-syntax-objects?" in "Syntax Transformer
Helpers" in the manual for full details.

And the documentation formatted as text is below.  I guess a 2.2.1 is
coming soon.  Thanks all for the review!

Andy



1.12 Sandboxed Evaluation
-

Sometimes you would like to evaluate code that comes from an untrusted
party.  The safest way to do this is to buy a new computer, evaluate the
code on that computer, then throw the machine away.  However if you are
unwilling to take this simple approach, Guile does include a limited
"sandbox" facility that can allow untrusted code to be evaluated with
some confidence.

   To use the sandboxed evaluator, load its module:

 (use-modules (ice-9 sandbox))

   Guile's sandboxing facility starts with the ability to restrict the
time and space used by a piece of code.

 -- Scheme Procedure: call-with-time-limit limit thunk limit-reached
 Call THUNK, but cancel it if LIMIT seconds of wall-clock time have
 elapsed.  If the computation is cancelled, call LIMIT-REACHED in
 tail position.  THUNK must not disable interrupts or prevent an
 abort via a 'dynamic-wind' unwind handler.

 -- Scheme Procedure: call-with-allocation-limit limit thunk
  limit-reached
 Call THUNK, but cancel it if LIMIT bytes have been allocated.  If
 the computation is cancelled, call LIMIT-REACHED in tail position.
 THUNK must not disable interrupts or prevent an abort via a
 'dynamic-wind' unwind handler.

 This limit applies to both stack and heap allocation.  The
 computation will not be aborted before LIMIT bytes have been
 allocated, but for the heap allocation limit, the check may be
 postponed until the next garbage collection.

 Note that as a current shortcoming, the heap size limit applies to
 all threads; concurrent allocation by other unrelated threads
 counts towards the allocation limit.

 -- Scheme Procedure: call-with-time-and-allocation-limits time-limit
  allocation-limit thunk
 Invoke THUNK in a dynamic extent in which its execution is limited
 to TIME-LIMIT seconds of wall-clock time, and its allocation 

Re: RFC: (ice-9 sandbox)

2017-04-17 Thread Nala Ginrut
Hmm...I didn't think about this security issue. And even if we may do some
verification in IR(say, CPS or lower level), it's insufficient to avoid
security issue, since front-end implementation may use cross module
function to mimic primitives for other languages.
Now I think maybe front-end writer has to write their own sandbox with
(ice-9 sandbox) if any necessary. :-)

Best regards.


2017年4月17日 16:07,"Andy Wingo" 写道:

> On Sat 15 Apr 2017 19:23, Nala Ginrut  writes:
>
> > Could you please add #:from keyword to evil-in-sand box to indicate
> > the language front-end? Don't forget there's multi-lang plan. :-)
>
> In theory yes, but I don't know how to make safe sandboxes in other
> languages.  ice-9 sandbox relies on the Scheme characteristic that the
> only capabilities granted to a program are those that are in scope.
> Other languages often have ambient capabilities -- like Bash for example
> where there's no way to not provide the pipe ("|") operator.  I think
> adding other languages should be an exercise for the reader :)
>
> Andy
>


Re: RFC: (ice-9 sandbox)

2017-04-17 Thread Andy Wingo
On Sat 15 Apr 2017 19:23, Nala Ginrut  writes:

> Could you please add #:from keyword to evil-in-sand box to indicate
> the language front-end? Don't forget there's multi-lang plan. :-)

In theory yes, but I don't know how to make safe sandboxes in other
languages.  ice-9 sandbox relies on the Scheme characteristic that the
only capabilities granted to a program are those that are in scope.
Other languages often have ambient capabilities -- like Bash for example
where there's no way to not provide the pipe ("|") operator.  I think
adding other languages should be an exercise for the reader :)

Andy



Re: RFC: (ice-9 sandbox)

2017-04-15 Thread Nala Ginrut
Hi Andy!
It's pretty cool!
Could you please add #:from keyword to evil-in-sand box to indicate the
language front-end? Don't forget there's multi-lang plan. :-)

Best regards.

Andy Wingo 于2017年3月31日周五 17:28写道:

> Hi,
>
> Attached is a module that can evaluate an expression within a sandbox.
> If the evaluation takes too long or allocates too much, it will be
> cancelled.  The evaluation will take place with respect to a module with
> a "safe" set of imports.  Those imports include most of the bindings
> available in a default Guile environment.  See the file below for full
> details and a number of caveats.
>
> Any thoughts?  I would like something like this for a web service that
> has to evaluate untrusted code.
>
> Andy
>
>


Re: RFC: (ice-9 sandbox)

2017-04-14 Thread Ludovic Courtès
Hi!

Andy Wingo  skribis:

> On Mon 03 Apr 2017 17:35, l...@gnu.org (Ludovic Courtès) writes:
>
>> Riastradh’s document at 
>> has this:
>>
>>   Affix asterisks to the beginning and end of a globally mutable
>>   variable.  This allows the reader of the program to recognize very
>>   easily that it is badly written!
>>
>> … but it doesn’t say anything about constants nor about %.
>>
>> It could be ‘all-pure-bindings’, or ‘*all-pure-bindings*’, or
>> ‘%all-pure-bindings’.  So, dunno, as you see fit!
>
> I feel like I would have less of a need for name sigils like *earmuffs*
> or %preficentiles if we had more reliably immutable data.

[...]

> However we it is possible to do a more expensive check to see if a pair
> is embedded in an ELF image (or the converse, that it is allocated on
> the GC heap).  I just looked in Guile and there are only a few dozen
> instances of set-car! in Guile's source and a bit more of set-cdr!, so
> it's conceivable to think of this being a check that we can make.
>
> If we are able to do this, we can avoid the whole discussion about
> SIGSEGV handlers.
>
> It would be nice of course to be able to cons an immutable pair on the
> heap -- so a simple GC_is_heap_ptr(x) check wouldn't suffice to prove
> immutability.  Not sure quite what the right solution would be there.
>
> FWIW, Racket uses four words for pairs: the type tag, the hash code, and
> the two fields.  Four words is I think the logical progression after 2
> given GC's object size granularity.  It would be nice to avoid having
> the extra words, but if we ever switched to a moving GC we would need
> space for a hash code I think.
>
> Thoughts on the plan for immutable literals?

My feeling is that using GC_is_heap_ptr or similar would be nicer than
adding bits to the type tags, because we’d need to add this read-only
bit for every type, and we could have bugs where we forget to check them
in some cases.

GC_is_heap_ptr is probably enough until we support immutable objects
allocated on the heap.

> Concretely for this use case, assuming that we can solve the immutable
> literal problem, I propose to remove sigils entirely.  Thoughts welcome
> here.

In practice I guess the funny characters will stay for a while.  :-)

But I agree that it’d be nice to have a generic way to represent
immutable objects.

Ludo’.



Re: RFC: (ice-9 sandbox)

2017-04-14 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Fri, Apr 14, 2017 at 12:52:19PM +0200, Andy Wingo wrote:

[...]

> Concretely for this use case, assuming that we can solve the immutable
> literal problem, I propose to remove sigils entirely.  Thoughts welcome
> here.

There's still the "cultural value" of such sigils, which eases the
communication between humans. That'll depend on what other Schemes
do, and how current pedagogical literature is set up. Readability
and all that. Cultures are bound to change, though.

Of course, really marking things as immutable (the "technical" bit)
is still very cool.

regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAljwvcUACgkQBcgs9XrR2kYZTACcDPuqBDCiuPT9Etz3YS1m6Mta
TT4AniJs2TRtp899aiuleeV1FqYo1be7
=nA1X
-END PGP SIGNATURE-



Re: RFC: (ice-9 sandbox)

2017-04-14 Thread Andy Wingo
On Thu 06 Apr 2017 23:41, Freja Nordsiek  writes:

> On the subject of ports and i/o, I have a few ideas. R6RS i/o in the
> (rnrs io ports) module generally requires the port to be explicitly
> given, rather than assuming current in or out if not given (though
> rnrs io simple does make those assumptions). For many, it would be
> impossible because they put the port as the first argument and a
> required second argument afterwards. Looking at module/io/ports.scm in
> Guile 2.2.x, it looks like the reading and writing procedures there
> should be safe. Obviously, nothing that opens a file should be used,
> nor the procedures to get current input, output, and error; but the
> rest can be used. And this includes string and bytevector ports, which
> could be very useful in the sandbox (I don't know about anyone else,
> but I use string ports all the time).
>
> One question, is there a particular reason that guard is not exported?
> It doesn't seem like it is as nasty as dynamic-wind with trying to
> terminate, though maybe I am just not seeing how it could be used to
> prevent the sandbox terminating the process. Having at least one
> exception handling binding might be very helpful in a sandbox.

These questions are related.  There is nothing unsafe about "guard"
specifically.  Indeed the sandbox environment has "catch" and similar
things.  "guard" isn't in this default set because currently the set of
bindings that (ice-9 sandbox) offers in *all-pure-and-impure-bindings*
is subset of the bindings that are available by default.  "guard" has to
be imported via srfi-34.  Likewise for r6rs port procedures.  I think
it's reasonable to have this limitation -- otherwise there's no point at
which to stop.  Other binding sets are of course possible.

I would of course like I/O in the sandbox :) We could have versions of
"display" et al that require their port argument; that would be a
consistent with the strict-subset criteria.

Andy



Re: RFC: (ice-9 sandbox)

2017-04-14 Thread Andy Wingo
On Mon 03 Apr 2017 17:35, l...@gnu.org (Ludovic Courtès) writes:

> Riastradh’s document at 
> has this:
>
>   Affix asterisks to the beginning and end of a globally mutable
>   variable.  This allows the reader of the program to recognize very
>   easily that it is badly written!
>
> … but it doesn’t say anything about constants nor about %.
>
> It could be ‘all-pure-bindings’, or ‘*all-pure-bindings*’, or
> ‘%all-pure-bindings’.  So, dunno, as you see fit!

I feel like I would have less of a need for name sigils like *earmuffs*
or %preficentiles if we had more reliably immutable data.

Right now one of the functions of these sigils is to tell the reader,
"Don't use append! on this data structure or you will cause spooky
action-at-a-distance!"

It sure would be nice to be able to use these values without worries of
this kind.  We don't have this immutability problem with strings because
our compiled string literals are marked as immutable, and string
mutators assert that the strings are mutable.  We should do the same for
all literal constants.

We currently can't add an immutable bit to pairs due to our tagging
scheme -- pairs are just two words.  But we can do this easily with
other data types: vectors, arrays, bytevectors, etc.  (If we want to do
this, anyway.)

However we it is possible to do a more expensive check to see if a pair
is embedded in an ELF image (or the converse, that it is allocated on
the GC heap).  I just looked in Guile and there are only a few dozen
instances of set-car! in Guile's source and a bit more of set-cdr!, so
it's conceivable to think of this being a check that we can make.

If we are able to do this, we can avoid the whole discussion about
SIGSEGV handlers.

It would be nice of course to be able to cons an immutable pair on the
heap -- so a simple GC_is_heap_ptr(x) check wouldn't suffice to prove
immutability.  Not sure quite what the right solution would be there.

FWIW, Racket uses four words for pairs: the type tag, the hash code, and
the two fields.  Four words is I think the logical progression after 2
given GC's object size granularity.  It would be nice to avoid having
the extra words, but if we ever switched to a moving GC we would need
space for a hash code I think.

Thoughts on the plan for immutable literals?

Concretely for this use case, assuming that we can solve the immutable
literal problem, I propose to remove sigils entirely.  Thoughts welcome
here.

Andy



Re: RFC: (ice-9 sandbox)

2017-04-06 Thread Freja Nordsiek
I took a look at the specific binding the sandbox makes available and
have a few thoughts.

I didn't see any problems with any of the pure bindings made
available, but I am only very familiar with basic R5RS, R6RS, and R7RS
bindings, not Guile extensions (yet, at least), so I can't comment on
many of them.

On the subject of ports and i/o, I have a few ideas. R6RS i/o in the
(rnrs io ports) module generally requires the port to be explicitly
given, rather than assuming current in or out if not given (though
rnrs io simple does make those assumptions). For many, it would be
impossible because they put the port as the first argument and a
required second argument afterwards. Looking at module/io/ports.scm in
Guile 2.2.x, it looks like the reading and writing procedures there
should be safe. Obviously, nothing that opens a file should be used,
nor the procedures to get current input, output, and error; but the
rest can be used. And this includes string and bytevector ports, which
could be very useful in the sandbox (I don't know about anyone else,
but I use string ports all the time).

One question, is there a particular reason that guard is not exported?
It doesn't seem like it is as nasty as dynamic-wind with trying to
terminate, though maybe I am just not seeing how it could be used to
prevent the sandbox terminating the process. Having at least one
exception handling binding might be very helpful in a sandbox.



Freja Nordsiek

On Fri, Mar 31, 2017 at 11:27 AM, Andy Wingo  wrote:
> Hi,
>
> Attached is a module that can evaluate an expression within a sandbox.
> If the evaluation takes too long or allocates too much, it will be
> cancelled.  The evaluation will take place with respect to a module with
> a "safe" set of imports.  Those imports include most of the bindings
> available in a default Guile environment.  See the file below for full
> details and a number of caveats.
>
> Any thoughts?  I would like something like this for a web service that
> has to evaluate untrusted code.
>
> Andy
>
>
> ;;; Sandboxed evaluation of Scheme code
>
> ;;; Copyright (C) 2017 Free Software Foundation, Inc.
>
>  This library is free software; you can redistribute it and/or
>  modify it under the terms of the GNU Lesser General Public
>  License as published by the Free Software Foundation; either
>  version 3 of the License, or (at your option) any later version.
> 
>  This library is distributed in the hope that it will be useful,
>  but WITHOUT ANY WARRANTY; without even the implied warranty of
>  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>  Lesser General Public License for more details.
> 
>  You should have received a copy of the GNU Lesser General Public
>  License along with this library; if not, write to the Free Software
>  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
>
> ;;; Commentary:
> ;;;
> ;;; Code:
>
> (define-module (ice-9 sandbox)
>   #:use-module (ice-9 control)
>   #:use-module (ice-9 match)
>   #:use-module (system vm vm)
>   #:export (call-with-time-limit
> call-with-allocation-limit
> call-with-time-and-allocation-limits
>
> eval-in-sandbox
> make-sandbox-module
>
> *alist-bindings*
> *array-bindings*
> *bit-bindings*
> *bitvector-bindings*
> *char-bindings*
> *char-set-bindings*
> *clock-bindings*
> *core-bindings*
> *error-bindings*
> *fluid-bindings*
> *hash-bindings*
> *iteration-bindings*
> *keyword-bindings*
> *list-bindings*
> *macro-bindings*
> *nil-bindings*
> *number-bindings*
> *pair-bindings*
> *predicate-bindings*
> *procedure-bindings*
> *promise-bindings*
> *prompt-bindings*
> *regexp-bindings*
> *sort-bindings*
> *srfi-4-bindings*
> *string-bindings*
> *symbol-bindings*
> *unspecified-bindings*
> *variable-bindings*
> *vector-bindings*
> *version-bindings*
>
> *mutating-alist-bindings*
> *mutating-array-bindings*
> *mutating-bitvector-bindings*
> *mutating-fluid-bindings*
> *mutating-hash-bindings*
> *mutating-list-bindings*
> *mutating-pair-bindings*
> *mutating-sort-bindings*
> *mutating-srfi-4-bindings*
> *mutating-string-bindings*
> *mutating-variable-bindings*
> *mutating-vector-bindings*
>
> *all-pure-bindings*
> *all-pure-and-impure-bindings*))
>
>
> (define (call-with-time-limit limit thunk limit-reached)
>   "Call @var{thunk}, but cancel it if @var{limit} seconds of 

Re: RFC: (ice-9 sandbox)

2017-04-03 Thread Ludovic Courtès
Andy Wingo  skribis:

> On Fri 31 Mar 2017 23:41, l...@gnu.org (Ludovic Courtès) writes:
>
>> Andy Wingo  skribis:
>>
>>> On Fri 31 Mar 2017 13:33, l...@gnu.org (Ludovic Courtès) writes:
>>
>> [...]
>>
> ;; These can only form part of a safe binding set if no mutable
> ;; pair is exposed to the sandbox.
> (define *mutating-pair-bindings*
>   '(((guile)
>  set-car!
>  set-cdr!)))

 When used on a literal pair (mapped read-only), these can cause a
 segfault.  Now since the code is ‘eval’d, the only literal pairs it can
 see are those passed by the caller I suppose, so this may be safe?
>>>
>>> Who knows.  I mean vector-set! can also cause segfaults.  I think we
>>> should fix that situation to throw an exception.
>>
>> Yes, that would be nice, though I suppose it’s currently tricky to
>> achieve no?  Maybe that newfangled ‘userfaultfd’ will save us all.
>
> Maybe :)  I mean it's possible now to catch SIGSEGV.  I just sent a
> patch to guile-devel; wdyt?  Needs docs & tests of course.

Neat! I’ll look into it.

> (define *all-pure-and-impure-bindings*
>   (append *all-pure-bindings*

 Last but not least: why all the stars?  :-)
 I’m used to ‘%something’.
>>>
>>> For me I read % as being pronounced "sys" and indicating internal
>>> bindings.  Why do you use it for globals?  Is it your proposal that we
>>> use it for globals?
>>
>> I tend to do that but I realize I must be a minority here.  Let it be
>> stars then.  :-)
>
> I think that like you, I learned Scheme conventions in an ad-hoc way,
> apeing conventions from many sources (Guile's own code, Common Lisp,
> random Scheme).  I would be happy if we could be a bit more purposeful
> about our conventions and I would be happy to change mine :)  %
> can work fine :)

I grepped Guile and it seems that stars are actually more common for
globals than % (I thought it was the opposite but as you say, I kind of
discovered/invented the conventions.)

Riastradh’s document at 
has this:

  Affix asterisks to the beginning and end of a globally mutable
  variable.  This allows the reader of the program to recognize very
  easily that it is badly written!

… but it doesn’t say anything about constants nor about %.

It could be ‘all-pure-bindings’, or ‘*all-pure-bindings*’, or
‘%all-pure-bindings’.  So, dunno, as you see fit!

Ludo’.



Re: RFC: (ice-9 sandbox)

2017-04-02 Thread Andy Wingo
On Fri 31 Mar 2017 23:41, l...@gnu.org (Ludovic Courtès) writes:

> Andy Wingo  skribis:
>
>> On Fri 31 Mar 2017 13:33, l...@gnu.org (Ludovic Courtès) writes:
>
> [...]
>
 ;; These can only form part of a safe binding set if no mutable
 ;; pair is exposed to the sandbox.
 (define *mutating-pair-bindings*
   '(((guile)
  set-car!
  set-cdr!)))
>>>
>>> When used on a literal pair (mapped read-only), these can cause a
>>> segfault.  Now since the code is ‘eval’d, the only literal pairs it can
>>> see are those passed by the caller I suppose, so this may be safe?
>>
>> Who knows.  I mean vector-set! can also cause segfaults.  I think we
>> should fix that situation to throw an exception.
>
> Yes, that would be nice, though I suppose it’s currently tricky to
> achieve no?  Maybe that newfangled ‘userfaultfd’ will save us all.

Maybe :)  I mean it's possible now to catch SIGSEGV.  I just sent a
patch to guile-devel; wdyt?  Needs docs & tests of course.

 (define *all-pure-and-impure-bindings*
   (append *all-pure-bindings*
>>>
>>> Last but not least: why all the stars?  :-)
>>> I’m used to ‘%something’.
>>
>> For me I read % as being pronounced "sys" and indicating internal
>> bindings.  Why do you use it for globals?  Is it your proposal that we
>> use it for globals?
>
> I tend to do that but I realize I must be a minority here.  Let it be
> stars then.  :-)

I think that like you, I learned Scheme conventions in an ad-hoc way,
apeing conventions from many sources (Guile's own code, Common Lisp,
random Scheme).  I would be happy if we could be a bit more purposeful
about our conventions and I would be happy to change mine :)  %
can work fine :)

Andy



Re: RFC: (ice-9 sandbox)

2017-04-01 Thread Christopher Allan Webber
Wow!  With this I suppose we could implement something like 
  http://mumble.net/~jar/pubs/secureos/secureos.html
?



Re: RFC: (ice-9 sandbox)

2017-03-31 Thread Ludovic Courtès
Andy Wingo  skribis:

> On Fri 31 Mar 2017 13:33, l...@gnu.org (Ludovic Courtès) writes:

[...]

>>> ;; These can only form part of a safe binding set if no mutable
>>> ;; pair is exposed to the sandbox.
>>> (define *mutating-pair-bindings*
>>>   '(((guile)
>>>  set-car!
>>>  set-cdr!)))
>>
>> When used on a literal pair (mapped read-only), these can cause a
>> segfault.  Now since the code is ‘eval’d, the only literal pairs it can
>> see are those passed by the caller I suppose, so this may be safe?
>
> Who knows.  I mean vector-set! can also cause segfaults.  I think we
> should fix that situation to throw an exception.

Yes, that would be nice, though I suppose it’s currently tricky to
achieve no?  Maybe that newfangled ‘userfaultfd’ will save us all.

>>> (define *all-pure-and-impure-bindings*
>>>   (append *all-pure-bindings*
>>
>> Last but not least: why all the stars?  :-)
>> I’m used to ‘%something’.
>
> For me I read % as being pronounced "sys" and indicating internal
> bindings.  Why do you use it for globals?  Is it your proposal that we
> use it for globals?

I tend to do that but I realize I must be a minority here.  Let it be
stars then.  :-)

Thanks for working on this!

Ludo’.



Re: RFC: (ice-9 sandbox)

2017-03-31 Thread Andy Wingo
On Fri 31 Mar 2017 13:33, l...@gnu.org (Ludovic Courtès) writes:

> Andy Wingo  skribis:
>
> The allocations that trigger ‘after-gc-hook’ could be caused by a
> separate thread, right?  That’s probably an acceptable limitation, but
> one to be aware of.

Ah yes, we should document this.  Sadly we just don't have very good
metrics here.

> Also, if the code does:
>
>   (make-bytevector (expt 2 32))
>
> then ‘after-gc-hook’ run too late, as the comment notes.

Yep.

> IIUC ‘@@’ in unavailable in the returned module, right?

Correct.  You could put it there but that's a bad ideal.

> Isn’t make-fresh-user-module + purify-module! equivalent to just
> (make-module)?

No, beautify-user-module! does a few more things too.  I was thinking
that we would want to be able to work on the public interface of the
module so I wanted to make sure it was there but in retrospect we don't
need it and can probably simplify things I guess.

>> ;; These can only form part of a safe binding set if no mutable
>> ;; pair is exposed to the sandbox.
>> (define *mutating-pair-bindings*
>>   '(((guile)
>>  set-car!
>>  set-cdr!)))
>
> When used on a literal pair (mapped read-only), these can cause a
> segfault.  Now since the code is ‘eval’d, the only literal pairs it can
> see are those passed by the caller I suppose, so this may be safe?

Who knows.  I mean vector-set! can also cause segfaults.  I think we
should fix that situation to throw an exception.

>> (define *all-pure-and-impure-bindings*
>>   (append *all-pure-bindings*
>
> Last but not least: why all the stars?  :-)
> I’m used to ‘%something’.

For me I read % as being pronounced "sys" and indicating internal
bindings.  Why do you use it for globals?  Is it your proposal that we
use it for globals?

Andy



Re: RFC: (ice-9 sandbox)

2017-03-31 Thread Ludovic Courtès
Hello!

Andy Wingo  skribis:

> Any thoughts?  I would like something like this for a web service that
> has to evaluate untrusted code.

Would be nice!

> (define (call-with-allocation-limit limit thunk limit-reached)
>   "Call @var{thunk}, but cancel it if @var{limit} bytes have been
> allocated.  If the computation is cancelled, call @var{limit-reached} in
> tail position.  @var{thunk} must not disable interrupts or prevent an
> abort via a @code{dynamic-wind} unwind handler.
>
> This limit applies to both stack and heap allocation.  The computation
> will not be aborted before @var{limit} bytes have been allocated, but
> for the heap allocation limit, the check may be postponed until the next 
> garbage collection."
>   (define (bytes-allocated) (assq-ref (gc-stats) 'heap-total-allocated))
>   (let ((zero (bytes-allocated))
> (tag (make-prompt-tag)))
> (define (check-allocation)
>   (when (< limit (- (bytes-allocated) zero))
> (abort-to-prompt tag)))
> (call-with-prompt tag
>   (lambda ()
> (dynamic-wind
>   (lambda ()
> (add-hook! after-gc-hook check-allocation))
>   (lambda ()
> (call-with-stack-overflow-handler
>  ;; The limit is in "words", which used to be 4 or 8 but now
>  ;; is always 8 bytes.
>  (floor/ limit 8)
>  thunk
>  (lambda () (abort-to-prompt tag
>   (lambda ()
> (remove-hook! after-gc-hook check-allocation
>   (lambda (k)
> (limit-reached)

The allocations that trigger ‘after-gc-hook’ could be caused by a
separate thread, right?  That’s probably an acceptable limitation, but
one to be aware of.

Also, if the code does:

  (make-bytevector (expt 2 32))

then ‘after-gc-hook’ run too late, as the comment notes.

> (define (make-sandbox-module bindings)
>   "Return a fresh module that only contains @var{bindings}.
>
> The @var{bindings} should be given as a list of import sets.  One import
> set is a list whose car names an interface, like @code{(ice-9 q)}, and
> whose cdr is a list of imports.  An import is either a bare symbol or a
> pair of @code{(@var{out} . @var{in})}, where @var{out} and @var{in} are
> both symbols and denote the name under which a binding is exported from
> the module, and the name under which to make the binding available,
> respectively."
>   (let ((m (make-fresh-user-module)))
> (purify-module! m)
> ;; FIXME: We want to have a module that will be collectable by GC.
> ;; Currently in Guile all modules are part of a single tree, and
> ;; once a module is part of that tree it will never be collected.
> ;; So we want to sever the module off from that tree.  However the
> ;; psyntax syntax expander currently needs to be able to look up
> ;; modules by name; being severed from the name tree prevents that
> ;; from happening.  So for now, each evaluation leaks memory :/
> ;; 
> ;; (sever-module! m)
> (module-use-interfaces! m
> (map (match-lambda
>((mod-name . bindings)
> (resolve-interface mod-name
>#:select bindings)))
>  bindings))
> m))

IIUC ‘@@’ in unavailable in the returned module, right?

--8<---cut here---start->8---
scheme@(guile-user)> (eval '(@@ (guile) resolve-interface)
   (let ((m (make-fresh-user-module)))
 (purify-module! m)
 m))
ERROR: In procedure %resolve-variable:
ERROR: Unbound variable: @@
--8<---cut here---end--->8---

Isn’t make-fresh-user-module + purify-module! equivalent to just
(make-module)?


> ;; These can only form part of a safe binding set if no mutable
> ;; pair is exposed to the sandbox.
> (define *mutating-pair-bindings*
>   '(((guile)
>  set-car!
>  set-cdr!)))

When used on a literal pair (mapped read-only), these can cause a
segfault.  Now since the code is ‘eval’d, the only literal pairs it can
see are those passed by the caller I suppose, so this may be safe?

> (define *all-pure-and-impure-bindings*
>   (append *all-pure-bindings*

Last but not least: why all the stars?  :-)
I’m used to ‘%something’.

Thank you!

Ludo’.




RFC: (ice-9 sandbox)

2017-03-31 Thread Andy Wingo
Hi,

Attached is a module that can evaluate an expression within a sandbox.
If the evaluation takes too long or allocates too much, it will be
cancelled.  The evaluation will take place with respect to a module with
a "safe" set of imports.  Those imports include most of the bindings
available in a default Guile environment.  See the file below for full
details and a number of caveats.

Any thoughts?  I would like something like this for a web service that
has to evaluate untrusted code.

Andy

;;; Sandboxed evaluation of Scheme code

;;; Copyright (C) 2017 Free Software Foundation, Inc.

 This library is free software; you can redistribute it and/or
 modify it under the terms of the GNU Lesser General Public
 License as published by the Free Software Foundation; either
 version 3 of the License, or (at your option) any later version.
 
 This library is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 Lesser General Public License for more details.
 
 You should have received a copy of the GNU Lesser General Public
 License along with this library; if not, write to the Free Software
 Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
USA

;;; Commentary:
;;; 
;;; Code:

(define-module (ice-9 sandbox)
  #:use-module (ice-9 control)
  #:use-module (ice-9 match)
  #:use-module (system vm vm)
  #:export (call-with-time-limit
call-with-allocation-limit
call-with-time-and-allocation-limits

eval-in-sandbox
make-sandbox-module

*alist-bindings*
*array-bindings*
*bit-bindings*
*bitvector-bindings*
*char-bindings*
*char-set-bindings*
*clock-bindings*
*core-bindings*
*error-bindings*
*fluid-bindings*
*hash-bindings*
*iteration-bindings*
*keyword-bindings*
*list-bindings*
*macro-bindings*
*nil-bindings*
*number-bindings*
*pair-bindings*
*predicate-bindings*
*procedure-bindings*
*promise-bindings*
*prompt-bindings*
*regexp-bindings*
*sort-bindings*
*srfi-4-bindings*
*string-bindings*
*symbol-bindings*
*unspecified-bindings*
*variable-bindings*
*vector-bindings*
*version-bindings*

*mutating-alist-bindings*
*mutating-array-bindings*
*mutating-bitvector-bindings*
*mutating-fluid-bindings*
*mutating-hash-bindings*
*mutating-list-bindings*
*mutating-pair-bindings*
*mutating-sort-bindings*
*mutating-srfi-4-bindings*
*mutating-string-bindings*
*mutating-variable-bindings*
*mutating-vector-bindings*

*all-pure-bindings*
*all-pure-and-impure-bindings*))


(define (call-with-time-limit limit thunk limit-reached)
  "Call @var{thunk}, but cancel it if @var{limit} seconds of wall-clock
time have elapsed.  If the computation is cancelled, call
@var{limit-reached} in tail position.  @var{thunk} must not disable
interrupts or prevent an abort via a @code{dynamic-wind} unwind
handler."
  ;; FIXME: use separate thread instead of sigalrm.
  (let ((limit-usecs (inexact->exact (round (* limit 1e6
(prev-sigalarm-handler #f)
(tag (make-prompt-tag)))
(call-with-prompt tag
  (lambda ()
(dynamic-wind
  (lambda ()
(set! prev-sigalarm-handler
  (sigaction SIGALRM (lambda (sig) (abort-to-prompt tag
(setitimer ITIMER_REAL 0 0 0 limit-usecs))
  thunk
  (lambda ()
(setitimer ITIMER_REAL 0 0 0 0)
(match prev-sigalarm-handler
  ((handler . flags)
   (sigaction SIGALRM handler flags))
  (lambda (k)
(limit-reached)

(define (call-with-allocation-limit limit thunk limit-reached)
  "Call @var{thunk}, but cancel it if @var{limit} bytes have been
allocated.  If the computation is cancelled, call @var{limit-reached} in
tail position.  @var{thunk} must not disable interrupts or prevent an
abort via a @code{dynamic-wind} unwind handler.

This limit applies to both stack and heap allocation.  The computation
will not be aborted before @var{limit} bytes have been allocated, but
for the heap allocation limit, the check may be postponed until the next 
garbage collection."
  (define (bytes-allocated) (assq-ref (gc-stats) 'heap-total-allocated))
  (let ((zero (bytes-allocated))
(tag (make-prompt-tag)))
(define (check-allocation)
  (when (< limit (- (bytes-allocated) zero))
(abort-to-prompt tag)))