Re: [Haskell-cafe] Compiling arbitrary Haskell code

2013-10-14 Thread Daniil Frumin
For those who are interested (and I already chatted with Chris on IRC),
I've implemented a pastebin that is able to (among some other things) to
run arbitrary Haskell code: http://paste.hskll.org/
I've also developed a 'restricted-workers' library for managing processes
that should run in secured environment. I've described some of my endeavors
in a blog post:
http://parenz.wordpress.com/2013/07/15/interactive-diagrams-gsoc-progress-report/

Bottom line: proper restrictions are hard, the necessary tools operate on a
low level, there are some caveats too


On Sat, Oct 12, 2013 at 12:30 AM, Christopher Done chrisd...@gmail.comwrote:

 Is there a definitive list of things in GHC that are unsafe to
 _compile_ if I were to take an arbitrary module and compile it?

 E.g. off the top of my head, things that might be dangerous:

 * TemplateHaskell/QuasiQuotes -- obviously
 * Are rules safe?
 * #includes — I presume there's some security risk with including any old
 file?
 * FFI -- speaks for itself

 I'm interested in the idea of compiling Haskell code on lpaste.org,
 for core, rule firings, maybe even Th expansion, etc. When sandboxing
 code that I'm running, it's really easy if I whitelist what code is
 available (parsing with HSE, whitelisting imports, extensions). The
 problem of infinite loops or too much allocation is fairly
 straight-forwardly solved by similar techniques applied in mueval.

 SafeHaskell helps a lot here, but suppose that I want to also allow
 TemplateHaskell, GeneralizedNewtypeDeriving and stuff like that,
 because a lot of real code uses those. They only seem to be restricted
 to prevent cheeky messing with APIs in ways the authors of the APIs
 didn't want -- but that shouldn't necessarily be a security—in terms
 of my system—problem, should it? Ideally I'd very strictly whitelist
 which modules are allowed to be used (e.g. a version of TH that
 doesn't have runIO), and extensions, and then compile any code that
 uses them.

 I'd rather not have to setup a VM just to compile Haskell code safely.
 I'm willing to put some time in to investigate it, but if there's
 already previous work done for this, I'd appreciate any links.

 At the end of the day, there's always just supporting a subset of
 Haskell using SafeHaskell. I'm just curious about the more general
 case, for use-cases similar to my own.
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe




-- 
Sincerely yours,
-- Daniil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Compiling arbitrary Haskell code

2013-10-11 Thread Christopher Done
Is there a definitive list of things in GHC that are unsafe to
_compile_ if I were to take an arbitrary module and compile it?

E.g. off the top of my head, things that might be dangerous:

* TemplateHaskell/QuasiQuotes -- obviously
* Are rules safe?
* #includes — I presume there's some security risk with including any old file?
* FFI -- speaks for itself

I'm interested in the idea of compiling Haskell code on lpaste.org,
for core, rule firings, maybe even Th expansion, etc. When sandboxing
code that I'm running, it's really easy if I whitelist what code is
available (parsing with HSE, whitelisting imports, extensions). The
problem of infinite loops or too much allocation is fairly
straight-forwardly solved by similar techniques applied in mueval.

SafeHaskell helps a lot here, but suppose that I want to also allow
TemplateHaskell, GeneralizedNewtypeDeriving and stuff like that,
because a lot of real code uses those. They only seem to be restricted
to prevent cheeky messing with APIs in ways the authors of the APIs
didn't want -- but that shouldn't necessarily be a security—in terms
of my system—problem, should it? Ideally I'd very strictly whitelist
which modules are allowed to be used (e.g. a version of TH that
doesn't have runIO), and extensions, and then compile any code that
uses them.

I'd rather not have to setup a VM just to compile Haskell code safely.
I'm willing to put some time in to investigate it, but if there's
already previous work done for this, I'd appreciate any links.

At the end of the day, there's always just supporting a subset of
Haskell using SafeHaskell. I'm just curious about the more general
case, for use-cases similar to my own.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compiling arbitrary Haskell code

2013-10-11 Thread Jason Dagit
On Fri, Oct 11, 2013 at 1:30 PM, Christopher Done chrisd...@gmail.comwrote:

 Is there a definitive list of things in GHC that are unsafe to
 _compile_ if I were to take an arbitrary module and compile it?

 E.g. off the top of my head, things that might be dangerous:

 * TemplateHaskell/QuasiQuotes -- obviously
 * Are rules safe?
 * #includes — I presume there's some security risk with including any old
 file?
 * FFI -- speaks for itself


It really depends on the security properties you want to maintain. That
should inform your policy. For example, denial of service vs. leaking
information (like password db) vs. allowing yourself to become part of a
botnet. There are lots of things to consider here.

For example, lambdabot has always disallowed IO and thus needs to disallow
unsafeCoerce/unsafePerformIO/unsafeInterleaveIO and anything else that
introduces a backdoor in the type system. I think the list you have above
is a good start, but wouldn't be complete for lambdabot.



 I'm interested in the idea of compiling Haskell code on lpaste.org,
 for core, rule firings, maybe even Th expansion, etc. When sandboxing
 code that I'm running, it's really easy if I whitelist what code is
 available (parsing with HSE, whitelisting imports, extensions). The
 problem of infinite loops or too much allocation is fairly
 straight-forwardly solved by similar techniques applied in mueval.


What type of sandboxing do you plan to use and what limitations does it
have? For example, chroot jails can be defeated.



 SafeHaskell helps a lot here, but suppose that I want to also allow
 TemplateHaskell, GeneralizedNewtypeDeriving and stuff like that,
 because a lot of real code uses those. They only seem to be restricted
 to prevent cheeky messing with APIs in ways the authors of the APIs
 didn't want -- but that shouldn't necessarily be a security—in terms
 of my system—problem, should it? Ideally I'd very strictly whitelist
 which modules are allowed to be used (e.g. a version of TH that
 doesn't have runIO), and extensions, and then compile any code that
 uses them.


GND can be used to cause a segfault. I don't know if it can be used to
cause a more serious exploit, but I would be concerned that it can. Then
again, if you're already allowing TH or arbitrary IO then those are
probably much easier places to attack so it may not matter.



 I'd rather not have to setup a VM just to compile Haskell code safely.
 I'm willing to put some time in to investigate it, but if there's
 already previous work done for this, I'd appreciate any links.


I don't know how well it's documented, but lambdabot has a long history of
restricting the Haskell it accepts to make it safe. Other things to look
at, google native client (to see how they approach sandboxing), and geordi
the C++ IRC bot.

In the native client case they do fancy tricks with segment registers (to
control where the sandboxed process can write to memory) and intercepting
system calls in the outer part of the process. They have the case where
they do everything in one process in one address space. You could imagine
porting the GHC RTS to run in native client (didn't someone start on that?)
and then using that to sandbox all your Haskell evaluation.



 At the end of the day, there's always just supporting a subset of
 Haskell using SafeHaskell. I'm just curious about the more general
 case, for use-cases similar to my own.


I think SafeHaskell is a reasonable starting place, but I don't think it
gives you a really strong guarantee yet. Everything that is inferred safe
probably is (I don't know of any exploits with that part of SafeHaskell).
In practice, you'll probably also want to use some trusted packages, but
that requires that none of the stuff your trust is exploitable.

I hope that helps,
Jason
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compiling arbitrary Haskell code

2013-10-11 Thread Aleksey Khudyakov

On 12.10.2013 00:30, Christopher Done wrote:

Is there a definitive list of things in GHC that are unsafe to
_compile_ if I were to take an arbitrary module and compile it?

E.g. off the top of my head, things that might be dangerous:

* TemplateHaskell/QuasiQuotes -- obviously
* Are rules safe?
* #includes — I presume there's some security risk with including any old file?
* FFI -- speaks for itself

I'm interested in the idea of compiling Haskell code on lpaste.org,
for core, rule firings, maybe even Th expansion, etc. When sandboxing
code that I'm running, it's really easy if I whitelist what code is
available (parsing with HSE, whitelisting imports, extensions). The
problem of infinite loops or too much allocation is fairly
straight-forwardly solved by similar techniques applied in mueval.

Pragma GHC_OPTIONS. You can add custom preprocessor for example bash and 
then interpret program as bash script. I think sandboing compiler

is a must. There are just too many handles and hooks to cater to all
possible uses. Some of them must be exploitable.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compiling arbitrary Haskell code

2013-10-11 Thread Johan Tibell
Whatever guarantees GHC offers (e.g. using Safe Haskell), I would always
run things like these in a sandbox. It's much better for security to
dissallow everything and then whitelist some things (e.g. let the sandbox
communicate with the rest of the world in some limited way) than the other
way around.

Same goes for running untrusted code.


On Fri, Oct 11, 2013 at 1:30 PM, Christopher Done chrisd...@gmail.comwrote:

 Is there a definitive list of things in GHC that are unsafe to
 _compile_ if I were to take an arbitrary module and compile it?

 E.g. off the top of my head, things that might be dangerous:

 * TemplateHaskell/QuasiQuotes -- obviously
 * Are rules safe?
 * #includes — I presume there's some security risk with including any old
 file?
 * FFI -- speaks for itself

 I'm interested in the idea of compiling Haskell code on lpaste.org,
 for core, rule firings, maybe even Th expansion, etc. When sandboxing
 code that I'm running, it's really easy if I whitelist what code is
 available (parsing with HSE, whitelisting imports, extensions). The
 problem of infinite loops or too much allocation is fairly
 straight-forwardly solved by similar techniques applied in mueval.

 SafeHaskell helps a lot here, but suppose that I want to also allow
 TemplateHaskell, GeneralizedNewtypeDeriving and stuff like that,
 because a lot of real code uses those. They only seem to be restricted
 to prevent cheeky messing with APIs in ways the authors of the APIs
 didn't want -- but that shouldn't necessarily be a security—in terms
 of my system—problem, should it? Ideally I'd very strictly whitelist
 which modules are allowed to be used (e.g. a version of TH that
 doesn't have runIO), and extensions, and then compile any code that
 uses them.

 I'd rather not have to setup a VM just to compile Haskell code safely.
 I'm willing to put some time in to investigate it, but if there's
 already previous work done for this, I'd appreciate any links.

 At the end of the day, there's always just supporting a subset of
 Haskell using SafeHaskell. I'm just curious about the more general
 case, for use-cases similar to my own.
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compiling arbitrary Haskell code

2013-10-11 Thread Christopher Done
On 12 October 2013 01:19, Johan Tibell johan.tib...@gmail.com wrote:
 Whatever guarantees GHC offers (e.g. using Safe Haskell), I would always run
 things like these in a sandbox. It's much better for security to dissallow
 everything and then whitelist some things (e.g. let the sandbox communicate
 with the rest of the world in some limited way) than the other way around.

Yeah, the impression I'm getting is that compiling pretty much
anything other than simple expressions (a la lambdabot) is that all
bets are off.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe