Re: [fonc] misc: code security model
On Friday 12 August 2011 21:23:23 BGB wrote: newer Linux distros also seem to do similar to Windows, by default running everything under a default user account, but requiring authorization to elevate the rights of applications (to root), although albeit with considerably more retyping of passwords... Just thought I'd point out that, although Linux and Windows both seem to prompt the user in the same way, there's a distinction in *why* the user is prompted. With Windows the prompt is Do you really want to do this?, with Linux the prompt is Prove that you are userX (with sudo at least; some distros still prefer su, in which case it's Prove that you are root). Also, from working on Web sites with a lot of user generated content, I thought I'd point out that the permission-checking approach of BGB ends up full of guards: either if (has_permission(...)), and an equal number of else blocks to recover in case of failure; or throw PermissionDeniedException(...) and an equivalent number of catch blocks (or a smaller number of catches *if* the cleanup is straightforward, but this smells of GOTO). Either way, there's a lot of code paths to worry about, and rolling back in the case of failure. Worlds would be useful here (except for I/O) and the if (has_permission(...)) pattern could be represented by the Maybe monad (where foo(Nothing) = Nothing) . The object capability model wouldn't require as many checks, as the calls are always made, even if they're to dummy objects. This is similar to the Maybe monad in that foo(Nothing) = Nothing and dummy.foo() {return}. Cheers, Chris ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] misc: code security model
On 8/15/2011 3:06 AM, Chris Warburton wrote: On Friday 12 August 2011 21:23:23 BGB wrote: newer Linux distros also seem to do similar to Windows, by default running everything under a default user account, but requiring authorization to elevate the rights of applications (to root), although albeit with considerably more retyping of passwords... Just thought I'd point out that, although Linux and Windows both seem to prompt the user in the same way, there's a distinction in *why* the user is prompted. With Windows the prompt is Do you really want to do this?, with Linux the prompt is Prove that you are userX (with sudo at least; some distros still prefer su, in which case it's Prove that you are root). yep. Also, from working on Web sites with a lot of user generated content, I thought I'd point out that the permission-checking approach of BGB ends up full of guards: either if (has_permission(...)), and an equal number of else blocks to recover in case of failure; or throw PermissionDeniedException(...) and an equivalent number of catch blocks (or a smaller number of catches *if* the cleanup is straightforward, but this smells of GOTO). I added it in right at the VM level, so most of the checks are in strategic locations: when accessing an object; when applying a function object; ... so, no real need to place permission checks all over the place. it actually operates underneath the language, so no real need for explicit access checks in the language. while I was at it, I noticed/fixed a few issues relating to class/instance objects and delegation, and added support for the new idea of get* and set* properties. they differ from normal properties in that they also accept the variable name, so: function get*(name) { ... } function set*(name, value) { ... } also, properties are now handled before delegating, which although potentially not as good for performance, does have the advantage that that things will be handled in the correct order (naive but slightly faster would be to handle them when unwinding, essentially calling any property methods in reverse order). also had an idea for: function call*(name, args) { ... } but, looking over the object system code, it seems I already have a method named doesNotUnderstand which does this. checking the code (extrapolating signature): function doesNotUnderstand(name, ...) { ... } I could either do nothing, or a lazy option would be to make call* a shorthand for doesNotUnderstand, or adding the doesNotUnderstand special method to the language spec. Either way, there's a lot of code paths to worry about, and rolling back in the case of failure. Worlds would be useful here (except for I/O) and the if (has_permission(...)) pattern could be represented by the Maybe monad (where foo(Nothing) = Nothing) . it is not such an issue, as the way it is implemented doesn't really give a crap about code paths. technically, the VM generally just behaves as if the operation failed with an accessDenied status-code (vs a noSlot or noMethod status). the logic is actually not all that much different than from an unknown variable/method access (actually, most of the logic is shared in-common between these cases). there was, after all, already logic in place for dealing with the case when a given object does not contain a given field or method (usually handled by delegating when-possible, or giving up with an error status). potentially later, the VM will throw an exception in the failure + status-code case, but for now it doesn't really bother (work is still needed here). The object capability model wouldn't require as many checks, as the calls are always made, even if they're to dummy objects. This is similar to the Maybe monad in that foo(Nothing) = Nothing and dummy.foo() {return}. possibly, but one would have to write the code into the HLL, rather than building it into the VM. so, I guess a tradeoff: object-capability is better built into the HLL code, but would be more of a pain to address at the VM/interpreter level; permissions checking is easier to handle at the VM/interpreter level, but could turn into an ugly mess at the HLL level. much like threading: threading makes far more sense in the VM, especially when it sees the code mostly in terms of a CPS (Continuation Passing Style) like form; trying to implement threading within the HLL would similarly turn into a very nasty mess (watch as I try to build a multi-threaded program by using lots of call/cc...). so, things may have different cost/benefit tradeoffs depending on the level in which they are implemented. in a sense, security is sort of like the memory-access protection features in x86. granted, my whole point was about adding security at the VM level to help protect against malevolent code, and not within the HLL code itself. or such... ___ fonc mailing list fonc@vpri.org
Re: [fonc] misc: code security model
On Thu, Aug 11, 2011 at 10:22 PM, BGB cr88...@gmail.com wrote: if the alteration would make the language unfamiliar to people; It is true that some people would rather work around a familiar, flawed language than accept an improvement. But here's the gotcha, BGB: *those people are not part of your audience*. For example, they will never try BGB script, no matter what improvements it purports. fundamental design changes could impact any non-trivial amount of said code. for example, for a single developer, a fundamental redesign in a 750 kloc project is not a small task, and much easier is to find more quick and dirty ways to patch up problems as they arise One should also account for accumulated technical debt, from all those 'quick and dirty ways to patch up problems as they arise', and the resulting need to globally refactor anyway. find a few good (strategic and/or centralized) locations to add security checks, rather than a strategy which would essentially require lots of changes all over the place. When getting started with capabilities, there are often a few strategic places one can introduce caps - i.e. at module boundaries, and to replace direct access to the toplevel. I've converted a lot of C++ projects to a more capability-oriented style for simple reasons: it's easier to test a program when you can mock up the environment with a set of objects. No security checks are required. a simple example would be a login style system: malware author has, say, GoodApp they want to get the login key from; they make a dummy or hacked version of the VM (Bad VM), and run the good app with this; GoodApp does its thing, and authorizes itself; malware author takes this key, and puts it into BadApp; BadApp, when run on GoodVM, gives out GoodApp's key, and so can do whatever GoodApp can do. You've developed some bad security for GoodApp. If GoodApp does authorize itself, it would first authenticate the VM. I.e. BadVM would be unable to run the app. This is a DRM-like model you've envisioned. More traditionally, GoodApp just has a signed hash, i.e. a 'certificate', and the VM can authenticate GoodApp. Either way, Bad VM cannot extract the key. In my own vision, code distributed to your machine would have capabilities. But they still couldn't be stolen because the distributed code (and capabilities) would be specialized to each client - i.e. a lot more code generation tends to undermine use of certificates. The supposed malware developer would only be able to 'steal' keys to his own device, which is somewhat pointless since he doesn't need to steal them. just 'passing the blame' to the user is a poor justification for computer security. this is, however, how it is commonly handled with things like Windows. It's a poor justification there, too. the only real alternative is to assume that the user is too stupid for their own good, and essentially disallow them from using the software outright There are *many* other real alternatives, ranging from running the app in a hardware-virtualized sandbox to use of powerbox for authority. optional features are very common though in most systems They're still bad for languages. code which doesn't need to use pointers, shouldn't use pointers, and if people choose not to use them Electing to not use pointers is not the same as turning off support for pointers in the compiler (which, for example, would break all the libraries that use pointers). in an ideal world, it would be able to be usable in a similar abstraction domain roughly between C++ and JavaScript. The elegant simplicity of C++ and the blazing speed of JavaScript? the permissions are currently intended as the underlying model by which a lot of the above can be achieved. They aren't necessary. ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] misc: code security model
On 8/12/2011 12:26 AM, David Barbour wrote: On Thu, Aug 11, 2011 at 10:22 PM, BGB cr88...@gmail.com mailto:cr88...@gmail.com wrote: if the alteration would make the language unfamiliar to people; It is true that some people would rather work around a familiar, flawed language than accept an improvement. But here's the gotcha, BGB: *those people are not part of your audience*. For example, they will never try BGB script, no matter what improvements it purports. but, whether or not they use it, or care that it exists, is irrelevant... but, anyways, FWIW, I am myself a part of the audience of people who would use my language, so I am free to design it how I want it designed, which happens to largely coincide with common practices. fundamental design changes could impact any non-trivial amount of said code. for example, for a single developer, a fundamental redesign in a 750 kloc project is not a small task, and much easier is to find more quick and dirty ways to patch up problems as they arise One should also account for accumulated technical debt, from all those 'quick and dirty ways to patch up problems as they arise', and the resulting need to globally refactor anyway. well, but for now it works... fixing these sorts of issues is a task for another day... it would be like: I could also go and patch up all the internal cruft currently residing within my object system (more cleanly merging the implementations of Prototype and Class/Instance objects, developing a more orthogonal/unified object model, ...). however, this itself doesn't really matter for now, and so I can put it off until later and work on things of more immediate relevance (and rely more on the fact that most of the details are glossed over by the API). find a few good (strategic and/or centralized) locations to add security checks, rather than a strategy which would essentially require lots of changes all over the place. When getting started with capabilities, there are often a few strategic places one can introduce caps - i.e. at module boundaries, and to replace direct access to the toplevel. I've converted a lot of C++ projects to a more capability-oriented style for simple reasons: it's easier to test a program when you can mock up the environment with a set of objects. No security checks are required. but, security checks seem like less up-front effort to bolt on to the VM... in this case, the HLL looks and behaves almost exactly the same as it did before, but now can reject code which violates security and similar. for the most part, one doesn't have to care whether or not the security model exists or not, much like as is the case when using an OS like Linux or Windows: yes, theoretically the security model has hair, and changing file ownership/permissions/... would be a pain, ... but for the most part, the OS makes it all work fairly transparently, so that the user can largely ignore that it all exists. it is not clear that users can so easily ignore the existence of a capability-oriented system. also, the toplevel is very convinient when entering code interactively from the console, or for use with target_eval entities in my 3D engine (when triggered by something, they evaluate an expression), ... a simple example would be a login style system: malware author has, say, GoodApp they want to get the login key from; they make a dummy or hacked version of the VM (Bad VM), and run the good app with this; GoodApp does its thing, and authorizes itself; malware author takes this key, and puts it into BadApp; BadApp, when run on GoodVM, gives out GoodApp's key, and so can do whatever GoodApp can do. You've developed some bad security for GoodApp. If GoodApp does authorize itself, it would first authenticate the VM. I.e. BadVM would be unable to run the app. This is a DRM-like model you've envisioned. More traditionally, GoodApp just has a signed hash, i.e. a 'certificate', and the VM can authenticate GoodApp. Either way, Bad VM cannot extract the key. I have seen stuff like this before (along with apps which stick magic numbers into the registry, ...). but, anyways, DRM and security are sort of interrelated anyways, just the intended purpose of the validation and who is friend or enemy differs some. In my own vision, code distributed to your machine would have capabilities. But they still couldn't be stolen because the distributed code (and capabilities) would be specialized to each client - i.e. a lot more code generation tends to undermine use of certificates. The supposed malware developer would only be able to 'steal' keys to his own device, which is somewhat pointless since he doesn't need to steal them. there are systems in existence based on the above system, and I was basically pointing out a weakness of doing app authentication in the way I was describing there,
Re: [fonc] misc: code security model
On 8/12/2011 9:23 AM, David Barbour wrote: On Fri, Aug 12, 2011 at 2:44 AM, BGB cr88...@gmail.com mailto:cr88...@gmail.com wrote: but, whether or not they use it, or care that it exists, is irrelevant... Then so is the language. by this criteria, pretty much everything is irrelevant. my concept of relevance is structured differently, and is not based either on authority or gaining authority over others' choice of which languages/software/... to use, but rather on the ability of a piece of software/... to be used to complete tasks. in this sense, my stuff thus has some relevance (to myself) because I make use of it. this would be because relevance is a property essentially subject to each individual, their relations to things, and their relations to others, ... but, anyways, FWIW, I am myself a part of the audience of people who would use my language, so I am free to design it how I want it designed, which happens to largely coincide with common practices. A language is a very large investment, as you are aware. You say 'FWIW' (for what it's worth), but how much is your language worth, really? How much time and effort have you wasted on your language that would have been better spent maintaining a project in an existing language? most of my projects' codebase is actually in C, but there is a lot of shared architecture as well. in around 2007-2010 I had largely almost forgotten about the existence of my language, but it gained more weight with myself once some random changes (elsewhere in the VM subsystem) caused it to become less lame... 2007 was mostly first spent with me trying to migrate my scripting to dynamically-compiled C code, but this wasn't really working out. 2008-2010 was mostly me looking at trying to migrate my efforts to Java, but despite my efforts, I couldn't make Java into something I wanted to use (and using JNI is a big pile of suck...). also, because in the areas I care about, it does a better job than V8 or SpiderMonkey... also, because unlike Python or Lua, it doesn't seem totally weird... also, because it builds on the same infrastructure as most of the rest of my code, so it is the path of least resistance; ... like, say, one has an issue: they can, either, go fix it in their own code; or abandon their code and try to switch over to a different piece of technology. which is less work?... typically, fixing a bug or adding a feature is relatively little effort (minutes or maybe hours). switching out to a different piece of VM technology may be a much huger investment, potentially requiring days or weeks of effort, or more (say, requiring a large scale rewrite of their project from the ground up), and having to climb the learning-curve treadmill (like, for example, familiarizing oneself with all of the Python APIs or similar, ...). but, security checks seem like less up-front effort to bolt on to the VM... The investment is about the same. investment == ease of use + implementation costs. the overall cost of capabilities would seem to be a little higher. it would be along the lines of: well, I am just going to write this big pile of stuff really quick and dirty which looks like it escaped from C followed by oh crap, it doesn't work whereas, with privilege checks, it will work if it is running as root, or throw an exception otherwise. now, the question becomes, what about security (assuming the whole system is designed well) and performance? capabilities could potentially be better on this front. also, security-check models are well proven in systems like Windows and Linux... for example, in Linux, a plain user over SSH tries to execute shutdown -h 0, but it fails to do anything. this is because, it is a matter of rights (via a check), and not because shutdown is nowhere in the path (depends some on distro though, as some distros tend to leave /bin and /sbin out of ordinary user paths, forcing them to fully qualify the path for system tools after doing an su root). however, the ability of a user to type /sbin/shutdown does not compromise the system. it is not clear that users can so easily ignore the existence of a capability-oriented system. Capability based security is quite transparent to most users. To those unaware of the security implications, it just looks like a very well designed API - one suitable for concurrent or distributed computing or testing, or flexible composition (e.g. ability to wrap access to the 'console', or have different 'consoles' for different libraries). I mean, transparent in the sense that it looks just like code without it, namely: a big global namespace and ability to declare and import packages wherever, ... not that it is really one or the other though, as my current planned strategy actually consists of making use of *both* sets of strategies (in a sort of half-assed way). like, use
Re: [fonc] misc: code security model
On Fri, Aug 12, 2011 at 1:23 PM, BGB cr88...@gmail.com wrote: also, security-check models are well proven in systems like Windows and Linux... It is true that there are success stories using checked permissions. But, for security, the successes aren't what you should be counting. my current planned strategy actually consists of making use of *both* sets of strategies (in a sort of half-assed way). I'm sure that strategy will lead you to a sort of half-assed security. ideally, IMO, the user retains roughly the same level of security as before, but applications run under their own virtual users with considerably less rights. Look into PLASH and Polaris. ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] misc: code security model
On 8/12/2011 4:58 PM, David Barbour wrote: On Fri, Aug 12, 2011 at 1:23 PM, BGB cr88...@gmail.com mailto:cr88...@gmail.com wrote: also, security-check models are well proven in systems like Windows and Linux... It is true that there are success stories using checked permissions. But, for security, the successes aren't what you should be counting. if one counts number of deployed production systems, they also have this... my current planned strategy actually consists of making use of *both* sets of strategies (in a sort of half-assed way). I'm sure that strategy will lead you to a sort of half-assed security. make it work now, fix problems later... seems to generally work ok in many real-world environments... in this case, the exploits help tell what all needs to be fixed (much like crashes help with finding bugs, ...). better than waiting around with a product forever stuck in beta testing until it is perfect. anyway, if one throws multiple strategies at a problem, probably at least one of them will work. also, perfect systems have a bad habit of turning sour in production environments, and many of the things which tend to work well are those things which have been beaten with 1 hammers, as it were, and gain their reputation by holding up effectively to whatever sorts of challenges come their way, sort of like a race or gauntlet or similar... then one faces setbacks and failures, patches them up, and continues on ones' way, ... ideally, IMO, the user retains roughly the same level of security as before, but applications run under their own virtual users with considerably less rights. Look into PLASH and Polaris. looked at Polaris earlier, seemed interesting... PLASH doesn't pull up anything on Wikipedia, but found a link to it using Google, yep... ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] misc: code security model
A huge amount of work has been done in this area in the capability security world. See for instance the reference to Mark Miller's thesis in the footnotes of http://en.wikipedia.org/wiki/Object-capability_model A short summary of capability security is that checking permissions is error prone. Instead, only give out objects you want other pieces to use. In essence there is no global namespace, instead your supervisor environment only hands out the capabilities that you've decided the program that will be run needs. The canonical example is cp a.txt b.txt vs. cat a.txt b.txt cp needs access to the global namespace to find a.txt and b.txt and permissions are checked, etc. That is giving cp the authority to actually modify files other than a.txt and b.txt. The cat example is simple to reason about. The only objects cat needs are the input and output file descriptors, and the shell in this case is the supervisor which hands these capabilities to cat. It doesn't need any access to the filesystem and so should not be granted the ability to get directory listings or open other files that the user running the program might have access to. At a language level one of my favorite papers is http://bracha.org/newspeak-modules.pdf because it addresses the issue of the top level namespace of a language without making it globally accessible. Monty On Wed, Aug 10, 2011 at 7:35 PM, BGB cr88...@gmail.com wrote: well, ok, this is currently mostly about my own language, but I figured it might be relevant/interesting. the basic idea is this: not all code may be from trusted sources. consider, say, code comes from the internet. what is a good way of enforcing security in such a case? first obvious thing seems to be to disallow any features which could directly circumvent security. say, the code is marked untrusted, and the first things likely to happen would be to disable access to things like raw pointers and to the C FFI. the second thing seems to be the option of moving the code to a local toplevel where its ability to see certain things is severely limited. both of these pose problems: simply disabling compiler features may not be sufficient, since there may be ways of using the language which may be insecure and which go beyond simply enabling/disabling certain features in the compiler. anything still visible may be tampered with, for example, suppose a global package is made visible in the new toplevel, and the untrusted code decides to define functions in a system package, essentially overwriting the existing functions. this is useful, say, for writing program mods, but may be a bad things from a security perspective. a partial option is to give untrusted code its own shadowed packages, but this poses other problems. similarly, an exposed API function may indirectly give untrusted code unexpected levels of power if it, by default, has unhindered access to the system, placing additional burden on library code not to perform operations which may be exploitable. consider something trivial like: function getTopVar(name) { return top[name]; } which, if exposed under a visibility-based security scheme, and the function was part of a library package (with full system access), now suddenly the whole security model is dead. essentially it would amount to trying to create water tight code to avoid potential security leaks. another security worry is created by, among other things, the semantics of object delegation (at least in my language), where assigning through a delegated object may in-turn move up and assign the variable in a delegated-to object (at the VM level there are multiple assignment operators to address these different cases, namely which object will have a variable set in...). this in-turn compromises the ability to simply use delegation to give each module its own local toplevel (effectively, the toplevel, and any referenced scopes, would need to be cloned, so as to avoid them being modified), ... so, I am left idly wondering: could a variation of, say, the Unix security model, be applied at the VM level? in this case, any running code would have, say, a UID (UserID, more refers to the origin of the code than to the actual user) and GID (GroupID). VM objects, variables, and methods, would themselves (often implicitly) have access rights assigned to them (for example: Read/Write/Execute/Special). possible advantages: could be reasonably secure without going through contortions; lessens the problem of unintended levels of power; reasonably transparent at the language level (for the most part); ... disadvantages: have to implement VM level security checking in many places; there are many cases where static validation will not work, and where runtime checks would be needed (possible performance issue); may add additional memory costs (now, several types of memory objects will have to remember their owner and access rights,
Re: [fonc] misc: code security model
On Wed, Aug 10, 2011 at 7:35 PM, BGB cr88...@gmail.com wrote: not all code may be from trusted sources. consider, say, code comes from the internet. what is a good way of enforcing security in such a case? Object capability security is probably the very best approach available today - in terms of a wide variety of criterion such as flexibility, performance, precision, visibility, awareness, simplicity, and usability. In this model, ability to send a message to an object is sufficient proof that you have rights to use it - there are no passwords, no permissions checks, etc. The security discipline involves controlling whom has access to which objects - i.e. there are a number of patterns, such as 'revocable forwarders', where you'll provide an intermediate object that allows you to audit and control access to another object. You can read about several of these patterns on the erights wiki [1]. Access to FFI and such would be regulated through objects. This leaves the issue of deciding: how do we decide which objects untrusted code should get access to? Disabling all of FFI is often too extreme. My current design: FFI is a network of registries. Plugins and services publish FFI objects (modules) to these registries. Different registries are associated with different security levels, and there might be connections between them based on relative trust and security. A single FFI plugin might provide similar objects at multiple security levels - e.g. access to HTTP service might be provided at a low security level for remote addresses, but at a high security level that allows for local (127, 192.168, 10.0.0, etc.) addresses. One reason to favor plugin-based FFI is that it is easy to develop security policy for high-level features compared to low-level capabilities. (E.g. access to generic 'local storage' is lower security level than access to 'filesystem'.) Other than security, my design is to solve other difficult problems involving code migration [2], multi-process and distributed extensibility (easy to publish modules to registries even from other processes or servers; similar to web-server CGI), smooth transitions from legacy, extreme resilience and self-healing (multiple fallbacks per FFI dependency), and policyconfiguration management [3]. [1] http://wiki.erights.org/wiki/Walnut/Secure_Distributed_Computing [2] http://wiki.erights.org/wiki/Unum [3] http://c2.com/cgi/wiki?PolicyInjection the second thing seems to be the option of moving the code to a local toplevel where its ability to see certain things is severely limited. Yes, this is equivalent to controlling which 'capabilities' are available in a given context. Unfortunately, developers lack 'awareness' - i.e. it is not explicit in code that certain capabilities are needed by a given library, so failures occur much later when the library is actually loaded. This is part of why I eventually abandoned dynamic scopes (where 'dynamic scope' would include the toplevel [4]). [4] http://c2.com/cgi/wiki?ExplicitManagementOfImplicitContext simply disabling compiler features may not be sufficient It is also a bad idea. You end up with 2^N languages for N switches. That's hell to test and verify. Libraries developed for different sets of switches will consequently prove buggy when people try to compose them. This is even more documentation to manage. anything still visible may be tampered with, for example, suppose a global package is made visible in the new toplevel, and the untrusted code decides to define functions in a system package, essentially overwriting the existing functions Indeed. Almost every language built for security makes heavy use of immutable objects. They're easier to reason about. For example, rather than replacing the function in the package, you would be forced to create a new record that is the same as the old one but replaces one of the functions. Access to mutable state is more tightly controlled - i.e. an explicit capability to inject a new stage in a pipeline, rather than implicit access to a variable. We don't lose any flexibility, but the 'path of least resistance' is much more secure. an exposed API function may indirectly give untrusted code unexpected levels of power if it, by default, has unhindered access to the system, placing additional burden on library code not to perform operations which may be exploitable This is why whitelisting, rather than blacklisting, should be the rule for security. assigning through a delegated object may in-turn move up and assign the variable in a delegated-to object (at the VM level there are multiple assignment operators to address these different cases, namely which object will have a variable set in...). The security problem isn't delegation, but rather the fact that this chaining is 'implicit' so developers easily forget about it and thus leave security holes. A library of security patterns could help out. E.g. you could ensure your
Re: [fonc] misc: code security model
On 8/11/2011 10:08 AM, Monty Zukowski wrote: A huge amount of work has been done in this area in the capability security world. See for instance the reference to Mark Miller's thesis in the footnotes of http://en.wikipedia.org/wiki/Object-capability_model A short summary of capability security is that checking permissions is error prone. Instead, only give out objects you want other pieces to use. In essence there is no global namespace, instead your supervisor environment only hands out the capabilities that you've decided the program that will be run needs. I ran across some of this while looking to see if anyone else had done similar. there are some merits, but also a few drawbacks: if there is a way around the wall of isolation, then it is effectively broken (article mentions this as well); this would have a notable impact on the design of an HLL (and couldn't just be retrofitted onto an existing traditional OO language such as ActionScript or C#). a model based on permissions/... potentially offers a bigger safety net, since even if one can get a reference to something, they can't likely use it. granted, the main way of breaking a Unix-style model would be by getting root, whereby if code can manage to somehow gain root privileges, then they have full access to the system. a hypothetical hole would be, for example: _setuid public function fakesudo(str) { eval(str); } fakesudo(load(\bad_code.bs\);); which basically means some care would be needed as to what sorts of functions can be given setuid. The canonical example is cp a.txt b.txt vs. cata.txtb.txt cp needs access to the global namespace to find a.txt and b.txt and permissions are checked, etc. That is giving cp the authority to actually modify files other than a.txt and b.txt. The cat example is simple to reason about. The only objects cat needs are the input and output file descriptors, and the shell in this case is the supervisor which hands these capabilities to cat. It doesn't need any access to the filesystem and so should not be granted the ability to get directory listings or open other files that the user running the program might have access to. yes, but it can also be noted that having to use 'cat' instead of 'cp', in a general sense, would impact the system in very fundamental ways. although, effectively, what I am imagining is sort of a hybrid strategy (where user code is basically given its own toplevel, probably with a new toplevel per-user and linked to a shared group-specific toplevel). potentially, some features (such as access to the C FFI) would be omitted from the user toplevel (in addition to being restricted to being root-only). I am debating some whether this should only apply to the FFI access object (IOW: the object that allows seeing into C land), or if it should also apply to retrieved functions. I am slightly leaning on allowing looser access to retrieved functions, mostly as otherwise it would mean piles of setuid wrapper functions to be able to expose native APIs to user code, whereas if they can be re-exposed as --x, then user code can still access exposed native functions. hypothetical example (as root): native import C.math; var mathobj={sin: sin, cos: cos, tan: tan, ...};//re-expose math functions ... var newtop={#math: mathobj, ...};//where (#name: value) defines a delegate slot //possible future load with an attribute list: load(userscript.bs, top: newtop, user: someapp, group: apps); (could probably do similar with eval, vs the current fixed 1 or 2 arg forms...). At a language level one of my favorite papers is http://bracha.org/newspeak-modules.pdf because it addresses the issue of the top level namespace of a language without making it globally accessible. may have to read this... could be interesting. Monty On Wed, Aug 10, 2011 at 7:35 PM, BGBcr88...@gmail.com wrote: well, ok, this is currently mostly about my own language, but I figured it might be relevant/interesting. the basic idea is this: not all code may be from trusted sources. consider, say, code comes from the internet. what is a good way of enforcing security in such a case? first obvious thing seems to be to disallow any features which could directly circumvent security. say, the code is marked untrusted, and the first things likely to happen would be to disable access to things like raw pointers and to the C FFI. the second thing seems to be the option of moving the code to a local toplevel where its ability to see certain things is severely limited. both of these pose problems: simply disabling compiler features may not be sufficient, since there may be ways of using the language which may be insecure and which go beyond simply enabling/disabling certain features in the compiler. anything still visible may be tampered with, for example, suppose a global package is made visible in the new toplevel, and the untrusted code decides to define
Re: [fonc] misc: code security model
On Thu, Aug 11, 2011 at 1:07 PM, BGB cr88...@gmail.com wrote: this would have a notable impact on the design of an HLL (and couldn't just be retrofitted onto an existing traditional OO language such as ActionScript or C#). That's a fair point. Some projects such as Joe-E [1] achieve something similar to a retrofit, requiring developers to write code in a safe subset of a language. JavaScript is also becoming more capability oriented, with the EC5 'strict' mode supporting a transition phase [2]. [1] http://en.wikipedia.org/wiki/Joe-E [2] http://wiki.ecmascript.org/doku.php?id=harmony:harmony a model based on permissions/... potentially offers a bigger safety net, since even if one can get a reference to something, they can't likely use it. Ah, but like most nets, you're likely to have a lot of holes. Managing permissions is even more painful than managing explicit static types and explicitly propagating error codes. It is not first on the mind of most developers. One of the big advantages of ocaps is that it tends to put security on the path of least resistance - i.e. developer laziness and security are coupled [2]. [2] http://www.youtube.com/watch?v=eL5o4PFuxTY yes, but it can also be noted that having to use 'cat' instead of 'cp', in a general sense, would impact the system in very fundamental ways. Security will impact a system in very fundamental ways. ;-) Regards, Dave ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] misc: code security model
On 8/11/2011 12:55 PM, David Barbour wrote: On Wed, Aug 10, 2011 at 7:35 PM, BGB cr88...@gmail.com mailto:cr88...@gmail.com wrote: not all code may be from trusted sources. consider, say, code comes from the internet. what is a good way of enforcing security in such a case? Object capability security is probably the very best approach available today - in terms of a wide variety of criterion such as flexibility, performance, precision, visibility, awareness, simplicity, and usability. In this model, ability to send a message to an object is sufficient proof that you have rights to use it - there are no passwords, no permissions checks, etc. The security discipline involves controlling whom has access to which objects - i.e. there are a number of patterns, such as 'revocable forwarders', where you'll provide an intermediate object that allows you to audit and control access to another object. You can read about several of these patterns on the erights wiki [1]. the big problem though: to try to implement this as a sole security model, and expecting it to be effective, would likely impact language design and programming strategy, and possibly lead to a fair amount of effort WRT hole plugging in an existing project. granted, code will probably not use logins/passwords for authority, as this would likely be horridly ineffective for code (about as soon as a piece of malware knows the login used by a piece of trusted code, it can spoof as the code and do whatever it wants). digital signing is another possible strategy, but poses a similar problem: how to effectively prevent spoofing (say, one manages to extract the key from a trusted app, and then signs a piece of malware with it). AFAICT, the usual strategy used with SSL certificates is that they may expire and are checked against a certificate authority. although maybe reasonably effective for the internet, this seems to be a fairly complex and heavy-weight approach (not ideal for software, especially not FOSS, as most such authorities want money and require signing individual binaries, ...). my current thinking is roughly along the line that each piece of code will be given a fingerprint (possibly an MD5 or SHA hash), and this fingerprint is either known good to the VM itself (for example, its own code, or code that is part of the host application), or may be confirmed as trusted by the user (if it requires special access, ...). it is a little harder to spoof a hash, and tampering with a piece of code will change its hash (although with simpler hashes, such as checksums and CRC's, it is often possible to use a glob of garbage bytes to trick the checksum algorithm into giving the desired value). yes, there is still always the risk of a naive user confirming a piece of malware, but this is their own problem at this point. Access to FFI and such would be regulated through objects. This leaves the issue of deciding: how do we decide which objects untrusted code should get access to? Disabling all of FFI is often too extreme. potentially. my current thinking is, granted, that it will disable access to the FFI access object (internally called ctop in my VM), which would disable the ability to fetch new functions/... from the FFI (or perform native import operations with the current implementation). however, if retrieved functions are still accessible, it might be possible to retrieve them indirectly and then make them visible this way. as noted in another message: native import C.math; var mathobj={sin: sin, cos: cos, tan: tan, ...}; giving access to mathobj will still allow access to these functions, without necessarily giving access to the entire C toplevel, which poses a much bigger security risk. sadly, there is no real good way to safely streamline this in the current implementation. My current design: FFI is a network of registries. Plugins and services publish FFI objects (modules) to these registries. Different registries are associated with different security levels, and there might be connections between them based on relative trust and security. A single FFI plugin might provide similar objects at multiple security levels - e.g. access to HTTP service might be provided at a low security level for remote addresses, but at a high security level that allows for local (127, 192.168, 10.0.0, etc.) addresses. One reason to favor plugin-based FFI is that it is easy to develop security policy for high-level features compared to low-level capabilities. (E.g. access to generic 'local storage' is lower security level than access to 'filesystem'.) my FFI is based on bulk importing the contents of C headers. although fairly powerful and convenient, securing such a beast is likely to be a bit of a problem. easier just to be like code which isn't trusted can't directly use the FFI Other than security, my design is to solve other difficult problems
Re: [fonc] misc: code security model
I feel obligated to comment on usage of MD5 for any security purpose: http://www.codeproject.com/KB/security/HackingMd5.aspx On Thu, Aug 11, 2011 at 19:06, BGB cr88...@gmail.com wrote: On 8/11/2011 12:55 PM, David Barbour wrote: On Wed, Aug 10, 2011 at 7:35 PM, BGB cr88...@gmail.com wrote: not all code may be from trusted sources. consider, say, code comes from the internet. what is a good way of enforcing security in such a case? Object capability security is probably the very best approach available today - in terms of a wide variety of criterion such as flexibility, performance, precision, visibility, awareness, simplicity, and usability. In this model, ability to send a message to an object is sufficient proof that you have rights to use it - there are no passwords, no permissions checks, etc. The security discipline involves controlling whom has access to which objects - i.e. there are a number of patterns, such as 'revocable forwarders', where you'll provide an intermediate object that allows you to audit and control access to another object. You can read about several of these patterns on the erights wiki [1]. the big problem though: to try to implement this as a sole security model, and expecting it to be effective, would likely impact language design and programming strategy, and possibly lead to a fair amount of effort WRT hole plugging in an existing project. granted, code will probably not use logins/passwords for authority, as this would likely be horridly ineffective for code (about as soon as a piece of malware knows the login used by a piece of trusted code, it can spoof as the code and do whatever it wants). digital signing is another possible strategy, but poses a similar problem: how to effectively prevent spoofing (say, one manages to extract the key from a trusted app, and then signs a piece of malware with it). AFAICT, the usual strategy used with SSL certificates is that they may expire and are checked against a certificate authority. although maybe reasonably effective for the internet, this seems to be a fairly complex and heavy-weight approach (not ideal for software, especially not FOSS, as most such authorities want money and require signing individual binaries, ...). my current thinking is roughly along the line that each piece of code will be given a fingerprint (possibly an MD5 or SHA hash), and this fingerprint is either known good to the VM itself (for example, its own code, or code that is part of the host application), or may be confirmed as trusted by the user (if it requires special access, ...). it is a little harder to spoof a hash, and tampering with a piece of code will change its hash (although with simpler hashes, such as checksums and CRC's, it is often possible to use a glob of garbage bytes to trick the checksum algorithm into giving the desired value). yes, there is still always the risk of a naive user confirming a piece of malware, but this is their own problem at this point. Access to FFI and such would be regulated through objects. This leaves the issue of deciding: how do we decide which objects untrusted code should get access to? Disabling all of FFI is often too extreme. potentially. my current thinking is, granted, that it will disable access to the FFI access object (internally called ctop in my VM), which would disable the ability to fetch new functions/... from the FFI (or perform native import operations with the current implementation). however, if retrieved functions are still accessible, it might be possible to retrieve them indirectly and then make them visible this way. as noted in another message: native import C.math; var mathobj={sin: sin, cos: cos, tan: tan, ...}; giving access to mathobj will still allow access to these functions, without necessarily giving access to the entire C toplevel, which poses a much bigger security risk. sadly, there is no real good way to safely streamline this in the current implementation. My current design: FFI is a network of registries. Plugins and services publish FFI objects (modules) to these registries. Different registries are associated with different security levels, and there might be connections between them based on relative trust and security. A single FFI plugin might provide similar objects at multiple security levels - e.g. access to HTTP service might be provided at a low security level for remote addresses, but at a high security level that allows for local (127, 192.168, 10.0.0, etc.) addresses. One reason to favor plugin-based FFI is that it is easy to develop security policy for high-level features compared to low-level capabilities. (E.g. access to generic 'local storage' is lower security level than access to 'filesystem'.) my FFI is based on bulk importing the contents of C headers. although fairly powerful and convenient, securing such a beast is likely to be a bit of a
Re: [fonc] misc: code security model
On Thu, Aug 11, 2011 at 5:06 PM, BGB cr88...@gmail.com wrote: the big problem though: to try to implement this as a sole security model, and expecting it to be effective, would likely impact language design and programming strategy, and possibly lead to a fair amount of effort WRT hole plugging in an existing project. A problem with language design is only a big problem if a lot of projects are using the language. Security is a big problem today because a lot of projects use languages that were not designed with effective security as a requirement. how to effectively prevent spoofing (say, one manages to extract the key from a trusted app, and then signs a piece of malware with it). Reason about security *inductively*. Assume the piece holding the key is secure up to its API. If you start with assumptions like: well, let's assume the malware has backdoor access to your keys and such, you're assuming insecurity - you'll never reach security from there. Phrases such as 'trusted app' or 'trusted code' smell like vaguely of brimstone - like a road built of good intentions. What is the app trusted with? How do we answer this question with a suitably fine-grained executable policy? yes, there is still always the risk of a naive user confirming a piece of malware, but this is their own problem at this point. I disagree. Computer security includes concerns such as limiting and recovering from damage, and awareness. And just 'passing the blame' to the user is a rather poor justification for computer security. if trying to use a feature simply makes code using it invalid (sorry, I can't let you do that), this works. When I first got into language design, I thought as you did. Then I realized: * With optional features, I have 2^N languages no matter how I implement them. * I pay implementation, maintenance, debugging, documentation, and design costs for those features, along with different subsets of them. * Library developers are motivated to write for the Least Common Denominator (LCD) language anyway, for reusable code. * Library developers can (and will) create frameworks, interpreters, EDSLs to support more features above the LCD. * Therefore, it is wasteful to spend my time on anything but the LCD features, and make life as cozy as possible for library developers and their EDSLs. *The only cogent purpose of general purpose language design is to raise the LCD.* Optional features are a waste of time and effort, BGB - yours, and of everyone who as a result submits a bug report or wades through the documentation. with a language/VM existing for approx 8 years and with ~ 540 opcodes, ... I guess things like this are inevitable. I think this is a property of your language design philosophy, rather than inherent to language development. but whitelisting is potentially much more effort than blacklisting, even if potentially somewhat better from a security perspective. Effectiveness for effort, whitelisting is typically far better than blacklisting. In most cases, it is less effort. Always, it is far easier to reason about. I think you'll need to stretch to find rare counter-examples. LambdaMoo found a MUD, if this is what was in question... LambdaMoo is a user-programmable MUD, with prototype based objects and a Unix-like security model. as for simple or efficient, a Unix-style security model doesn't look all that bad. Unix security model is complicated, inefficient, and ineffective compared to object capability model. But I agree that you could do worse. luckily, there are only a relatively small number of places I really need to put in security checks (mostly in the object system and similar). most of the rest of the typesystem or VM doesn't really need them. I recommend you pursue control of the toplevel capabilities (FFI, and explicit forwarding of math, etc.) as you demonstrated earlier, perhaps add some support for 'freezing' objects to block implicit delegation of assignment, and simply forget about Unix or permissions checks. Regards, David ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] misc: code security model
On 8/11/2011 7:35 PM, Tristan Slominski wrote: I feel obligated to comment on usage of MD5 for any security purpose: http://www.codeproject.com/KB/security/HackingMd5.aspx but, to be fair, that is a fairly contrived example... it is at least not like, say, Adler-32 or CRC-32 where one can (fairly quickly) brute-force a glob of bozo bytes to make one file look like another (or apparently with CRC-32, make use of a few well-placed xors and similar, at least going by some examples I found on the internet). like, say, file A has a certain CRC; file B is a virus, followed by a sequence of bytes to fiddle the CRC bits into the desired value, and to pad the file to the expected size (among other possibilities). Adler-32 is basically a pair of sums, and with some fiddling and arithmetic, one can (probably) push the sums up/down (in relation) until they match the expected values. On Thu, Aug 11, 2011 at 19:06, BGB cr88...@gmail.com mailto:cr88...@gmail.com wrote: On 8/11/2011 12:55 PM, David Barbour wrote: On Wed, Aug 10, 2011 at 7:35 PM, BGB cr88...@gmail.com mailto:cr88...@gmail.com wrote: not all code may be from trusted sources. consider, say, code comes from the internet. what is a good way of enforcing security in such a case? Object capability security is probably the very best approach available today - in terms of a wide variety of criterion such as flexibility, performance, precision, visibility, awareness, simplicity, and usability. In this model, ability to send a message to an object is sufficient proof that you have rights to use it - there are no passwords, no permissions checks, etc. The security discipline involves controlling whom has access to which objects - i.e. there are a number of patterns, such as 'revocable forwarders', where you'll provide an intermediate object that allows you to audit and control access to another object. You can read about several of these patterns on the erights wiki [1]. the big problem though: to try to implement this as a sole security model, and expecting it to be effective, would likely impact language design and programming strategy, and possibly lead to a fair amount of effort WRT hole plugging in an existing project. granted, code will probably not use logins/passwords for authority, as this would likely be horridly ineffective for code (about as soon as a piece of malware knows the login used by a piece of trusted code, it can spoof as the code and do whatever it wants). digital signing is another possible strategy, but poses a similar problem: how to effectively prevent spoofing (say, one manages to extract the key from a trusted app, and then signs a piece of malware with it). AFAICT, the usual strategy used with SSL certificates is that they may expire and are checked against a certificate authority. although maybe reasonably effective for the internet, this seems to be a fairly complex and heavy-weight approach (not ideal for software, especially not FOSS, as most such authorities want money and require signing individual binaries, ...). my current thinking is roughly along the line that each piece of code will be given a fingerprint (possibly an MD5 or SHA hash), and this fingerprint is either known good to the VM itself (for example, its own code, or code that is part of the host application), or may be confirmed as trusted by the user (if it requires special access, ...). it is a little harder to spoof a hash, and tampering with a piece of code will change its hash (although with simpler hashes, such as checksums and CRC's, it is often possible to use a glob of garbage bytes to trick the checksum algorithm into giving the desired value). yes, there is still always the risk of a naive user confirming a piece of malware, but this is their own problem at this point. Access to FFI and such would be regulated through objects. This leaves the issue of deciding: how do we decide which objects untrusted code should get access to? Disabling all of FFI is often too extreme. potentially. my current thinking is, granted, that it will disable access to the FFI access object (internally called ctop in my VM), which would disable the ability to fetch new functions/... from the FFI (or perform native import operations with the current implementation). however, if retrieved functions are still accessible, it might be possible to retrieve them indirectly and then make them visible this way. as noted in another message: native import C.math; var mathobj={sin: sin, cos: cos, tan: tan, ...}; giving access to mathobj will still allow access to these functions, without necessarily giving access to the entire C toplevel, which poses a
Re: [fonc] misc: code security model
On 8/11/2011 8:16 PM, David Barbour wrote: On Thu, Aug 11, 2011 at 5:06 PM, BGB cr88...@gmail.com mailto:cr88...@gmail.com wrote: the big problem though: to try to implement this as a sole security model, and expecting it to be effective, would likely impact language design and programming strategy, and possibly lead to a fair amount of effort WRT hole plugging in an existing project. A problem with language design is only a big problem if a lot of projects are using the language. Security is a big problem today because a lot of projects use languages that were not designed with effective security as a requirement. or: if the alteration would make the language unfamiliar to people; if one has, say, a large pile of code (say, for example, 500 kloc or 1 Mloc or more), and fundamental design changes could impact any non-trivial amount of said code. for example, for a single developer, a fundamental redesign in a 750 kloc project is not a small task, and much easier is to find more quick and dirty ways to patch up problems as they arise, or find a few good (strategic and/or centralized) locations to add security checks, rather than a strategy which would essentially require lots of changes all over the place. how to effectively prevent spoofing (say, one manages to extract the key from a trusted app, and then signs a piece of malware with it). Reason about security /inductively/. Assume the piece holding the key is secure up to its API. If you start with assumptions like: well, let's assume the malware has backdoor access to your keys and such, you're assuming insecurity - you'll never reach security from there. the problem though is that it may be possible for a person making the piece of malware to get at the keys indirectly... a simple example would be a login style system: malware author has, say, GoodApp they want to get the login key from; they make a dummy or hacked version of the VM (Bad VM), and run the good app with this; GoodApp does its thing, and authorizes itself; malware author takes this key, and puts it into BadApp; BadApp, when run on GoodVM, gives out GoodApp's key, and so can do whatever GoodApp can do. these types of problems are typically addressed (partially) with the VM/... logging into a server and authenticating keys over the internet, but there are drawbacks with this as well. Phrases such as 'trusted app' or 'trusted code' smell like vaguely of brimstone - like a road built of good intentions. What is the app trusted with? How do we answer this question with a suitably fine-grained executable policy? the terminology is mostly from what all I have read regarding the .NET and Windows security architecture... but, generally the trust is presumably spread between several parties: the vendors of the software (VM, apps, ...); the user of the software. yes, there is still always the risk of a naive user confirming a piece of malware, but this is their own problem at this point. I disagree. Computer security includes concerns such as limiting and recovering from damage, and awareness. And just 'passing the blame' to the user is a rather poor justification for computer security. this is, however, how it is commonly handled with things like Windows. if something looks suspect to the OS (bad file signing, the app trying to access system files, ...) then Windows pops up a dialogue Do you want to allow this app to do this? at which point the user confirms this, then yes it is their problem. the only real alternative is to assume that the user is too stupid for their own good, and essentially disallow them from using the software outright. in practice, this is a much bigger problem, as then one has taken away user rights (say, they can no longer install non-signed apps on their system...). systems which have taken the above policy have then often been manually jailbroken or rooted by the users, essentially gaining personal freedom at the risk of (potentially) compromising their personal security (or voided their warranty, or broke the law). better I think to make the system do its best effort to keep itself secure, but then delegate to the user for the rest. if trying to use a feature simply makes code using it invalid (sorry, I can't let you do that), this works. When I first got into language design, I thought as you did. Then I realized: * With optional features, I have 2^N languages no matter how I implement them. * I pay implementation, maintenance, debugging, documentation, and design costs for those features, along with different subsets of them. * Library developers are motivated to write for the Least Common Denominator (LCD) language anyway, for reusable code. * Library developers can (and will) create frameworks, interpreters, EDSLs to support more features above the LCD. * Therefore, it is wasteful to spend my time on anything but the LCD