I feel obligated to comment on usage of MD5 for any security purpose: http://www.codeproject.com/KB/security/HackingMd5.aspx
On Thu, Aug 11, 2011 at 19:06, BGB <[email protected]> wrote: > On 8/11/2011 12:55 PM, David Barbour wrote: > > On Wed, Aug 10, 2011 at 7:35 PM, BGB <[email protected]> wrote: > >> not all code may be from trusted sources. >> consider, say, code comes from the internet. >> >> what is a "good" way of enforcing security in such a case? >> > > Object capability security is probably the very best approach available > today - in terms of a wide variety of criterion such as flexibility, > performance, precision, visibility, awareness, simplicity, and usability. > > In this model, ability to send a message to an object is sufficient proof > that you have rights to use it - there are no passwords, no permissions > checks, etc. The security discipline involves controlling whom has access to > which objects - i.e. there are a number of patterns, such as 'revocable > forwarders', where you'll provide an intermediate object that allows you to > audit and control access to another object. You can read about several of > these patterns on the erights wiki [1]. > > > the big problem though: > to try to implement this as a sole security model, and expecting it to be > effective, would likely impact language design and programming strategy, and > possibly lead to a fair amount of effort WRT "hole plugging" in an existing > project. > > granted, code will probably not use logins/passwords for authority, as this > would likely be horridly ineffective for code (about as soon as a piece of > malware knows the login used by a piece of "trusted" code, it can spoof as > the code and do whatever it wants). > > "digital signing" is another possible strategy, but poses a similar > problem: > how to effectively prevent spoofing (say, one manages to "extract" the key > from a trusted app, and then signs a piece of malware with it). > > AFAICT, the usual strategy used with SSL certificates is that they may > expire and are checked against a "certificate authority". although maybe > reasonably effective for the internet, this seems to be a fairly complex and > heavy-weight approach (not ideal for software, especially not FOSS, as most > such authorities want money and require signing individual binaries, ...). > > my current thinking is roughly along the line that each piece of code will > be given a "fingerprint" (possibly an MD5 or SHA hash), and this fingerprint > is either known good to the VM itself (for example, its own code, or code > that is part of the host application), or may be confirmed as "trusted" by > the user (if it requires special access, ...). > > it is a little harder to spoof a hash, and tampering with a piece of code > will change its hash (although with simpler hashes, such as checksums and > CRC's, it is often possible to use a glob of "garbage bytes" to trick the > checksum algorithm into giving the desired value). > > yes, there is still always the risk of a naive user confirming a piece of > malware, but this is their own problem at this point. > > > > Access to FFI and such would be regulated through objects. This leaves > the issue of deciding: how do we decide which objects untrusted code should > get access to? Disabling all of FFI is often too extreme. > > > potentially. > my current thinking is, granted, that it will disable access to the "FFI > access object" (internally called "ctop" in my VM), which would disable the > ability to fetch new functions/... from the FFI (or perform "native import" > operations with the current implementation). > > however, if retrieved functions are still accessible, it might be possible > to retrieve them indirectly and then make them visible this way. > > as noted in another message: > > native import C.math; > var mathobj={sin: sin, cos: cos, tan: tan, ...}; > > giving access to "mathobj" will still allow access to these functions, > without necessarily giving access to "the entire C toplevel", which poses a > much bigger security risk. > > sadly, there is no real good way to safely "streamline" this in the current > implementation. > > > > My current design: FFI is a network of registries. Plugins and services > publish FFI objects (modules) to these registries. Different registries are > associated with different security levels, and there might be connections > between them based on relative trust and security. A single FFI plugin > might provide similar objects at multiple security levels - e.g. access to > HTTP service might be provided at a low security level for remote addresses, > but at a high security level that allows for local (127, 192.168, 10.0.0, > etc.) addresses. One reason to favor plugin-based FFI is that it is easy to > develop security policy for high-level features compared to low-level > capabilities. (E.g. access to generic 'local storage' is lower security > level than access to 'filesystem'.) > > > my FFI is based on bulk importing the contents of C headers. > > although fairly powerful and convenient, "securing" such a beast is likely > to be a bit of a problem. > > easier just to be like "code which isn't trusted can't directly use the > FFI...". > > > > Other than security, my design is to solve other difficult problems > involving code migration [2], multi-process and distributed extensibility > (easy to publish modules to registries even from other processes or servers; > similar to web-server CGI), smooth transitions from legacy, extreme > resilience and self-healing (multiple fallbacks per FFI dependency), and > policy&configuration management [3]. > > [1] http://wiki.erights.org/wiki/Walnut/Secure_Distributed_Computing > [2] http://wiki.erights.org/wiki/Unum > [3] http://c2.com/cgi/wiki?PolicyInjection > > > > I had done code migration in the past, but sadly my VM's haven't had this > feature in a fairly long time (many years). > > even then, it had a few ugly problems: > the migration essentially involved transparently sending the AST, and > recompiling it on the other end. a result of this was that closures would > tend to loose the "identity" of their lexical scope. > ... > > over a socket, it had used a model where many data types (lists/...) were > essentially passed as copies; > things like builtin and native functions simply bound against their > analogues on the other end (code in C land was unique to each node); > objects were "mirrored" with an asynchronous consistency model (altering an > object would send slot-change messages to the other nodes which held > copies); > other object types were passed-by-handle (basically, it identifies the > NodeID and ObjectID for a remote object); > ... > > > some later ideas (for reviving the above) had involved the idea of using > essentially mirroring a virtual heap over the network (using a system > similar to "far pointers" and "segmented addressing"), but this would have > introduced many nasty problems, and this didn't go anywhere. > > if I ever do get around to re-implementing something like this, I will > probably use a variation of my original strategy, except that I would > probably leave objects as being remotely accessed via handles, rather than > trying to mirror them and keep them in sync (or, if mirroring is used, > effectively using a "synchronized write" strategy of some sort...). > > > > >> the second thing seems to be the option of moving the code to a local >> toplevel where its ability to see certain things is severely limited. >> > > Yes, this is equivalent to controlling which 'capabilities' are available > in a given context. Unfortunately, developers lack 'awareness' - i.e. it is > not explicit in code that certain capabilities are needed by a given > library, so failures occur much later when the library is actually loaded. > This is part of why I eventually abandoned dynamic scopes (where 'dynamic > scope' would include the toplevel [4]). > > > > "dynamic scope" in my case refers to something very different. > I generally call the objects+delegation model "object scope", which is the > main model used by the toplevel. > > it differs some for import: > by default, "import" actually exists in terms of the lexical scope (it is > internally a delegate lexical variable); > potentially confusingly, for "delegate import" the import is actually > placed into the object scope (directly into the containing package or > toplevel object), which is part of the reason for its unique semantics. > > say (at the toplevel): > extern delegate import foo.bar; > > actually does something roughly similar to: > load("foo/bar.bs"); //not exactly, but it is a similar idea... > delegate var #'foo/bar'=#:"foo/bar"; //sort of... > in turn invoking more funky semantics in the VM. > > note: #'...' and #:"..." is basically syntax for allowing identifiers and > keywords containing otherwise invalid characters (characters invalid for > identifiers). > > > > [4] http://c2.com/cgi/wiki?ExplicitManagementOfImplicitContext > > > ok. > > > >> simply disabling compiler features may not be sufficient > > > It is also a bad idea. You end up with 2^N languages for N switches. > That's hell to test and verify. Libraries developed for different sets of > switches will consequently prove buggy when people try to compose them. This > is even more documentation to manage. > > > it depends on the nature of the features and their impact on the language. > > if trying to use a feature simply makes code using it invalid ("sorry, I > can't let you do that"), this works. > if it leaves the code still valid but with different semantics, or enabling > a feature changes the semantics of code written with it disabled, well, this > is a bit more ugly... > > > but, yes, sadly, I am already having enough issues with seemingly endless > undocumented/forgotten features, and features which were mostly implemented > but are subtly broken (for example, me earlier fixing a feature which > existed in the parser/compiler, but depended on an opcode which for whatever > reason was absent from the bytecode interpreter, ...). > > but, with a language/VM existing for approx 8 years and with ~ 540 opcodes, > ... I guess things like this are inevitable. > > > > > >> >> anything still visible may be tampered with, for example, suppose a >> global package is made visible in the new toplevel, and the untrusted code >> decides to define functions in a system package, essentially overwriting the >> existing functions > > > Indeed. Almost every language built for security makes heavy use of > immutable objects. They're easier to reason about. For example, rather than > replacing the function in the package, you would be forced to create a new > record that is the same as the old one but replaces one of the functions. > > Access to mutable state is more tightly controlled - i.e. an explicit > capability to inject a new stage in a pipeline, rather than implicit access > to a variable. We don't lose any flexibility, but the 'path of least > resistance' is much more secure. > > > yes, but this isn't as ideal in a pre-existing language where nearly > everything is highly mutable. > in this case, creation of security may involve... "write protecting" > things... > > a basic security mechanism then is that, by default, most non-owned objects > will be marked read-only. > > > > > > an exposed API function may indirectly give untrusted code "unexpected >> levels of power" if it, by default, has unhindered access to the system, >> placing additional burden on library code not to perform operations which >> may be exploitable > > > This is why whitelisting, rather than blacklisting, should be the rule > for security. > > > but whitelisting is potentially much more effort than blacklisting, even if > potentially somewhat better from a security perspective. > > > > >> assigning through a delegated object may in-turn move up and assign the >> variable in a delegated-to object (at the VM level there are multiple >> assignment operators to address these different cases, namely which object >> will have a variable set in...). > > > The security problem isn't delegation, but rather the fact that this > chaining is 'implicit' so developers easily forget about it and thus leave > security holes. > > A library of security patterns could help out. E.g. you could ensure your > revocable forwarders and facet-pattern constructors also provide barriers > against propagation of assignment. > > > > potentially, or use cloning rather than delegation chaining (however, in my > VM, it is only possible to clone from a single object, whereas one may do > N-way delegation, making delegation generally more convenient for building > the toplevel). > > my current thinking is that basically assignment delegation will stop once > an object is hit which is read-only, forcing the assignment into a "nearer" > object. trying to force-assign into a read-only object will result in an > exception or similar. > > in general though, trying to assign top-level bindings (which are generally > things like API functions) may be a bad practice in general. > > > > >> could a variation of, say, the Unix security model, be applied at the VM >> level? >> > > Within the VM, this has been done before, e.g. Java introduced thread > capabilities. But the Unix security model is neither simple nor flexible nor > efficient, especially for fine-grained delegation. I cannot recommend it. > But if you do pursue this route: it has been done before, and there's a lot > of material you can learn from. Look up LambdaMoo, for example. > > > LambdaMoo found a MUD, if this is what was in question... > > I partly patched it on last-night, and the performance overhead should be > "modest" in the common case. > > > as for "simple" or "efficient", a Unix-style security model doesn't look > all that bad. at least I am not looking at implementing ACLs or a > Windows-style security model, which would be a fair amount more complex and > slower (absent static checking and optimization). > > luckily, there are only a relatively small number of places I really need > to put in security checks (mostly in the object system and similar). most of > the rest of the typesystem or VM doesn't really need them. > > or such... > > > _______________________________________________ > fonc mailing list > [email protected] > http://vpri.org/mailman/listinfo/fonc > >
_______________________________________________ fonc mailing list [email protected] http://vpri.org/mailman/listinfo/fonc
