Re: [fonc] misc: code security model

Tristan Slominski Thu, 11 Aug 2011 19:36:57 -0700

I feel obligated to comment on usage of MD5 for any security purpose:

http://www.codeproject.com/KB/security/HackingMd5.aspx


On Thu, Aug 11, 2011 at 19:06, BGB <[email protected]> wrote:

>  On 8/11/2011 12:55 PM, David Barbour wrote:
>
> On Wed, Aug 10, 2011 at 7:35 PM, BGB <[email protected]> wrote:
>
>> not all code may be from trusted sources.
>> consider, say, code comes from the internet.
>>
>> what is a "good" way of enforcing security in such a case?
>>
>
>  Object capability security is probably the very best approach available
> today - in terms of a wide variety of criterion such as flexibility,
> performance, precision, visibility, awareness, simplicity, and usability.
>
>  In this model, ability to send a message to an object is sufficient proof
> that you have rights to use it - there are no passwords, no permissions
> checks, etc. The security discipline involves controlling whom has access to
> which objects - i.e. there are a number of patterns, such as 'revocable
> forwarders', where you'll provide an intermediate object that allows you to
> audit and control access to another object. You can read about several of
> these patterns on the erights wiki [1].
>
>
> the big problem though:
> to try to implement this as a sole security model, and expecting it to be
> effective, would likely impact language design and programming strategy, and
> possibly lead to a fair amount of effort WRT "hole plugging" in an existing
> project.
>
> granted, code will probably not use logins/passwords for authority, as this
> would likely be horridly ineffective for code (about as soon as a piece of
> malware knows the login used by a piece of "trusted" code, it can spoof as
> the code and do whatever it wants).
>
> "digital signing" is another possible strategy, but poses a similar
> problem:
> how to effectively prevent spoofing (say, one manages to "extract" the key
> from a trusted app, and then signs a piece of malware with it).
>
> AFAICT, the usual strategy used with SSL certificates is that they may
> expire and are checked against a "certificate authority". although maybe
> reasonably effective for the internet, this seems to be a fairly complex and
> heavy-weight approach (not ideal for software, especially not FOSS, as most
> such authorities want money and require signing individual binaries, ...).
>
> my current thinking is roughly along the line that each piece of code will
> be given a "fingerprint" (possibly an MD5 or SHA hash), and this fingerprint
> is either known good to the VM itself (for example, its own code, or code
> that is part of the host application), or may be confirmed as "trusted" by
> the user (if it requires special access, ...).
>
> it is a little harder to spoof a hash, and tampering with a piece of code
> will change its hash (although with simpler hashes, such as checksums and
> CRC's, it is often possible to use a glob of "garbage bytes" to trick the
> checksum algorithm into giving the desired value).
>
> yes, there is still always the risk of a naive user confirming a piece of
> malware, but this is their own problem at this point.
>
>
>
>  Access to FFI and such would be regulated through objects. This leaves
> the issue of deciding: how do we decide which objects untrusted code should
> get access to? Disabling all of FFI is often too extreme.
>
>
> potentially.
> my current thinking is, granted, that it will disable access to the "FFI
> access object" (internally called "ctop" in my VM), which would disable the
> ability to fetch new functions/... from the FFI (or perform "native import"
> operations with the current implementation).
>
> however, if retrieved functions are still accessible, it might be possible
> to retrieve them indirectly and then make them visible this way.
>
> as noted in another message:
>
> native import C.math;
> var mathobj={sin: sin, cos: cos, tan: tan, ...};
>
> giving access to "mathobj" will still allow access to these functions,
> without necessarily giving access to "the entire C toplevel", which poses a
> much bigger security risk.
>
> sadly, there is no real good way to safely "streamline" this in the current
> implementation.
>
>
>
>  My current design: FFI is a network of registries. Plugins and services
> publish FFI objects (modules) to these registries. Different registries are
> associated with different security levels, and there might be connections
> between them based on relative trust and security. A single FFI plugin
> might provide similar objects at multiple security levels - e.g. access to
> HTTP service might be provided at a low security level for remote addresses,
> but at a high security level that allows for local (127, 192.168, 10.0.0,
> etc.) addresses. One reason to favor plugin-based FFI is that it is easy to
> develop security policy for high-level features compared to low-level
> capabilities. (E.g. access to generic 'local storage' is lower security
> level than access to 'filesystem'.)
>
>
> my FFI is based on bulk importing the contents of C headers.
>
> although fairly powerful and convenient, "securing" such a beast is likely
> to be a bit of a problem.
>
> easier just to be like "code which isn't trusted can't directly use the
> FFI...".
>
>
>
>  Other than security, my design is to solve other difficult problems
> involving code migration [2], multi-process and distributed extensibility
> (easy to publish modules to registries even from other processes or servers;
> similar to web-server CGI), smooth transitions from legacy, extreme
> resilience and self-healing (multiple fallbacks per FFI dependency), and
> policy&configuration management [3].
>
>  [1] http://wiki.erights.org/wiki/Walnut/Secure_Distributed_Computing
>  [2] http://wiki.erights.org/wiki/Unum
> [3] http://c2.com/cgi/wiki?PolicyInjection
>
>
>
> I had done code migration in the past, but sadly my VM's haven't had this
> feature in a fairly long time (many years).
>
> even then, it had a few ugly problems:
> the migration essentially involved transparently sending the AST, and
> recompiling it on the other end. a result of this was that closures would
> tend to loose the "identity" of their lexical scope.
> ...
>
> over a socket, it had used a model where many data types (lists/...) were
> essentially passed as copies;
> things like builtin and native functions simply bound against their
> analogues on the other end (code in C land was unique to each node);
> objects were "mirrored" with an asynchronous consistency model (altering an
> object would send slot-change messages to the other nodes which held
> copies);
> other object types were passed-by-handle (basically, it identifies the
> NodeID and ObjectID for a remote object);
> ...
>
>
> some later ideas (for reviving the above) had involved the idea of using
> essentially mirroring a virtual heap over the network (using a system
> similar to "far pointers" and "segmented addressing"), but this would have
> introduced many nasty problems, and this didn't go anywhere.
>
> if I ever do get around to re-implementing something like this, I will
> probably use a variation of my original strategy, except that I would
> probably leave objects as being remotely accessed via handles, rather than
> trying to mirror them and keep them in sync (or, if mirroring is used,
> effectively using a "synchronized write" strategy of some sort...).
>
>
>
>
>> the second thing seems to be the option of moving the code to a local
>> toplevel where its ability to see certain things is severely limited.
>>
>
>  Yes, this is equivalent to controlling which 'capabilities' are available
> in a given context. Unfortunately, developers lack 'awareness' - i.e. it is
> not explicit in code that certain capabilities are needed by a given
> library, so failures occur much later when the library is actually loaded.
> This is part of why I eventually abandoned dynamic scopes (where 'dynamic
> scope' would include the toplevel [4]).
>
>
>
> "dynamic scope" in my case refers to something very different.
> I generally call the objects+delegation model "object scope", which is the
> main model used by the toplevel.
>
> it differs some for import:
> by default, "import" actually exists in terms of the lexical scope (it is
> internally a delegate lexical variable);
> potentially confusingly, for "delegate import" the import is actually
> placed into the object scope (directly into the containing package or
> toplevel object), which is part of the reason for its unique semantics.
>
> say (at the toplevel):
> extern delegate import foo.bar;
>
> actually does something roughly similar to:
> load("foo/bar.bs");    //not exactly, but it is a similar idea...
> delegate var #'foo/bar'=#:"foo/bar";    //sort of...
> in turn invoking more funky semantics in the VM.
>
> note: #'...' and #:"..." is basically syntax for allowing identifiers and
> keywords containing otherwise invalid characters (characters invalid for
> identifiers).
>
>
>
>  [4] http://c2.com/cgi/wiki?ExplicitManagementOfImplicitContext
>
>
> ok.
>
>
>
>> simply disabling compiler features may not be sufficient
>
>
>  It is also a bad idea. You end up with 2^N languages for N switches.
> That's hell to test and verify. Libraries developed for different sets of
> switches will consequently prove buggy when people try to compose them. This
> is even more documentation to manage.
>
>
> it depends on the nature of the features and their impact on the language.
>
> if trying to use a feature simply makes code using it invalid ("sorry, I
> can't let you do that"), this works.
> if it leaves the code still valid but with different semantics, or enabling
> a feature changes the semantics of code written with it disabled, well, this
> is a bit more ugly...
>
>
> but, yes, sadly, I am already having enough issues with seemingly endless
> undocumented/forgotten features, and features which were mostly implemented
> but are subtly broken (for example, me earlier fixing a feature which
> existed in the parser/compiler, but depended on an opcode which for whatever
> reason was absent from the bytecode interpreter, ...).
>
> but, with a language/VM existing for approx 8 years and with ~ 540 opcodes,
> ... I guess things like this are inevitable.
>
>
>
>
>
>>
>>  anything still visible may be tampered with, for example, suppose a
>> global package is made visible in the new toplevel, and the untrusted code
>> decides to define functions in a system package, essentially overwriting the
>> existing functions
>
>
>  Indeed. Almost every language built for security makes heavy use of
> immutable objects. They're easier to reason about. For example, rather than
> replacing the function in the package, you would be forced to create a new
> record that is the same as the old one but replaces one of the functions.
>
>  Access to mutable state is more tightly controlled - i.e. an explicit
> capability to inject a new stage in a pipeline, rather than implicit access
> to a variable. We don't lose any flexibility, but the 'path of least
> resistance' is much more secure.
>
>
> yes, but this isn't as ideal in a pre-existing language where nearly
> everything is highly mutable.
> in this case, creation of security may involve... "write protecting"
> things...
>
> a basic security mechanism then is that, by default, most non-owned objects
> will be marked read-only.
>
>
>
>
>
> an exposed API function may indirectly give untrusted code "unexpected
>> levels of power" if it, by default, has unhindered access to the system,
>> placing additional burden on library code not to perform operations which
>> may be exploitable
>
>
>  This is why whitelisting, rather than blacklisting, should be the rule
> for security.
>
>
> but whitelisting is potentially much more effort than blacklisting, even if
> potentially somewhat better from a security perspective.
>
>
>
>
>>  assigning through a delegated object may in-turn move up and assign the
>> variable in a delegated-to object (at the VM level there are multiple
>> assignment operators to address these different cases, namely which object
>> will have a variable set in...).
>
>
>  The security problem isn't delegation, but rather the fact that this
> chaining is 'implicit' so developers easily forget about it and thus leave
> security holes.
>
> A library of security patterns could help out. E.g. you could ensure your
> revocable forwarders and facet-pattern constructors also provide barriers
> against propagation of assignment.
>
>
>
> potentially, or use cloning rather than delegation chaining (however, in my
> VM, it is only possible to clone from a single object, whereas one may do
> N-way delegation, making delegation generally more convenient for building
> the toplevel).
>
> my current thinking is that basically assignment delegation will stop once
> an object is hit which is read-only, forcing the assignment into a "nearer"
> object. trying to force-assign into a read-only object will result in an
> exception or similar.
>
> in general though, trying to assign top-level bindings (which are generally
> things like API functions) may be a bad practice in general.
>
>
>
>
>> could a variation of, say, the Unix security model, be applied at the VM
>> level?
>>
>
>  Within the VM, this has been done before, e.g. Java introduced thread
> capabilities. But the Unix security model is neither simple nor flexible nor
> efficient, especially for fine-grained delegation. I cannot recommend it.
> But if you do pursue this route: it has been done before, and there's a lot
> of material you can learn from. Look up LambdaMoo, for example.
>
>
> LambdaMoo found a MUD, if this is what was in question...
>
> I partly patched it on last-night, and the performance overhead should be
> "modest" in the common case.
>
>
> as for "simple" or "efficient", a Unix-style security model doesn't look
> all that bad. at least I am not looking at implementing ACLs or a
> Windows-style security model, which would be a fair amount more complex and
> slower (absent static checking and optimization).
>
> luckily, there are only a relatively small number of places I really need
> to put in security checks (mostly in the object system and similar). most of
> the rest of the typesystem or VM doesn't really need them.
>
> or such...
>
>
> _______________________________________________
> fonc mailing list
> [email protected]
> http://vpri.org/mailman/listinfo/fonc
>
>

_______________________________________________
fonc mailing list
[email protected]
http://vpri.org/mailman/listinfo/fonc

Re: [fonc] misc: code security model

Reply via email to