Re: es-discuss Digest, Vol 43, Issue 9

Leo Meyerovich Sun, 05 Sep 2010 18:42:00 -0700

Snipping together several emails:


>  
> Of course not. But it does allow these to be frozen. SES makes use of that to 
> freeze them (after cleaning them up) as the last step of its initialization 
> <http://code.google.com/p/es-lab/source/browse/trunk/src/ses/initSES.js#557>. 


So why not allow the ability to disable weak map enumeration? Perhaps default 
on vs. off is a distinction more important to SES than strict mode. 


>  
> 
> 3) Sniffing is already ubiquitous in terms of language / host API variants
> 
> Sure. What issue does that raise?
> 


It's an example of (static) data leakage we've already accepted, and, yep, I 
won't lose sleep over it anytime soon!



> More generally, ES mostly has what one might call sources of "static 
> nondeterminism" 
> <http://code.google.com/p/google-caja/wiki/SourcesOfNonDeterminism>. By 
> "static" I mean ways implementations of the spec may differ from each other. 
> But these only open a non-overt channel if they may differ dynamically among 
> runs on the same platform in a way that can be influenced.
>  



That page suggests we agree that combining a language implementation with a 
predictable cost model (e.g., time iterating a short list vs a long one is 
observable) with "Date()" is enough to form a reliable covert channel. In the 
motivating mashup scenario for our discussion, I suspect we're just as likely 
(or more likely!) to get actual side channels from APIs with exploitable 
complexities as from enumerable weakmaps. I couldn't tell from the wiki page if 
you thought we should remove "Date()" in general, in Cajita in particular, or 
should just be generally aware of it. 

Inclusion of "Date()" suggests a strict regimen well beyond ocap-style ES5 if 
you want object (e.g., API) sharing mashups without side channels.

> 
> 
> Nice. Excellent questions. The boundary between overt and non-overt is 
> relative to the semantics of a platform. For the same concrete platform 
> implementation, there can be multiple semantic accounts, and thereby multiple 
> notions of where the boundary is between overt and non-overt.


I've thought about the meaning of covert vs. overt in terms of granularities of 
semantics for the past couple of years. What is visible in a model is overt; 
what is only visible in a finer model is covert in the coarse model. PL papers 
typically hide cost models and GC, though I view it more as a way to avoid 
distractions and developers do use some sort of high-level model of them when 
writing programs. Thus, aspects of them are indeed overt.

I don't know about side-channels; it seems too widely used a term for me to 
want to pick one clean meaning at the language level. In the case of your 
taxonomy breaker, it might even be orthogonal to overt vs. covert. If there's 
any sort of bug communicating what wasn't intended, irrespective of 
(c)overtness, that might be a side channel. E.g., maybe you forgot to disable 
the global readable debug log.


> Formal models for reasoning about confidentiality that ignore non-overt 
> channels are generally unsound. Deterministic platforms aside, "correctly" 
> preserving confidentiality involves reasoning about implementations and about 
> multiple levels of abstraction.



As we can always make a model more fine-grained, by my definition of covert vs 
overt, any confidentiality result is unsound (which is not surprising: you 
better not be sniffing my electrons!). So perhaps you mean that we only admit 
attacks within a restricted range of semantic models. The question then is what 
is the finest model considered and where do we draw the line between overt and 
covert ones. This starts to be very involved...


> 
> A correct program must only rely on the communications provided by overt 
> channels. This brings us to an important realization from Fred Spiessen's 
> thesis <http://www.evoluware.eu/fsp_thesis.pdf>: Formal models for reasoning 
> about integrity that ignore non-overt channels remain sound. The correctness 
> of a program at preserving integrity is within conventional correctness, 
> which we can reason about soundly in a modular manner.



I didn't see which part of the thesis you're referring to, perhaps you can 
point to a section?


> 
> Now to your questions. My earlier statements are with regard to using a 
> conventional programming language semantics, in which memory allocation 
> mechanics are mostly ignored. At 
> <http://wiki.ecmascript.org/doku.php?id=strawman:gc_semantics> I captured 
> some notes and references to papers that explore how to extend such 
> conventional semantics to take account of some of these issues. At the 
> opposite extreme, we could reason about the semantics of the machine 
> instruction set, where the "program" we are reasoning about is the entire 
> tower of software leading up the same program. At this level of abstraction, 
> there is finite memory and no memory allocation. Under this perspective, we 
> would get very different answers about what is correct or incorrect, and 
> corresponding different answers about what is overt and non-overt.
> 
> Of course, the artifacts this committee produces and this list discusses are 
> semantics for next standard versions of JavaScript. These are programming 
> language semantics in the conventional sense.
>  


1. "For code that can overtly sense what has been collected, such as by using 
weak references, such observation opens an overt information leakage channel" 
-- it seems here you are accepting weak references as an overt peephole into 
references they attach to. The more I think about it, the more I think about 
this as being an (acceptable) ocap explanation of weak references that isn't 
far from, say, (open) destructors.


I agree with your later comment about non-local reasoning. I've been trying to 
think of a way to make an enforceable decision at the allocation or sharing 
point -- e.g., declaring an object cannot be reachable by a weak reference -- 
but haven't thought of anything satisfying beyond being able to disable weak 
enumerability at the global level as a MOP capability.

Most selective controls are going to still enable implicit control dependencies 
similar to your example. I don't see ES5/Cajita/... providing reliable way to 
do both general object sharing and information flow control against these 
without adding some major restrictions and machinery that I suspect could also 
obviate the need for such controls.


2. If we want to be side-channel free to the extent of weak map enumeration 
mattering, more stringent GC progress guarantees seem essential. I haven't 
followed the literature here, but this sounds kind of neat at the research 
level (which also might be a concern for those whose address doesn't end in 
".edu"!).



> Ok, let's suppose that we consider a semantics in which a shared object 
> reference is an overt communications channel. Overt implies that correct 
> programs may rely on this communications channel. This semantics introduces a 
> non-modularity, in that providing an abstraction to two clients enables those 
> clients to legitimately interact in ways out of control of their shared 
> abstraction. 



I totally agree! However, I'd point my finger at garbage collection: even 
without weak references, it breaks modularity. GC, for high-level semantic 
models, automates the memory allocation protocol, seemingly restoring 
modularity  to programs. I don't recall the best paper describing this 
perspective; definitely surfaced in part in  Dan Grossman's "The TM / GC 
Analogy" and maybe in Wilson's GC survey, but I think there was a more focused 
writeup elsewhere. However, the automation isn't perfect: memory leaks etc. 
still happen and we still use non-local reasoning when reference passing (for 
memory use, not confidentiality). Perhaps a more historically fair way of 
phrasing this is that "garbage collection didn't fully solve modularity wrt 
memory use."



> How would we program modularly in such a system? We would need to avoid ever 
> sharing references except when we intend to enable communication.



Even without timing attacks, weak references, etc., I think ES5 is still 
unreliable for information flow properties. SES sounds like a step closer, but 
even there, I wouldn't trust 20,000+ lines of it without something like an 
information flow analysis. Writing information safe ES seems like it would be 
much less like ES5 than anything like SES, AdSafe, etc.

Furthermore, for most analyses I've read, I think the modularity (for 
confidentiality) isn't so negatively impacted. Yes, you now have to check that 
exposed tainted references outlive untrusted widgets (or finer ideas like 
collection is not impacted by high data), but I suspect most such analyses 
would disallow giving the items to beginwith!  I don't recall any papers about 
such leaks; I think the timing attacks (even without traditional GC) are a 
bigger concern. 


Rereading this email, I'm not sure the covert/overt/side channel perspective is 
adding much to this discussion. Our willingness to address timing attacks such 
as predictable GC, (concrete) acceptable levels of information leakage, and 
'toggleability' seem more important. The others are interesting, but maybe not 
for here (say the ocap list).

Regards,

- Leo

_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

Re: es-discuss Digest, Vol 43, Issue 9

Reply via email to