Re: [JS-internals] Taint analysis in SpiderMonkey

Nicolas B. Pierron Fri, 16 Aug 2013 10:50:37 -0700

On 08/15/2013 05:53 PM, Jim Blandy wrote:

On 08/15/2013 11:29 AM, Nicolas B. Pierron wrote:

On 08/09/2013 02:59 PM, Jim Blandy wrote:

Ivan Alagenchev and Mark Goodwin asked me to take a look at their project to
bring DOMinator, a taint analysis for SpiderMonkey, […]


On Tuesday, Koushik Sen made a presentation which is available on Air
Mozilla[1] where he presented some JavaScript instrumentation which use
some parser-hook to rewrite the original script with some extensible
instrumentation.

I think it's important to consider both the scale of the effort required and
the results produced. Implementing something like the StringLabeller (pace
Brendan) hooks would be a different order of magnitude of effort than the
alternatives suggested here.

I need to watch that presentation, but I did see Sen's presentation at
JSTools 2013 in Montpellier. Without any intent to contradict, Jalangi's
record-and-reply-with-shadow-execution approach did not seem to me like a
low-maintenance tooling approach. Certainly, using shadow execution to
recover the details of execution drastically reduces what one needs to
record, and thus its runtime impact. But the combination of the recording
annotations and the shadow interpreter do not seem like a light maintenance
burden. Am I being pessimistic?

This something which is more general and it would have multiple purposeswhich is likely to be more stable over time than just one cornerapplication. In addition, it can be customize by users, so this would a begood to remove the burden of the content of the analysis from the JS engine.

I would prefer a similar solution over a simple tainting solution which onlyconsider to intrusively annotate strings. First, the performance impactwould be isolated to people who are running the analysis. Second, it is notas intrusive because it would provide an alternate & contained path in thebytecode emitter, and the rest should remain unchanged.

Jalangi's approach is exactly like self-hosting the analysis withoutinstrumenting the interpreter or the Jits, which means that we would have noperformance issue induced by the instrumentation when users are not usingany analysis.

In addition, this would lower the cost of adding any other analysis to thedevtools as this would have to be done once. And web developers can evenmake their custom analysis, such as "do not hold a cross-compartmentwrappers except in these functions".

Further: having thought a bit more, I'm not sure that source-rewriting
techniques are going to be much better. Perhaps there's a beautiful trick
I'm not noticing, but it seems to me that making finer-grained distinctions
between strings than the language supports entails nothing less than a
self-hosted JavaScript interpreter, because you can't use strings
(meta-level) to represent strings (debuggee level).

From what I understand of Jalangi, is that you can add any kind ofannotation by boxing the results, and remove the annotation by unboxing theoperands around any operations.

The only aspect of it that I do not like is that they redefine theoperations, which does not guarantee the correct behavior. I think we cando better with maybeBox & maybeUnbox primitives and a pre- & pos- operationsfor updating the context. In addition, we can manage to please TI and avoidthe mega-morphic operators as they have in Jalangi.


--
Nicolas B. Pierron

_______________________________________________
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Re: [JS-internals] Taint analysis in SpiderMonkey

Reply via email to