[JS-internals] Improving our internal documentation.

2018-11-21 Thread Nicolas B. Pierron
Hi,

SpiderMonkey internal documentation is sometimes lacking or out-of-date. With 
the JIT team we had a meeting to discuss ways to improve upon this state. One 
of the idea was to open a meta-bug to track internal documentation issues.

If while reading the code you find a place which lacks or has an out-dated 
documentation, or you just spent 30 minutes on IRC trying to understand the 
state of the existing code,

Then, take 10s to file a bug as a blocker of:

  Bug SMDOC: https://bugzilla.mozilla.org/show_bug.cgi?id=SMDOC

Then, as a SpiderMonkey developer, you now have the responsibility to fix some 
of these bugs at your own pace, knowing that in a few years, you might be the 
person asking these questions.

Also note, that some of this documentation effort, when going in depth in the 
description of some components might be worth writing a blog post. Readers of 
the JavaScript blog are more than likely interested in your prose, and might 
give you the feedback needed to make this documentation understandable to 
anybody who never saw a SpiderMonkey before.
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Let's crowdsource JS shell flag combinations for fuzzing

2018-05-15 Thread Nicolas B. Pierron
On Tuesday, May 15, 2018 at 2:42:16 PM UTC, Benjamin Bouvier wrote:
> let's add a text file containing a
> list of interesting JS shell flag combinations so that our fuzzing people
> can parse this file and automatically pick random combinations from it.

One question, should these combinations of flags be named such that we can 
bisect with a named combination of flags, even if the flags are being renamed?
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] JS_STACK_GROWTH_DIRECTION

2017-11-17 Thread Nicolas B. Pierron
I agree with Jan, all the logic of the frame iterator is based on the stack 
growing down, as well as the way the pointers are being coerced and interpreted.


For now, I think it is safe to assume that the stack grows down in the 
generated code. (jit & wasm directories)


On 11/17/2017 08:39 AM, Jan de Mooij wrote:

IMO it's okay to rely on the stack growing down in JIT code. All of our JIT
backends work like that right now and if this ever changes we would have to
refactor/audit a ton of things anyway (all callers of masm.push,
masm.getStackPointer would be a good start).

Jan

On Fri, Nov 17, 2017 at 9:06 AM, Lars Hansen <lhan...@mozilla.com> wrote:


JS_STACK_GROWTH_DIRECTION is normally -1 (down) but is defined as 1 (up)
for HPPA.

Does anyone test with stack-growing-up any more?  (I know HPPA is tier-3,
at best.)  Do any of you think about this possibility when you write code?
When you masm.Push something and you need the address of the pushed item,
do you worry about whether you should capture the stack pointer before or
after you push?

The wasm baseline compiler currently assumes that the stack grows down so I
guess I can just add an assert there and somebody can fix that if it
becomes necessary, but it would be nice to know if I should be worrying
about this at all.

(It looks like stack direction is now like floating point and endianness --
all mainstream systems agree, at last.  I just hope everyone will get
around to adopting TSO before too long.)

--lars
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals




--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Should we remove TraceLogger?

2017-08-11 Thread Nicolas B. Pierron

On 08/11/2017 04:38 AM, Sean Stangl wrote:

The perf-html project is good enough now for me to use it in place of
Tracelogger.


perf-html is not available in the JS shell, and it is not as precise, even 
with smaller sampling rate, you will get more overhead than the tracelogger.


An alternative to the complete removal, would be to tune perf-html to have 
labels for each location where tracelogger can be enabled today.



I would also like to get rid of Iongraph. We should see if we can expose
more JIT information to the perf-html team.


I still rely frequently on Iongraph within the JS shell, I would not like to 
see this one disappear.


I once tried to expose the exported JSON to the devtools, but this caused 
more pain than anything.  We can try to have a buffer storing the log of the 
compilation, like I did previously with the devtools, but without calling 
back into JS.  Then, for perf-html this might cause transfer/recording size 
issues.


Also, as much good as I think of perf-html, we should be careful of what we 
expose in perf-html.  Remember that all users of perf-html are not Jit 
experts, and that even exposing small information such as bailouts can 
back-fire badly with a large number of false-positive bug reports.


I do not think we should expose the content presented by Iongraph in 
perf-html.  Maybe we should focus on a synthesized version, such as the 
information about Jit optimizations, or displaying the speed-up that each 
Jit compiled code are running at.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] How JS team members decide what to work on

2017-04-19 Thread Nicolas B. Pierron

On 04/11/2017 07:34 PM, Jason Orendorff wrote:

I have no plans to type in my notes from the JS meeting. If you want them,
ping me on IRC. But one thing I want to think about is how we decide what
to work on, especially performance work. Today, it's like this:

- If you're a volunteer, of course you decide what to pick up—we're just
glad you're here!

- A lot of us profile benchmarks and look for useful work in the profiles.

- Sometimes we do the same thing with random web sites.

- Bigger projects, like Waldo's work on parsing and djvj et al's work on GC
scheduling, are undertaken when we have stuff that has been showing up on
profiles "forever". This kind of work isn't driven by any one particular
measurement, like a benchmark.

Generally, I think we're working on stuff that makes sense (and have been
all along), but it's still not guaranteed to be representative of the web
as users see it. Is that fair? What else should we be doing?


I have looked at multiple performance issues in the past. I globally agree 
with the processes listed above. The problem I see is that by fixing problem 
we often forget to look at the big picture, or ask the meta questions.


One of these meta questions is: What parts of our current design includes us 
writing code to fix performance issues?


The answers to these question are unfortunately larger projects than just 
fixing a few performance issues observed on websites.


Still, a project like CacheIR is a good answer to this question.  Before 
CacheIR, we had to investigate performance issues of Baseline, and 
performance issues of Ion.  Baseline IC are usually more complete and Ion IC 
are usually lacking while being more optimized.  By unifying the 2 IC 
systems, CacheIR gives us time to investigate other performance issues in 
the future.


Our time is the most sparse resource, and the question you asked about how 
we decide which project is the most important perfectly highlight that we 
cannot keep up.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] How to implement the security scheme that prevents the RET instructions from being misused

2017-03-22 Thread Nicolas B. Pierron

On 03/22/2017 04:07 AM, Yuan Pinghai wrote:

In my current design, the cookie is stored in a new field (named
retCookie_) of JitCode, and each JITCODE (representing an instance of
JitCode) has its own cookie. In this way, when i need the original
return-address, i can recover it by first getting the JITCODE and then
fetching the cookie. Now, my problem is how can i get the correct JITCODE
with an address (e.g. the interrupted address before bailing-out)?


The CalleeToken of JitFrameLayout frames holds either a JSFunction or a 
JSScript which contains a pointer to the Baseline and Ion structure 
containing references to the JitCode.


When a JitCode is invalidated (Ion), the JitCode pointer is written above 
the return address.  JitFrameIterator::ionScript() should do the proper work 
to get the information you are looking for.


Note that JitCode are used for all trampoline code which are created when 
the JitRuntime is created. (see Trampoline-*.cpp files) and these should be 
registered on the Runtime.


Our stack frames are making assumption about the alignment of the stack, 
thus if you add any fields in the CommonFrameLayout, or JitFrameLayout, 
these might cause issue in all code able to produce code, in which case you 
should look for MacroAssembler::call and MacroAssembler::callJit.


WebAssembly / Asm.js are not using any of the Jit frames.  Instead they are 
using the same frame layout as the ABI of the system, with some variations 
around the manipulation of SIMD registers.



Could some body give me a tip? Any suggestions are welcome. I appreciate
for the help!


To be honest, our stack frames are not the easiest thing to manipulate.

Maybe to prototype it, it would be easier to create some reserved memory 
which is used as a second stack space which only contains the cookies?


You could store them as part of the JSContext*, and fetch the one 
corresponding to the top of the stack.




By the way, i am working on Spidermonkey 45. By honest, i don't think i
have enough knowledge on fixing the bailing-out and exception handling
mechanisms, i also need suggestions on them.


What are you issues with bailouts and exceptions?  They are basically 
reading a register dump from the stack to build a MachineState (structure of 
pointers to each register spilled location if any) and then unwind the frame 
and in case of a bailout replace it by the one created by Baseline.


What would matter would be to edit the return address, knowing the caller & 
callee, which you should have in both cases with the JitFrameIterator.



--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Clang-format

2016-05-12 Thread Nicolas B. Pierron

On 05/11/2016 06:31 PM, Bill McCloskey wrote:

On Wed, May 11, 2016 at 5:01 AM, Nicolas B. Pierron <
nicolas.b.pier...@mozilla.com> wrote:

If the problem are the pointless arguments on dev.platform, which are

mistakenly considering SpiderMonkey as Gecko's property, I would totally
agree on moving SpiderMonkey into its own repository.

I do not see how indentation differences could be a speed bump, and even
if this was a problem, I am still not yet convinced this alone could
justify changing 95% of the lines of the project.

One thing I hate with Gecko undesired continuous integration, is that we
are hold responsible for failures in tests that we cannot reproduce. Having
a separated project would make explicit the fact that someone is
responsible for the integration, and for converting such test cases into
SpiderMonkey test cases.  I honestly think I spend more time thinking about
how I can reproduce some Gecko failures than anybody spent else spent about
thinking about indentation.



This is a really bad attitude for Mozilla as a whole. Every one of us at
Mozilla has a responsibility to make Firefox the best web browser. The more
we divide ourselves into cliques and label bugs as "someone else's
problem", the sooner we will fail. You might think it's more productive for
you to focus on SpiderMonkey alone and let other people deal with other
issues. Unfortunately, many of the most important bugs that span across
different areas; with your approach, these bugs will never be fixed.


This is not some else problem, this is my problem, except that someone else 
who is much more experienced with the rest of the browser already worked out 
to figure out how to help me reproduce the issue.


Basically, what I am suggesting by having a person responsible for the 
integration of SpiderMonkey in Gecko, is to have one or multiple persons who 
would become knowledgeable about all the various parts where I am not.  Thus 
making *us* (Gecko & SpiderMonkey) more productive, by having competent 
persons working in their domain of expertise.


Thus we should no longer be stuck for weeks on problems that we have no idea 
how to address.


I know the time it takes to investigate such errors, and I value my time and 
choose by priorities, such that I can have the most impact.  When facing 
Gecko failures, I have 2 choices:

 - Spending weeks to figures them out.
 - Switching to something else.

In both cases, I waste something.  I either waste the time to figure out the 
issue, or I waste the time it took me to make the initial work.


I sometime take the second solution, in hope that fuzzers will find the 
issue, or that other bugs would be easier to investigate.  Thus reducing the 
amount of wasted time at the cost of extra latencies.



Mozilla needs more people who understand multiple browser components. I'll
call them superheroes because of how valuable they are. Understanding and
reproducing browser tests can seem unrewarding, but it's a great way to
start to understand how the rest of the system works. People on the
SpiderMonkey team are in a great position to be superheroes: SpiderMonkey
and XPConnect are some of the hardest parts of the browser to understand,
and it's often necessary to step through them to debug other browser
issues. People who already understand them have an advantage over everyone
else.


The need for super heroes only highlight the lack of efforts from us to make 
SpiderMonkey easier to grasp from within a debugger, for embedders.


That's something I wanted to change for a while, and I think we can improve 
SpiderMonkey embedders experience.  I suggested multiple time that we should 
improve SpiderMonkey debugging experience under gdb, by giving the ability 
to set breakpoint in JS code within gdb. (including Jit code)


The more we empower people for working only on their domain(s) of expertise, 
the less we would have need for such heroes.  Having persons responsible for 
the integration would help us on that.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Clang-format

2016-05-11 Thread Nicolas B. Pierron

On 05/11/2016 02:15 AM, Jason Orendorff wrote:

instead go with Terrence’s suggestion and simply adopt the same style as
the rest of Gecko, including the 2-space indent.



I've said before that we won't do this without talking it over as a team.
Well, team? What do you think?


Massive changes are always bad ideas, unless these are used to eliminate 
classes of bugs/crashes, by preventing us from writing them.


Changing the indentation is the kind of thing which brings no value, and 
introduce massive changes.  So, I will always be totally against these kind 
of changes.


I agree that having a tool to *check* one coding style is nicer than having 
no coding style, as long as the tool is flexible enough to allow local 
inconsistencies made to make the code more readable.



Personally I dislike the 2-space indent. But what matters to me here is
eliminating a speed bump for both Gecko and SM hackers; and reducing
pointless arguments on dev.platform.


If the problem are the pointless arguments on dev.platform, which are 
mistakenly considering SpiderMonkey as Gecko's property, I would totally 
agree on moving SpiderMonkey into its own repository.


I do not see how indentation differences could be a speed bump, and even if 
this was a problem, I am still not yet convinced this alone could justify 
changing 95% of the lines of the project.


One thing I hate with Gecko undesired continuous integration, is that we are 
hold responsible for failures in tests that we cannot reproduce. Having a 
separated project would make explicit the fact that someone is responsible 
for the integration, and for converting such test cases into SpiderMonkey 
test cases.  I honestly think I spend more time thinking about how I can 
reproduce some Gecko failures than anybody spent else spent about thinking 
about indentation.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Clang-format

2016-05-10 Thread Nicolas B. Pierron

On 05/06/2016 06:07 PM, Jakob Stoklund Olesen wrote:



On May 6, 2016, at 09:59, Jason Orendorff <jorendo...@mozilla.com> wrote:

On Fri, May 6, 2016 at 10:43 AM, Jakob Stoklund Olesen <jole...@mozilla.com 
<mailto:jole...@mozilla.com>> wrote:
Unfortunately, the way SpiderMonkey indents case labels is too odd for 
clang-format. I don’t think it has a configuration flag that can do that 
half-indent.

Feel free to mass-change it to whatever Gecko does and update the style guide. 
We'll cope.


The mozilla style is to indent the case label by one level from the switch, and 
the code inside the case by one further level. With 4-space indent, it looks 
like this:

 switch (tag) {
 case SCRIPT_INT: {
 uint32_t i;
 if (mode == XDR_ENCODE)
 i = uint32_t(vp.toInt32());
 if (!xdr->codeUint32())
 return false;
 if (mode == XDR_DECODE)
 vp.set(Int32Value(int32_t(i)));
 break;
 }
 case SCRIPT_DOUBLE: {
 double d;
 if (mode == XDR_ENCODE)
 d = vp.toDouble();
 if (!xdr->codeDouble())
 return false;
 if (mode == XDR_DECODE)
 vp.set(DoubleValue(d));
 break;
 }

Applied to the current SM code base, this style change would move all lines 
inside a switch, not just the case labels.

I think that if we can cope with such an invasive mass change, we should 
instead go with Terrence’s suggestion and simply adopt the same style as the 
rest of Gecko, including the 2-space indent.


I would not go for indenting case labels by 4, as this would basically make 
us avoid using switch-case statements in favor of "else-if" which does not 
gives us the same guarantee.


One other solution would be to remove the half-indent and replace them by no 
indent, i-e all the visibility modifiers would be on the same line as the 
class keyword, and all the case would be on the same line as the switch 
statement.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Reducing SpiderMonkey's crash rate

2016-05-04 Thread Nicolas B. Pierron

On 05/03/2016 08:10 PM, Steve Fink wrote:

On 05/03/2016 11:11 AM, Jakob Stoklund Olesen wrote:

LLVM had an EXPENSIVE_CHECKS macro for that kind of assertion, but I don’t
think they use it any more. People would rarely enable it, so the
expensive assertions had a tendency to bit rot. I think if they had been
enabled by default, they might have stayed in.


Yes, this would be worth doing, but would require some effort. I think we
can keep them from bitrotting by running with them on in automation. I would
say that a debug build should always have them compiled in, but the
expensive asserts could do a dynamic check before executing.



I do not recall who suggested that, but one idea was to run nightly build 
with the MOZ_ASSERT compiled in, but with the addition of way to skip it in 
order to throttle down the assertion overhead.


While I think this is a good idea, I see some pitfalls, such that we don't 
want to introduce opt-only bugs that are not caught by the nightly 
population.  And, also we do not want to use an opt-build with all 
MOZ_ASSERT enabled, as this would cause issues with variables which only 
exists in debug builds.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Reducing SpiderMonkey's crash rate

2016-05-03 Thread Nicolas B. Pierron

On 05/02/2016 11:32 PM, Nicholas Nethercote wrote:

On Thu, Apr 28, 2016 at 10:35 PM, Nicolas B. Pierron
<nicolas.b.pier...@mozilla.com> wrote:


For the JIT, what would improve our life a lot, would be if we could dump
the code of the compiled function which is currently being executed.  If we
have that, I think we can make a tool to reverse engineer the trace of
functions used to generate the assembly code, and potentially walk back to
the LIR / Inline Cache which produced the code.


Good idea. How hard would this be? Should I file a bug?


The idea I had was to have a compilation mode where we instrument the 
assembler buffer to record the sequences of stack traces with the sequences 
of pushed bytes.  Then use this information to build a markov chain for each 
stack frames which is still live on the stack.


This way, the reverse engineering would be like a GLR-parser on an island 
grammar expressed by the markov chains.  Thus producing as an AST the 
potential compilation traces for the code which produced the assembly 
buffer.  The markov chain should provide the likelyhood of each AST, and 
also potentially help us by highlighting corrupted bytes.


I think such tool can be made in matters of weeks.

The big unknown for me is where can we find the bytes which are surrounding 
the pc.  Jan told me that we are already doing so, but I have no access to 
such pool of information to experiment with it.



I think this would be something we should consider doing if we are going to
rewrite the MIR representation / the compiler, as I expect to do as part of
THM as its internal representation should be easy {de,}serialize.

> […]
>

(BTW, what is "THM"?)


Three Headed Monkey, the project which should revolutionize the way we write 
compiler, but on which I have effectively no time to work on yet.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Reducing SpiderMonkey's crash rate

2016-04-28 Thread Nicolas B. Pierron

On 04/28/2016 06:48 AM, Nicholas Nethercote wrote:

This is a good moment to think hard about how we can improve things.

- Can we use static and dynamic analysis tools more? (Even simple
things like bug 1267551 can help.)


I think we already do that every time we can think of a practical one.  For 
dynamic analysis, fuzzers are extremely helpful to help us figure out issues.



- How can we get better data in JIT and GC crash reports?


For the JIT, what would improve our life a lot, would be if we could dump 
the code of the compiled function which is currently being executed.  If we 
have that, I think we can make a tool to reverse engineer the trace of 
functions used to generate the assembly code, and potentially walk back to 
the LIR / Inline Cache which produced the code.



- Would "extended assertions" help? By this I mean verification passes
over complex data structures. Compilers often have these, e.g. after
each pass you can optionally run a pass that does a thorough sanity
check of the IR. Do we have that for the JITs?


We have such phases in IonMonkey, which ensure the sanity of the MIR graph. 
 For the moment it is only enabled in debug builds.  Most of the checks 
could be done in release, but some of the checks are dependent on data which 
are only available in debug builds, and that we would not want to enable in 
release build either.


I think it is doable to have such checks turned on if the browser is in a 
refined-error-detection mode.  I am thinking mostly of repeated start-up 
crashes, which are likely to be caused by more-or-less deterministic behaviours.


On the other hand, asserting about the graph coherency would help locating 
the error, but hardly isolate it to a specific function.



- What defensive programming measures can we add in? What code
patterns are error-prone and should be avoided?


One idea I had, would be to make unit test for phases of the compiler. 
Unfortunately this is not something trivial to add as-is, and we would have 
to detail all the hidden assumptions which are currently present in all the 
phases.


I think this would be something we should consider doing if we are going to 
rewrite the MIR representation / the compiler, as I expect to do as part of 
THM as its internal representation should be easy {de,}serialize.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] OOM exceptions

2016-04-22 Thread Nicolas B. Pierron

On 04/21/2016 09:05 PM, Shu-yu Guo wrote:

The first is ergonomic. I want phased workloads like parsing + BCE and JIT
compiling to be completely infallible. The allocation pattern of compilers
is a series of many small allocations. I'd like to just allocate a huge
buffer inside a LifoAlloc-like allocator in the beginning and call it a
day. That shouldn't impact 32bit address space fragmentation. Checking
false/null returns everywhere really puts a crimp in my evening.


We have been on this path with IonMonkey, and I honestly find that this is a 
not good practice to have an infallible allocator, especially if you don't 
want to cause a browser crash and properly handle OOMs as we try in IonMonkey.


The reason why I think we should properly handle OOMs in IonMonkey, is 
because this is not part of the mandatory pool of allocation request by the 
Script.  Thus, the user script should not see these OOMs, as it is 
technically not responsible for them.


The problem then, is that we have to armor any loop which has boundaries 
controled by any user inputs.  If the user input can be made as large as 
possible, then whatever ballast space you take, it can be overflow.  I 
think, as this is occasional this is much more miss-leading than having a 
pattern being repeated over and over.


I tried to suggest better approaches in Bug 1244824 [1], but the best I can 
come up is a way to emulate exception handling, but the pit-falls are worse. 
 So, if we were to have a static analysis to ensure that we do not have any 
destructor to execute while leaving a scope, then I guess we could emulate 
exception this way.


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1244824

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] OOM exceptions

2016-04-21 Thread Nicolas B. Pierron

On 04/21/2016 05:16 PM, Jan de Mooij wrote:

Is our only option doubling down on these fuzz bugs and adding more
assertions, or can we do better with static analysis, the type system,
annotations, something?


From the type system point of view, I think we could add a type to 
distinguish the Allocation failures from boolean types. In many cases, I 
found that we were mixing the true/false expectation of an analysis, with 
the true/false of an allocations.


Using the type system would involve making a lot of modifications to the 
code base, either to wrap/unwrap error code, or to add new enumerated types. 
 I think this could be a good long term solution, but hardly a way to make 
incremental progress.


A static analysis is probably the easiest way forward, and it should ensure 
that same value  (false / Foo::ALLOC_ERROR) is always used to identify 
allocation failures within a single function.


This means that an analysis should probably: (1) Go through the body of 
functions, and look for values returned in case of allocation failures. (2) 
Annotate the function declaration with the value used on allocation 
failures. (3) Revisit by going to 1. any of the functions which are using 
any of the annotated function declarations. (4) Ensure that virtual 
functions have consistent error values.


This would leave the question of function pointers, but I guess this is 
something we can easily address either by review or with annotations.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Jit DevTools 0.0.1

2015-06-11 Thread Nicolas B. Pierron

On 06/11/2015 04:49 PM, cosinusoida...@gmail.com wrote:

On Thursday, June 11, 2015 at 2:19:40 PM UTC+1, Nicolas B. Pierron wrote:

On 06/11/2015 02:43 PM, cosinusoida...@gmail.com wrote:

Hi Nicolas,

I'm getting the following error when I attempt to use the addon:

console.error: jit-dev-tools:
Message: TypeError: this.debuggee is undefined
Stack:
  JitPanel.onReady@resource://gre/modules/commonjs/toolkit/loader.js - 
resource://jit-dev-tools/lib/jit-panel.js:75:5
emitOnObject@resource://gre/modules/commonjs/toolkit/loader.js - 
resource://gre/modules/commonjs/sdk/event/core.js:112:9
emit@resource://gre/modules/commonjs/toolkit/loader.js - 
resource://gre/modules/commonjs/sdk/event/core.js:89:38
onStateChange@resource://gre/modules/commonjs/toolkit/loader.js - 
resource://gre/modules/commonjs/dev/panel.js:70:3



I seem to get exactly the same issue when I try it in the latest nightly 
(https://hg.mozilla.org/mozilla-central/rev/bfd82015df48 according to 
about:buildconfig). I also built the latest version of your addon from git, but 
still I get the same error.


If you have a github / bugzilla account, I suggest we offload this 
discussion to [1] or [2].


Also, can you explain in the bug how you start the devtools, and how they 
appear, maybe I would manage to reproduce this issue.


Otherwise, if you have time I suggest we discuss on irc.mozilla.org (nbp) on 
how to instrument the code to debug this addon.


Thanks for testing, and reporting issues :)

[1] https://github.com/nbp/jit-dev-tools/issues/new
[2] 
https://bugzilla.mozilla.org/enter_bug.cgi?assigned_to=nobody%40mozilla.orgbug_file_loc=http%3A%2F%2Fbug_ignored=0bug_severity=normalbug_status=NEWcc=:nbpcf_blocking_b2g=---cf_blocking_fennec=---cf_feature_b2g=---cf_fx_iteration=---cf_fx_points=---cf_status_b2g_2_0=---cf_status_b2g_2_0m=---cf_status_b2g_2_1=---cf_status_b2g_2_1_s=---cf_status_b2g_2_2=---cf_status_b2g_master=---cf_status_firefox38=---cf_status_firefox38_0_5=---cf_status_firefox39=---cf_status_firefox40=---cf_status_firefox41=---cf_status_firefox_esr31=---cf_status_firefox_esr38=---cf_tracking_b2g=---cf_tracking_e10s=---cf_tracking_firefox38=---cf_tracking_firefox38_0_5=---cf_tracking_firefox39=---cf_tracking_firefox40=---cf_tracking_firefox41=---cf_tracking_firefox_esr31=---cf_tracking_firefox_esr38=---cf_tracking_firefox_relnote=---cf_tracking_p11=---cf_tracking_relnote_b2g=---component=JavaScript%20Engine%3A%20JITcontenttypemethod=autodetectcontenttypeselection=text%2Fpl

aindefin
ed_groups=1flag_type-203=Xflag_type-37=Xflag_type-4=Xflag_type-41=Xflag_type-5=Xflag_type-607=Xflag_type-720=Xflag_type-721=Xflag_type-737=Xflag_type-781=Xflag_type-787=Xflag_type-791=Xflag_type-799=Xflag_type-800=Xflag_type-803=Xflag_type-835=Xflag_type-846=Xflag_type-855=Xflag_type-856=Xflag_type-857=Xflag_type-863=Xflag_type-864=Xflag_type-870=Xflag_type-875=Xflag_type-889=Xform_name=enter_bugmaketemplate=Remember%20values%20as%20bookmarkable%20templateop_sys=Unspecifiedpriority=--product=Corerep_platform=Unspecifiedshort_desc=Jit%20DevToolstarget_milestone=---version=unspecified

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Jit DevTools 0.0.1

2015-06-11 Thread Nicolas B. Pierron

On 06/11/2015 02:43 PM, cosinusoida...@gmail.com wrote:

Hi Nicolas,

I'm getting the following error when I attempt to use the addon:

console.error: jit-dev-tools:
   Message: TypeError: this.debuggee is undefined
   Stack:
 JitPanel.onReady@resource://gre/modules/commonjs/toolkit/loader.js - 
resource://jit-dev-tools/lib/jit-panel.js:75:5
emitOnObject@resource://gre/modules/commonjs/toolkit/loader.js - 
resource://gre/modules/commonjs/sdk/event/core.js:112:9
emit@resource://gre/modules/commonjs/toolkit/loader.js - 
resource://gre/modules/commonjs/sdk/event/core.js:89:38
onStateChange@resource://gre/modules/commonjs/toolkit/loader.js - 
resource://gre/modules/commonjs/dev/panel.js:70:3


Thanks for reporting, I guess I might be using a new API added to the dev tools.


I get that in both Firefox 38 and Firefox Developer edition on x86_64 Linux.


Also, the Debugger.onIonCompilation hook is quite new (= 41.0a1), so you 
will have to use a nightly version of Firefox to use this addon, or wait 
until the end of the month, for the next release cycle.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


[JS-internals] Jit DevTools 0.0.1

2015-06-10 Thread Nicolas B. Pierron

Hello everybody,

I am please to do an early announce a new tool named Jit DevTools.

This new tool is an addon which mostly target Jit developers.  It uses the 
recently added Debugger.onIonCompilation hook to display the latest MIR [1] 
and LIR graphs within the dev tools.


To use this tool, go to a web page, and open the Jit DevTools panel, then 
wait until a function got compiled.


Once a function is compiled, you can select a compiled script, which will 
display the MIR graph of the compilation.  If you need to, you can also have 
a look at the LIR graph.  The output is similar to the one rendered with 
iongraph [5].


By selecting any inlined script, the background of the block titles will 
change color, if the block correspond to the inlined instance.  This feature 
is quite useful to identify how a function plays in a compiled script.


You might find this tool quite handy to use for the following use cases:
 - Investigating DOM optimization.
 - Investigating jsperf issues.
 - Comparing function implementations, and impacts on the generated code.

This addon works on optimized builds, and even with parallel compilation 
enabled.


You can download [2] this early version from 
http://people.mozilla.org/~npierron/jit-dev-tools/ , or build it your-self 
by using the sources [3] with jpm tool [4].


Have fun, and enjoy.


[1] http://people.mozilla.org/~npierron/jit-dev-tools/jit-dev-tools-0.0.1.png
[2] http://people.mozilla.org/~npierron/jit-dev-tools/jit-dev-tools-0.0.1.xpi
[3] https://github.com/nbp/jit-dev-tools/
[4] https://developer.mozilla.org/en-US/Add-ons/SDK/Tools/jpm
[5] 
https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Hacking_Tips#Using_IonMonkey_spew_%28JS_shell%29


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Bailout_DuringVMCall

2015-04-14 Thread Nicolas B. Pierron

On 04/14/2015 08:31 PM, madhukar.kedl...@gmail.com wrote:

For the past few months I have been working on using offline type profile 
information to avoid bailouts in SM. During my experiments I came across 
Bailout_DuringVMCall and was not able to trace what exactly caused it.

I was able to reproduce these bailouts using simple examples where I changed 
the type of a global variable or shape of an object in a hot function after a 
few thousand invocations of it. Ideally, the bailout type for such a code 
should be either Bailout_TypeBarrierV or Bailout_ShapeGuard. But I see 
Bailout_DuringVMCall being generated. Is it because the hot function gets 
inlined into another hot function and there is no what to figure out which 
bailout occurred?



This is unfortunately the placeholder for both JS and native function calls. 
 This can be interpreted as “the generated code got invalidated”.


Sadly, many things can produce code invalidations, and the location of the 
bailout does not discriminate the reason of it.  This bailout is just here 
to ensure that we no longer execute the code which is now un-safe, as it is 
based on assumptions which are no longer holding.


Tracing the reason of the invalidation would be nice, and might be doable by 
instrumenting calls to addPendingRecompile[1].


[1] 
https://dxr.mozilla.org/mozilla-central/search?q=addPendingRecompilecase=true


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Optimization tracking API landed

2015-02-06 Thread Nicolas B. Pierron

On 02/06/2015 01:20 AM, Shu-yu Guo wrote:

I recently landed bug 1030389 to track the high-level optimizations
decisions
(i.e., deciding what MIR is emitted) made by IonBuilder. This information
will feed into the profiler and is attached to sampled JIT frames.


Not all optimization decisions are taken by IonBuilder, is there plan to 
make this API available to other transformation phases.  In particular, I am 
thinking of  Escape Analysis / GVN  LICM / Sink.



That's it! Instrumentation bugs also make great first bugs, and I would be
happy to mentor.


That's Great!

Should we instrument every code path, or only instrument the code path which 
have big cliff?


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


[JS-internals] Jit: Source-code locality Compiled-code locality ?

2015-01-13 Thread Nicolas B. Pierron

Hi list,

Lately, we have been discussing ways to clean-up the MIR.h file.  Among the 
problem that we have with this file is that all instructions are randomly 
ordered in the file.  Thus if one need to look for an instruction, you must 
use the searchjump feature of your editor.


More over, the problem is more general than MIR.h, as we see the same 
problem in Lowering.cpp and CodeGenerator.cpp.


Having such files results in a terrible developer experience as the 
source-code locality is non-existent.  On the other hand, this model 
provides good compiled-code locality, as similar functions are packed 
together in the binary.


Previously, I suggested that we should be moving functions closer to the the 
transformation phase which are making use of them.  As of today, we can see 
this idea applied to RangeAnalysis.cpp and to Recover.cpp.  This idea gives 
better compiled-code locality.


On the other hand, as a developer, I am sad of the current state, as when I 
have to look for one instruction, I do not see everything which is related 
to this instruction in one file.  Having source-code locality is good for 
eye-balling consistency of modifications (reducing review time?).


I think we should improve source-code locality while keeping/improving 
compiled-code locality.  I want to know if people are interested by this 
topic and if I we should continue this discussion on a bug with actual code 
prototypes?


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Contributing

2014-12-06 Thread Nicolas B. Pierron

Hi Xue,

and Welcome :)

On 12/06/2014 02:23 PM, Xue Fuqiao wrote:

Hi list,

A newbie here.

I'm interested in contributing to SpiderMonkey and I've read some
information about SpiderMonkey on MDN and MozillaWiki.

FYI - I'm familiar with:
   * JavaScript (ES5)
   * Bugzilla
   * MozillaWiki
   * ANSI C
   * C++98
   * JSON
   * Beavis and Butt-head Do America (which is the origin of the name
SpiderMonkey :-)

I'm not familiar with (yet):
   * ES6 (and TC-39, and thHie standardization process)
   * Mercurial
   * Try Server
   * JIT and bytecode
   * Garbage collection
   * JS engine benchmarking (Kraken/SunSpider/Octane)
   * Instruction sets
   * asm.js
   * IRC
   * Make (I can only write some very simple rules.)
   * Security


This is great that you made such list, but don't worry we can always help you.

About:
 - IRC:
   Read the documentation which is on [1], and you can join the #jsapi 
channel, which is were SpiderMonkey developers are chatting.  Sadly, we are 
unlikely to have a fast answer today as most of us are travelling.


   [1] https://wiki.mozilla.org/IRC

 - Mercurial:
   You can find some documentation at the following link:

   https://developer.mozilla.org/en-US/docs/Mercurial

 - Try Server:
   Don't worry about it, the usual process is that we make a few patches 
first, and ask the reviewer to push the patch to try and then to 
mozilla-inbound.


(sorry I have to cut my reply short, I will continue tomorrow … I have to 
take a bus)


Welcome :)

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


[JS-internals] Jit test JS shell command line options.

2014-11-27 Thread Nicolas B. Pierron

Hi all,

Fuzzers are testing configurations which are not the default one, such as 
--ion-gvn=off and --ion-regalloc=backtracking.  Some of these options are 
convenient as we want to be robust or as we want to migrate to a new 
configuration.


Ideally we should use the testing functions[1], such as gczeal or 
setJitCompilerOption as these can also be used in browser builds.


For options which are not covered by testing functions yet and which have a 
command line interface in the JS Shell, I am adding a way to use these, as 
part of Bug 1105187 [2].  To use command line options, just write a similar 
comment at the top of the test case:


 // |jit-test| --no-sse4; --ion-regalloc=backtracking; error:ReferenceError

 …

Note that command line options are separated by semi-colons, and that spaces 
around them are stripped before appending them to the command line of the JS 
Shell.


Only long command line options (starting with --) are accepted, and not 
the short one, so


 // |jit-test| -D

Will not work and output a warning message, while the following will succeed:

 // |jit-test| --dump-bytecode

If you have any doubt, you can check if the JS Shell is invoked with the 
right command line, by using a similar command:


 python ./jit-test/jit_test.py -s -o ./path/to/js ion/bug1105187-sink.js


[1] 
http://dxr.mozilla.org/mozilla-central/source/js/src/builtin/TestingFunctions.cpp#2221

[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1105187

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


[JS-internals] Use MOZ_ASSERT and MOZ_ASSERT_IF

2014-10-01 Thread Nicolas B. Pierron

Hi,

I just replaced all instances of JS_ASSERT by MOZ_ASSERT and JS_ASSERT_IF by 
MOZ_ASSERT_IF.  Each commit contains the command use to do it automatically, 
and it is also listed in Bug 1074911.


JS_ASSERT is dead, Hurray for the new MOZ_ASSERT \o/

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Growing arrays

2014-07-17 Thread Nicolas B. Pierron

On 07/16/2014 11:08 PM, Nicholas Nethercote wrote:

So then I tried reverting that change and inserting this line just
before the loop:

   array[length - 1] = 0;

And now it avoids the doubling allocations -- the array elements are
allocated once, at the right size. But it feels dirty, and I don't
know if would give the same behaviour in other JS engines.


Arrays implementation are really different between JavaScript engines.

I know that is used to make us produce a spare array where in addition to 
setting the last element, we pay the cost for each assignment done in the 
middle after.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Dynamic analysis meeting w/ devtools

2014-07-01 Thread Nicolas B. Pierron

On 07/01/2014 11:11 AM, Till Schneidereit wrote:

On Tue, Jul 1, 2014 at 7:52 PM, Jason Orendorff jorendo...@mozilla.com
wrote:


The proposed implementation technique underlying this is bytecode
instrumentation. One reason for this is that we already have tons of
practice
adding new opcodes to Ion, baseline, and the interpreter. We already know
how
to make them work the same in all modes and fast in Ion. Of course the
implementation technique could vary per event. If we choose to support an
event
that already has a natural choke-point in C++, we would not need bytecode
instrumentation to intercept that event. It is also true that bytecode
instrumentation has a few weaknesses--things like exception handling are
not
done by executing bytecodes at all.



Isn't this exactly what tracelogging does? Or, a subset of what
tracelogging does, rather?


For this precise point, yes.

Which is why I also discuss with Hannes about the Tracelogger.  The main 
difference is the exposure through Debugger.  But keep in mind that this 
first aspect would be the ground work for the incremental updates of this 
API, and that we can instrument on-demand based on what is monitored by all 
the Debuggers.


Code coverage might want to have a per-block overview of the code usage, or 
per-instruction overview, depending on how much overhead is acceptable.


I think the Tracelogger output is only one kind of information that we want 
to expose through any analysis API, such as we can stream  process events, 
and make it available inside the debugger.


One of the difference between the Tracelogger and what we want to achieve at 
first here, is that we only want to observe one compartment, and not all 
runtimes of the browser. (is there a tracelogger filter?)  One of the thing 
that I would hope to see exposed to web developers would be the time spent 
in the Parser / GC of each compartment.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Dynamic analysis meeting w/ devtools

2014-07-01 Thread Nicolas B. Pierron

On 07/01/2014 12:04 PM, Fitzgerald, Nick wrote:

On 7/1/14, 10:52 AM, Jason Orendorff wrote:


Events are *not* delivered synchronously. As JS code executes, a log is
written. Occasionally, the log is then parsed and records are delivered to
devtools code via the enterFunction and leaveFunction methods above. (This
batching should improve performance, by minimizing C++-to-JS-to-C++ and
cross-compartment calls.)


Because all the devtools are designed to work remotely from day one*, we
will be sending these logs over the Remote Debugging Protocol from the
debuggee device (the Firefox OS / Fennec phone, etc) to the debugger device
(desktop Firefox) where the data will be processed in a worker and
eventually displayed to the user.

It would be a shame if we did this:

1. Collect log in SpiderMonkey
2. Parse log into JS objects
3. Deliver to hooks devtools set
4. Re-serialize JS objects into a log for transport
5. Send log over RDP
6. Parse log into JS objects again

When if the log was exposed to devtools as some kind of blob / typed array
that we can send across the RDP as binary data, we could do this:

1. Collect the log in SpiderMonkey on the debuggee device
2. Deliver the log blob to a hook the devtools set
3. Send log blob over RDP
4. On the debugger device, devtools code asks Debugger to parse the blob

This way we aren't repeatedly parsing and serializing (to potentially
different formats!) for no good reason.


One of the issue with the blob logic is that the intent of making analysis 
is to be able to inspect elements.  One of the idea was to be able to proxy 
objects, such as we can still provide a boxing mechanism, which makes sense 
for synchronous analysis as they have in Jalangi.


On the other hand, now that I am thinking more about it, I do wonder to what 
extend having an asynchronous view on objects might be helpful compared to 
some unique identifier of an object.


In which case, if you want to find the value corresponding to one 
identifier, you will have to watch for objects mutations as well.  Knowing 
that objects allocations/mutations/deallocations will stream you the list of 
modifications made to all objects.


If we go through the RDP, then I guess we want the asynchronous tracing to 
just provide an ArrayBuffer of its log based on a list of callbacks (not 
functions), such as we can easily write the server-side of the pipeline.


Then, I guess we want a second function into which we feed the ArrayBuffer 
and it calls all the callbacks (provided as a list to the first function). 
And this would be on the client side of the Debugger.


// producer
var watched = [enterFunction, leaveFunction, setObject, newObject, 
freeObject];

dbg.addLogListener(watched, function (stream) {
  // ... send stream over the network, or locally ...
});

// consumer
var watcher = {
  enterFunction: function (event) { ... };
  leaveFunction: function (event) { ... };
  setObject: function (event) { ... };
  newObject: function (event) { ... };
  freeObject: function (event) { ... };
};

function onStreamReceived(stream) {
  dbg.dispatchLogEvents(stream, watcher);
}

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Dynamic Analysis API discussion

2014-06-26 Thread Nicolas B. Pierron
 or elsewhere where we're
depending on analysis.


Like everything else, but there is more chance to break something which rely 
on source-to-source transformation than something which relies on a lower 
level (ECMA based?) API.



#3 is interesting and perhaps where lessons learned from Java and other
contexts do not apply. I think we should dig into specific tool examples
for this; maybe some combination of more intelligent translation and
judicious API extensions can solve the problems.

Nicolas B. Pierron wrote:


Personally, I think that these issues implies that we should avoid relying
on a source-to-source mapping if we want to provide meaningful security
results. We could replicate the same or a similar API in SpiderMonkey, and
even make one compatible with Jalangi analysis.



It's not clear what you mean by the same or a similar API here.


I mean that I want such API to be a JavaScript API.  I do not want us to 
provide function for adding hooks.  I want the JS engine to provide one 
function for registering all the hooks you want in a separated compartment.


  var a = newAnalysisGlobal();
  a.eval(load('my-analysis.js'));

  var g = newGlobal({analysis = a});

  // Generate bytecode probes based on function currently present on the
  // analysis global.
  g.eval(…);

We can either inspire our-self from Jalangi interface for making analysis, 
or just bridge the two with a wrapper.  Such analysis should be implemented 
in JavaScript and not any other language as our primary target are 
JavaScript developers.



If we add opcodes dedicated to monitor values (at the bytecode emitter
level), instead of doing source-to-source transformation. One of the
advantage would be that frontend developers would not have to maintain
Jalangi sources when we are adding new features in SpiderMonkey, and more
over, the bytecode emitter already breakdown everything to opcodes, which
are easier to wrap than the source.

Analysis are usually made to observe the execution of a code, and not to
mutate it.  So if we only monitor the execution, instead of emulating it, we
might be able to batch analysis calls.  Doing batches asynchronously implies
that the overhead of running an analysis is  minimal while the analyzed code
is running.



Logging and log analysis have their place, but a lot of dynamic analysis
tools rely on efficient synchronous online data processing in
instrumentation code. For example, if you want to count the number of times
a program point is reached, it's much more efficient to increment a global
variable at that program point than to log to a buffer every time that
point is reached, and count log entries offline. For many analyses of
real-world applications, high-volume data logging is neither efficient nor
scalable. Here are a couple of examples of Java tools I worked on where
synchronous online data processing was essential:
-- http://fsl.cs.illinois.edu/images/e/e8/P385-goldsmith.pdf
-- http://web5.cs.columbia.edu/~junfeng/09fa-e6998/papers/hybrid.pdf
So I think injection of synchronously executed instrumentation is essential
for a large class of analyses.


The asynchronism is one suggestion to make recording analysis faster, by 
avoiding frequent cross-compartment calls.  I do not see any issue to have 
synchronous request, on the contrary I think it might be interesting to 
interrupt the program execution on such request, or even change the program 
execution (things that we can only do synchronously) to prevent security 
holes / privacy leaks.


On the other, I do think that we should have asychronous analysis first, but 
only the use case of potential users can answer this question for us.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Dynamic Analysis API discussion

2014-06-26 Thread Nicolas B. Pierron

On 06/26/2014 10:49 AM, Shu-yu Guo wrote:

On Jun 26, 2014, at 6:57 AM, Nicolas B. Pierron nicolas.b.pier...@mozilla.com 
wrote:


I have a question for you, and also for people who have made such analysis in 
SpiderMonkey.  Why taking all the pain of integrating such analysis in 
SpiderMonkey's code, which is hard and change frequently when it would be easy 
(based on what you mention) to just do source-to-source transformation?

Why do we have 3 propositions of implementing taint analysis in SpiderMonkey so 
far?  It sounds to me that there is something which is not easily accessible 
from source-to-source transformation, which might be easier to get hooked once 
you are deep inside the engine.


Perhaps we can get those who tried to implement taint analysis in SpiderMonkey 
before to chime in about the pain points they experienced. Do we know who they 
are?


Yes, we know who they are, and we contact for all of them.

But I know that at least one of them does not want to go public right now.


Extending a JS parser, maybe.  Extending 2 JS parser the same way, is harder.


New language features with complex semantics require significant tool
updates whatever API we use.


No as much as the syntax, the bytecode is an example of it, as the bytecode is 
some-kind of subset that we target with the bytecode emitter.  As you 
mentionned, manipulating bytecode is easy, but manipulating the source to 
ensure that we have the same semantic might be more complex.


It seems a worse maintenance burden to me to have to update all analyses 
written when we decide to change the bytecode in SpiderMonkey, say, like 
decomposing some more fat ops. Exposing a bytecode-based instrumentation on a 
private bytecode makes the bytecode a de facto public and frozen API, which is 
undesirable.

As I’ve said before, which I’ll repeat here for the benefit of the discussion 
thread, I am in favor of a source-to-source approach because it seems to me 
that source-to-source is just as expressive as the API proposed here. I remain 
optimistic that an out-of-engine tool can be made performant, for some of the 
points roc mentioned. For maintenance, if nothing else, an out-of-engine tool 
is open to be maintained by a larger number of developers instead of just JS 
engine developers.


I do not disagree that source-to-source is more expressive, but it is as 
well easier to shoot your-self in the foot, by doing such modification.


I want to make sure that this is both as easy for analysis developers to 
make analysis as it is for us to maintain such API.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


[JS-internals] Dynamic Analysis API discussion

2014-06-25 Thread Nicolas B. Pierron
 features in SpiderMonkey, and more 
over, the bytecode emitter already breakdown everything to opcodes, which 
are easier to wrap than the source.


Analysis are usually made to observe the execution of a code, and not to 
mutate it.  So if we only monitor the execution, instead of emulating it, we 
might be able to batch analysis calls.  Doing batches asynchronously implies 
that the overhead of running an analysis is  minimal while the analyzed code 
is running.


On an orthogonal aspect, we could isolate the analysis code from the 
analyzed code by making a separated compartment for the analysis.  This 
would provide any boxing and unboxing feature as a safe guard, but this 
would be extremely expensive in terms of speed (without a batching system), 
and in terms of memory (without executing batches before GCs).  In addition 
to provide safe guards for persons making analysis, it avoid the pitfall of 
the mega-morphic calls.


Separating the analysis from the code being analyzed provide an additional 
advantage which is that we know what the analysis might be looking for. 
This implies that we could only trace values which are being watched by the 
analysis, and thus avoid useless overhead.


[6] http://marijnhaverbeke.nl/acorn/

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] bailout return addresses

2014-06-10 Thread Nicolas B. Pierron

On 06/08/2014 09:27 PM, Cameron Kaiser wrote:

[Codegen] instruction CallGetIntrinsicValue
[Codegen] == push(immgcptr) ==
[Codegen] #label ((1068))
[Codegen] == push(imm) ==
[Codegen] 0201522c --- lis r0,913 (0x391)
[Codegen] 02015230 --- ori r0,r0,18768 (0x4950)
[Codegen] 02015234 --- stwu r0,-4(sp)
[Codegen] == callWithExitFrame(ion *) ==
[Codegen] == push(imm) ==
[Codegen] 02015238 --- li r0,2112 (0x840)
[Codegen] 0201523c --- stwu r0,-4(sp)
[Codegen] == call(JitCode) ==
[Codegen] 02015240 --- mfspr r0,lr
[Codegen] 02015244 --- bl .+4  lr = pc
[Codegen] 02015248 --- mfspr r12,lrget pc into r12
[Codegen] 0201524c --- mtspr lr, r0
[Codegen] 02015250 --- addi r12,r12,32 (0x20)  push pc+32
[Codegen] 02015254 --- stwu r12,-4(sp)
[Codegen] 02015258 --- x_skip_this_jump
/* x_skip_this_jump is patched by the assembler to a lis/ori/mtctr stanza to
call the VM wrapper */
[Codegen] 0201525c --- nop
[Codegen] 02015260 --- nop
[Codegen] 02015264 --- bctrl   jump to CTR
/* VM gets called and returns here */
[Codegen] ##addPendingCall offs 0458 to 00cf1fa0
[Codegen] #label ((1128))
[Codegen] == push(reg) ==
[Codegen] 02015268 --- stwu r3,-4(sp)
[Codegen] == push(reg) ==
[Codegen] 0201526c --- stwu r4,-4(sp)


The return address (using these offsets) is 0x02015268. That's where it
should return to, if I understand what it's doing. By the way, it doesn't
look like I need to preserve LR and it was never saved in Baseline; it's
just here for paranoia.


Safepoint return address is here as a convenient way to index the safepoint 
and to know where we can patch the code.  When we have an invalidation, we 
are patching 4/8 bytes of the dead code below (which is not going to be 
executed once we return to this function) to register the pointer of the 
IonScript.




The return address doesn't correspond to where the OsiPoint got marked -- it
gets marked way down at 0x020153cc after the CallGetIntrinsic, or way back
at 0x020527c4 with the OsiPoint/MoveGroup. So it asserts in
IonScript::getOsiIndex() because the return address doesn't match any of the
recorded OsiPoints.


Indeed, the return address inside the Safepoint (the OsiPoint return 
address) should be set by the OsiPoint, and it does not correspond to the 
return address of the call inside the instruction which is using the Safepoint.


The return address which is on the stack is used to find the Safepoint. 
The return address which is in the Safepoint is used to locate the OsiPoint.


I would simply say that this is expected that the Safepoint does not contain 
its own return address as we can use a SafePoint multiple times in one 
instruction.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] JS status report for Nightly 32

2014-06-10 Thread Nicolas B. Pierron

Hi Chris,

On 06/09/2014 06:42 PM, Chris Peterson wrote:

We have some very active community contributors for this release! nbp and
bbouvier have been mentoring many bugs for recover instructions, a
precursor for IonMonkey escape analysis and branch profiling.

Contributors working on Nightly 32 (in alphabetical order):

* Amol Mundayoor
* Heiher
* Inanc Seylan
* Julien Levesy
* Nathan Braswell
* Sankha Narayan Guria
* Sushant Dinesh
* Tooru Fujisawa

If I missed you, please let me know! If you need more bugs, just drop by
#jsapi on irc.mozilla.org. :)


Thanks Chris for doing this list. :)

This is amazing that we got so many new contributors working on the JS 
Engine, and I will suggest others to do the same, the rules are simple[1].


Also, with the Bugs Squashing Party[2] event in the Paris office, I am 
expecting to mentor new contributors over the weekend of the 21st/22sd of June.


[1] https://wiki.mozilla.org/Contribute/Coding/Mentoring
[2] 
https://www.eventbrite.com/e/mozilla-bugs-squashing-party-tickets-11619100041


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Hello all

2014-04-15 Thread Nicolas B. Pierron

Hi Sriram,

On 04/15/2014 09:33 AM, Sriram A S wrote:

I am new to SpiderMonkey group and wish to contribute in bug fixing and future 
developments. I have a working dev env already in my machine and going through 
the docs to understand the working of SpiderMonkey JS engine and get used to it.


Nice this is the first step for contributing. :)


I am not sure from where I need to start, so it will be great if someone can 
guide me. If there are any assignments for me, please let me know.


The JavaScript engine has multiple components such as a Parser, Debugger 
API, JITs, and so on … What are interested in, and where do you want 
contribute to?


You should also join the IRC channel and get your self known on 
irc.mozilla.org #jsapi as well as on #introduction.  These are the places 
where you can ask question on the JS Engine and on how to start contributing.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Cake ( other beverages)

2014-04-08 Thread Nicolas B. Pierron

On 04/07/2014 02:32 PM, Jason Orendorff wrote:

What:  25 Minutes Of Cake And JavaScript
(bring your own cake)

A totally optional vidyo chat for random SM talk,
show  tell, brainstorming, etc.

When:  2nd and 4th Friday of each month,
starting this Friday,
10AM Mountain View Time.


Good idea,

I reserved a room in the Paris office for people who might want to join 
Benjamin and I there. (19:00 Paris time)


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Removing 'jit-tests' from make check

2014-04-04 Thread Nicolas B. Pierron

On 04/04/2014 03:39 AM, Daniel Minor wrote:

Just a heads up that very soon we'll be removing jit-tests from the make check target[1]. The 
tests have been split out into a separate test job on TBPL[2] (labelled Jit), have been running on Cedar for 
several months, and have been recently turned on for other trees. We've added a mach command-- mach 
jittest that runs the tests with the same arguments that make check currently does.


mach jittest ?

Is there any documentation which explain how to only work with the JS Shell 
by using mach commands?  Does this change implies that every JS developer 
will have to compile the full browser just to work on the Shell?


The only documentation I know [1] explains how to run a configure  make.

[1] https://developer.mozilla.org/en-US/docs/SpiderMonkey/Build_Documentation

--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Having the compilation process produce a binary as well as a symbols file

2014-03-06 Thread Nicolas B. Pierron

On 03/05/2014 05:17 PM, Gary Kwong wrote:

How useful would it be for the compilation process to produce a binary as
well as a symbols file for everyone?


I will reformulate the question, how many bugs were hard to reproduce on 
others computer?  And were more dependent on the compiler output than the 
configuration flags?



My usecase would be to be able to archive just the binary and symbols in a
cache folder so I don't need to compile it again when testing testcases.


May be you can just keep one full image for every mozilla-central merge, and 
only keep binary-diff compared to the parent commit.  I do not expect that 
the ~30 changes happening to the JS engine will shift things a lot.


Recovering the build of a changeset, is a matter of applying the right set 
of patches.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Getting the allocation point of every object

2014-02-27 Thread Nicolas B. Pierron

On 02/27/2014 03:02 AM, Brendan Eich wrote:

Fitzgerald, Nick wrote:

Or in self hosted code, right? Maybe the iterator { value, done } objects?


Are we optimizing away { value, done } objects that can't escape (from
iterators run afresh by for-of loops)? If not, is there a bug on file to so
optimize? If not, please file and cite here. Thanks,


Not as far as I know, and I do not think we can *properly* do anything like 
that before landing something similar to Bug 878503.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Better memory reporting of objects and shapes

2014-02-19 Thread Nicolas B. Pierron

On 02/18/2014 12:49 PM, Nicholas Nethercote wrote:

On Tue, Feb 18, 2014 at 1:57 AM, Nicolas B. Pierron
nicolas.b.pier...@mozilla.com wrote:


I think it might make sense to special case the JSFunction class, such as we
can get the object prototype name in addition to the JSFunction class.


Interesting idea. What's the exact code for getting the object
prototype name from a JSFunction?


The corresponding JS code would be:

  obj.__proto__.constructor

obj.__proto__maps to  JSObject::getProto(cx, obj, proto)
   .constructor  maps to  JS_GetConstructor(cx, proto)

Then, if this is a JSFunction, then you can extract the

RootedAtom name(cx);
if (constructor-isJSFunction())
name = constructor-asJSFunction().displayAtom()

Which would be either the name of the function, or it's inferred name.

This way, for the following code:

function Foo() {}
var x = new Foo();

We should be able to display Foo in the memory reporter.

--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] JIT Inspector

2014-02-06 Thread Nicolas B. Pierron

On 02/05/2014 10:09 PM, Boris Zbarsky wrote:

JIT Inspector is a pretty awesome tool, but it's bitrotted slightly... In
particular:

1)  It doesn't seem to know anything about baseline.
2)  I've had a hard time making parts of it other than Ion Activity work.

Is there any interest in updating it to deal with the current state of the
world?


I am not a big fan of the current implementation of the JIT Inspector, 
especially since we have no test, and that this is kind of issues is likely 
to have again in the future.


I do not think we would have much value as dumping the assembly of Baseline, 
especially ICs might be harder.  On the other hand, I think it would be easy 
to dump ICs chains as part of the PC Count interface.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Taint analysis in SpiderMonkey

2014-02-03 Thread Nicolas B. Pierron

Hi Stéphanie,

Thanks for looking again into this,

On 02/03/2014 02:08 AM, Stéphanie Ouillon wrote:

Ivan forwarded me the script Jim wrote to benchmark the impact of
tainting on SpiderMonkey (see attachments, I fixed bits in the patch to
apply it on recent mozilla-central code).


I look at the patch as well as the benchmark[1], I have multiple comments on 
them:
 - The patch does not instrument the CodeGenerator which inline CharAt[2]. 
This means that if we compile the JS function only the concat 
instrumentation would be be testing this flag. (also, the concat 
instrumentation is not needed, as we need to flatten a string before reading 
anything from it)


 - AssertEq is a C++ function, and this would add some overhead for just 
doing a flatten.  Comparing the 2 strings in JS would be better, but then we 
also need to instrument the CodeGen.
 - Use an extra functions which contains the inner loop, as we are only 
interested in this function and not on the top-level script.
 - Using a gc() call here may have a nasty effect with OSR 
(on-stack-replacement), because we are jumping into Ion only from the 
outer-loop, so I don't think we are even using the result of Ion's compilation.
 - The loop runs 1 times, and results are only divided by 1000 (not 
really important)



I ran the tests several times (performance mode, in console) on commits


Your results are a bit noisy, did you pkill -18 firefox before running 
these benchmarks?



I don't know what was intended to be done after that, so I'm posting the
results here to have any feedback.


I think this is a way to highlight that checking this extra bit is not 
changing the performance profile of the engine.  But this does not deal with 
the maintenance issue question that I raised previously.


[1] https://gist.github.com/arroway/617c534a7e4cb24adeab
[2] 
http://dxr.mozilla.org/mozilla-central/source/js/src/jit/CodeGenerator.cpp#5006


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Replacing YARR

2014-01-06 Thread Nicolas B. Pierron

Hi,

On 01/05/2014 02:31 AM, julian.vier...@googlemail.com wrote:
 Before converting the entire Octane RegExp benchmark to run using
 RegExp.JS I thought I just try the first RegExp tested in the benchmark.
 This means the in terms of code changes:

diff --git a/regexp.js b/regexp.js
- var re0 = /^ba/;
+ var re0 = new RegExpJS(/^ba/);

Any reasons why you are using the deconstructing RegExpJS function, instead 
of giving a string as argument?


 var re0 = new RegExpJS(^ba);

--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] A question on Float32 codegen

2013-12-13 Thread Nicolas B. Pierron

On 12/12/2013 06:51 PM, Feng, Haitao wrote:

Why did not we use 4-byte stack slot and 4-byte snapshot slot for
float32 and use movss instead of movsd?


Can you be more precise?  The Snapshot slots are unlikely to be 4 bytes, 
because they are written into a compact buffer and they are not directly 
addressable.  Also the snapshot slots are used to indicate the location, 
such as if it is in a register or on the stack.


If you are talking about the spill of register made for bailouts, then, we 
will have trouble with Float32x4, as we are eagerly spilling Value-size 
content at the moment.  This should not be a big deal to spill the full 
[xy]mm registers instead of the low parts.



If we introduced FLOAT32_REG
type in the LDefinition::Type and FLOAT32_REG in the LAllocation::Kind,
it should be relatively easier to add Float32x4_REG.


I will talk about the MIRType, as this is the example I took at the time , 
but I think we can do the same thing for the LAllocation and LDefinition.


This is one thing I discuss with Benjamin Bouvier before he added the 
Float32.  We should think of the future when doing such design.  At the 
moment we are running conditionals to check if this is a float32 or a 
double.  As we have many vector sizes (1,2,4,8) and vector type (double, 
float, uint8, uint32), I think it would be better to abstract all of these 
and make the MIRType a structure which is using a bit-field to represent all 
of these vector type.


struct MIRType {
  enum Type {
TYPE_VALUE,
…,
TYPE_DOUBLE,
TYPE_FLOAT,
TYPE_INT8,
TYPE_INT16,
TYPE_INT32
  };

  // Useful for finally supporting unsigned int 8
  // and URSH without hacks.
  const uint32_t signedValue:1;

  const uint32_t padding_: 12;

  // Shift index to obtain the number of element in the vector.
  const uint32_t vectorScale: 3;

  // use uint32_t instead of Type because of a
  // windows compiler issue.
  const uint32_t type:16;
};

static const MIRType MIRType_Value =
  {true,  /* 1  */ 0, MIRType::TYPE_VALUE};
static const MIRType MIRType_Float32x4 =
  {true,  /* 1  */ 2, MIRType::TYPE_FLOAT};
static const MIRType MIRType_UInt8x4 =
  {false, /* 1  */ 2, MIRType::TYPE_INT8};

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] thread safe and autoconf question

2013-11-30 Thread Nicolas B. Pierron

Hi Roelof,

On 11/29/2013 01:54 PM, Roelof Wobben wrote:

Can I somehow check with autoconf if mozjs185 is compiled with 
--enable-threadsafe
-- with-system-nspr ?


Usually, configure scripts are leaving some log files that are used to store 
all these details.  What you are interested in is likely a file named 
config.status in your build directory.  This file contains the list of 
substitutions made by the configure script.  Look for JS_THREADSAFE.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] [Fixed] Do Not Land: AWFY is not responding.

2013-10-15 Thread Nicolas B. Pierron

On 10/15/2013 01:11 PM, Jason Orendorff wrote:

On 10/15/13 12:40 PM, Nicolas B. Pierron wrote:

AWFY is our last barrier to prevent regressions.  Currently, all
slaves are down!

Until AWFY is fixed, I suggest that we do not land anything to the JS
Engine unless we provide benchmark results for sunspider and kraken.
V8 still runs on tbpl[1].


Filed, in case you want more info:
https://bugzilla.mozilla.org/show_bug.cgi?id=927094


For people who are not yet following the Bug, AWFY slaves are now collecting 
data again, so this is just a question of seeing the updated values.


The issue was caused by 3 (4?) unrelated bugs which happened simultaneously:
 - (?) The VM is slow at refreshing results.
 - x86/x86-64 computer harness ran into an infinite loop (?)
 - The ARM board rebooted and a lock file prevented the automatic restart.
 - The git mirror used by B2G builds got corrupted.

--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Taint analysis in SpiderMonkey

2013-08-19 Thread Nicolas B. Pierron

On 08/19/2013 10:14 AM, Jim Blandy wrote:

There are many issues here, but specifically regarding the runtime impact of
a DOMinator-style taint analysis when not in use:

Taint instrumentation is only needed in operations that allocate new strings
whose contents are taken from other strings. Such operations would gain a
branch per input (checking for taint), and a branch per output (checking
whether there was taint to be propagated). These branches sit alongside a
JSString allocation, and perhaps content copies. When taint is not in use,
the branches would be well-predicted (and we could annotate them unlikely,
if that would help).

That's not zero impact - but would you expect it to be measurable on
benchmarks?


Yes, I think this will damage performances on cases, where people are 
building strings with a concatenation loop:


  for (var i = 0, ii = arr.length; i  ii; i++)
s += String.fromCharCode(arr[i]);

PdfJS has a few of these, where an array/string is converted into a string. 
 Either to copy the content, or to go from a base 64 to some text.  This 
kind of code is also expected at boundaries of typed arrays.


I agree, the trivial example above can be inferred, but the += is in 
question here.  As we are allocating a JSStringRope for every operations.


So having hooks on the string allocation sounds like a terrible idea.

On the other hand, Doing it as part of the flatten operation, will remove 
half of the comparisons.  Still, I would expect some impact there.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Taint analysis in SpiderMonkey

2013-08-16 Thread Nicolas B. Pierron

On 08/15/2013 05:53 PM, Jim Blandy wrote:

On 08/15/2013 11:29 AM, Nicolas B. Pierron wrote:

On 08/09/2013 02:59 PM, Jim Blandy wrote:

Ivan Alagenchev and Mark Goodwin asked me to take a look at their project to
bring DOMinator, a taint analysis for SpiderMonkey, […]


On Tuesday, Koushik Sen made a presentation which is available on Air
Mozilla[1] where he presented some JavaScript instrumentation which use
some parser-hook to rewrite the original script with some extensible
instrumentation.

I think it's important to consider both the scale of the effort required and
the results produced. Implementing something like the StringLabeller (pace
Brendan) hooks would be a different order of magnitude of effort than the
alternatives suggested here.

I need to watch that presentation, but I did see Sen's presentation at
JSTools 2013 in Montpellier. Without any intent to contradict, Jalangi's
record-and-reply-with-shadow-execution approach did not seem to me like a
low-maintenance tooling approach. Certainly, using shadow execution to
recover the details of execution drastically reduces what one needs to
record, and thus its runtime impact. But the combination of the recording
annotations and the shadow interpreter do not seem like a light maintenance
burden. Am I being pessimistic?


This something which is more general and it would have multiple purposes 
which is likely to be more stable over time than just one corner 
application.  In addition, it can be customize by users, so this would a be 
good to remove the burden of the content of the analysis from the JS engine.


I would prefer a similar solution over a simple tainting solution which only 
consider to intrusively annotate strings.  First, the performance impact 
would be isolated to people who are running the analysis.  Second, it is not 
as intrusive because it would provide an alternate  contained path in the 
bytecode emitter, and the rest should remain unchanged.


Jalangi's approach is exactly like self-hosting the analysis without 
instrumenting the interpreter or the Jits, which means that we would have no 
performance issue induced by the instrumentation when users are not using 
any analysis.


In addition, this would lower the cost of adding any other analysis to the 
devtools as this would have to be done once.  And web developers can even 
make their custom analysis, such as do not hold a cross-compartment 
wrappers except in these functions.



Further: having thought a bit more, I'm not sure that source-rewriting
techniques are going to be much better. Perhaps there's a beautiful trick
I'm not noticing, but it seems to me that making finer-grained distinctions
between strings than the language supports entails nothing less than a
self-hosted JavaScript interpreter, because you can't use strings
(meta-level) to represent strings (debuggee level).


From what I understand of Jalangi, is that you can add any kind of 
annotation by boxing the results, and remove the annotation by unboxing the 
operands around any operations.


The only aspect of it that I do not like is that they redefine the 
operations, which does not guarantee the correct behavior.  I think we can 
do better with maybeBox  maybeUnbox primitives and a pre-  pos- operations 
for updating the context.  In addition, we can manage to please TI and avoid 
the mega-morphic operators as they have in Jalangi.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Taint analysis in SpiderMonkey

2013-08-15 Thread Nicolas B. Pierron

On 08/09/2013 02:59 PM, Jim Blandy wrote:

Ivan Alagenchev and Mark Goodwin asked me to take a look at their project to
bring DOMinator, a taint analysis for SpiderMonkey, […]


On Tuesday, Koushik Sen made a presentation which is available on Air 
Mozilla[1] where he presented some JavaScript instrumentation which use some 
parser-hook to rewrite the original script with some extensible instrumentation.


In his presentation, he highlights that only hundreds of lines are necessary 
to instrument a few parts of the engine with one kind of analysis.  One of 
the example is tainting analysis, which is made such as it works on objects.


The parser hook, is used to transform the source such as it can replace 
JavaScript operations by functions calls which are doing the original 
operation.  The example given in the presentation is that 'x + y' is 
transformed to Binary('+', x, y).


I think this is the kind of problem that we can address at the Bytecode 
emitter level.  We could make a second bytecode emitter which generates 
calls into some non-instrumented code registered for the analysis.  I think 
we could require that the analysis must be loaded ahead of time and only 
generate hooks if there is any instrumentation.


This approach would be way slower than the tainting suggested by the 
DOMinator project, but at the same time it serves a generic purpose for 
instrumenting the engine in any customizable way.


As he suggested in the presentation, we might want to make this visible to 
the user.  Which means that we need to think of a way to provide some 
differential testing, such as analysis developers can check that they are 
not changing the behavior of the manipulated program (unless expected).


[1] https://air.mozilla.org/test-and-cure-your-javascript-blues-with-jalangi/

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Conformance testing of the JavaScript engine(s?)

2013-08-11 Thread Nicolas B. Pierron

On 08/11/2013 06:00 AM, David Bruant wrote:

And I was wondering if conformance and regression tests were run against all
3-4 engines (since each compilation step may introduce its own bugs) or
whether only the combination was tested.
I imagine that when I run [1], only the conformance of the interpreter is
tested, because no test really have time to become warm or hot (not even the
test harness since it's injected in a new iframe for each test for good
reasons).


As terrence mentionned we do have eager compilation. Even if the eager 
compilation might not be enough to test the conformance.  The fact that the 
test are present help fuzzers generating new tests by mutations.  Such test 
can then be used in a differential tests to check for different behavior 
between the interpreter and the compiled code.


So as soon as we get it right in the interpreter the rest should catch-up if 
we forgot to update some corner cases of the JITs.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Taint analysis in SpiderMonkey

2013-08-09 Thread Nicolas B. Pierron

Hi,

On 08/09/2013 02:59 PM, Jim Blandy wrote:

The taint analysis applies to strings only, and has four parts:

  * It identifies certain *sources* of strings as tainted:
document.URL, input fields, and so on.
  * The JavaScript engine propagates taint information on strings.
Taking a substring of a tainted string, or concatenating a tainted
string, yields a tainted string. Regexp operations, charAt, and so
on all propagate taint information. And so on.
  * It identifies certain *sinks* as vulnerable: eval, 'src' attributes
on script elements, and so on.
  * Finally, the tool's user interface logs the appearance of tainted
strings at vulnerable sinks. The taint metadata actually records the
provenance of each region of a tainted string, so the tool can
explain exactly why the final string is tainted, which is really
helpful in constructing XSS attacks.

[…]

I looked at the code in the github repo at
https://github.com/alagenchev/spider_monkey. It seems to me that the
following issues need to be addressed:

  * The way the taint metadata is stored needs to change. A linked list
isn't appropriate for operating at scale.


I don't think we should use tainting.  The extra bit on the length is 
currently reserve to optimize strings which would be represented as an int.


The goal of the tainting is to re-construct the inverted data-flow graph, 
i-e finding the origin of a string which flow into a function. And the 
data-flow graph is basically what is monitored when we register that we can 
see a new values flowing inside a store at a specific code location.


I think that if we want to capture this kind of information, we should at 
least make it in such a way that we can also use it to improve our 
performance.  If we are able to isolate the data-flow, we could optimize our 
data representation based on guarded invariants of the data-flow (dynamic 
deforestation?), and with the support of a moving GC, we could 
optimize/deoptimize the value representation on GCs.  [to be seen as a JIT 
compiler for the data flow instead of only having JIT compilers for the 
control flow]



  * The patch adds functions on 'String', and an accessor on
'String.prototype'. We can't really add methods directly to String,
for the sake of web compatibility. Rather, we should handle taint
the way we handle Debugger.


I agree, we should make this visible from developers tools, but not directly 
added to the String interface.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Taint analysis in SpiderMonkey

2013-08-09 Thread Nicolas B. Pierron

On 08/09/2013 05:27 PM, Jim Blandy wrote:

On 08/09/2013 04:29 PM, Nicolas B. Pierron wrote:

The goal of the tainting is to re-construct the inverted data-flow graph,
i-e finding the origin of a string which flow into a function. And the
data-flow graph is basically what is monitored when we register that we
can see a new values flowing inside a store at a specific code location.

I think that if we want to capture this kind of information, we should at
least make it in such a way that we can also use it to improve our
performance.  If we are able to isolate the data-flow, we could optimize
our data representation based on guarded invariants of the data-flow
(dynamic deforestation?), and with the support of a moving GC, we could
optimize/deoptimize the value representation on GCs.  [to be seen as a JIT
compiler for the data flow instead of only having JIT compilers for the
control flow]


It's true that, in principle, the flow graph the compiler uses and the flow
graph taint analysis uses are the same. But in practice they're very different.

  * Taint is concerned with flow *through* string primitives:
concatenation, substring, regexp match extraction, and so on. The
compiler doesn't know much about those operations, and so is only
concerned with getting them their arguments, and delivering their
results to the right place. It doesn't relate their inputs to their
outputs.


This is a problem of instrumentation, and this would still exists even with 
tainting.  Also, as I mentioned to Ivan, monitoring strings is an 
approximation, as a string might be given to JSON.parse or converted into an 
Array/TypedArray.



  * Taint needs to dynamically observe the flow of values in specific
actual executions. If a particular branch isn't taken, then the
not-executed code shouldn't affect taint results. But the compiler
needs to reach conservative conclusions that hold on all possible
executions.


This would be true in the case of a static compiler, but in the case of a 
dynamic compiler, we can omit information based on the monitored flow.  In 
fact, TI already restrict the possible type to the observed type, and this 
is for this precise reason that we need to insert type-barriers in 
IonMonkey's code, when the set of observed type is not equal to the upper 
bound calculated by the type inference.



What you propose would require substantial contributions from a group of
engineers (IonMonkey and GC hackers) that is in high demand; it's hard for
me to imagine taint support becoming a sufficient priority for that team -
especially since it's an unproven approach. In contrast, the taint analysis
I brought up here has been prototyped and shown to be valuable, and is
within reach of a volunteer (Ivan) from the security team.


One of the reason why I would prefer us to depend on such information, is 
that our focus is set on performances.  If a bug or an incorrect value 
appear in the analysis, then it would likely be related to a performance 
issue or an incorrect behavior.  The reasons why I want to find a 
performance reason for doing this analysis is that we could rely on it and 
make it better.


As a side note.  Currently, we conditionally maintained an artificial stack 
a side to the Interpreter, Baseline and IonMonkey. This stack is only used 
by the Gecko profiler.  Worse, tbpl does not even run test on the JS Engine 
to ensure that we get it in a correct shape.  So using the information 
collected by this profiling could be helpful in many ways.  Such as finding 
functions which are worth keeping across GCs.


Another example, is the type inference.  Currently we collect a lot of 
information which is valuable for the developer tools.  Sadly, we do not 
have a well-detailed API to make it usable out-side the engine.  But the 
fact that we rely on it, ensure that the type information we see would be 
better than any static analysis tool.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] [GSoC 2013] Project Ideas

2013-04-23 Thread Nicolas B. Pierron

Hi,

On 04/22/2013 10:05 PM, Wei WU(吴伟) wrote:

Hi,


On Sun, Apr 21, 2013 at 8:37 AM, Nicolas B. Pierron 
nicolas.b.pier...@mozilla.com wrote:


On 04/20/2013 02:41 AM, 吴伟/Wei WU wrote:


On Fri, Apr 19, 2013 at 10:01 AM, Nicolas B. Pierron 
nicolas.b.pier...@mozilla.com wrote:




  - Adding profile guided optimization, the idea would be to profile which

branches are used and to prune branches which are unused, either while
generating the MIR Graph, or as a second optimization phase working on
the
graph.  A trivial way of doing so can just look at the jump target which
have been visited so far.  A more clever one might register the
transition
and use Markov chain to determine relations between branches, and
duplicate
a sub-part of the graph to improve constant propagation and the range
analysis.  Before doing anything fancy like the more-clever case we still
need to check that this is worth it and see the benefit of other
optimizations if we fold the graph.

  I'm interested in this idea and I'm willing to implement it. A simple

and
fast algorithm may be a good choice for GSoC, while the advanced methods
require more investigation. I haven't found any branch profiling code in
the interpreter so the instrumenting mechanism must be finished first, or
we can leverage baseline compiler to generate self-instrumented jit code.
Profiling data can be saved separately or in MIR nodes as an annotation.
In
the second case more other passes may benefit from it easily.



Indeed, there is no code for profiling yet.  The most easiest way to think
about it is to do like the write barrier of the incremental GC, i-e having
a buffer (potentially circular, as the opposite of the write barrier) which
is filled with jump targets.  Thus, every time we jump to another location
in the code we push the location of the offset in this buffer, and later we
reuse this buffer to re-organize basic blocks in IonMonkey and to duplicate
/ prune basic blocks if needed.



There are two possible ways to store branch profiling data. One is count
array, the other is target buffer.
I've read LLVM's implementation of Branch(Edge) Profiling and found that it
uses a vector (array) to save the information. LLVM allocates two counters
for each branch/edge. When a block jumps to another the correlate counters
will be incremented by 1. Then these frequencies will be transformed into
probabilities and consumed by some transform passes. Jikes RVM stores
branch profiling data in the similar way.
One possible problem is that they use block ID to index branch counters and
we don't have such things in JavaScript bytecodes.


Indeed, but we can use the jsbytecode pointers, as targets, and later 
convert them to basic block numbers when we process the branch profile.



On the other hand, branch target buffer maintains a sequence of branch
targets and the index problem can be avoided. Further more, It makes
possible to determine relations between branches. The main problem I have
considered is that the cost of calculating branch probabilities might be
high, since we must traverse the buffer to summarize the results.


I don't think the cost would be high, current CPU are good at prefetching 
memory ahead of time on linear read of a buffer, which is likely the case 
that we will encounter.  What you need then would be a way to map the target 
to the basic block to recover the counters.


It might be good, as a first step to have a circular buffer to experiment 
with the branch prediction, and later reduce it to simple counters if the 
buffer proved to be a major pitfall in terms of performance, or if the we do 
not have any benchmark which can take advantage of a more-clever way of 
inferring branches ordering.



I have encountered some other issues. Could you give me some suggestions?
- Instrumenting the Interpreter is fairly easy, while I'm not sure it is
possible to instrument baseline compiler in the similar way. I think it's
better to support them both.


As opposed to TI, which need to keep a consistent model of the types flowing 
in, there is no restriction at only taking a sub-set of the profiling data.


Profiling only in the Baseline's code should be enough and would avoid 
adding extra cost to the interpreter which is used by all script which are 
running a few times.



- Overhead. Instrumenting every conditional jumps may degrades performance
and an 'accepted overhead rate' should be considered. Also, it should be
able to switch on/off by command line arguments.


I agree with the CLI switch. The accepted overhead is weel-defined as 
being our benchmark score and the memory usage.  If you can improve our 
benchmark score without taking too much memory, then this would be 
acceptable performances.



- Store/Restore profiling data on disks. Branch profiling data can be
stored and restored in LLVM and Jikes RVM. But I don't think it is
necessary in SpiderMonkey.


Indeed, I don't think it would be necessary either, and it would be even 
more

Re: [JS-internals] [GSoC 2013] Project Ideas

2013-04-20 Thread Nicolas B. Pierron

Hi,

On 04/20/2013 02:41 AM, 吴伟/Wei WU wrote:

On Fri, Apr 19, 2013 at 10:01 AM, Nicolas B. Pierron 
nicolas.b.pier...@mozilla.com wrote:


- Clarifying our heuristics, to be able to make guesses while we are
compiling in IonMonkey, and recompile without the guarded assumptions if
the bailout paths are too costly.  Our current view is mostly black  white
and only one bad use case can destroy the performances.  We need to
introduce some gray view, saying that we are compiling for the likely
sub-set and accept bailouts as part of the normal lifetime of a script.  As
of today, IonMonkey is too much like a normal compiler, we should make it
an assumption based compiler.




I found a bug (#825268
https://bugzilla.mozilla.org/show_bug.cgi?id=825268) that
may related to this project. According to the description of that bug I
realized that my understanding of the term 'heuristics' is relatively naive
(the strategy used by an optimization algorithm to modify the intermediate
representation is based on a few expressions calculated from the structure
of source codes.) and I think the heuristics you mentioned are more generic
than that.

If I understand correctly, 'compiling for the the likely sub-set' means
that we can compile multiple versions for a method and execute one of them
based on which one's assumptions are satisfied. for example:

function g(x){
   ...use x...
}
function f(){
   for (var i = 0; i  1; i++){
 if (i % 1000){
   g(i);
 }else{
   g('string');
 }
   }
}

Currently IonMonkey compiles g(x) with the guard assert(typeof x === int)
and the jit code will be invalidated periodically. If g(x) can be compiled
with the assumption assert(typeof x in {int,string} then the bailouts
would be avoided.

Am I right?


If g is not inlined, IonMonkey will compile with a guard to ensure that x is 
an Int.  As soon as a call is made with a string, we will discard the code, 
and later recompile it with a larger type set (int  string).  The problem 
is that doing operations on something which can be either an int or a string 
will always be slow, as we fallback on VM function calls.  In your example 
the Int case can be executed in the baseline compiler while the string case 
can remain in Ion-compiled code.  Doing a bailout might still be worth it 
compared to the price of a recompilation and a slower execution for the 
most-likely use case.


The question which remain behind is: Should we keep or discard the code?. 
To be able to answer this question we need some kind of heuristics. And 
before making heuristics, we need a meaningful metric. The only metric that 
is meaningful in such cases is the time, but not ordinary time, as we are at 
the mercy of the scheduler.  Bug 825268 suggests to clean-up the way we are 
using the use-count to make it a reliable  *determinist* source of time 
based on the execution of scripts.



- Adding profile guided optimization, the idea would be to profile which
branches are used and to prune branches which are unused, either while
generating the MIR Graph, or as a second optimization phase working on the
graph.  A trivial way of doing so can just look at the jump target which
have been visited so far.  A more clever one might register the transition
and use Markov chain to determine relations between branches, and duplicate
a sub-part of the graph to improve constant propagation and the range
analysis.  Before doing anything fancy like the more-clever case we still
need to check that this is worth it and see the benefit of other
optimizations if we fold the graph.


I'm interested in this idea and I'm willing to implement it. A simple and
fast algorithm may be a good choice for GSoC, while the advanced methods
require more investigation. I haven't found any branch profiling code in
the interpreter so the instrumenting mechanism must be finished first, or
we can leverage baseline compiler to generate self-instrumented jit code.
Profiling data can be saved separately or in MIR nodes as an annotation. In
the second case more other passes may benefit from it easily.


Indeed, there is no code for profiling yet.  The most easiest way to think 
about it is to do like the write barrier of the incremental GC, i-e having a 
buffer (potentially circular, as the opposite of the write barrier) which is 
filled with jump targets.  Thus, every time we jump to another location in 
the code we push the location of the offset in this buffer, and later we 
reuse this buffer to re-organize basic blocks in IonMonkey and to duplicate 
/ prune basic blocks if needed.



Two bugs (#410994 https://bugzilla.mozilla.org/show_bug.cgi?id=410994, #
419344 https://bugzilla.mozilla.org/show_bug.cgi?id=419344 ) have
mentioned PGO but may be irrelevant with this idea.


Indeed, they are irrelevant for doing PGO on JavaScript.


Other case of smaller projects might be:

- Improving our Alias Analysis to take advantage of the type set (this
might help a lot kraken benchmarks, by factoring

Re: [JS-internals] [GSoC 2013] Project Ideas

2013-04-20 Thread Nicolas B. Pierron

On 04/20/2013 05:37 PM, Nicolas B. Pierron wrote:

where Foo is a different type than Bar.  Which implies that both b are part
of different object, which means that the previous JavaScript function can
be transformed to:

function f(a, arr) {
   a.b = 0; /* Assume shape of a is different than the shape of arr[i] */
   for (var i = 0; i  arr.length; i++) {
 arr[i].b = 1;
   }
}


Sorry, this is true, only if we can prove that the loop body is at least 
executed once.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] [GSoC 2013] Project Ideas

2013-04-18 Thread Nicolas B. Pierron
, the idea would be to profile which 
branches are used and to prune branches which are unused, either while 
generating the MIR Graph, or as a second optimization phase working on the 
graph.  A trivial way of doing so can just look at the jump target which 
have been visited so far.  A more clever one might register the transition 
and use Markov chain to determine relations between branches, and duplicate 
a sub-part of the graph to improve constant propagation and the range 
analysis.  Before doing anything fancy like the more-clever case we still 
need to check that this is worth it and see the benefit of other 
optimizations if we fold the graph.


Other case of smaller projects might be:

- Improving our Alias Analysis to take advantage of the type set (this might 
help a lot kraken benchmarks, by factoring out array accesses).


- Improve dummy functions used for asm.js boundaries.  Asm.js needs to 
communicate with the DOM, and to do so it need some trampoline functions 
which are used as an interface with the DOM API.  Such trampolines might 
transform typed arrays into strings or objects and serialize the result back 
into typed arrays.


--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Long-running compilation steps

2013-04-15 Thread Nicolas B. Pierron

On 04/14/2013 11:41 PM, Nicholas Nethercote wrote:

Hi,

For https://bugzilla.mozilla.org/show_bug.cgi?id=842800 I'm interested
in adding more checks of SpiderMonkey's operation callback in
potentially long-running operations.  For example, when running the
asm.js Unreal demo, parsing takes several seconds, so in my patch I
added a check after the parsing of each statement.

I'm also seeing that IonMonkey compilation can take a while on the
demo.  Is there a loop or loops in IonMonkey that are good candidates
for adding operation callback checks?


IonMonkey compilation is currently sequential for the IonBuilder phase and 
the CodeGenerator::link phase.  All the rest might run in a separate thread 
(or not in case of mono core architectures).


Before the compilation we allocate a lifo-alloc which got extend if needed 
by the compilation.  As far as I know, everything which goes into the 
lifo-alloc is considered as dark-matter from the about:memory point of view.


I am not sure what you are trying to look at but what might be interesting 
would be to look at the usage of the lifo alloc.  It would be safe to do it 
at the creation of each basic block, and in MIRGenerator::shouldCancel 
(called at each loop iteration on basic blocks) if it is safe to report such 
status from another thread.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Instruction scheduling and selection in IonMonkey

2013-03-15 Thread Nicolas B. Pierron

Hi,

On 03/15/2013 03:20 AM, Ting-Yuan Huang wrote:

It seems that there's no instruction scheduler in IonMonkey. If so, may I know 
why? Modern processors should be benefited a lot by an instruction scheduler. 
I'd like to know if it is worth doing so before diving in :-)

Also I didn't see a formal (that appears in textbooks) instruction selector, 
such as tiling a tree/DAG by dynamic programming, or a peephole optimizer. I'm not sure 
but it seems that the quality of instruction selection relies on the lowering process 
from MIR to LIR, so that a direct mapping from LIR to assembly codes is efficient enough, 
right?


Indeed, our macro assembler is directly writing into the buffer.  At the 
same time the code that we are producing contains many checks which might 
make it hard for assembly optimization to trigger as we need to handle 
corner cases such as bailouts.


In IonMonkey case, I think this might be interesting in terms of code-size 
and avoiding redundant operations.  Like avoiding test operations after ALU 
if we are checking if the last computed register is zero, and also to get 
rid of scratch register initialization on x64.  But I guess this would 
mostly be a code-size issue.


In asm.js case (codename OdinMonkey), I think this might be interesting to 
test as we are trying to recover the assembly out-of infallible JavaScript 
(except on ARM bounds check).  I don't know if the quality of the assembly 
that we are producing is good enough or not, Luke and Marty might know more 
about it.


In case of ARM, I don't know what is the impact of such optimizations.

--
Nicolas B. Pierron

___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Exact rooting progress script

2012-11-06 Thread Nicolas B. Pierron

On 11/06/2012 03:18 PM, Terrence Cole wrote:

+--+
| m-i/js/xpconnect |
+--+
IonCode   : 2


This surprise me, so git grep IonCode in js/xpconnect returns:

src/XPCJSRuntime.cpp-CREPORT_GC_BYTES(cJSPathPrefix + 
NS_LITERAL_CSTRING(gc-heap/ion-codes),

src/XPCJSRuntime.cpp: cStats.gcHeapIonCodes,
src/XPCJSRuntime.cpp- Memory on the garbage-collected 
JavaScript 
src/XPCJSRuntime.cpp- heap that holds references to 
executable code pools 

src/XPCJSRuntime.cpp- used by IonMonkey.);

src/nsXPConnect.cpp-static const char trace_types[][11] = {
src/nsXPConnect.cpp-Object,
src/nsXPConnect.cpp-String,
src/nsXPConnect.cpp-Script,
src/nsXPConnect.cpp:IonCode,
src/nsXPConnect.cpp-Xml,
src/nsXPConnect.cpp-Shape,
src/nsXPConnect.cpp-BaseShape,
src/nsXPConnect.cpp-TypeObject,
src/nsXPConnect.cpp-};

So you can remove 2 at the count-down. :)

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] Inlining heuristics and type inference

2012-08-01 Thread Nicolas B. Pierron

Hi Igor,

On 08/01/2012 12:50 PM, Igor Rafael wrote:

Hi guys,

 I would like to change the inlining heuristics in IonMonkey, so that a 
function would be inlined the first time it is met in the program flow. To do 
this, I have commented out the code:

if (script-getUseCount()  checkUses) {
 IonSpew(IonSpew_Inlining, Not inlining, caller is not hot);
 return false;
}


The heuristic should be fine with the current mode of compilation which 
implies running JM first and IonMonkey second for the recompilation.  Which 
means we are unlikely to hit this condition unless we hit an invalidation 
bailout which will reset the counter of the inlined script.


If you run with IonMonkey first (--no-jm), then this check is likely to fail 
because the use count is incremented at loop-head and at the function entry 
point.  This means that we are likely to have a smaller use-count (off by 1) 
for the inlined functions which are only call once per loop iterations.  It 
might be interesting to tweak this heuristic by expecting one less on the 
use-count counter of the inline function to balance the entry-point.



However IonMonkey is answering me Cannot inline due to oracle veto. I have 
tried to follow the calls and this code seems to be related with the type inference 
algorithm. It seems that the type inference engine has not been executed at that point in 
time.
I would like to know what I could do to disable this veto. If it is not to 
abuse your patience, I would like to know also when this type inference engine 
runs and why it not runs before the first compilation.


You will need to run the TypeInferenceOracle which is just a wrapper on top 
of the Type Monitoring  Inference.  Look at ion/TypeOrcale.cpp, the init 
function.


--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals


Re: [JS-internals] How do I see the assembly generated for a function in firefox

2012-08-01 Thread Nicolas B. Pierron

Hi Paddy,

On 08/01/2012 01:46 PM, Paddy Mullen wrote:

 From firefox how do I know what type inferences are being picked up for my 
javascript code?  How do I see the generated assembly?  Do I have to use a raw 
build of ionmonkey?


From Firefox, I will recommend use to use the Code (JIT) Inspector made by 
Brian Hackett,  This is a Firefox extension which can display more 
information about your javascript.  Sadly we have no good UI to clarify its 
output.


If you want to see the byte code then you might use the function used by 
this extension.


   const Ci = Components.interfaces;
var utils = window
.QueryInterface(Ci.nsIInterfaceRequestor)
.getInterface(Ci.nsIDOMWindowUtils);

utils.startPCCountProfiling();
… some code to profile …
utils.stopPCCountProfiling();

var count = utils.getPCCountScriptCount();
for (var i = 0; i  count; i++) {
var summary = JSON.parse(utils.getPCCountScriptSummary(i));
var detail = JSON.parse(utils.getPCCountScriptContents(i));
…
}
utils.purgePCCounts();

The previous technique does not work yet with IonMonkey.  Once Bug 771118 is 
fixed, the previous extension might work with ion monkey too.


If you want to have a look in the details of IonMonkey, I won't recommend 
you to look at the assembly first, but at the codegen and at the output of 
iongraph[1]. When you run the shell with IONFLAGS=logs, IonMonkey will send 
some spew in a temporary file which is retrieved by iongraph.


If you absolutely want to look at the assembly, I will recommend you to use 
gdb to disassemble the code which is produced by IonMonkey.


[1] https://github.com/sstangl/iongraph

--
Nicolas B. Pierron
___
dev-tech-js-engine-internals mailing list
dev-tech-js-engine-internals@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals