[JS-internals] Improving our internal documentation.
Hi, SpiderMonkey internal documentation is sometimes lacking or out-of-date. With the JIT team we had a meeting to discuss ways to improve upon this state. One of the idea was to open a meta-bug to track internal documentation issues. If while reading the code you find a place which lacks or has an out-dated documentation, or you just spent 30 minutes on IRC trying to understand the state of the existing code, Then, take 10s to file a bug as a blocker of: Bug SMDOC: https://bugzilla.mozilla.org/show_bug.cgi?id=SMDOC Then, as a SpiderMonkey developer, you now have the responsibility to fix some of these bugs at your own pace, knowing that in a few years, you might be the person asking these questions. Also note, that some of this documentation effort, when going in depth in the description of some components might be worth writing a blog post. Readers of the JavaScript blog are more than likely interested in your prose, and might give you the feedback needed to make this documentation understandable to anybody who never saw a SpiderMonkey before. ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Let's crowdsource JS shell flag combinations for fuzzing
On Tuesday, May 15, 2018 at 2:42:16 PM UTC, Benjamin Bouvier wrote: > let's add a text file containing a > list of interesting JS shell flag combinations so that our fuzzing people > can parse this file and automatically pick random combinations from it. One question, should these combinations of flags be named such that we can bisect with a named combination of flags, even if the flags are being renamed? ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] JS_STACK_GROWTH_DIRECTION
I agree with Jan, all the logic of the frame iterator is based on the stack growing down, as well as the way the pointers are being coerced and interpreted. For now, I think it is safe to assume that the stack grows down in the generated code. (jit & wasm directories) On 11/17/2017 08:39 AM, Jan de Mooij wrote: IMO it's okay to rely on the stack growing down in JIT code. All of our JIT backends work like that right now and if this ever changes we would have to refactor/audit a ton of things anyway (all callers of masm.push, masm.getStackPointer would be a good start). Jan On Fri, Nov 17, 2017 at 9:06 AM, Lars Hansen <lhan...@mozilla.com> wrote: JS_STACK_GROWTH_DIRECTION is normally -1 (down) but is defined as 1 (up) for HPPA. Does anyone test with stack-growing-up any more? (I know HPPA is tier-3, at best.) Do any of you think about this possibility when you write code? When you masm.Push something and you need the address of the pushed item, do you worry about whether you should capture the stack pointer before or after you push? The wasm baseline compiler currently assumes that the stack grows down so I guess I can just add an assert there and somebody can fix that if it becomes necessary, but it would be nice to know if I should be worrying about this at all. (It looks like stack direction is now like floating point and endianness -- all mainstream systems agree, at last. I just hope everyone will get around to adopting TSO before too long.) --lars ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Should we remove TraceLogger?
On 08/11/2017 04:38 AM, Sean Stangl wrote: The perf-html project is good enough now for me to use it in place of Tracelogger. perf-html is not available in the JS shell, and it is not as precise, even with smaller sampling rate, you will get more overhead than the tracelogger. An alternative to the complete removal, would be to tune perf-html to have labels for each location where tracelogger can be enabled today. I would also like to get rid of Iongraph. We should see if we can expose more JIT information to the perf-html team. I still rely frequently on Iongraph within the JS shell, I would not like to see this one disappear. I once tried to expose the exported JSON to the devtools, but this caused more pain than anything. We can try to have a buffer storing the log of the compilation, like I did previously with the devtools, but without calling back into JS. Then, for perf-html this might cause transfer/recording size issues. Also, as much good as I think of perf-html, we should be careful of what we expose in perf-html. Remember that all users of perf-html are not Jit experts, and that even exposing small information such as bailouts can back-fire badly with a large number of false-positive bug reports. I do not think we should expose the content presented by Iongraph in perf-html. Maybe we should focus on a synthesized version, such as the information about Jit optimizations, or displaying the speed-up that each Jit compiled code are running at. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] How JS team members decide what to work on
On 04/11/2017 07:34 PM, Jason Orendorff wrote: I have no plans to type in my notes from the JS meeting. If you want them, ping me on IRC. But one thing I want to think about is how we decide what to work on, especially performance work. Today, it's like this: - If you're a volunteer, of course you decide what to pick up—we're just glad you're here! - A lot of us profile benchmarks and look for useful work in the profiles. - Sometimes we do the same thing with random web sites. - Bigger projects, like Waldo's work on parsing and djvj et al's work on GC scheduling, are undertaken when we have stuff that has been showing up on profiles "forever". This kind of work isn't driven by any one particular measurement, like a benchmark. Generally, I think we're working on stuff that makes sense (and have been all along), but it's still not guaranteed to be representative of the web as users see it. Is that fair? What else should we be doing? I have looked at multiple performance issues in the past. I globally agree with the processes listed above. The problem I see is that by fixing problem we often forget to look at the big picture, or ask the meta questions. One of these meta questions is: What parts of our current design includes us writing code to fix performance issues? The answers to these question are unfortunately larger projects than just fixing a few performance issues observed on websites. Still, a project like CacheIR is a good answer to this question. Before CacheIR, we had to investigate performance issues of Baseline, and performance issues of Ion. Baseline IC are usually more complete and Ion IC are usually lacking while being more optimized. By unifying the 2 IC systems, CacheIR gives us time to investigate other performance issues in the future. Our time is the most sparse resource, and the question you asked about how we decide which project is the most important perfectly highlight that we cannot keep up. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] How to implement the security scheme that prevents the RET instructions from being misused
On 03/22/2017 04:07 AM, Yuan Pinghai wrote: In my current design, the cookie is stored in a new field (named retCookie_) of JitCode, and each JITCODE (representing an instance of JitCode) has its own cookie. In this way, when i need the original return-address, i can recover it by first getting the JITCODE and then fetching the cookie. Now, my problem is how can i get the correct JITCODE with an address (e.g. the interrupted address before bailing-out)? The CalleeToken of JitFrameLayout frames holds either a JSFunction or a JSScript which contains a pointer to the Baseline and Ion structure containing references to the JitCode. When a JitCode is invalidated (Ion), the JitCode pointer is written above the return address. JitFrameIterator::ionScript() should do the proper work to get the information you are looking for. Note that JitCode are used for all trampoline code which are created when the JitRuntime is created. (see Trampoline-*.cpp files) and these should be registered on the Runtime. Our stack frames are making assumption about the alignment of the stack, thus if you add any fields in the CommonFrameLayout, or JitFrameLayout, these might cause issue in all code able to produce code, in which case you should look for MacroAssembler::call and MacroAssembler::callJit. WebAssembly / Asm.js are not using any of the Jit frames. Instead they are using the same frame layout as the ABI of the system, with some variations around the manipulation of SIMD registers. Could some body give me a tip? Any suggestions are welcome. I appreciate for the help! To be honest, our stack frames are not the easiest thing to manipulate. Maybe to prototype it, it would be easier to create some reserved memory which is used as a second stack space which only contains the cookies? You could store them as part of the JSContext*, and fetch the one corresponding to the top of the stack. By the way, i am working on Spidermonkey 45. By honest, i don't think i have enough knowledge on fixing the bailing-out and exception handling mechanisms, i also need suggestions on them. What are you issues with bailouts and exceptions? They are basically reading a register dump from the stack to build a MachineState (structure of pointers to each register spilled location if any) and then unwind the frame and in case of a bailout replace it by the one created by Baseline. What would matter would be to edit the return address, knowing the caller & callee, which you should have in both cases with the JitFrameIterator. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Clang-format
On 05/11/2016 06:31 PM, Bill McCloskey wrote: On Wed, May 11, 2016 at 5:01 AM, Nicolas B. Pierron < nicolas.b.pier...@mozilla.com> wrote: If the problem are the pointless arguments on dev.platform, which are mistakenly considering SpiderMonkey as Gecko's property, I would totally agree on moving SpiderMonkey into its own repository. I do not see how indentation differences could be a speed bump, and even if this was a problem, I am still not yet convinced this alone could justify changing 95% of the lines of the project. One thing I hate with Gecko undesired continuous integration, is that we are hold responsible for failures in tests that we cannot reproduce. Having a separated project would make explicit the fact that someone is responsible for the integration, and for converting such test cases into SpiderMonkey test cases. I honestly think I spend more time thinking about how I can reproduce some Gecko failures than anybody spent else spent about thinking about indentation. This is a really bad attitude for Mozilla as a whole. Every one of us at Mozilla has a responsibility to make Firefox the best web browser. The more we divide ourselves into cliques and label bugs as "someone else's problem", the sooner we will fail. You might think it's more productive for you to focus on SpiderMonkey alone and let other people deal with other issues. Unfortunately, many of the most important bugs that span across different areas; with your approach, these bugs will never be fixed. This is not some else problem, this is my problem, except that someone else who is much more experienced with the rest of the browser already worked out to figure out how to help me reproduce the issue. Basically, what I am suggesting by having a person responsible for the integration of SpiderMonkey in Gecko, is to have one or multiple persons who would become knowledgeable about all the various parts where I am not. Thus making *us* (Gecko & SpiderMonkey) more productive, by having competent persons working in their domain of expertise. Thus we should no longer be stuck for weeks on problems that we have no idea how to address. I know the time it takes to investigate such errors, and I value my time and choose by priorities, such that I can have the most impact. When facing Gecko failures, I have 2 choices: - Spending weeks to figures them out. - Switching to something else. In both cases, I waste something. I either waste the time to figure out the issue, or I waste the time it took me to make the initial work. I sometime take the second solution, in hope that fuzzers will find the issue, or that other bugs would be easier to investigate. Thus reducing the amount of wasted time at the cost of extra latencies. Mozilla needs more people who understand multiple browser components. I'll call them superheroes because of how valuable they are. Understanding and reproducing browser tests can seem unrewarding, but it's a great way to start to understand how the rest of the system works. People on the SpiderMonkey team are in a great position to be superheroes: SpiderMonkey and XPConnect are some of the hardest parts of the browser to understand, and it's often necessary to step through them to debug other browser issues. People who already understand them have an advantage over everyone else. The need for super heroes only highlight the lack of efforts from us to make SpiderMonkey easier to grasp from within a debugger, for embedders. That's something I wanted to change for a while, and I think we can improve SpiderMonkey embedders experience. I suggested multiple time that we should improve SpiderMonkey debugging experience under gdb, by giving the ability to set breakpoint in JS code within gdb. (including Jit code) The more we empower people for working only on their domain(s) of expertise, the less we would have need for such heroes. Having persons responsible for the integration would help us on that. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Clang-format
On 05/11/2016 02:15 AM, Jason Orendorff wrote: instead go with Terrence’s suggestion and simply adopt the same style as the rest of Gecko, including the 2-space indent. I've said before that we won't do this without talking it over as a team. Well, team? What do you think? Massive changes are always bad ideas, unless these are used to eliminate classes of bugs/crashes, by preventing us from writing them. Changing the indentation is the kind of thing which brings no value, and introduce massive changes. So, I will always be totally against these kind of changes. I agree that having a tool to *check* one coding style is nicer than having no coding style, as long as the tool is flexible enough to allow local inconsistencies made to make the code more readable. Personally I dislike the 2-space indent. But what matters to me here is eliminating a speed bump for both Gecko and SM hackers; and reducing pointless arguments on dev.platform. If the problem are the pointless arguments on dev.platform, which are mistakenly considering SpiderMonkey as Gecko's property, I would totally agree on moving SpiderMonkey into its own repository. I do not see how indentation differences could be a speed bump, and even if this was a problem, I am still not yet convinced this alone could justify changing 95% of the lines of the project. One thing I hate with Gecko undesired continuous integration, is that we are hold responsible for failures in tests that we cannot reproduce. Having a separated project would make explicit the fact that someone is responsible for the integration, and for converting such test cases into SpiderMonkey test cases. I honestly think I spend more time thinking about how I can reproduce some Gecko failures than anybody spent else spent about thinking about indentation. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Clang-format
On 05/06/2016 06:07 PM, Jakob Stoklund Olesen wrote: On May 6, 2016, at 09:59, Jason Orendorff <jorendo...@mozilla.com> wrote: On Fri, May 6, 2016 at 10:43 AM, Jakob Stoklund Olesen <jole...@mozilla.com <mailto:jole...@mozilla.com>> wrote: Unfortunately, the way SpiderMonkey indents case labels is too odd for clang-format. I don’t think it has a configuration flag that can do that half-indent. Feel free to mass-change it to whatever Gecko does and update the style guide. We'll cope. The mozilla style is to indent the case label by one level from the switch, and the code inside the case by one further level. With 4-space indent, it looks like this: switch (tag) { case SCRIPT_INT: { uint32_t i; if (mode == XDR_ENCODE) i = uint32_t(vp.toInt32()); if (!xdr->codeUint32()) return false; if (mode == XDR_DECODE) vp.set(Int32Value(int32_t(i))); break; } case SCRIPT_DOUBLE: { double d; if (mode == XDR_ENCODE) d = vp.toDouble(); if (!xdr->codeDouble()) return false; if (mode == XDR_DECODE) vp.set(DoubleValue(d)); break; } Applied to the current SM code base, this style change would move all lines inside a switch, not just the case labels. I think that if we can cope with such an invasive mass change, we should instead go with Terrence’s suggestion and simply adopt the same style as the rest of Gecko, including the 2-space indent. I would not go for indenting case labels by 4, as this would basically make us avoid using switch-case statements in favor of "else-if" which does not gives us the same guarantee. One other solution would be to remove the half-indent and replace them by no indent, i-e all the visibility modifiers would be on the same line as the class keyword, and all the case would be on the same line as the switch statement. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Reducing SpiderMonkey's crash rate
On 05/03/2016 08:10 PM, Steve Fink wrote: On 05/03/2016 11:11 AM, Jakob Stoklund Olesen wrote: LLVM had an EXPENSIVE_CHECKS macro for that kind of assertion, but I don’t think they use it any more. People would rarely enable it, so the expensive assertions had a tendency to bit rot. I think if they had been enabled by default, they might have stayed in. Yes, this would be worth doing, but would require some effort. I think we can keep them from bitrotting by running with them on in automation. I would say that a debug build should always have them compiled in, but the expensive asserts could do a dynamic check before executing. I do not recall who suggested that, but one idea was to run nightly build with the MOZ_ASSERT compiled in, but with the addition of way to skip it in order to throttle down the assertion overhead. While I think this is a good idea, I see some pitfalls, such that we don't want to introduce opt-only bugs that are not caught by the nightly population. And, also we do not want to use an opt-build with all MOZ_ASSERT enabled, as this would cause issues with variables which only exists in debug builds. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Reducing SpiderMonkey's crash rate
On 05/02/2016 11:32 PM, Nicholas Nethercote wrote: On Thu, Apr 28, 2016 at 10:35 PM, Nicolas B. Pierron <nicolas.b.pier...@mozilla.com> wrote: For the JIT, what would improve our life a lot, would be if we could dump the code of the compiled function which is currently being executed. If we have that, I think we can make a tool to reverse engineer the trace of functions used to generate the assembly code, and potentially walk back to the LIR / Inline Cache which produced the code. Good idea. How hard would this be? Should I file a bug? The idea I had was to have a compilation mode where we instrument the assembler buffer to record the sequences of stack traces with the sequences of pushed bytes. Then use this information to build a markov chain for each stack frames which is still live on the stack. This way, the reverse engineering would be like a GLR-parser on an island grammar expressed by the markov chains. Thus producing as an AST the potential compilation traces for the code which produced the assembly buffer. The markov chain should provide the likelyhood of each AST, and also potentially help us by highlighting corrupted bytes. I think such tool can be made in matters of weeks. The big unknown for me is where can we find the bytes which are surrounding the pc. Jan told me that we are already doing so, but I have no access to such pool of information to experiment with it. I think this would be something we should consider doing if we are going to rewrite the MIR representation / the compiler, as I expect to do as part of THM as its internal representation should be easy {de,}serialize. > […] > (BTW, what is "THM"?) Three Headed Monkey, the project which should revolutionize the way we write compiler, but on which I have effectively no time to work on yet. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Reducing SpiderMonkey's crash rate
On 04/28/2016 06:48 AM, Nicholas Nethercote wrote: This is a good moment to think hard about how we can improve things. - Can we use static and dynamic analysis tools more? (Even simple things like bug 1267551 can help.) I think we already do that every time we can think of a practical one. For dynamic analysis, fuzzers are extremely helpful to help us figure out issues. - How can we get better data in JIT and GC crash reports? For the JIT, what would improve our life a lot, would be if we could dump the code of the compiled function which is currently being executed. If we have that, I think we can make a tool to reverse engineer the trace of functions used to generate the assembly code, and potentially walk back to the LIR / Inline Cache which produced the code. - Would "extended assertions" help? By this I mean verification passes over complex data structures. Compilers often have these, e.g. after each pass you can optionally run a pass that does a thorough sanity check of the IR. Do we have that for the JITs? We have such phases in IonMonkey, which ensure the sanity of the MIR graph. For the moment it is only enabled in debug builds. Most of the checks could be done in release, but some of the checks are dependent on data which are only available in debug builds, and that we would not want to enable in release build either. I think it is doable to have such checks turned on if the browser is in a refined-error-detection mode. I am thinking mostly of repeated start-up crashes, which are likely to be caused by more-or-less deterministic behaviours. On the other hand, asserting about the graph coherency would help locating the error, but hardly isolate it to a specific function. - What defensive programming measures can we add in? What code patterns are error-prone and should be avoided? One idea I had, would be to make unit test for phases of the compiler. Unfortunately this is not something trivial to add as-is, and we would have to detail all the hidden assumptions which are currently present in all the phases. I think this would be something we should consider doing if we are going to rewrite the MIR representation / the compiler, as I expect to do as part of THM as its internal representation should be easy {de,}serialize. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] OOM exceptions
On 04/21/2016 09:05 PM, Shu-yu Guo wrote: The first is ergonomic. I want phased workloads like parsing + BCE and JIT compiling to be completely infallible. The allocation pattern of compilers is a series of many small allocations. I'd like to just allocate a huge buffer inside a LifoAlloc-like allocator in the beginning and call it a day. That shouldn't impact 32bit address space fragmentation. Checking false/null returns everywhere really puts a crimp in my evening. We have been on this path with IonMonkey, and I honestly find that this is a not good practice to have an infallible allocator, especially if you don't want to cause a browser crash and properly handle OOMs as we try in IonMonkey. The reason why I think we should properly handle OOMs in IonMonkey, is because this is not part of the mandatory pool of allocation request by the Script. Thus, the user script should not see these OOMs, as it is technically not responsible for them. The problem then, is that we have to armor any loop which has boundaries controled by any user inputs. If the user input can be made as large as possible, then whatever ballast space you take, it can be overflow. I think, as this is occasional this is much more miss-leading than having a pattern being repeated over and over. I tried to suggest better approaches in Bug 1244824 [1], but the best I can come up is a way to emulate exception handling, but the pit-falls are worse. So, if we were to have a static analysis to ensure that we do not have any destructor to execute while leaving a scope, then I guess we could emulate exception this way. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1244824 -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] OOM exceptions
On 04/21/2016 05:16 PM, Jan de Mooij wrote: Is our only option doubling down on these fuzz bugs and adding more assertions, or can we do better with static analysis, the type system, annotations, something? From the type system point of view, I think we could add a type to distinguish the Allocation failures from boolean types. In many cases, I found that we were mixing the true/false expectation of an analysis, with the true/false of an allocations. Using the type system would involve making a lot of modifications to the code base, either to wrap/unwrap error code, or to add new enumerated types. I think this could be a good long term solution, but hardly a way to make incremental progress. A static analysis is probably the easiest way forward, and it should ensure that same value (false / Foo::ALLOC_ERROR) is always used to identify allocation failures within a single function. This means that an analysis should probably: (1) Go through the body of functions, and look for values returned in case of allocation failures. (2) Annotate the function declaration with the value used on allocation failures. (3) Revisit by going to 1. any of the functions which are using any of the annotated function declarations. (4) Ensure that virtual functions have consistent error values. This would leave the question of function pointers, but I guess this is something we can easily address either by review or with annotations. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Jit DevTools 0.0.1
On 06/11/2015 04:49 PM, cosinusoida...@gmail.com wrote: On Thursday, June 11, 2015 at 2:19:40 PM UTC+1, Nicolas B. Pierron wrote: On 06/11/2015 02:43 PM, cosinusoida...@gmail.com wrote: Hi Nicolas, I'm getting the following error when I attempt to use the addon: console.error: jit-dev-tools: Message: TypeError: this.debuggee is undefined Stack: JitPanel.onReady@resource://gre/modules/commonjs/toolkit/loader.js - resource://jit-dev-tools/lib/jit-panel.js:75:5 emitOnObject@resource://gre/modules/commonjs/toolkit/loader.js - resource://gre/modules/commonjs/sdk/event/core.js:112:9 emit@resource://gre/modules/commonjs/toolkit/loader.js - resource://gre/modules/commonjs/sdk/event/core.js:89:38 onStateChange@resource://gre/modules/commonjs/toolkit/loader.js - resource://gre/modules/commonjs/dev/panel.js:70:3 I seem to get exactly the same issue when I try it in the latest nightly (https://hg.mozilla.org/mozilla-central/rev/bfd82015df48 according to about:buildconfig). I also built the latest version of your addon from git, but still I get the same error. If you have a github / bugzilla account, I suggest we offload this discussion to [1] or [2]. Also, can you explain in the bug how you start the devtools, and how they appear, maybe I would manage to reproduce this issue. Otherwise, if you have time I suggest we discuss on irc.mozilla.org (nbp) on how to instrument the code to debug this addon. Thanks for testing, and reporting issues :) [1] https://github.com/nbp/jit-dev-tools/issues/new [2] https://bugzilla.mozilla.org/enter_bug.cgi?assigned_to=nobody%40mozilla.orgbug_file_loc=http%3A%2F%2Fbug_ignored=0bug_severity=normalbug_status=NEWcc=:nbpcf_blocking_b2g=---cf_blocking_fennec=---cf_feature_b2g=---cf_fx_iteration=---cf_fx_points=---cf_status_b2g_2_0=---cf_status_b2g_2_0m=---cf_status_b2g_2_1=---cf_status_b2g_2_1_s=---cf_status_b2g_2_2=---cf_status_b2g_master=---cf_status_firefox38=---cf_status_firefox38_0_5=---cf_status_firefox39=---cf_status_firefox40=---cf_status_firefox41=---cf_status_firefox_esr31=---cf_status_firefox_esr38=---cf_tracking_b2g=---cf_tracking_e10s=---cf_tracking_firefox38=---cf_tracking_firefox38_0_5=---cf_tracking_firefox39=---cf_tracking_firefox40=---cf_tracking_firefox41=---cf_tracking_firefox_esr31=---cf_tracking_firefox_esr38=---cf_tracking_firefox_relnote=---cf_tracking_p11=---cf_tracking_relnote_b2g=---component=JavaScript%20Engine%3A%20JITcontenttypemethod=autodetectcontenttypeselection=text%2Fpl aindefin ed_groups=1flag_type-203=Xflag_type-37=Xflag_type-4=Xflag_type-41=Xflag_type-5=Xflag_type-607=Xflag_type-720=Xflag_type-721=Xflag_type-737=Xflag_type-781=Xflag_type-787=Xflag_type-791=Xflag_type-799=Xflag_type-800=Xflag_type-803=Xflag_type-835=Xflag_type-846=Xflag_type-855=Xflag_type-856=Xflag_type-857=Xflag_type-863=Xflag_type-864=Xflag_type-870=Xflag_type-875=Xflag_type-889=Xform_name=enter_bugmaketemplate=Remember%20values%20as%20bookmarkable%20templateop_sys=Unspecifiedpriority=--product=Corerep_platform=Unspecifiedshort_desc=Jit%20DevToolstarget_milestone=---version=unspecified -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Jit DevTools 0.0.1
On 06/11/2015 02:43 PM, cosinusoida...@gmail.com wrote: Hi Nicolas, I'm getting the following error when I attempt to use the addon: console.error: jit-dev-tools: Message: TypeError: this.debuggee is undefined Stack: JitPanel.onReady@resource://gre/modules/commonjs/toolkit/loader.js - resource://jit-dev-tools/lib/jit-panel.js:75:5 emitOnObject@resource://gre/modules/commonjs/toolkit/loader.js - resource://gre/modules/commonjs/sdk/event/core.js:112:9 emit@resource://gre/modules/commonjs/toolkit/loader.js - resource://gre/modules/commonjs/sdk/event/core.js:89:38 onStateChange@resource://gre/modules/commonjs/toolkit/loader.js - resource://gre/modules/commonjs/dev/panel.js:70:3 Thanks for reporting, I guess I might be using a new API added to the dev tools. I get that in both Firefox 38 and Firefox Developer edition on x86_64 Linux. Also, the Debugger.onIonCompilation hook is quite new (= 41.0a1), so you will have to use a nightly version of Firefox to use this addon, or wait until the end of the month, for the next release cycle. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
[JS-internals] Jit DevTools 0.0.1
Hello everybody, I am please to do an early announce a new tool named Jit DevTools. This new tool is an addon which mostly target Jit developers. It uses the recently added Debugger.onIonCompilation hook to display the latest MIR [1] and LIR graphs within the dev tools. To use this tool, go to a web page, and open the Jit DevTools panel, then wait until a function got compiled. Once a function is compiled, you can select a compiled script, which will display the MIR graph of the compilation. If you need to, you can also have a look at the LIR graph. The output is similar to the one rendered with iongraph [5]. By selecting any inlined script, the background of the block titles will change color, if the block correspond to the inlined instance. This feature is quite useful to identify how a function plays in a compiled script. You might find this tool quite handy to use for the following use cases: - Investigating DOM optimization. - Investigating jsperf issues. - Comparing function implementations, and impacts on the generated code. This addon works on optimized builds, and even with parallel compilation enabled. You can download [2] this early version from http://people.mozilla.org/~npierron/jit-dev-tools/ , or build it your-self by using the sources [3] with jpm tool [4]. Have fun, and enjoy. [1] http://people.mozilla.org/~npierron/jit-dev-tools/jit-dev-tools-0.0.1.png [2] http://people.mozilla.org/~npierron/jit-dev-tools/jit-dev-tools-0.0.1.xpi [3] https://github.com/nbp/jit-dev-tools/ [4] https://developer.mozilla.org/en-US/Add-ons/SDK/Tools/jpm [5] https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Hacking_Tips#Using_IonMonkey_spew_%28JS_shell%29 -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Bailout_DuringVMCall
On 04/14/2015 08:31 PM, madhukar.kedl...@gmail.com wrote: For the past few months I have been working on using offline type profile information to avoid bailouts in SM. During my experiments I came across Bailout_DuringVMCall and was not able to trace what exactly caused it. I was able to reproduce these bailouts using simple examples where I changed the type of a global variable or shape of an object in a hot function after a few thousand invocations of it. Ideally, the bailout type for such a code should be either Bailout_TypeBarrierV or Bailout_ShapeGuard. But I see Bailout_DuringVMCall being generated. Is it because the hot function gets inlined into another hot function and there is no what to figure out which bailout occurred? This is unfortunately the placeholder for both JS and native function calls. This can be interpreted as “the generated code got invalidated”. Sadly, many things can produce code invalidations, and the location of the bailout does not discriminate the reason of it. This bailout is just here to ensure that we no longer execute the code which is now un-safe, as it is based on assumptions which are no longer holding. Tracing the reason of the invalidation would be nice, and might be doable by instrumenting calls to addPendingRecompile[1]. [1] https://dxr.mozilla.org/mozilla-central/search?q=addPendingRecompilecase=true -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Optimization tracking API landed
On 02/06/2015 01:20 AM, Shu-yu Guo wrote: I recently landed bug 1030389 to track the high-level optimizations decisions (i.e., deciding what MIR is emitted) made by IonBuilder. This information will feed into the profiler and is attached to sampled JIT frames. Not all optimization decisions are taken by IonBuilder, is there plan to make this API available to other transformation phases. In particular, I am thinking of Escape Analysis / GVN LICM / Sink. That's it! Instrumentation bugs also make great first bugs, and I would be happy to mentor. That's Great! Should we instrument every code path, or only instrument the code path which have big cliff? -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
[JS-internals] Jit: Source-code locality Compiled-code locality ?
Hi list, Lately, we have been discussing ways to clean-up the MIR.h file. Among the problem that we have with this file is that all instructions are randomly ordered in the file. Thus if one need to look for an instruction, you must use the searchjump feature of your editor. More over, the problem is more general than MIR.h, as we see the same problem in Lowering.cpp and CodeGenerator.cpp. Having such files results in a terrible developer experience as the source-code locality is non-existent. On the other hand, this model provides good compiled-code locality, as similar functions are packed together in the binary. Previously, I suggested that we should be moving functions closer to the the transformation phase which are making use of them. As of today, we can see this idea applied to RangeAnalysis.cpp and to Recover.cpp. This idea gives better compiled-code locality. On the other hand, as a developer, I am sad of the current state, as when I have to look for one instruction, I do not see everything which is related to this instruction in one file. Having source-code locality is good for eye-balling consistency of modifications (reducing review time?). I think we should improve source-code locality while keeping/improving compiled-code locality. I want to know if people are interested by this topic and if I we should continue this discussion on a bug with actual code prototypes? -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Contributing
Hi Xue, and Welcome :) On 12/06/2014 02:23 PM, Xue Fuqiao wrote: Hi list, A newbie here. I'm interested in contributing to SpiderMonkey and I've read some information about SpiderMonkey on MDN and MozillaWiki. FYI - I'm familiar with: * JavaScript (ES5) * Bugzilla * MozillaWiki * ANSI C * C++98 * JSON * Beavis and Butt-head Do America (which is the origin of the name SpiderMonkey :-) I'm not familiar with (yet): * ES6 (and TC-39, and thHie standardization process) * Mercurial * Try Server * JIT and bytecode * Garbage collection * JS engine benchmarking (Kraken/SunSpider/Octane) * Instruction sets * asm.js * IRC * Make (I can only write some very simple rules.) * Security This is great that you made such list, but don't worry we can always help you. About: - IRC: Read the documentation which is on [1], and you can join the #jsapi channel, which is were SpiderMonkey developers are chatting. Sadly, we are unlikely to have a fast answer today as most of us are travelling. [1] https://wiki.mozilla.org/IRC - Mercurial: You can find some documentation at the following link: https://developer.mozilla.org/en-US/docs/Mercurial - Try Server: Don't worry about it, the usual process is that we make a few patches first, and ask the reviewer to push the patch to try and then to mozilla-inbound. (sorry I have to cut my reply short, I will continue tomorrow … I have to take a bus) Welcome :) -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
[JS-internals] Jit test JS shell command line options.
Hi all, Fuzzers are testing configurations which are not the default one, such as --ion-gvn=off and --ion-regalloc=backtracking. Some of these options are convenient as we want to be robust or as we want to migrate to a new configuration. Ideally we should use the testing functions[1], such as gczeal or setJitCompilerOption as these can also be used in browser builds. For options which are not covered by testing functions yet and which have a command line interface in the JS Shell, I am adding a way to use these, as part of Bug 1105187 [2]. To use command line options, just write a similar comment at the top of the test case: // |jit-test| --no-sse4; --ion-regalloc=backtracking; error:ReferenceError … Note that command line options are separated by semi-colons, and that spaces around them are stripped before appending them to the command line of the JS Shell. Only long command line options (starting with --) are accepted, and not the short one, so // |jit-test| -D Will not work and output a warning message, while the following will succeed: // |jit-test| --dump-bytecode If you have any doubt, you can check if the JS Shell is invoked with the right command line, by using a similar command: python ./jit-test/jit_test.py -s -o ./path/to/js ion/bug1105187-sink.js [1] http://dxr.mozilla.org/mozilla-central/source/js/src/builtin/TestingFunctions.cpp#2221 [2] https://bugzilla.mozilla.org/show_bug.cgi?id=1105187 -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
[JS-internals] Use MOZ_ASSERT and MOZ_ASSERT_IF
Hi, I just replaced all instances of JS_ASSERT by MOZ_ASSERT and JS_ASSERT_IF by MOZ_ASSERT_IF. Each commit contains the command use to do it automatically, and it is also listed in Bug 1074911. JS_ASSERT is dead, Hurray for the new MOZ_ASSERT \o/ -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Growing arrays
On 07/16/2014 11:08 PM, Nicholas Nethercote wrote: So then I tried reverting that change and inserting this line just before the loop: array[length - 1] = 0; And now it avoids the doubling allocations -- the array elements are allocated once, at the right size. But it feels dirty, and I don't know if would give the same behaviour in other JS engines. Arrays implementation are really different between JavaScript engines. I know that is used to make us produce a spare array where in addition to setting the last element, we pay the cost for each assignment done in the middle after. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Dynamic analysis meeting w/ devtools
On 07/01/2014 11:11 AM, Till Schneidereit wrote: On Tue, Jul 1, 2014 at 7:52 PM, Jason Orendorff jorendo...@mozilla.com wrote: The proposed implementation technique underlying this is bytecode instrumentation. One reason for this is that we already have tons of practice adding new opcodes to Ion, baseline, and the interpreter. We already know how to make them work the same in all modes and fast in Ion. Of course the implementation technique could vary per event. If we choose to support an event that already has a natural choke-point in C++, we would not need bytecode instrumentation to intercept that event. It is also true that bytecode instrumentation has a few weaknesses--things like exception handling are not done by executing bytecodes at all. Isn't this exactly what tracelogging does? Or, a subset of what tracelogging does, rather? For this precise point, yes. Which is why I also discuss with Hannes about the Tracelogger. The main difference is the exposure through Debugger. But keep in mind that this first aspect would be the ground work for the incremental updates of this API, and that we can instrument on-demand based on what is monitored by all the Debuggers. Code coverage might want to have a per-block overview of the code usage, or per-instruction overview, depending on how much overhead is acceptable. I think the Tracelogger output is only one kind of information that we want to expose through any analysis API, such as we can stream process events, and make it available inside the debugger. One of the difference between the Tracelogger and what we want to achieve at first here, is that we only want to observe one compartment, and not all runtimes of the browser. (is there a tracelogger filter?) One of the thing that I would hope to see exposed to web developers would be the time spent in the Parser / GC of each compartment. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Dynamic analysis meeting w/ devtools
On 07/01/2014 12:04 PM, Fitzgerald, Nick wrote: On 7/1/14, 10:52 AM, Jason Orendorff wrote: Events are *not* delivered synchronously. As JS code executes, a log is written. Occasionally, the log is then parsed and records are delivered to devtools code via the enterFunction and leaveFunction methods above. (This batching should improve performance, by minimizing C++-to-JS-to-C++ and cross-compartment calls.) Because all the devtools are designed to work remotely from day one*, we will be sending these logs over the Remote Debugging Protocol from the debuggee device (the Firefox OS / Fennec phone, etc) to the debugger device (desktop Firefox) where the data will be processed in a worker and eventually displayed to the user. It would be a shame if we did this: 1. Collect log in SpiderMonkey 2. Parse log into JS objects 3. Deliver to hooks devtools set 4. Re-serialize JS objects into a log for transport 5. Send log over RDP 6. Parse log into JS objects again When if the log was exposed to devtools as some kind of blob / typed array that we can send across the RDP as binary data, we could do this: 1. Collect the log in SpiderMonkey on the debuggee device 2. Deliver the log blob to a hook the devtools set 3. Send log blob over RDP 4. On the debugger device, devtools code asks Debugger to parse the blob This way we aren't repeatedly parsing and serializing (to potentially different formats!) for no good reason. One of the issue with the blob logic is that the intent of making analysis is to be able to inspect elements. One of the idea was to be able to proxy objects, such as we can still provide a boxing mechanism, which makes sense for synchronous analysis as they have in Jalangi. On the other hand, now that I am thinking more about it, I do wonder to what extend having an asynchronous view on objects might be helpful compared to some unique identifier of an object. In which case, if you want to find the value corresponding to one identifier, you will have to watch for objects mutations as well. Knowing that objects allocations/mutations/deallocations will stream you the list of modifications made to all objects. If we go through the RDP, then I guess we want the asynchronous tracing to just provide an ArrayBuffer of its log based on a list of callbacks (not functions), such as we can easily write the server-side of the pipeline. Then, I guess we want a second function into which we feed the ArrayBuffer and it calls all the callbacks (provided as a list to the first function). And this would be on the client side of the Debugger. // producer var watched = [enterFunction, leaveFunction, setObject, newObject, freeObject]; dbg.addLogListener(watched, function (stream) { // ... send stream over the network, or locally ... }); // consumer var watcher = { enterFunction: function (event) { ... }; leaveFunction: function (event) { ... }; setObject: function (event) { ... }; newObject: function (event) { ... }; freeObject: function (event) { ... }; }; function onStreamReceived(stream) { dbg.dispatchLogEvents(stream, watcher); } -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Dynamic Analysis API discussion
or elsewhere where we're depending on analysis. Like everything else, but there is more chance to break something which rely on source-to-source transformation than something which relies on a lower level (ECMA based?) API. #3 is interesting and perhaps where lessons learned from Java and other contexts do not apply. I think we should dig into specific tool examples for this; maybe some combination of more intelligent translation and judicious API extensions can solve the problems. Nicolas B. Pierron wrote: Personally, I think that these issues implies that we should avoid relying on a source-to-source mapping if we want to provide meaningful security results. We could replicate the same or a similar API in SpiderMonkey, and even make one compatible with Jalangi analysis. It's not clear what you mean by the same or a similar API here. I mean that I want such API to be a JavaScript API. I do not want us to provide function for adding hooks. I want the JS engine to provide one function for registering all the hooks you want in a separated compartment. var a = newAnalysisGlobal(); a.eval(load('my-analysis.js')); var g = newGlobal({analysis = a}); // Generate bytecode probes based on function currently present on the // analysis global. g.eval(…); We can either inspire our-self from Jalangi interface for making analysis, or just bridge the two with a wrapper. Such analysis should be implemented in JavaScript and not any other language as our primary target are JavaScript developers. If we add opcodes dedicated to monitor values (at the bytecode emitter level), instead of doing source-to-source transformation. One of the advantage would be that frontend developers would not have to maintain Jalangi sources when we are adding new features in SpiderMonkey, and more over, the bytecode emitter already breakdown everything to opcodes, which are easier to wrap than the source. Analysis are usually made to observe the execution of a code, and not to mutate it. So if we only monitor the execution, instead of emulating it, we might be able to batch analysis calls. Doing batches asynchronously implies that the overhead of running an analysis is minimal while the analyzed code is running. Logging and log analysis have their place, but a lot of dynamic analysis tools rely on efficient synchronous online data processing in instrumentation code. For example, if you want to count the number of times a program point is reached, it's much more efficient to increment a global variable at that program point than to log to a buffer every time that point is reached, and count log entries offline. For many analyses of real-world applications, high-volume data logging is neither efficient nor scalable. Here are a couple of examples of Java tools I worked on where synchronous online data processing was essential: -- http://fsl.cs.illinois.edu/images/e/e8/P385-goldsmith.pdf -- http://web5.cs.columbia.edu/~junfeng/09fa-e6998/papers/hybrid.pdf So I think injection of synchronously executed instrumentation is essential for a large class of analyses. The asynchronism is one suggestion to make recording analysis faster, by avoiding frequent cross-compartment calls. I do not see any issue to have synchronous request, on the contrary I think it might be interesting to interrupt the program execution on such request, or even change the program execution (things that we can only do synchronously) to prevent security holes / privacy leaks. On the other, I do think that we should have asychronous analysis first, but only the use case of potential users can answer this question for us. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Dynamic Analysis API discussion
On 06/26/2014 10:49 AM, Shu-yu Guo wrote: On Jun 26, 2014, at 6:57 AM, Nicolas B. Pierron nicolas.b.pier...@mozilla.com wrote: I have a question for you, and also for people who have made such analysis in SpiderMonkey. Why taking all the pain of integrating such analysis in SpiderMonkey's code, which is hard and change frequently when it would be easy (based on what you mention) to just do source-to-source transformation? Why do we have 3 propositions of implementing taint analysis in SpiderMonkey so far? It sounds to me that there is something which is not easily accessible from source-to-source transformation, which might be easier to get hooked once you are deep inside the engine. Perhaps we can get those who tried to implement taint analysis in SpiderMonkey before to chime in about the pain points they experienced. Do we know who they are? Yes, we know who they are, and we contact for all of them. But I know that at least one of them does not want to go public right now. Extending a JS parser, maybe. Extending 2 JS parser the same way, is harder. New language features with complex semantics require significant tool updates whatever API we use. No as much as the syntax, the bytecode is an example of it, as the bytecode is some-kind of subset that we target with the bytecode emitter. As you mentionned, manipulating bytecode is easy, but manipulating the source to ensure that we have the same semantic might be more complex. It seems a worse maintenance burden to me to have to update all analyses written when we decide to change the bytecode in SpiderMonkey, say, like decomposing some more fat ops. Exposing a bytecode-based instrumentation on a private bytecode makes the bytecode a de facto public and frozen API, which is undesirable. As I’ve said before, which I’ll repeat here for the benefit of the discussion thread, I am in favor of a source-to-source approach because it seems to me that source-to-source is just as expressive as the API proposed here. I remain optimistic that an out-of-engine tool can be made performant, for some of the points roc mentioned. For maintenance, if nothing else, an out-of-engine tool is open to be maintained by a larger number of developers instead of just JS engine developers. I do not disagree that source-to-source is more expressive, but it is as well easier to shoot your-self in the foot, by doing such modification. I want to make sure that this is both as easy for analysis developers to make analysis as it is for us to maintain such API. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
[JS-internals] Dynamic Analysis API discussion
features in SpiderMonkey, and more over, the bytecode emitter already breakdown everything to opcodes, which are easier to wrap than the source. Analysis are usually made to observe the execution of a code, and not to mutate it. So if we only monitor the execution, instead of emulating it, we might be able to batch analysis calls. Doing batches asynchronously implies that the overhead of running an analysis is minimal while the analyzed code is running. On an orthogonal aspect, we could isolate the analysis code from the analyzed code by making a separated compartment for the analysis. This would provide any boxing and unboxing feature as a safe guard, but this would be extremely expensive in terms of speed (without a batching system), and in terms of memory (without executing batches before GCs). In addition to provide safe guards for persons making analysis, it avoid the pitfall of the mega-morphic calls. Separating the analysis from the code being analyzed provide an additional advantage which is that we know what the analysis might be looking for. This implies that we could only trace values which are being watched by the analysis, and thus avoid useless overhead. [6] http://marijnhaverbeke.nl/acorn/ -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] bailout return addresses
On 06/08/2014 09:27 PM, Cameron Kaiser wrote: [Codegen] instruction CallGetIntrinsicValue [Codegen] == push(immgcptr) == [Codegen] #label ((1068)) [Codegen] == push(imm) == [Codegen] 0201522c --- lis r0,913 (0x391) [Codegen] 02015230 --- ori r0,r0,18768 (0x4950) [Codegen] 02015234 --- stwu r0,-4(sp) [Codegen] == callWithExitFrame(ion *) == [Codegen] == push(imm) == [Codegen] 02015238 --- li r0,2112 (0x840) [Codegen] 0201523c --- stwu r0,-4(sp) [Codegen] == call(JitCode) == [Codegen] 02015240 --- mfspr r0,lr [Codegen] 02015244 --- bl .+4 lr = pc [Codegen] 02015248 --- mfspr r12,lrget pc into r12 [Codegen] 0201524c --- mtspr lr, r0 [Codegen] 02015250 --- addi r12,r12,32 (0x20) push pc+32 [Codegen] 02015254 --- stwu r12,-4(sp) [Codegen] 02015258 --- x_skip_this_jump /* x_skip_this_jump is patched by the assembler to a lis/ori/mtctr stanza to call the VM wrapper */ [Codegen] 0201525c --- nop [Codegen] 02015260 --- nop [Codegen] 02015264 --- bctrl jump to CTR /* VM gets called and returns here */ [Codegen] ##addPendingCall offs 0458 to 00cf1fa0 [Codegen] #label ((1128)) [Codegen] == push(reg) == [Codegen] 02015268 --- stwu r3,-4(sp) [Codegen] == push(reg) == [Codegen] 0201526c --- stwu r4,-4(sp) The return address (using these offsets) is 0x02015268. That's where it should return to, if I understand what it's doing. By the way, it doesn't look like I need to preserve LR and it was never saved in Baseline; it's just here for paranoia. Safepoint return address is here as a convenient way to index the safepoint and to know where we can patch the code. When we have an invalidation, we are patching 4/8 bytes of the dead code below (which is not going to be executed once we return to this function) to register the pointer of the IonScript. The return address doesn't correspond to where the OsiPoint got marked -- it gets marked way down at 0x020153cc after the CallGetIntrinsic, or way back at 0x020527c4 with the OsiPoint/MoveGroup. So it asserts in IonScript::getOsiIndex() because the return address doesn't match any of the recorded OsiPoints. Indeed, the return address inside the Safepoint (the OsiPoint return address) should be set by the OsiPoint, and it does not correspond to the return address of the call inside the instruction which is using the Safepoint. The return address which is on the stack is used to find the Safepoint. The return address which is in the Safepoint is used to locate the OsiPoint. I would simply say that this is expected that the Safepoint does not contain its own return address as we can use a SafePoint multiple times in one instruction. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] JS status report for Nightly 32
Hi Chris, On 06/09/2014 06:42 PM, Chris Peterson wrote: We have some very active community contributors for this release! nbp and bbouvier have been mentoring many bugs for recover instructions, a precursor for IonMonkey escape analysis and branch profiling. Contributors working on Nightly 32 (in alphabetical order): * Amol Mundayoor * Heiher * Inanc Seylan * Julien Levesy * Nathan Braswell * Sankha Narayan Guria * Sushant Dinesh * Tooru Fujisawa If I missed you, please let me know! If you need more bugs, just drop by #jsapi on irc.mozilla.org. :) Thanks Chris for doing this list. :) This is amazing that we got so many new contributors working on the JS Engine, and I will suggest others to do the same, the rules are simple[1]. Also, with the Bugs Squashing Party[2] event in the Paris office, I am expecting to mentor new contributors over the weekend of the 21st/22sd of June. [1] https://wiki.mozilla.org/Contribute/Coding/Mentoring [2] https://www.eventbrite.com/e/mozilla-bugs-squashing-party-tickets-11619100041 -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Hello all
Hi Sriram, On 04/15/2014 09:33 AM, Sriram A S wrote: I am new to SpiderMonkey group and wish to contribute in bug fixing and future developments. I have a working dev env already in my machine and going through the docs to understand the working of SpiderMonkey JS engine and get used to it. Nice this is the first step for contributing. :) I am not sure from where I need to start, so it will be great if someone can guide me. If there are any assignments for me, please let me know. The JavaScript engine has multiple components such as a Parser, Debugger API, JITs, and so on … What are interested in, and where do you want contribute to? You should also join the IRC channel and get your self known on irc.mozilla.org #jsapi as well as on #introduction. These are the places where you can ask question on the JS Engine and on how to start contributing. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Cake ( other beverages)
On 04/07/2014 02:32 PM, Jason Orendorff wrote: What: 25 Minutes Of Cake And JavaScript (bring your own cake) A totally optional vidyo chat for random SM talk, show tell, brainstorming, etc. When: 2nd and 4th Friday of each month, starting this Friday, 10AM Mountain View Time. Good idea, I reserved a room in the Paris office for people who might want to join Benjamin and I there. (19:00 Paris time) -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Removing 'jit-tests' from make check
On 04/04/2014 03:39 AM, Daniel Minor wrote: Just a heads up that very soon we'll be removing jit-tests from the make check target[1]. The tests have been split out into a separate test job on TBPL[2] (labelled Jit), have been running on Cedar for several months, and have been recently turned on for other trees. We've added a mach command-- mach jittest that runs the tests with the same arguments that make check currently does. mach jittest ? Is there any documentation which explain how to only work with the JS Shell by using mach commands? Does this change implies that every JS developer will have to compile the full browser just to work on the Shell? The only documentation I know [1] explains how to run a configure make. [1] https://developer.mozilla.org/en-US/docs/SpiderMonkey/Build_Documentation -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Having the compilation process produce a binary as well as a symbols file
On 03/05/2014 05:17 PM, Gary Kwong wrote: How useful would it be for the compilation process to produce a binary as well as a symbols file for everyone? I will reformulate the question, how many bugs were hard to reproduce on others computer? And were more dependent on the compiler output than the configuration flags? My usecase would be to be able to archive just the binary and symbols in a cache folder so I don't need to compile it again when testing testcases. May be you can just keep one full image for every mozilla-central merge, and only keep binary-diff compared to the parent commit. I do not expect that the ~30 changes happening to the JS engine will shift things a lot. Recovering the build of a changeset, is a matter of applying the right set of patches. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Getting the allocation point of every object
On 02/27/2014 03:02 AM, Brendan Eich wrote: Fitzgerald, Nick wrote: Or in self hosted code, right? Maybe the iterator { value, done } objects? Are we optimizing away { value, done } objects that can't escape (from iterators run afresh by for-of loops)? If not, is there a bug on file to so optimize? If not, please file and cite here. Thanks, Not as far as I know, and I do not think we can *properly* do anything like that before landing something similar to Bug 878503. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Better memory reporting of objects and shapes
On 02/18/2014 12:49 PM, Nicholas Nethercote wrote: On Tue, Feb 18, 2014 at 1:57 AM, Nicolas B. Pierron nicolas.b.pier...@mozilla.com wrote: I think it might make sense to special case the JSFunction class, such as we can get the object prototype name in addition to the JSFunction class. Interesting idea. What's the exact code for getting the object prototype name from a JSFunction? The corresponding JS code would be: obj.__proto__.constructor obj.__proto__maps to JSObject::getProto(cx, obj, proto) .constructor maps to JS_GetConstructor(cx, proto) Then, if this is a JSFunction, then you can extract the RootedAtom name(cx); if (constructor-isJSFunction()) name = constructor-asJSFunction().displayAtom() Which would be either the name of the function, or it's inferred name. This way, for the following code: function Foo() {} var x = new Foo(); We should be able to display Foo in the memory reporter. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] JIT Inspector
On 02/05/2014 10:09 PM, Boris Zbarsky wrote: JIT Inspector is a pretty awesome tool, but it's bitrotted slightly... In particular: 1) It doesn't seem to know anything about baseline. 2) I've had a hard time making parts of it other than Ion Activity work. Is there any interest in updating it to deal with the current state of the world? I am not a big fan of the current implementation of the JIT Inspector, especially since we have no test, and that this is kind of issues is likely to have again in the future. I do not think we would have much value as dumping the assembly of Baseline, especially ICs might be harder. On the other hand, I think it would be easy to dump ICs chains as part of the PC Count interface. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Taint analysis in SpiderMonkey
Hi Stéphanie, Thanks for looking again into this, On 02/03/2014 02:08 AM, Stéphanie Ouillon wrote: Ivan forwarded me the script Jim wrote to benchmark the impact of tainting on SpiderMonkey (see attachments, I fixed bits in the patch to apply it on recent mozilla-central code). I look at the patch as well as the benchmark[1], I have multiple comments on them: - The patch does not instrument the CodeGenerator which inline CharAt[2]. This means that if we compile the JS function only the concat instrumentation would be be testing this flag. (also, the concat instrumentation is not needed, as we need to flatten a string before reading anything from it) - AssertEq is a C++ function, and this would add some overhead for just doing a flatten. Comparing the 2 strings in JS would be better, but then we also need to instrument the CodeGen. - Use an extra functions which contains the inner loop, as we are only interested in this function and not on the top-level script. - Using a gc() call here may have a nasty effect with OSR (on-stack-replacement), because we are jumping into Ion only from the outer-loop, so I don't think we are even using the result of Ion's compilation. - The loop runs 1 times, and results are only divided by 1000 (not really important) I ran the tests several times (performance mode, in console) on commits Your results are a bit noisy, did you pkill -18 firefox before running these benchmarks? I don't know what was intended to be done after that, so I'm posting the results here to have any feedback. I think this is a way to highlight that checking this extra bit is not changing the performance profile of the engine. But this does not deal with the maintenance issue question that I raised previously. [1] https://gist.github.com/arroway/617c534a7e4cb24adeab [2] http://dxr.mozilla.org/mozilla-central/source/js/src/jit/CodeGenerator.cpp#5006 -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Replacing YARR
Hi, On 01/05/2014 02:31 AM, julian.vier...@googlemail.com wrote: Before converting the entire Octane RegExp benchmark to run using RegExp.JS I thought I just try the first RegExp tested in the benchmark. This means the in terms of code changes: diff --git a/regexp.js b/regexp.js - var re0 = /^ba/; + var re0 = new RegExpJS(/^ba/); Any reasons why you are using the deconstructing RegExpJS function, instead of giving a string as argument? var re0 = new RegExpJS(^ba); -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] A question on Float32 codegen
On 12/12/2013 06:51 PM, Feng, Haitao wrote: Why did not we use 4-byte stack slot and 4-byte snapshot slot for float32 and use movss instead of movsd? Can you be more precise? The Snapshot slots are unlikely to be 4 bytes, because they are written into a compact buffer and they are not directly addressable. Also the snapshot slots are used to indicate the location, such as if it is in a register or on the stack. If you are talking about the spill of register made for bailouts, then, we will have trouble with Float32x4, as we are eagerly spilling Value-size content at the moment. This should not be a big deal to spill the full [xy]mm registers instead of the low parts. If we introduced FLOAT32_REG type in the LDefinition::Type and FLOAT32_REG in the LAllocation::Kind, it should be relatively easier to add Float32x4_REG. I will talk about the MIRType, as this is the example I took at the time , but I think we can do the same thing for the LAllocation and LDefinition. This is one thing I discuss with Benjamin Bouvier before he added the Float32. We should think of the future when doing such design. At the moment we are running conditionals to check if this is a float32 or a double. As we have many vector sizes (1,2,4,8) and vector type (double, float, uint8, uint32), I think it would be better to abstract all of these and make the MIRType a structure which is using a bit-field to represent all of these vector type. struct MIRType { enum Type { TYPE_VALUE, …, TYPE_DOUBLE, TYPE_FLOAT, TYPE_INT8, TYPE_INT16, TYPE_INT32 }; // Useful for finally supporting unsigned int 8 // and URSH without hacks. const uint32_t signedValue:1; const uint32_t padding_: 12; // Shift index to obtain the number of element in the vector. const uint32_t vectorScale: 3; // use uint32_t instead of Type because of a // windows compiler issue. const uint32_t type:16; }; static const MIRType MIRType_Value = {true, /* 1 */ 0, MIRType::TYPE_VALUE}; static const MIRType MIRType_Float32x4 = {true, /* 1 */ 2, MIRType::TYPE_FLOAT}; static const MIRType MIRType_UInt8x4 = {false, /* 1 */ 2, MIRType::TYPE_INT8}; -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] thread safe and autoconf question
Hi Roelof, On 11/29/2013 01:54 PM, Roelof Wobben wrote: Can I somehow check with autoconf if mozjs185 is compiled with --enable-threadsafe -- with-system-nspr ? Usually, configure scripts are leaving some log files that are used to store all these details. What you are interested in is likely a file named config.status in your build directory. This file contains the list of substitutions made by the configure script. Look for JS_THREADSAFE. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] [Fixed] Do Not Land: AWFY is not responding.
On 10/15/2013 01:11 PM, Jason Orendorff wrote: On 10/15/13 12:40 PM, Nicolas B. Pierron wrote: AWFY is our last barrier to prevent regressions. Currently, all slaves are down! Until AWFY is fixed, I suggest that we do not land anything to the JS Engine unless we provide benchmark results for sunspider and kraken. V8 still runs on tbpl[1]. Filed, in case you want more info: https://bugzilla.mozilla.org/show_bug.cgi?id=927094 For people who are not yet following the Bug, AWFY slaves are now collecting data again, so this is just a question of seeing the updated values. The issue was caused by 3 (4?) unrelated bugs which happened simultaneously: - (?) The VM is slow at refreshing results. - x86/x86-64 computer harness ran into an infinite loop (?) - The ARM board rebooted and a lock file prevented the automatic restart. - The git mirror used by B2G builds got corrupted. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Taint analysis in SpiderMonkey
On 08/19/2013 10:14 AM, Jim Blandy wrote: There are many issues here, but specifically regarding the runtime impact of a DOMinator-style taint analysis when not in use: Taint instrumentation is only needed in operations that allocate new strings whose contents are taken from other strings. Such operations would gain a branch per input (checking for taint), and a branch per output (checking whether there was taint to be propagated). These branches sit alongside a JSString allocation, and perhaps content copies. When taint is not in use, the branches would be well-predicted (and we could annotate them unlikely, if that would help). That's not zero impact - but would you expect it to be measurable on benchmarks? Yes, I think this will damage performances on cases, where people are building strings with a concatenation loop: for (var i = 0, ii = arr.length; i ii; i++) s += String.fromCharCode(arr[i]); PdfJS has a few of these, where an array/string is converted into a string. Either to copy the content, or to go from a base 64 to some text. This kind of code is also expected at boundaries of typed arrays. I agree, the trivial example above can be inferred, but the += is in question here. As we are allocating a JSStringRope for every operations. So having hooks on the string allocation sounds like a terrible idea. On the other hand, Doing it as part of the flatten operation, will remove half of the comparisons. Still, I would expect some impact there. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Taint analysis in SpiderMonkey
On 08/15/2013 05:53 PM, Jim Blandy wrote: On 08/15/2013 11:29 AM, Nicolas B. Pierron wrote: On 08/09/2013 02:59 PM, Jim Blandy wrote: Ivan Alagenchev and Mark Goodwin asked me to take a look at their project to bring DOMinator, a taint analysis for SpiderMonkey, […] On Tuesday, Koushik Sen made a presentation which is available on Air Mozilla[1] where he presented some JavaScript instrumentation which use some parser-hook to rewrite the original script with some extensible instrumentation. I think it's important to consider both the scale of the effort required and the results produced. Implementing something like the StringLabeller (pace Brendan) hooks would be a different order of magnitude of effort than the alternatives suggested here. I need to watch that presentation, but I did see Sen's presentation at JSTools 2013 in Montpellier. Without any intent to contradict, Jalangi's record-and-reply-with-shadow-execution approach did not seem to me like a low-maintenance tooling approach. Certainly, using shadow execution to recover the details of execution drastically reduces what one needs to record, and thus its runtime impact. But the combination of the recording annotations and the shadow interpreter do not seem like a light maintenance burden. Am I being pessimistic? This something which is more general and it would have multiple purposes which is likely to be more stable over time than just one corner application. In addition, it can be customize by users, so this would a be good to remove the burden of the content of the analysis from the JS engine. I would prefer a similar solution over a simple tainting solution which only consider to intrusively annotate strings. First, the performance impact would be isolated to people who are running the analysis. Second, it is not as intrusive because it would provide an alternate contained path in the bytecode emitter, and the rest should remain unchanged. Jalangi's approach is exactly like self-hosting the analysis without instrumenting the interpreter or the Jits, which means that we would have no performance issue induced by the instrumentation when users are not using any analysis. In addition, this would lower the cost of adding any other analysis to the devtools as this would have to be done once. And web developers can even make their custom analysis, such as do not hold a cross-compartment wrappers except in these functions. Further: having thought a bit more, I'm not sure that source-rewriting techniques are going to be much better. Perhaps there's a beautiful trick I'm not noticing, but it seems to me that making finer-grained distinctions between strings than the language supports entails nothing less than a self-hosted JavaScript interpreter, because you can't use strings (meta-level) to represent strings (debuggee level). From what I understand of Jalangi, is that you can add any kind of annotation by boxing the results, and remove the annotation by unboxing the operands around any operations. The only aspect of it that I do not like is that they redefine the operations, which does not guarantee the correct behavior. I think we can do better with maybeBox maybeUnbox primitives and a pre- pos- operations for updating the context. In addition, we can manage to please TI and avoid the mega-morphic operators as they have in Jalangi. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Taint analysis in SpiderMonkey
On 08/09/2013 02:59 PM, Jim Blandy wrote: Ivan Alagenchev and Mark Goodwin asked me to take a look at their project to bring DOMinator, a taint analysis for SpiderMonkey, […] On Tuesday, Koushik Sen made a presentation which is available on Air Mozilla[1] where he presented some JavaScript instrumentation which use some parser-hook to rewrite the original script with some extensible instrumentation. In his presentation, he highlights that only hundreds of lines are necessary to instrument a few parts of the engine with one kind of analysis. One of the example is tainting analysis, which is made such as it works on objects. The parser hook, is used to transform the source such as it can replace JavaScript operations by functions calls which are doing the original operation. The example given in the presentation is that 'x + y' is transformed to Binary('+', x, y). I think this is the kind of problem that we can address at the Bytecode emitter level. We could make a second bytecode emitter which generates calls into some non-instrumented code registered for the analysis. I think we could require that the analysis must be loaded ahead of time and only generate hooks if there is any instrumentation. This approach would be way slower than the tainting suggested by the DOMinator project, but at the same time it serves a generic purpose for instrumenting the engine in any customizable way. As he suggested in the presentation, we might want to make this visible to the user. Which means that we need to think of a way to provide some differential testing, such as analysis developers can check that they are not changing the behavior of the manipulated program (unless expected). [1] https://air.mozilla.org/test-and-cure-your-javascript-blues-with-jalangi/ -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Conformance testing of the JavaScript engine(s?)
On 08/11/2013 06:00 AM, David Bruant wrote: And I was wondering if conformance and regression tests were run against all 3-4 engines (since each compilation step may introduce its own bugs) or whether only the combination was tested. I imagine that when I run [1], only the conformance of the interpreter is tested, because no test really have time to become warm or hot (not even the test harness since it's injected in a new iframe for each test for good reasons). As terrence mentionned we do have eager compilation. Even if the eager compilation might not be enough to test the conformance. The fact that the test are present help fuzzers generating new tests by mutations. Such test can then be used in a differential tests to check for different behavior between the interpreter and the compiled code. So as soon as we get it right in the interpreter the rest should catch-up if we forgot to update some corner cases of the JITs. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Taint analysis in SpiderMonkey
Hi, On 08/09/2013 02:59 PM, Jim Blandy wrote: The taint analysis applies to strings only, and has four parts: * It identifies certain *sources* of strings as tainted: document.URL, input fields, and so on. * The JavaScript engine propagates taint information on strings. Taking a substring of a tainted string, or concatenating a tainted string, yields a tainted string. Regexp operations, charAt, and so on all propagate taint information. And so on. * It identifies certain *sinks* as vulnerable: eval, 'src' attributes on script elements, and so on. * Finally, the tool's user interface logs the appearance of tainted strings at vulnerable sinks. The taint metadata actually records the provenance of each region of a tainted string, so the tool can explain exactly why the final string is tainted, which is really helpful in constructing XSS attacks. […] I looked at the code in the github repo at https://github.com/alagenchev/spider_monkey. It seems to me that the following issues need to be addressed: * The way the taint metadata is stored needs to change. A linked list isn't appropriate for operating at scale. I don't think we should use tainting. The extra bit on the length is currently reserve to optimize strings which would be represented as an int. The goal of the tainting is to re-construct the inverted data-flow graph, i-e finding the origin of a string which flow into a function. And the data-flow graph is basically what is monitored when we register that we can see a new values flowing inside a store at a specific code location. I think that if we want to capture this kind of information, we should at least make it in such a way that we can also use it to improve our performance. If we are able to isolate the data-flow, we could optimize our data representation based on guarded invariants of the data-flow (dynamic deforestation?), and with the support of a moving GC, we could optimize/deoptimize the value representation on GCs. [to be seen as a JIT compiler for the data flow instead of only having JIT compilers for the control flow] * The patch adds functions on 'String', and an accessor on 'String.prototype'. We can't really add methods directly to String, for the sake of web compatibility. Rather, we should handle taint the way we handle Debugger. I agree, we should make this visible from developers tools, but not directly added to the String interface. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Taint analysis in SpiderMonkey
On 08/09/2013 05:27 PM, Jim Blandy wrote: On 08/09/2013 04:29 PM, Nicolas B. Pierron wrote: The goal of the tainting is to re-construct the inverted data-flow graph, i-e finding the origin of a string which flow into a function. And the data-flow graph is basically what is monitored when we register that we can see a new values flowing inside a store at a specific code location. I think that if we want to capture this kind of information, we should at least make it in such a way that we can also use it to improve our performance. If we are able to isolate the data-flow, we could optimize our data representation based on guarded invariants of the data-flow (dynamic deforestation?), and with the support of a moving GC, we could optimize/deoptimize the value representation on GCs. [to be seen as a JIT compiler for the data flow instead of only having JIT compilers for the control flow] It's true that, in principle, the flow graph the compiler uses and the flow graph taint analysis uses are the same. But in practice they're very different. * Taint is concerned with flow *through* string primitives: concatenation, substring, regexp match extraction, and so on. The compiler doesn't know much about those operations, and so is only concerned with getting them their arguments, and delivering their results to the right place. It doesn't relate their inputs to their outputs. This is a problem of instrumentation, and this would still exists even with tainting. Also, as I mentioned to Ivan, monitoring strings is an approximation, as a string might be given to JSON.parse or converted into an Array/TypedArray. * Taint needs to dynamically observe the flow of values in specific actual executions. If a particular branch isn't taken, then the not-executed code shouldn't affect taint results. But the compiler needs to reach conservative conclusions that hold on all possible executions. This would be true in the case of a static compiler, but in the case of a dynamic compiler, we can omit information based on the monitored flow. In fact, TI already restrict the possible type to the observed type, and this is for this precise reason that we need to insert type-barriers in IonMonkey's code, when the set of observed type is not equal to the upper bound calculated by the type inference. What you propose would require substantial contributions from a group of engineers (IonMonkey and GC hackers) that is in high demand; it's hard for me to imagine taint support becoming a sufficient priority for that team - especially since it's an unproven approach. In contrast, the taint analysis I brought up here has been prototyped and shown to be valuable, and is within reach of a volunteer (Ivan) from the security team. One of the reason why I would prefer us to depend on such information, is that our focus is set on performances. If a bug or an incorrect value appear in the analysis, then it would likely be related to a performance issue or an incorrect behavior. The reasons why I want to find a performance reason for doing this analysis is that we could rely on it and make it better. As a side note. Currently, we conditionally maintained an artificial stack a side to the Interpreter, Baseline and IonMonkey. This stack is only used by the Gecko profiler. Worse, tbpl does not even run test on the JS Engine to ensure that we get it in a correct shape. So using the information collected by this profiling could be helpful in many ways. Such as finding functions which are worth keeping across GCs. Another example, is the type inference. Currently we collect a lot of information which is valuable for the developer tools. Sadly, we do not have a well-detailed API to make it usable out-side the engine. But the fact that we rely on it, ensure that the type information we see would be better than any static analysis tool. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] [GSoC 2013] Project Ideas
Hi, On 04/22/2013 10:05 PM, Wei WU(吴伟) wrote: Hi, On Sun, Apr 21, 2013 at 8:37 AM, Nicolas B. Pierron nicolas.b.pier...@mozilla.com wrote: On 04/20/2013 02:41 AM, 吴伟/Wei WU wrote: On Fri, Apr 19, 2013 at 10:01 AM, Nicolas B. Pierron nicolas.b.pier...@mozilla.com wrote: - Adding profile guided optimization, the idea would be to profile which branches are used and to prune branches which are unused, either while generating the MIR Graph, or as a second optimization phase working on the graph. A trivial way of doing so can just look at the jump target which have been visited so far. A more clever one might register the transition and use Markov chain to determine relations between branches, and duplicate a sub-part of the graph to improve constant propagation and the range analysis. Before doing anything fancy like the more-clever case we still need to check that this is worth it and see the benefit of other optimizations if we fold the graph. I'm interested in this idea and I'm willing to implement it. A simple and fast algorithm may be a good choice for GSoC, while the advanced methods require more investigation. I haven't found any branch profiling code in the interpreter so the instrumenting mechanism must be finished first, or we can leverage baseline compiler to generate self-instrumented jit code. Profiling data can be saved separately or in MIR nodes as an annotation. In the second case more other passes may benefit from it easily. Indeed, there is no code for profiling yet. The most easiest way to think about it is to do like the write barrier of the incremental GC, i-e having a buffer (potentially circular, as the opposite of the write barrier) which is filled with jump targets. Thus, every time we jump to another location in the code we push the location of the offset in this buffer, and later we reuse this buffer to re-organize basic blocks in IonMonkey and to duplicate / prune basic blocks if needed. There are two possible ways to store branch profiling data. One is count array, the other is target buffer. I've read LLVM's implementation of Branch(Edge) Profiling and found that it uses a vector (array) to save the information. LLVM allocates two counters for each branch/edge. When a block jumps to another the correlate counters will be incremented by 1. Then these frequencies will be transformed into probabilities and consumed by some transform passes. Jikes RVM stores branch profiling data in the similar way. One possible problem is that they use block ID to index branch counters and we don't have such things in JavaScript bytecodes. Indeed, but we can use the jsbytecode pointers, as targets, and later convert them to basic block numbers when we process the branch profile. On the other hand, branch target buffer maintains a sequence of branch targets and the index problem can be avoided. Further more, It makes possible to determine relations between branches. The main problem I have considered is that the cost of calculating branch probabilities might be high, since we must traverse the buffer to summarize the results. I don't think the cost would be high, current CPU are good at prefetching memory ahead of time on linear read of a buffer, which is likely the case that we will encounter. What you need then would be a way to map the target to the basic block to recover the counters. It might be good, as a first step to have a circular buffer to experiment with the branch prediction, and later reduce it to simple counters if the buffer proved to be a major pitfall in terms of performance, or if the we do not have any benchmark which can take advantage of a more-clever way of inferring branches ordering. I have encountered some other issues. Could you give me some suggestions? - Instrumenting the Interpreter is fairly easy, while I'm not sure it is possible to instrument baseline compiler in the similar way. I think it's better to support them both. As opposed to TI, which need to keep a consistent model of the types flowing in, there is no restriction at only taking a sub-set of the profiling data. Profiling only in the Baseline's code should be enough and would avoid adding extra cost to the interpreter which is used by all script which are running a few times. - Overhead. Instrumenting every conditional jumps may degrades performance and an 'accepted overhead rate' should be considered. Also, it should be able to switch on/off by command line arguments. I agree with the CLI switch. The accepted overhead is weel-defined as being our benchmark score and the memory usage. If you can improve our benchmark score without taking too much memory, then this would be acceptable performances. - Store/Restore profiling data on disks. Branch profiling data can be stored and restored in LLVM and Jikes RVM. But I don't think it is necessary in SpiderMonkey. Indeed, I don't think it would be necessary either, and it would be even more
Re: [JS-internals] [GSoC 2013] Project Ideas
Hi, On 04/20/2013 02:41 AM, 吴伟/Wei WU wrote: On Fri, Apr 19, 2013 at 10:01 AM, Nicolas B. Pierron nicolas.b.pier...@mozilla.com wrote: - Clarifying our heuristics, to be able to make guesses while we are compiling in IonMonkey, and recompile without the guarded assumptions if the bailout paths are too costly. Our current view is mostly black white and only one bad use case can destroy the performances. We need to introduce some gray view, saying that we are compiling for the likely sub-set and accept bailouts as part of the normal lifetime of a script. As of today, IonMonkey is too much like a normal compiler, we should make it an assumption based compiler. I found a bug (#825268 https://bugzilla.mozilla.org/show_bug.cgi?id=825268) that may related to this project. According to the description of that bug I realized that my understanding of the term 'heuristics' is relatively naive (the strategy used by an optimization algorithm to modify the intermediate representation is based on a few expressions calculated from the structure of source codes.) and I think the heuristics you mentioned are more generic than that. If I understand correctly, 'compiling for the the likely sub-set' means that we can compile multiple versions for a method and execute one of them based on which one's assumptions are satisfied. for example: function g(x){ ...use x... } function f(){ for (var i = 0; i 1; i++){ if (i % 1000){ g(i); }else{ g('string'); } } } Currently IonMonkey compiles g(x) with the guard assert(typeof x === int) and the jit code will be invalidated periodically. If g(x) can be compiled with the assumption assert(typeof x in {int,string} then the bailouts would be avoided. Am I right? If g is not inlined, IonMonkey will compile with a guard to ensure that x is an Int. As soon as a call is made with a string, we will discard the code, and later recompile it with a larger type set (int string). The problem is that doing operations on something which can be either an int or a string will always be slow, as we fallback on VM function calls. In your example the Int case can be executed in the baseline compiler while the string case can remain in Ion-compiled code. Doing a bailout might still be worth it compared to the price of a recompilation and a slower execution for the most-likely use case. The question which remain behind is: Should we keep or discard the code?. To be able to answer this question we need some kind of heuristics. And before making heuristics, we need a meaningful metric. The only metric that is meaningful in such cases is the time, but not ordinary time, as we are at the mercy of the scheduler. Bug 825268 suggests to clean-up the way we are using the use-count to make it a reliable *determinist* source of time based on the execution of scripts. - Adding profile guided optimization, the idea would be to profile which branches are used and to prune branches which are unused, either while generating the MIR Graph, or as a second optimization phase working on the graph. A trivial way of doing so can just look at the jump target which have been visited so far. A more clever one might register the transition and use Markov chain to determine relations between branches, and duplicate a sub-part of the graph to improve constant propagation and the range analysis. Before doing anything fancy like the more-clever case we still need to check that this is worth it and see the benefit of other optimizations if we fold the graph. I'm interested in this idea and I'm willing to implement it. A simple and fast algorithm may be a good choice for GSoC, while the advanced methods require more investigation. I haven't found any branch profiling code in the interpreter so the instrumenting mechanism must be finished first, or we can leverage baseline compiler to generate self-instrumented jit code. Profiling data can be saved separately or in MIR nodes as an annotation. In the second case more other passes may benefit from it easily. Indeed, there is no code for profiling yet. The most easiest way to think about it is to do like the write barrier of the incremental GC, i-e having a buffer (potentially circular, as the opposite of the write barrier) which is filled with jump targets. Thus, every time we jump to another location in the code we push the location of the offset in this buffer, and later we reuse this buffer to re-organize basic blocks in IonMonkey and to duplicate / prune basic blocks if needed. Two bugs (#410994 https://bugzilla.mozilla.org/show_bug.cgi?id=410994, # 419344 https://bugzilla.mozilla.org/show_bug.cgi?id=419344 ) have mentioned PGO but may be irrelevant with this idea. Indeed, they are irrelevant for doing PGO on JavaScript. Other case of smaller projects might be: - Improving our Alias Analysis to take advantage of the type set (this might help a lot kraken benchmarks, by factoring
Re: [JS-internals] [GSoC 2013] Project Ideas
On 04/20/2013 05:37 PM, Nicolas B. Pierron wrote: where Foo is a different type than Bar. Which implies that both b are part of different object, which means that the previous JavaScript function can be transformed to: function f(a, arr) { a.b = 0; /* Assume shape of a is different than the shape of arr[i] */ for (var i = 0; i arr.length; i++) { arr[i].b = 1; } } Sorry, this is true, only if we can prove that the loop body is at least executed once. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] [GSoC 2013] Project Ideas
, the idea would be to profile which branches are used and to prune branches which are unused, either while generating the MIR Graph, or as a second optimization phase working on the graph. A trivial way of doing so can just look at the jump target which have been visited so far. A more clever one might register the transition and use Markov chain to determine relations between branches, and duplicate a sub-part of the graph to improve constant propagation and the range analysis. Before doing anything fancy like the more-clever case we still need to check that this is worth it and see the benefit of other optimizations if we fold the graph. Other case of smaller projects might be: - Improving our Alias Analysis to take advantage of the type set (this might help a lot kraken benchmarks, by factoring out array accesses). - Improve dummy functions used for asm.js boundaries. Asm.js needs to communicate with the DOM, and to do so it need some trampoline functions which are used as an interface with the DOM API. Such trampolines might transform typed arrays into strings or objects and serialize the result back into typed arrays. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Long-running compilation steps
On 04/14/2013 11:41 PM, Nicholas Nethercote wrote: Hi, For https://bugzilla.mozilla.org/show_bug.cgi?id=842800 I'm interested in adding more checks of SpiderMonkey's operation callback in potentially long-running operations. For example, when running the asm.js Unreal demo, parsing takes several seconds, so in my patch I added a check after the parsing of each statement. I'm also seeing that IonMonkey compilation can take a while on the demo. Is there a loop or loops in IonMonkey that are good candidates for adding operation callback checks? IonMonkey compilation is currently sequential for the IonBuilder phase and the CodeGenerator::link phase. All the rest might run in a separate thread (or not in case of mono core architectures). Before the compilation we allocate a lifo-alloc which got extend if needed by the compilation. As far as I know, everything which goes into the lifo-alloc is considered as dark-matter from the about:memory point of view. I am not sure what you are trying to look at but what might be interesting would be to look at the usage of the lifo alloc. It would be safe to do it at the creation of each basic block, and in MIRGenerator::shouldCancel (called at each loop iteration on basic blocks) if it is safe to report such status from another thread. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Instruction scheduling and selection in IonMonkey
Hi, On 03/15/2013 03:20 AM, Ting-Yuan Huang wrote: It seems that there's no instruction scheduler in IonMonkey. If so, may I know why? Modern processors should be benefited a lot by an instruction scheduler. I'd like to know if it is worth doing so before diving in :-) Also I didn't see a formal (that appears in textbooks) instruction selector, such as tiling a tree/DAG by dynamic programming, or a peephole optimizer. I'm not sure but it seems that the quality of instruction selection relies on the lowering process from MIR to LIR, so that a direct mapping from LIR to assembly codes is efficient enough, right? Indeed, our macro assembler is directly writing into the buffer. At the same time the code that we are producing contains many checks which might make it hard for assembly optimization to trigger as we need to handle corner cases such as bailouts. In IonMonkey case, I think this might be interesting in terms of code-size and avoiding redundant operations. Like avoiding test operations after ALU if we are checking if the last computed register is zero, and also to get rid of scratch register initialization on x64. But I guess this would mostly be a code-size issue. In asm.js case (codename OdinMonkey), I think this might be interesting to test as we are trying to recover the assembly out-of infallible JavaScript (except on ARM bounds check). I don't know if the quality of the assembly that we are producing is good enough or not, Luke and Marty might know more about it. In case of ARM, I don't know what is the impact of such optimizations. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Exact rooting progress script
On 11/06/2012 03:18 PM, Terrence Cole wrote: +--+ | m-i/js/xpconnect | +--+ IonCode : 2 This surprise me, so git grep IonCode in js/xpconnect returns: src/XPCJSRuntime.cpp-CREPORT_GC_BYTES(cJSPathPrefix + NS_LITERAL_CSTRING(gc-heap/ion-codes), src/XPCJSRuntime.cpp: cStats.gcHeapIonCodes, src/XPCJSRuntime.cpp- Memory on the garbage-collected JavaScript src/XPCJSRuntime.cpp- heap that holds references to executable code pools src/XPCJSRuntime.cpp- used by IonMonkey.); src/nsXPConnect.cpp-static const char trace_types[][11] = { src/nsXPConnect.cpp-Object, src/nsXPConnect.cpp-String, src/nsXPConnect.cpp-Script, src/nsXPConnect.cpp:IonCode, src/nsXPConnect.cpp-Xml, src/nsXPConnect.cpp-Shape, src/nsXPConnect.cpp-BaseShape, src/nsXPConnect.cpp-TypeObject, src/nsXPConnect.cpp-}; So you can remove 2 at the count-down. :) -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] Inlining heuristics and type inference
Hi Igor, On 08/01/2012 12:50 PM, Igor Rafael wrote: Hi guys, I would like to change the inlining heuristics in IonMonkey, so that a function would be inlined the first time it is met in the program flow. To do this, I have commented out the code: if (script-getUseCount() checkUses) { IonSpew(IonSpew_Inlining, Not inlining, caller is not hot); return false; } The heuristic should be fine with the current mode of compilation which implies running JM first and IonMonkey second for the recompilation. Which means we are unlikely to hit this condition unless we hit an invalidation bailout which will reset the counter of the inlined script. If you run with IonMonkey first (--no-jm), then this check is likely to fail because the use count is incremented at loop-head and at the function entry point. This means that we are likely to have a smaller use-count (off by 1) for the inlined functions which are only call once per loop iterations. It might be interesting to tweak this heuristic by expecting one less on the use-count counter of the inline function to balance the entry-point. However IonMonkey is answering me Cannot inline due to oracle veto. I have tried to follow the calls and this code seems to be related with the type inference algorithm. It seems that the type inference engine has not been executed at that point in time. I would like to know what I could do to disable this veto. If it is not to abuse your patience, I would like to know also when this type inference engine runs and why it not runs before the first compilation. You will need to run the TypeInferenceOracle which is just a wrapper on top of the Type Monitoring Inference. Look at ion/TypeOrcale.cpp, the init function. -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals
Re: [JS-internals] How do I see the assembly generated for a function in firefox
Hi Paddy, On 08/01/2012 01:46 PM, Paddy Mullen wrote: From firefox how do I know what type inferences are being picked up for my javascript code? How do I see the generated assembly? Do I have to use a raw build of ionmonkey? From Firefox, I will recommend use to use the Code (JIT) Inspector made by Brian Hackett, This is a Firefox extension which can display more information about your javascript. Sadly we have no good UI to clarify its output. If you want to see the byte code then you might use the function used by this extension. const Ci = Components.interfaces; var utils = window .QueryInterface(Ci.nsIInterfaceRequestor) .getInterface(Ci.nsIDOMWindowUtils); utils.startPCCountProfiling(); … some code to profile … utils.stopPCCountProfiling(); var count = utils.getPCCountScriptCount(); for (var i = 0; i count; i++) { var summary = JSON.parse(utils.getPCCountScriptSummary(i)); var detail = JSON.parse(utils.getPCCountScriptContents(i)); … } utils.purgePCCounts(); The previous technique does not work yet with IonMonkey. Once Bug 771118 is fixed, the previous extension might work with ion monkey too. If you want to have a look in the details of IonMonkey, I won't recommend you to look at the assembly first, but at the codegen and at the output of iongraph[1]. When you run the shell with IONFLAGS=logs, IonMonkey will send some spew in a temporary file which is retrieved by iongraph. If you absolutely want to look at the assembly, I will recommend you to use gdb to disassemble the code which is produced by IonMonkey. [1] https://github.com/sstangl/iongraph -- Nicolas B. Pierron ___ dev-tech-js-engine-internals mailing list dev-tech-js-engine-internals@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals