Author: Carl Friedrich Bolz <cfb...@gmx.de> Branch: extradoc Changeset: r3617:3748a7d6e071 Date: 2011-06-08 16:22 +0200 http://bitbucket.org/pypy/extradoc/changeset/3748a7d6e071/
Log: fix the typos the reviewers pointed out diff --git a/talk/icooolps2011/paper.tex b/talk/icooolps2011/paper.tex --- a/talk/icooolps2011/paper.tex +++ b/talk/icooolps2011/paper.tex @@ -124,10 +124,10 @@ %___________________________________________________________________________ \section{Introduction} -One of the hardest parts of implementing a dynamic language efficiently is to -optimize its object model. This is made harder by the fact that many recent -languages such as Python, JavaScript or Ruby have a rather complex core object -semantics. For them, even implementing just an interpreter is already a complex +One of the hardest parts of implementing an object-oriented dynamic language well is to +optimize its object model. This is made harder by the complexity of the core +object semantics of many recent languages such as Python, JavaScript or Ruby. +For them, even implementing just an interpreter is already a difficult task. Implementing these languages efficiently with a just-in-time compiler (JIT) is extremely challenging, because of their many corner-cases. @@ -226,7 +226,7 @@ \label{sub:tracing} A recently popular approach to JIT compilers is that of tracing JITs. Tracing -JITs have their origin in the Dynamo project, which used the technique for dynamic +JITs were popularized by the Dynamo project, which used the technique for dynamic machine code optimization \cite{bala_dynamo:_2000}. Later they were used to implement a lightweight JIT for Java \cite{gal_hotpathvm:_2006} and for dynamic languages such as JavaScript \cite{gal_trace-based_2009}. @@ -257,7 +257,7 @@ Therefore PyPy's JIT is a \emph{meta-tracer} \cite{bolz_tracing_2009}. It does not trace the execution of the user program, but instead traces the execution of the \emph{interpreter} that is running the program. This means that the traces -it produces don't contain the bytecodes of the language in question, but +it produces do not contain the bytecodes of the language in question, but RPython-level operations that the interpreter did to execute the program. Tracing through the execution of an interpreter has many advantages. It makes @@ -312,7 +312,7 @@ object model that just supports classes and instances, without any inheritance or other advanced features. In the model classes contain methods. Instances have a class. Instances have their own attributes (or fields). When looking up an -attribute on an instance, the instance's attributes are searched. If the +attribute of an instance, the instance's attributes are searched. If the attribute is not found there, the class' methods are searched. \begin{figure} @@ -335,7 +335,7 @@ When using this object model in an interpreter, a large amount of time will be spent doing lookups in these dictionaries. -Let's assume we trace through code that sums three attributes, such as: +Let us assume we trace through code that sums three attributes, such as: \anto{I still think it's a bit weird to call them ``methods'' and then use them as attributes in the example} @@ -362,7 +362,7 @@ condition in the original code. The trace contains five calls to \texttt{dict.get}, which is slow. To make the language efficient using a tracing JIT, we need to find a way to get rid of these dictionary -lookups somehow. How to achieve this will be topic of +lookups. How to achieve this will be the topic of Section~\ref{sec:fastobjmodel}. @@ -441,7 +441,7 @@ typical reason to do that is if there is a lot of computation depending on the value of one variable. -Let's make this more concrete. If we trace a call to the function (written in +Let us make this more concrete. If we trace a call to the function (written in RPython) on the left, we get the trace on the right: \begin{minipage}[b]{0.5\linewidth} @@ -468,7 +468,7 @@ \end{minipage} Observe how the first two operations could be constant-folded if the value of -$x_1$ were known. Let's assume that the value of \texttt{x} in the RPython code can vary, but does so +$x_1$ were known. Let us assume that the value of \texttt{x} in the RPython code can vary, but does so rarely, i.e. only takes a few different values at runtime. If this is the case, we can add a hint to promote \texttt{x}, like this: @@ -504,7 +504,7 @@ to be written down slightly differently in the actual code.} When just running the code, the \texttt{promote} function has no effect. When tracing, some extra work -is done. Let's assume that this changed function is traced with +is done. Let us assume that this changed function is traced with the arguments \texttt{4} and \texttt{8}. The trace will be the same, except for one operation at the beginning. @@ -513,10 +513,9 @@ then be exploited by the compiler. The introduced guard specializes the trace, because it only works if the value of $x_1$ is \texttt{4}. From the point of view of the -optimizer, this guard is not any different than the one produced by the \texttt{if} -statement in the first example. After the guard, the rest of the trace can -assume that $x_1$ is equal to \texttt{4}, meaning that the optimizer will turn this -trace into: +optimizer, this guard is not different frome the one produced by the \texttt{if} +statement in the first example. After the guard, it can be assumed that $x_1$ +is equal to \texttt{4}, meaning that the optimizer will turn this trace into: {\noop \begin{lstlisting}[mathescape,basicstyle=\ttfamily] @@ -547,8 +546,8 @@ This new trace will be attached to the guard instruction of the first trace. If $x_1$ takes on even more values, a new trace will eventually be made for all of them, linking them into a chain. This is clearly not desirable, so we should promote -only variables that don't vary much. However, adding a promotion hint will never produce wrong -results. It might just lead to too much assembler code being generated. +only variables that do not vary much. However, adding a promotion hint will never produce wrong +results. It might just lead to too much machine code being generated. Promoting integers, as in the examples above, is not used that often. However, the internals of dynamic language interpreters often @@ -580,7 +579,7 @@ idempotent side effects\footnote{This property is less strict than that of a pure function, because it is only about actual calls during execution. All pure functions are trace-elidable though.}. -From this definition follows that a call to an trace-elidable function with +From this definition follows that a call to a trace-elidable function with constant arguments in a trace can be replaced with the result of the call seen during tracing. As an example, take the class on the left. Tracing the call \texttt{a.f(10)} of @@ -621,7 +620,7 @@ which lets the interpreter author communicate invariants to the optimizer. In this case, she could decide that the \texttt{x} field of instances of \texttt{A} is immutable, and therefore \texttt{c} -is an trace-elidable function. To communicate this, there is an \texttt{@elidable} decorator. +is a trace-elidable function. To communicate this, there is an \texttt{@elidable} decorator. If the code in \texttt{c} should be constant-folded away, we would change the class as follows: @@ -698,7 +697,7 @@ The first step in making \texttt{getattr} faster in our object model is to optimize away the dictionary lookups on the instances. The hints of the previous section -don't seem to help with the current object model. There is +do not seem to help with the current object model. There is no trace-elidable function to be seen, and the instance is not a candidate for promotion, because there tend to be many instances. @@ -726,7 +725,7 @@ reference to a map, which maps field names to indexes into a storage list. The storage list contains the actual field values. Maps are shared between different instances, therefore they have to be immutable, which means -that their \texttt{getindex} method is an trace-elidable function. When a new attribute is added +that their \texttt{getindex} method is a trace-elidable function. When a new attribute is added to an instance, a new map needs to be chosen, which is done with the \texttt{add\_attribute} method on the previous map. This function is also trace-elidable, because it caches all new instances of \texttt{Map} that it creates, to make @@ -735,7 +734,7 @@ introduced maps, it is safe to promote the map everywhere, because we assume that the number of different instance layouts is small. -With this adapted instance implementation, the trace we saw in Section~\ref{sub:running} changes to the +With this adapted instance implementation, the trace we saw in Section~\ref{sub:running} changes to that of Figure~\ref{fig:trace2}. There \texttt{0xb74af4a8} is the memory address of the \texttt{Map} instance that has been promoted. Operations that can be optimized away are grayed out, their results will be replaced with @@ -776,7 +775,7 @@ enough.\footnote{There is a more complex variant of the presented technique that can accommodate quick-changing class fields a lot better.} -What we would really like is if the \texttt{Class.find\_method} method were trace-elidable. +What we would really like that the \texttt{Class.find\_method} method is trace-elidable. But it cannot be, because it is always possible to change the class itself. Every time the class changes, \texttt{find\_method} can potentially return a new value. @@ -798,7 +797,7 @@ What is interesting here is that \texttt{\_find\_method} takes the \texttt{version} argument but it does not use it at all. Its only purpose is to make the call trace-elidable, because when the version object changes, the result of the call might be -different than the previous one. +different from the previous one. \begin{figure} \input{code/trace4.tex} @@ -956,7 +955,7 @@ Lua VM in C, which makes it hard to judge the effectiveness of the approach. SPUR \cite{bebenita_spur:_2010} is a tracing JIT for CIL bytecode, which is then -used to trace through an JavaScript implementation written in C\#. The +used to trace through a JavaScript implementation written in C\#. The JavaScript implementation compiles JavaScript to CIL bytecode together with an implementation of the JavaScript object model. The object model uses maps and inline caches to speed up operations on objects. The tracer traces through @@ -985,7 +984,7 @@ \cite{rose_bytecodes_2009} that will be added to the JVM is supposed to make the implementation of dynamic languages on top of JVMs easier. The bytecode gives the user access to generalized inline caches. It requires of course compilation to JVM bytecode instead of simply writing an interpreter, predictability of performance across JVMs is also an open question. -We already explored promotion in other context, such as earlier versions of +We already explored promotion in other contexts, such as earlier versions of PyPy's JIT. %as well as a Prolog partial evaluator \cite{bolz_towards_2009} Promotion is also heavily _______________________________________________ pypy-commit mailing list pypy-commit@python.org http://mail.python.org/mailman/listinfo/pypy-commit