Author: Hakan Ardo <[email protected]>
Branch: extradoc
Changeset: r3650:ec569faca194
Date: 2011-06-12 18:22 +0200
http://bitbucket.org/pypy/extradoc/changeset/ec569faca194/
Log: started to describe some benchmarks
diff --git a/talk/iwtc11/paper.tex b/talk/iwtc11/paper.tex
--- a/talk/iwtc11/paper.tex
+++ b/talk/iwtc11/paper.tex
@@ -578,6 +578,62 @@
\section{Benchmarks}
+The loop peeling optimization was implemented in the PyPy
+framework. That means that the jit compilers generated for all
+interpreters implemented within PyPy now can take advantage of
+it. Benchmarks have been executed for a few different interpreters and
+we see improvements in several cases. The ideal loop for this optimization
+would be short numerical calculations with no failing guards and no
+external calls.
+
+\subsection{Python}
+The python interpreter of the PyPy framework is a complete python
+version 2.7 compatible interpreter. A set of numerical
+calculations where implemented in both python and in C and their
+runtimes compared. The benchmarks are
+\begin{itemize}
+\item {\bf sqrt}: approximates the square root of $y$ as $x_\infty$
+ with $x_0=y/2$ and $x_k = \left( x_{k-1} + y/x_{k-1} \right) /
+ 2$. There are three different versions of this benchmark where $x_k$
+ is represented with different type of objects: int's, float's and
+ Fix16's. The later, Fix16, is a custom class that implements
+ fixpoint arithmetic with 16 bits precision. In python there is only
+ a single implementation of the benchmark that gets specialized
+ depending on the class of it's input argument, $y$, while in C,
+ there is three different implementations.
+\item {\bf conv3}: one dimensional convolution with a kernel of fixed
+ size $3$.
+\item {\bf conv5}: one dimensional convolution with a kernel of fixed
+ size $5$.
+\item {\bf conv3x3}: two dimensional convolution with kernel of fixed
+ size $3 \times 3$ using a custom class to represent two dimensional
+ arrays.
+\item {\bf dilate3x3}: two dimensional dilation with kernel of fixed
+ size $3 \times 3$. This is similar to convolution but instead of
+ summing over the elements, the maximum is taken. That places a
+ external call to a max function within the loop that prevents some
+ of the optimizations.
+\item {\bf sobel}: an low level video processing algorithm used to
+ locate edges in an image. It calculated the gradient magnitude
+ using sobel derivatives. The algorithm is in python implemented
+ on top of a custom image class that is specially designed for the
+ problem. It ensures that there will be no failing guards, and makes
+ a lot of the two dimension index calculations loop invariant. The
+ intention there is twofold. It shows that the performance impact of
+ having wrapper classes giving objects some application specific
+ properties is negligible. This is due to the inlining performed
+ during the tracing and the allocation removal of the index objects
+ introduced. It also shows that it is possible to do some low level
+ hand optimizations of the python code and hide those optimization
+ under a nice interface without loosing performance.
+\end{itemize}
+
+\subsection{Numpy}
+XXX: Fijal?
+
+\subsection{Prolog}
+XXX: Carl?
+
\appendix
\section{Appendix Title}
_______________________________________________
pypy-commit mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pypy-commit