[pypy-commit] extradoc extradoc: some adjustments

Raemi Tue, 17 Jun 2014 08:33:42 -0700

Author: Remi Meier <[email protected]>
Branch: extradoc
Changeset: r5349:897d0fa65804
Date: 2014-06-17 17:23 +0200
http://bitbucket.org/pypy/extradoc/changeset/897d0fa65804/


Log:    some adjustments

diff --git a/talk/icooolps2014/position-paper.pdf 
b/talk/icooolps2014/position-paper.pdf
index 
4c20e424eab52b3d1bdeed6cd22738c5db3b0800..e7a9e9a08d66ba2c9c26bafa1c59d5d4a2e7bf14
GIT binary patch

[cut]

diff --git a/talk/icooolps2014/position-paper.tex 
b/talk/icooolps2014/position-paper.tex
--- a/talk/icooolps2014/position-paper.tex
+++ b/talk/icooolps2014/position-paper.tex
@@ -116,7 +116,7 @@
   Dynamic languages became very popular in recent years. At some
   point, the need for concurrency arose, and many of them made the
   choice to use a single global interpreter lock (GIL) to synchronise
-  the interpreter in a multithreading scenario. This choice however
+  the interpreter in a multithreading scenario. This choice, however,
   makes it impossible to actually run code in parallel.
 
   Here we want to compare different approaches to replacing the GIL
@@ -143,7 +143,7 @@
 performance increases less and less every year, many dynamic languages
 have a problem. While there is certainly a lot of popularity around
 languages like Python and Ruby, their ability to make use of multiple
-cores is somewhat limited. For ease of implementation they chose to
+cores is somewhat limited. For ease of implementation, they chose to
 use a single, global interpreter lock (GIL) to synchronise the
 execution of code in multiple threads. While this is a
 straight-forward way to eliminate synchronisation issues in the
@@ -168,43 +168,20 @@
 threads for concurrency, as well as the changes that are required to
 the interpreter itself.
 
-These requirements are not easy to meet. We argue that STM is the
-overall winner. While it currently has a big performance problem, it
-gets more points in all the other categories. We think that it is the
-only solution that also provides a better synchronisation mechanism to
-the application in the form of parallelisable atomic blocks.
-
-%% \subsection{Issue}
-%% The issue that we want to discuss is how to efficiently support
-%% multi-core parallel execution of code in dynamic languages that were
-%% designed with GIL semantics in mind.
-
-%% Furthermore, a solution to this problem should also bring better
-%% synchronisation mechanism with it...
-
-%% (supporting (large) atomic blocks for synchronisation)
-
-%% \subsection{Our Position}
-%% Current solutions for replacing the GIL include STM, HTM, and
-%% fine-grained locking. STM is usually too slow, HTM very limited, and
-%% locking suffers from complexity that makes it hard to implement
-%% correctly. We argue that the best way forward is still STM and that
-%% its performance problem can be solved.
-
-%% Current solutions like STM, HTM, and fine-grained locking are slow, hard
-%% to implement correctly, and don't fit the specific problems of dynamic
-%% language.  STM is the best way forward but has bad performance, so we
-%% fix that.
+These requirements are not easy to meet. The author's position is that
+STM provides the best way forward. While STM currently has a big
+performance problem, it gets more points in the other categories. We
+think that it is the only solution that also provides a better
+synchronisation mechanism to the application in the form of
+parallelisable atomic blocks.  In the following section, we try to
+present a balanced view of the compared approaches.
 
 
 \section{Discussion}
 
-In this section we examine the approaches and highlight their
-advantages and disadvantages.
-%% \paragraph{dynamic language VM problems}
-%% XXX:
-%% - high allocation rate (short lived objects)\\
-%% - (don't know anything about the program that runs until it actually runs: 
arbitrary atomic block size)
+In this section we first explain the motivation for using a GIL and
+then examine different approaches to remove or avoid it -- highlighting
+their advantages and disadvantages.
 
 
 \subsection{Why is there a GIL?}
@@ -214,7 +191,7 @@
 also applies to Abstract Syntax Tree (AST) interpreters, where the GIL
 may only be released between interpreting two AST nodes.}. The interpreter
 can thus rely on complete isolation and atomicity for the
-instructions' execution. Thus, accesses to data structures like
+instructions' execution. Also, accesses to data structures like
 dictionaries and lists happen atomically and do not need additional
 protection from data races when shared between threads.
 
@@ -222,8 +199,8 @@
 model~\cite{lamport79}. This can be very valuable as it means less
 surprises for the programmer. For example in Table~\ref{tab:seq_cons},
 the programmer can expect the critical section to only be entered by
-one thread.  If the model allowed to buffer the writes, both threads
-may enter the critical section at the same time.
+one thread.  On the other hand, if the model allowed to buffer the
+writes, both threads may enter the critical section at the same time.
 
 \begin{table}[!ht]
   \begin{center}
@@ -231,11 +208,12 @@
       \hline
       Thread 1                      & Thread 2                      \\
       \hline
-      \multicolumn{2}{|l|}{\texttt{A = B = 0}}                      \\
+      \multicolumn{2}{|c|}{\texttt{A = B = 0}}                      \\
       \hline
       \texttt{A = 1}                & \texttt{B = 1}                \\
       \texttt{if B == 0:}           & \texttt{if A == 0:}           \\
-      \texttt{    critical section} & \texttt{    critical section} \\
+      \multicolumn{2}{|c|}{only one thread enters here}             \\
+      \multicolumn{2}{|c|}{(e.g.\ critical section)}             \\
       \hline
     \end{tabular}
     \caption{Critical section with a sequential consistency model.}
@@ -263,7 +241,7 @@
 degree. Again, a potential solution should be able to integrate with
 external libraries with similar ease. We will however focus our
 argumentation more on running code in the interpreted language in
-parallel, not the external C calls.
+parallel, not the external C code.
 
 Since the GIL is mostly an implementation detail of the interpreter,
 it is not exposed to the application running on top of it. To
@@ -284,7 +262,8 @@
 that the GIL is not released during the execution of the atomic
 block. Of course, this still means that no two atomic blocks can
 execute in parallel or even concurrently.  Potential solutions are
-preferable if they provide a good way to implement atomic blocks that
+preferable if they provide a good way to implement atomic blocks
+(or another, comparable synchronisation mechanism) that
 are also able to be executed in parallel.
 
 
@@ -318,8 +297,8 @@
 \subsection{Potential Solutions}
 \label{sec:pot_solutions}
 
-For the discussion we define a set of criteria to evaluate the
-multiple potential solutions for removing or avoiding the GIL and its
+For the discussion, we define a set of criteria to evaluate the
+potential solutions for removing or avoiding the GIL and its
 limitations:
 
 \begin{description}
@@ -350,14 +329,13 @@
 and releasing locks produces. The former means that sometimes it is
 necessary to fall back to less fine-grained locking, preventing some
 potential parallelism in order to keep the complexity manageable.
-The latter means that we lose a bit of performance in the
-single-threaded case compared to the GIL, which requires much less
-acquire-release operations.
+The latter means that we lose a bit of performance compared to the
+GIL, which requires much less acquire-release operations.
 
 Jython~\cite{webjython} is one project that implements an
 interpreter for Python on the Java Virtual Machine (JVM) and
 that uses fine-grained locking\footnote{The performance impact of
-fine-grained locking is milder in Java than it would be in a typical piece
+fine-grained locking is milder on the JVM than it would be in a typical piece
 of C code; see e.g.~\cite{biased}.} to correctly synchronise the
 interpreter. For a language like Python, one needs quite a few,
 carefully placed locks -- every dictionary, list, instance, or mutable
@@ -367,21 +345,21 @@
 there is no central location for all these locks, the
 complexity of the implementation is quite a bit larger compared to
 using a GIL. Integrating external, non-thread-safe libraries should
-however be very simple too. One could simply use one lock per library
+however be very simple too. One can simply use one lock per library
 to avoid this issue.
 
 In the end, fine-grained locking can transparently replace the GIL and
 therefore parallelise existing applications, generally without any
-changes\footnote{There are rare cases where not having atomic
-  bytecodes actually changes the semantics. E.g.\ in Jython,
-  \texttt{dict1.update(dict2)} is not atomic: it first reads data from
-  \texttt{dict2} with \texttt{dict2}'s lock, and then puts it into
-  \texttt{dict1} with \texttt{dict1}'s lock. A lot can happen
-  in-between.}. An implementation has to follow the GIL semantics very
+changes. An implementation has to follow the GIL semantics very
 closely, otherwise it may expose some latent data races in existing
-applications which are just not exposed with a GIL. This approach does
-however not provide a better parallelising synchronisation mechanism
-to the application like e.g.\ atomic blocks.
+applications which are just not exposed with a GIL\footnote{There are
+  rare cases where not having atomic bytecodes actually changes the
+  semantics. E.g.\ in Jython, \texttt{dict1.update(dict2)} is not
+  atomic: it first reads data from \texttt{dict2} with \texttt{dict2}'s
+  lock, and then puts it into \texttt{dict1} with \texttt{dict1}'s
+  lock. A lot can happen in-between.}. This approach does however not
+provide a better parallelising synchronisation mechanism to the
+application and still requires explicit locking in the application.
 
 
 
@@ -389,7 +367,7 @@
 
 There are also approaches that work around the GIL instead of trying
 to replace it. If an application can be split into completely
-independent parts that only very rarely need to share anything, or
+independent parts that only very rarely need to share something, or
 only do so via an external program like a database, then it is
 sensible to have one GIL per independent part. At the extreme, there
 are applications that parallelise perfectly simply by running
@@ -450,7 +428,7 @@
 suffered a lot from this.
 
 The performance of HTM is pretty good as it does not introduce much
-overhead ($<40\%$ overhead~\cite{odaira14}). And it can transparently
+overhead ($<40\%$~\cite{odaira14}). And it can transparently
 parallelise existing applications to some degree. The implementation
 is very straight-forward because it directly replaces the GIL in a
 central place. HTM is also directly compatible with any external
@@ -512,8 +490,8 @@
 general overview in Table \ref{tab:comparison}. The points were assigned
 according to the criteria described in \ref{sec:pot_solutions}. Since
 the criteria are defined intuitively, there are no formal justifications
-for the assigned points. The reader is thus advised to take them with a
-grain of salt.
+for the number of points. The reader is thus advised to take the result
+with a grain of salt.
 
 The general picture is everything else than clear. It looks like HTM
 may be a good solution to replace the GIL in the near future. Current
@@ -537,9 +515,9 @@
 
 One way to get more performance is to develop STM systems that make
 better use of low-level features in existing OS kernels.  We are
-currently working on a STM system that makes use of several such
+currently working on an STM system that makes use of several such
 features like virtual memory and memory segmentation.  We further
-tailor the system to the discussed use case which gives us an
+tailor the system to the discussed use case, which gives us an
 advantage over other STM systems that are more general. With this
 approach, initial results suggest that we can keep the overhead of STM
 well below 50\%. A hybrid TM system, which also uses HTM to accelerate
_______________________________________________
pypy-commit mailing list
[email protected]
https://mail.python.org/mailman/listinfo/pypy-commit

[pypy-commit] extradoc extradoc: some adjustments

Reply via email to