Add Plasma blog post

Project: http://git-wip-us.apache.org/repos/asf/arrow-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow-site/commit/3b67853c
Tree: http://git-wip-us.apache.org/repos/asf/arrow-site/tree/3b67853c
Diff: http://git-wip-us.apache.org/repos/asf/arrow-site/diff/3b67853c

Branch: refs/heads/asf-site
Commit: 3b67853c506b90b862ddd9f1aecf69b2e609fa7c
Parents: b286da8
Author: Wes McKinney <wes.mckin...@twosigma.com>
Authored: Tue Aug 8 10:25:46 2017 -0400
Committer: Wes McKinney <wes.mckin...@twosigma.com>
Committed: Tue Aug 8 10:25:46 2017 -0400

----------------------------------------------------------------------
 .../08/plasma-in-memory-object-store/index.html | 261 +++++++++++++++++++
 blog/index.html                                 | 153 +++++++++++
 css/main.css                                    |   2 +-
 docs/ipc.html                                   |  27 +-
 docs/memory_layout.html                         |  49 ++--
 docs/metadata.html                              |  27 +-
 feed.xml                                        | 123 ++++++++-
 7 files changed, 599 insertions(+), 43 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/arrow-site/blob/3b67853c/blog/2017/08/08/plasma-in-memory-object-store/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/08/08/plasma-in-memory-object-store/index.html 
b/blog/2017/08/08/plasma-in-memory-object-store/index.html
new file mode 100644
index 0000000..3fd78fa
--- /dev/null
+++ b/blog/2017/08/08/plasma-in-memory-object-store/index.html
@@ -0,0 +1,261 @@
+<!DOCTYPE html>
+<html lang="en-US">
+  <head>
+    <meta charset="UTF-8">
+    <title>Plasma In-Memory Object Store</title>
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="generator" content="Jekyll v3.4.3">
+    <!-- The above 3 meta tags *must* come first in the head; any other head 
content must come *after* these tags -->
+    <link rel="icon" type="image/x-icon" href="/favicon.ico">
+
+    <title>Apache Arrow Homepage</title>
+    <link rel="stylesheet" 
href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900">
+
+    <link href="/css/main.css" rel="stylesheet">
+    <link href="/css/syntax.css" rel="stylesheet">
+    <script src="https://code.jquery.com/jquery-3.2.1.min.js";
+            integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4="
+            crossorigin="anonymous"></script>
+    <script src="/assets/javascripts/bootstrap.min.js"></script>
+  </head>
+
+
+
+<body class="wrap">
+  <div class="container">
+    <nav class="navbar navbar-default">
+  <div class="container-fluid">
+    <div class="navbar-header">
+      <button type="button" class="navbar-toggle" data-toggle="collapse" 
data-target="#arrow-navbar">
+        <span class="sr-only">Toggle navigation</span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+      </button>
+      <a class="navbar-brand" href="/">Apache 
Arrow&#8482;&nbsp;&nbsp;&nbsp;</a>
+    </div>
+
+    <!-- Collect the nav links, forms, and other content for toggling -->
+    <div class="collapse navbar-collapse" id="arrow-navbar">
+      <ul class="nav navbar-nav">
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Project Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/install/">Install</a></li>
+            <li><a href="/blog/">Blog</a></li>
+            <li><a href="/release/">Releases</a></li>
+            <li><a href="https://issues.apache.org/jira/browse/ARROW";>Issue 
Tracker</a></li>
+            <li><a href="https://github.com/apache/arrow";>Source Code</a></li>
+            <li><a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>Mailing List</a></li>
+            <li><a href="https://apachearrowslackin.herokuapp.com";>Slack 
Channel</a></li>
+            <li><a href="/committers/">Committers</a></li>
+          </ul>
+        </li>
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Specification<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/memory_layout.html">Memory Layout</a></li>
+            <li><a href="/docs/metadata.html">Metadata</a></li>
+            <li><a href="/docs/ipc.html">Messaging / IPC</a></li>
+          </ul>
+        </li>
+
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Documentation<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/python">Python</a></li>
+            <li><a href="/docs/cpp">C++ API</a></li>
+            <li><a href="/docs/java">Java API</a></li>
+            <li><a href="/docs/c_glib">C GLib API</a></li>
+          </ul>
+        </li>
+        <!-- <li><a href="/blog">Blog</a></li> -->
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">ASF Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="http://www.apache.org/";>ASF Website</a></li>
+            <li><a href="http://www.apache.org/licenses/";>License</a></li>
+            <li><a 
href="http://www.apache.org/foundation/sponsorship.html";>Donate</a></li>
+            <li><a 
href="http://www.apache.org/foundation/thanks.html";>Thanks</a></li>
+            <li><a href="http://www.apache.org/security/";>Security</a></li>
+          </ul>
+        </li>
+      </ul>
+      <a href="http://www.apache.org/";>
+        <img style="float:right;" src="/img/asf_logo.svg" width="120px"/>
+      </a>
+      </div><!-- /.navbar-collapse -->
+    </div>
+  </nav>
+
+
+    <h2>
+      Plasma In-Memory Object Store
+      <a href="/blog/2017/08/08/plasma-in-memory-object-store/" 
class="permalink" title="Permalink">∞</a>
+    </h2>
+
+    
+
+    <div class="panel">
+      <div class="panel-body">
+        <div>
+          <span class="label label-default">Published</span>
+          <span class="published">
+            <i class="fa fa-calendar"></i>
+            08 Aug 2017
+          </span>
+        </div>
+        <div>
+          <span class="label label-default">By</span>
+          <a href="http://people.apache.org/~Philipp Moritz and Robert 
Nishihara"><i class="fa fa-user"></i>  (Philipp Moritz and Robert Nishihara)</a>
+        </div>
+      </div>
+    </div>
+
+    <!--
+
+-->
+
+<p><em><a href="https://people.eecs.berkeley.edu/~pcmoritz/";>Philipp 
Moritz</a> and <a href="http://www.robertnishihara.com";>Robert Nishihara</a> 
are graduate students at UC
+ Berkeley.</em></p>
+
+<h2 id="plasma-a-high-performance-shared-memory-object-store">Plasma: A 
High-Performance Shared-Memory Object Store</h2>
+
+<h3 id="motivating-plasma">Motivating Plasma</h3>
+
+<p>This blog post presents Plasma, an in-memory object store that is being
+developed as part of Apache Arrow. <strong>Plasma holds immutable objects in 
shared
+memory so that they can be accessed efficiently by many clients across process
+boundaries.</strong> In light of the trend toward larger and larger multicore 
machines,
+Plasma enables critical performance optimizations in the big data regime.</p>
+
+<p>Plasma was initially developed as part of <a 
href="https://github.com/ray-project/ray";>Ray</a>, and has recently been moved
+to Apache Arrow in the hopes that it will be broadly useful.</p>
+
+<p>One of the goals of Apache Arrow is to serve as a common data layer enabling
+zero-copy data exchange between multiple frameworks. A key component of this
+vision is the use of off-heap memory management (via Plasma) for storing and
+sharing Arrow-serialized objects between applications.</p>
+
+<p><strong>Expensive serialization and deserialization as well as data copying 
are a
+common performance bottleneck in distributed computing.</strong> For example, a
+Python-based execution framework that wishes to distribute computation across
+multiple Python “worker” processes and then aggregate the results in a 
single
+“driver” process may choose to serialize data using the built-in <code 
class="highlighter-rouge">pickle</code>
+library. Assuming one Python process per core, each worker process would have 
to
+copy and deserialize the data, resulting in excessive memory usage. The driver
+process would then have to deserialize results from each of the workers,
+resulting in a bottleneck.</p>
+
+<p>Using Plasma plus Arrow, the data being operated on would be placed in the
+Plasma store once, and all of the workers would read the data without copying 
or
+deserializing it (the workers would map the relevant region of memory into 
their
+own address spaces). The workers would then put the results of their 
computation
+back into the Plasma store, which the driver could then read and aggregate
+without copying or deserializing the data.</p>
+
+<h3 id="the-plasma-api">The Plasma API:</h3>
+
+<p>Below we illustrate a subset of the API. The C++ API is documented more 
fully
+<a 
href="https://github.com/apache/arrow/blob/master/cpp/apidoc/tutorials/plasma.md";>here</a>,
 and the Python API is documented <a 
href="https://github.com/apache/arrow/blob/master/python/doc/source/plasma.rst";>here</a>.</p>
+
+<p><strong>Object IDs:</strong> Each object is associated with a string of 
bytes.</p>
+
+<p><strong>Creating an object:</strong> Objects are stored in Plasma in two 
stages. First, the
+object store <em>creates</em> the object by allocating a buffer for it. At 
this point,
+the client can write to the buffer and construct the object within the 
allocated
+buffer. When the client is done, the client <em>seals</em> the buffer making 
the object
+immutable and making it available to other Plasma clients.</p>
+
+<div class="language-python highlighter-rouge"><pre 
class="highlight"><code><span class="c"># Create an object.</span>
+<span class="n">object_id</span> <span class="o">=</span> <span 
class="n">pyarrow</span><span class="o">.</span><span 
class="n">plasma</span><span class="o">.</span><span 
class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> 
<span class="o">*</span> <span class="n">b</span><span 
class="s">'a'</span><span class="p">)</span>
+<span class="n">object_size</span> <span class="o">=</span> <span 
class="mi">1000</span>
+<span class="nb">buffer</span> <span class="o">=</span> <span 
class="n">memoryview</span><span class="p">(</span><span 
class="n">client</span><span class="o">.</span><span 
class="n">create</span><span class="p">(</span><span 
class="n">object_id</span><span class="p">,</span> <span 
class="n">object_size</span><span class="p">))</span>
+
+<span class="c"># Write to the buffer.</span>
+<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> 
<span class="nb">range</span><span class="p">(</span><span 
class="mi">1000</span><span class="p">):</span>
+    <span class="nb">buffer</span><span class="p">[</span><span 
class="n">i</span><span class="p">]</span> <span class="o">=</span> <span 
class="mi">0</span>
+
+<span class="c"># Seal the object making it immutable and available to other 
clients.</span>
+<span class="n">client</span><span class="o">.</span><span 
class="n">seal</span><span class="p">(</span><span 
class="n">object_id</span><span class="p">)</span>
+</code></pre>
+</div>
+
+<p><strong>Getting an object:</strong> After an object has been sealed, any 
client who knows the
+object ID can get the object.</p>
+
+<div class="language-python highlighter-rouge"><pre 
class="highlight"><code><span class="c"># Get the object from the store. This 
blocks until the object has been sealed.</span>
+<span class="n">object_id</span> <span class="o">=</span> <span 
class="n">pyarrow</span><span class="o">.</span><span 
class="n">plasma</span><span class="o">.</span><span 
class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> 
<span class="o">*</span> <span class="n">b</span><span 
class="s">'a'</span><span class="p">)</span>
+<span class="p">[</span><span class="n">buff</span><span class="p">]</span> 
<span class="o">=</span> <span class="n">client</span><span 
class="o">.</span><span class="n">get</span><span class="p">([</span><span 
class="n">object_id</span><span class="p">])</span>
+<span class="nb">buffer</span> <span class="o">=</span> <span 
class="n">memoryview</span><span class="p">(</span><span 
class="n">buff</span><span class="p">)</span>
+</code></pre>
+</div>
+
+<p>If the object has not been sealed yet, then the call to <code 
class="highlighter-rouge">client.get</code> will block
+until the object has been sealed.</p>
+
+<h3 id="a-sorting-application">A sorting application</h3>
+
+<p>To illustrate the benefits of Plasma, we demonstrate an <strong>11x 
speedup</strong> (on a
+machine with 20 physical cores) for sorting a large pandas DataFrame (one
+billion entries). The baseline is the built-in pandas sort function, which 
sorts
+the DataFrame in 477 seconds. To leverage multiple cores, we implement the
+following standard distributed sorting scheme.</p>
+
+<ul>
+  <li>We assume that the data is partitioned across K pandas DataFrames and 
that
+each one already lives in the Plasma store.</li>
+  <li>We subsample the data, sort the subsampled data, and use the result to 
define
+L non-overlapping buckets.</li>
+  <li>For each of the K data partitions and each of the L buckets, we find the
+subset of the data partition that falls in the bucket, and we sort that
+subset.</li>
+  <li>For each of the L buckets, we gather all of the K sorted subsets that 
fall in
+that bucket.</li>
+  <li>For each of the L buckets, we merge the corresponding K sorted 
subsets.</li>
+  <li>We turn each bucket into a pandas DataFrame and place it in the Plasma 
store.</li>
+</ul>
+
+<p>Using this scheme, we can sort the DataFrame (the data starts and ends in 
the
+Plasma store), in 44 seconds, giving an 11x speedup over the baseline.</p>
+
+<h3 id="design">Design</h3>
+
+<p>The Plasma store runs as a separate process. It is written in C++ and is
+designed as a single-threaded event loop based on the <a 
href="https://redis.io/";>Redis</a> event loop library.
+The plasma client library can be linked into applications. Clients communicate
+with the Plasma store via messages serialized using <a 
href="https://google.github.io/flatbuffers/";>Google Flatbuffers</a>.</p>
+
+<h3 id="call-for-contributions">Call for contributions</h3>
+
+<p>Plasma is a work in progress, and the API is currently unstable. Today 
Plasma is
+primarily used in <a href="https://github.com/ray-project/ray";>Ray</a> as an 
in-memory cache for Arrow serialized objects.
+We are looking for a broader set of use cases to help refine Plasma’s API. In
+addition, we are looking for contributions in a variety of areas including
+improving performance and building other language bindings. Please let us know
+if you are interested in getting involved with the project.</p>
+
+
+
+    <hr/>
+<footer class="footer">
+  <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache 
Arrow project logo are either registered trademarks or trademarks of The Apache 
Software Foundation in the United States and other countries.</p>
+  <p>&copy; 2017 Apache Software Foundation</p>
+</footer>
+
+  </div>
+</body>
+</html>

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/3b67853c/blog/index.html
----------------------------------------------------------------------
diff --git a/blog/index.html b/blog/index.html
index 1350659..bf83de3 100644
--- a/blog/index.html
+++ b/blog/index.html
@@ -111,6 +111,159 @@
     
   <div class="container">
     <h2>
+      Plasma In-Memory Object Store
+      <a href="/blog/2017/08/08/plasma-in-memory-object-store/" 
class="permalink" title="Permalink">∞</a>
+    </h2>
+
+    
+
+    <div class="panel">
+      <div class="panel-body">
+        <div>
+          <span class="label label-default">Published</span>
+          <span class="published">
+            <i class="fa fa-calendar"></i>
+            08 Aug 2017
+          </span>
+        </div>
+        <div>
+          <span class="label label-default">By</span>
+          <a href=""><i class="fa fa-user"></i>  (Philipp Moritz and Robert 
Nishihara)</a>
+        </div>
+      </div>
+    </div>
+    <!--
+
+-->
+
+<p><em><a href="https://people.eecs.berkeley.edu/~pcmoritz/";>Philipp 
Moritz</a> and <a href="http://www.robertnishihara.com";>Robert Nishihara</a> 
are graduate students at UC
+ Berkeley.</em></p>
+
+<h2 id="plasma-a-high-performance-shared-memory-object-store">Plasma: A 
High-Performance Shared-Memory Object Store</h2>
+
+<h3 id="motivating-plasma">Motivating Plasma</h3>
+
+<p>This blog post presents Plasma, an in-memory object store that is being
+developed as part of Apache Arrow. <strong>Plasma holds immutable objects in 
shared
+memory so that they can be accessed efficiently by many clients across process
+boundaries.</strong> In light of the trend toward larger and larger multicore 
machines,
+Plasma enables critical performance optimizations in the big data regime.</p>
+
+<p>Plasma was initially developed as part of <a 
href="https://github.com/ray-project/ray";>Ray</a>, and has recently been moved
+to Apache Arrow in the hopes that it will be broadly useful.</p>
+
+<p>One of the goals of Apache Arrow is to serve as a common data layer enabling
+zero-copy data exchange between multiple frameworks. A key component of this
+vision is the use of off-heap memory management (via Plasma) for storing and
+sharing Arrow-serialized objects between applications.</p>
+
+<p><strong>Expensive serialization and deserialization as well as data copying 
are a
+common performance bottleneck in distributed computing.</strong> For example, a
+Python-based execution framework that wishes to distribute computation across
+multiple Python “worker” processes and then aggregate the results in a 
single
+“driver” process may choose to serialize data using the built-in <code 
class="highlighter-rouge">pickle</code>
+library. Assuming one Python process per core, each worker process would have 
to
+copy and deserialize the data, resulting in excessive memory usage. The driver
+process would then have to deserialize results from each of the workers,
+resulting in a bottleneck.</p>
+
+<p>Using Plasma plus Arrow, the data being operated on would be placed in the
+Plasma store once, and all of the workers would read the data without copying 
or
+deserializing it (the workers would map the relevant region of memory into 
their
+own address spaces). The workers would then put the results of their 
computation
+back into the Plasma store, which the driver could then read and aggregate
+without copying or deserializing the data.</p>
+
+<h3 id="the-plasma-api">The Plasma API:</h3>
+
+<p>Below we illustrate a subset of the API. The C++ API is documented more 
fully
+<a 
href="https://github.com/apache/arrow/blob/master/cpp/apidoc/tutorials/plasma.md";>here</a>,
 and the Python API is documented <a 
href="https://github.com/apache/arrow/blob/master/python/doc/source/plasma.rst";>here</a>.</p>
+
+<p><strong>Object IDs:</strong> Each object is associated with a string of 
bytes.</p>
+
+<p><strong>Creating an object:</strong> Objects are stored in Plasma in two 
stages. First, the
+object store <em>creates</em> the object by allocating a buffer for it. At 
this point,
+the client can write to the buffer and construct the object within the 
allocated
+buffer. When the client is done, the client <em>seals</em> the buffer making 
the object
+immutable and making it available to other Plasma clients.</p>
+
+<div class="language-python highlighter-rouge"><pre 
class="highlight"><code><span class="c"># Create an object.</span>
+<span class="n">object_id</span> <span class="o">=</span> <span 
class="n">pyarrow</span><span class="o">.</span><span 
class="n">plasma</span><span class="o">.</span><span 
class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> 
<span class="o">*</span> <span class="n">b</span><span 
class="s">'a'</span><span class="p">)</span>
+<span class="n">object_size</span> <span class="o">=</span> <span 
class="mi">1000</span>
+<span class="nb">buffer</span> <span class="o">=</span> <span 
class="n">memoryview</span><span class="p">(</span><span 
class="n">client</span><span class="o">.</span><span 
class="n">create</span><span class="p">(</span><span 
class="n">object_id</span><span class="p">,</span> <span 
class="n">object_size</span><span class="p">))</span>
+
+<span class="c"># Write to the buffer.</span>
+<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> 
<span class="nb">range</span><span class="p">(</span><span 
class="mi">1000</span><span class="p">):</span>
+    <span class="nb">buffer</span><span class="p">[</span><span 
class="n">i</span><span class="p">]</span> <span class="o">=</span> <span 
class="mi">0</span>
+
+<span class="c"># Seal the object making it immutable and available to other 
clients.</span>
+<span class="n">client</span><span class="o">.</span><span 
class="n">seal</span><span class="p">(</span><span 
class="n">object_id</span><span class="p">)</span>
+</code></pre>
+</div>
+
+<p><strong>Getting an object:</strong> After an object has been sealed, any 
client who knows the
+object ID can get the object.</p>
+
+<div class="language-python highlighter-rouge"><pre 
class="highlight"><code><span class="c"># Get the object from the store. This 
blocks until the object has been sealed.</span>
+<span class="n">object_id</span> <span class="o">=</span> <span 
class="n">pyarrow</span><span class="o">.</span><span 
class="n">plasma</span><span class="o">.</span><span 
class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> 
<span class="o">*</span> <span class="n">b</span><span 
class="s">'a'</span><span class="p">)</span>
+<span class="p">[</span><span class="n">buff</span><span class="p">]</span> 
<span class="o">=</span> <span class="n">client</span><span 
class="o">.</span><span class="n">get</span><span class="p">([</span><span 
class="n">object_id</span><span class="p">])</span>
+<span class="nb">buffer</span> <span class="o">=</span> <span 
class="n">memoryview</span><span class="p">(</span><span 
class="n">buff</span><span class="p">)</span>
+</code></pre>
+</div>
+
+<p>If the object has not been sealed yet, then the call to <code 
class="highlighter-rouge">client.get</code> will block
+until the object has been sealed.</p>
+
+<h3 id="a-sorting-application">A sorting application</h3>
+
+<p>To illustrate the benefits of Plasma, we demonstrate an <strong>11x 
speedup</strong> (on a
+machine with 20 physical cores) for sorting a large pandas DataFrame (one
+billion entries). The baseline is the built-in pandas sort function, which 
sorts
+the DataFrame in 477 seconds. To leverage multiple cores, we implement the
+following standard distributed sorting scheme.</p>
+
+<ul>
+  <li>We assume that the data is partitioned across K pandas DataFrames and 
that
+each one already lives in the Plasma store.</li>
+  <li>We subsample the data, sort the subsampled data, and use the result to 
define
+L non-overlapping buckets.</li>
+  <li>For each of the K data partitions and each of the L buckets, we find the
+subset of the data partition that falls in the bucket, and we sort that
+subset.</li>
+  <li>For each of the L buckets, we gather all of the K sorted subsets that 
fall in
+that bucket.</li>
+  <li>For each of the L buckets, we merge the corresponding K sorted 
subsets.</li>
+  <li>We turn each bucket into a pandas DataFrame and place it in the Plasma 
store.</li>
+</ul>
+
+<p>Using this scheme, we can sort the DataFrame (the data starts and ends in 
the
+Plasma store), in 44 seconds, giving an 11x speedup over the baseline.</p>
+
+<h3 id="design">Design</h3>
+
+<p>The Plasma store runs as a separate process. It is written in C++ and is
+designed as a single-threaded event loop based on the <a 
href="https://redis.io/";>Redis</a> event loop library.
+The plasma client library can be linked into applications. Clients communicate
+with the Plasma store via messages serialized using <a 
href="https://google.github.io/flatbuffers/";>Google Flatbuffers</a>.</p>
+
+<h3 id="call-for-contributions">Call for contributions</h3>
+
+<p>Plasma is a work in progress, and the API is currently unstable. Today 
Plasma is
+primarily used in <a href="https://github.com/ray-project/ray";>Ray</a> as an 
in-memory cache for Arrow serialized objects.
+We are looking for a broader set of use cases to help refine Plasma’s API. In
+addition, we are looking for contributions in a variety of areas including
+improving performance and building other language bindings. Please let us know
+if you are interested in getting involved with the project.</p>
+
+
+  </div>
+
+  
+
+  
+    
+  <div class="container">
+    <h2>
       Speeding up PySpark with Apache Arrow
       <a href="/blog/2017/07/26/spark-arrow/" class="permalink" 
title="Permalink">∞</a>
     </h2>

Reply via email to