Add blog post for 0.11
Project: http://git-wip-us.apache.org/repos/asf/arrow-site/repo Commit: http://git-wip-us.apache.org/repos/asf/arrow-site/commit/dc12f116 Tree: http://git-wip-us.apache.org/repos/asf/arrow-site/tree/dc12f116 Diff: http://git-wip-us.apache.org/repos/asf/arrow-site/diff/dc12f116 Branch: refs/heads/asf-site Commit: dc12f116260813301ccb068ddf14816994f3aaaa Parents: ca3b718 Author: Wes McKinney <[email protected]> Authored: Tue Oct 9 05:01:03 2018 -0400 Committer: Wes McKinney <[email protected]> Committed: Tue Oct 9 05:01:03 2018 -0400 ---------------------------------------------------------------------- blog/2017/05/08/0.3-release-japanese/index.html | 4 +- blog/2017/05/08/0.3-release/index.html | 4 +- blog/2017/05/23/0.4.0-release/index.html | 2 +- blog/2017/06/14/0.4.1-release/index.html | 2 +- blog/2017/06/16/turbodbc-arrow/index.html | 2 +- blog/2017/07/25/0.5.0-release/index.html | 2 +- blog/2017/07/26/spark-arrow/index.html | 2 +- .../08/plasma-in-memory-object-store/index.html | 22 +- blog/2017/08/16/0.6.0-release/index.html | 2 +- blog/2017/09/19/0.7.0-release/index.html | 2 +- .../index.html | 12 +- blog/2017/12/18/0.8.0-release/index.html | 2 +- .../12/18/java-vector-improvements/index.html | 2 +- blog/2018/03/22/0.9.0-release/index.html | 2 +- blog/2018/03/22/go-code-donation/index.html | 2 +- blog/2018/07/20/jemalloc/index.html | 2 +- blog/2018/08/07/0.10.0-release/index.html | 2 +- blog/index.html | 143 ++++- committers/index.html | 2 +- docs/ipc.html | 2 +- docs/memory_layout.html | 2 +- docs/metadata.html | 2 +- feed.xml | 216 +++---- index.html | 2 +- install/index.html | 2 +- powered_by/index.html | 2 +- release/0.1.0.html | 2 +- release/0.10.0.html | 2 +- release/0.11.0.html | 596 ++++++++++--------- release/0.2.0.html | 2 +- release/0.3.0.html | 2 +- release/0.4.0.html | 2 +- release/0.4.1.html | 2 +- release/0.5.0.html | 2 +- release/0.6.0.html | 2 +- release/0.7.0.html | 2 +- release/0.7.1.html | 2 +- release/0.8.0.html | 2 +- release/0.9.0.html | 2 +- release/index.html | 2 +- 40 files changed, 570 insertions(+), 493 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/05/08/0.3-release-japanese/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/05/08/0.3-release-japanese/index.html b/blog/2017/05/08/0.3-release-japanese/index.html index 11ed626..294602f 100644 --- a/blog/2017/05/08/0.3-release-japanese/index.html +++ b/blog/2017/05/08/0.3-release-japanese/index.html @@ -217,7 +217,7 @@ <span class="n">In</span> <span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="n">buf</span> <span class="n">Out</span><span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="o"><</span><span class="n">pyarrow</span><span class="o">.</span><span class="n">_io</span><span class="o">.</span><span class="n">Buffer</span> <span class="n">at</span> <span class="mh">0x7f6c0a84b538</span><span class="o">></span> -<span class="n">In</span> <span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span> +<span class="n">In</span> <span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="n">memoryview</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span> <span class="n">Out</span><span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="o"><</span><span class="n">memory</span> <span class="n">at</span> <span class="mh">0x7f6c0a8c5e88</span><span class="o">></span> <span class="n">In</span> <span class="p">[</span><span class="mi">10</span><span class="p">]:</span> <span class="n">buf</span><span class="o">.</span><span class="n">to_pybytes</span><span class="p">()</span> @@ -283,7 +283,7 @@ <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/05/08/0.3-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/05/08/0.3-release/index.html b/blog/2017/05/08/0.3-release/index.html index f5f09e3..d23718f 100644 --- a/blog/2017/05/08/0.3-release/index.html +++ b/blog/2017/05/08/0.3-release/index.html @@ -270,7 +270,7 @@ and memoryviews, so now code like this is possible:</p> <span class="n">In</span> <span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="n">buf</span> <span class="n">Out</span><span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="o"><</span><span class="n">pyarrow</span><span class="o">.</span><span class="n">_io</span><span class="o">.</span><span class="n">Buffer</span> <span class="n">at</span> <span class="mh">0x7f6c0a84b538</span><span class="o">></span> -<span class="n">In</span> <span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span> +<span class="n">In</span> <span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="n">memoryview</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span> <span class="n">Out</span><span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="o"><</span><span class="n">memory</span> <span class="n">at</span> <span class="mh">0x7f6c0a8c5e88</span><span class="o">></span> <span class="n">In</span> <span class="p">[</span><span class="mi">10</span><span class="p">]:</span> <span class="n">buf</span><span class="o">.</span><span class="n">to_pybytes</span><span class="p">()</span> @@ -359,7 +359,7 @@ instructions for getting started.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/05/23/0.4.0-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/05/23/0.4.0-release/index.html b/blog/2017/05/23/0.4.0-release/index.html index b04eda8..f75519a 100644 --- a/blog/2017/05/23/0.4.0-release/index.html +++ b/blog/2017/05/23/0.4.0-release/index.html @@ -223,7 +223,7 @@ Linux. We are working on providing binary wheel installers for Windows as well.< <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/06/14/0.4.1-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/06/14/0.4.1-release/index.html b/blog/2017/06/14/0.4.1-release/index.html index 727e8a1..9dee5e4 100644 --- a/blog/2017/06/14/0.4.1-release/index.html +++ b/blog/2017/06/14/0.4.1-release/index.html @@ -183,7 +183,7 @@ conda install turbodbc -c conda-forge <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/06/16/turbodbc-arrow/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/06/16/turbodbc-arrow/index.html b/blog/2017/06/16/turbodbc-arrow/index.html index a0c59a8..023538c 100644 --- a/blog/2017/06/16/turbodbc-arrow/index.html +++ b/blog/2017/06/16/turbodbc-arrow/index.html @@ -228,7 +228,7 @@ nitty-gritty details, check out parts <a href="https://tech.blue-yonder.com/maki <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/07/25/0.5.0-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/07/25/0.5.0-release/index.html b/blog/2017/07/25/0.5.0-release/index.html index 29c9c37..f5d5ff3 100644 --- a/blog/2017/07/25/0.5.0-release/index.html +++ b/blog/2017/07/25/0.5.0-release/index.html @@ -231,7 +231,7 @@ mailing list</a>. Please join the discussion there.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/07/26/spark-arrow/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/07/26/spark-arrow/index.html b/blog/2017/07/26/spark-arrow/index.html index a892187..cd6a129 100644 --- a/blog/2017/07/26/spark-arrow/index.html +++ b/blog/2017/07/26/spark-arrow/index.html @@ -268,7 +268,7 @@ helped push this effort forwards.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/08/08/plasma-in-memory-object-store/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/08/08/plasma-in-memory-object-store/index.html b/blog/2017/08/08/plasma-in-memory-object-store/index.html index fb635a9..308ef61 100644 --- a/blog/2017/08/08/plasma-in-memory-object-store/index.html +++ b/blog/2017/08/08/plasma-in-memory-object-store/index.html @@ -196,26 +196,26 @@ the client can write to the buffer and construct the object within the allocated buffer. When the client is done, the client <em>seals</em> the buffer making the object immutable and making it available to other Plasma clients.</p> -<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Create an object. -</span><span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> +<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Create an object.</span> +<span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> <span class="n">object_size</span> <span class="o">=</span> <span class="mi">1000</span> -<span class="nb">buffer</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">client</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">object_id</span><span class="p">,</span> <span class="n">object_size</span><span class="p">))</span> +<span class="nb">buffer</span> <span class="o">=</span> <span class="n">memoryview</span><span class="p">(</span><span class="n">client</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">object_id</span><span class="p">,</span> <span class="n">object_size</span><span class="p">))</span> -<span class="c1"># Write to the buffer. -</span><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">):</span> +<span class="c"># Write to the buffer.</span> +<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">):</span> <span class="nb">buffer</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span> -<span class="c1"># Seal the object making it immutable and available to other clients. -</span><span class="n">client</span><span class="o">.</span><span class="n">seal</span><span class="p">(</span><span class="n">object_id</span><span class="p">)</span> +<span class="c"># Seal the object making it immutable and available to other clients.</span> +<span class="n">client</span><span class="o">.</span><span class="n">seal</span><span class="p">(</span><span class="n">object_id</span><span class="p">)</span> </code></pre></div></div> <p><strong>Getting an object:</strong> After an object has been sealed, any client who knows the object ID can get the object.</p> -<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Get the object from the store. This blocks until the object has been sealed. -</span><span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> +<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Get the object from the store. This blocks until the object has been sealed.</span> +<span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> <span class="p">[</span><span class="n">buff</span><span class="p">]</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">get</span><span class="p">([</span><span class="n">object_id</span><span class="p">])</span> -<span class="nb">buffer</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">buff</span><span class="p">)</span> +<span class="nb">buffer</span> <span class="o">=</span> <span class="n">memoryview</span><span class="p">(</span><span class="n">buff</span><span class="p">)</span> </code></pre></div></div> <p>If the object has not been sealed yet, then the call to <code class="highlighter-rouge">client.get</code> will block @@ -269,7 +269,7 @@ if you are interested in getting involved with the project.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/08/16/0.6.0-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/08/16/0.6.0-release/index.html b/blog/2017/08/16/0.6.0-release/index.html index f4e6d3c..499f17b 100644 --- a/blog/2017/08/16/0.6.0-release/index.html +++ b/blog/2017/08/16/0.6.0-release/index.html @@ -230,7 +230,7 @@ Java and C++. Please join the discussion there.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/09/19/0.7.0-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/09/19/0.7.0-release/index.html b/blog/2017/09/19/0.7.0-release/index.html index edabde7..96c2614 100644 --- a/blog/2017/09/19/0.7.0-release/index.html +++ b/blog/2017/09/19/0.7.0-release/index.html @@ -306,7 +306,7 @@ bindings to more languages.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html b/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html index c210d9f..a3d59bc 100644 --- a/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html +++ b/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html @@ -354,16 +354,16 @@ Benchmarking <code class="highlighter-rouge">ray.put</code> and <code class="hig <span class="k">def</span> <span class="nf">benchmark_object</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">number</span><span class="o">=</span><span class="mi">10</span><span class="p">):</span> - <span class="c1"># Time serialization and deserialization for pickle. -</span> <span class="n">pickle_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> + <span class="c"># Time serialization and deserialization for pickle.</span> + <span class="n">pickle_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="k">lambda</span><span class="p">:</span> <span class="n">pickle</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=</span><span class="n">pickle</span><span class="o">.</span><span class="n">HIGHEST_PROTOCOL</span><span class="p">),</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span> <span class="n">serialized_obj</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">pickle</span><span class="o">.</span><span class="n">HIGHEST_PROTOCOL</span><span class="p">)</span> <span class="n">pickle_deserialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="k">lambda</span><span class="p">:</span> <span class="n">pickle</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">serialized_obj</span><span class="p">),</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span> - <span class="c1"># Time serialization and deserialization for Ray. -</span> <span class="n">ray_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> + <span class="c"># Time serialization and deserialization for Ray.</span> + <span class="n">ray_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="k">lambda</span><span class="p">:</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">serialize</span><span class="p">(</span><span class="n">obj</span><span class="p">)</span><span class="o">.</span><span class="n">to_buffer</span><span class="p">(),</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span> <span class="n">serialized_obj</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">serialize</span><span class="p">(</span><span class="n">obj</span><span class="p">)</span><span class="o">.</span><span class="n">to_buffer</span><span class="p">()</span> <span class="n">ray_deserialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> @@ -394,7 +394,7 @@ Benchmarking <code class="highlighter-rouge">ray.put</code> and <code class="hig <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">fontsize</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span> <span class="n">plt</span><span class="o">.</span><span class="n">tight_layout</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">yticks</span><span class="p">(</span><span class="n">fontsize</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span> - <span class="n">plt</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'plot-'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">+</span> <span class="s">'.png'</span><span class="p">,</span> <span class="nb">format</span><span class="o">=</span><span class="s">'png'</span><span class="p">)</span> + <span class="n">plt</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'plot-'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">+</span> <span class="s">'.png'</span><span class="p">,</span> <span class="n">format</span><span class="o">=</span><span class="s">'png'</span><span class="p">)</span> <span class="n">test_objects</span> <span class="o">=</span> <span class="p">[</span> @@ -422,7 +422,7 @@ Benchmarking <code class="highlighter-rouge">ray.put</code> and <code class="hig <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/12/18/0.8.0-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/12/18/0.8.0-release/index.html b/blog/2017/12/18/0.8.0-release/index.html index 615cfac..11f44f6 100644 --- a/blog/2017/12/18/0.8.0-release/index.html +++ b/blog/2017/12/18/0.8.0-release/index.html @@ -303,7 +303,7 @@ implementations and bindings to more languages.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2017/12/18/java-vector-improvements/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/12/18/java-vector-improvements/index.html b/blog/2017/12/18/java-vector-improvements/index.html index da2c6ee..e6e0d13 100644 --- a/blog/2017/12/18/java-vector-improvements/index.html +++ b/blog/2017/12/18/java-vector-improvements/index.html @@ -231,7 +231,7 @@ In the new implementation, all vectors in Java are nullable in nature.</li> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2018/03/22/0.9.0-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2018/03/22/0.9.0-release/index.html b/blog/2018/03/22/0.9.0-release/index.html index d7b1150..b20a96e 100644 --- a/blog/2018/03/22/0.9.0-release/index.html +++ b/blog/2018/03/22/0.9.0-release/index.html @@ -219,7 +219,7 @@ computational libraries within the project.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2018/03/22/go-code-donation/index.html ---------------------------------------------------------------------- diff --git a/blog/2018/03/22/go-code-donation/index.html b/blog/2018/03/22/go-code-donation/index.html index 0a3d039..6e0b94c 100644 --- a/blog/2018/03/22/go-code-donation/index.html +++ b/blog/2018/03/22/go-code-donation/index.html @@ -196,7 +196,7 @@ at <a href="https://arrow.apache.org">https://arrow.apache.org</a> and join the <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2018/07/20/jemalloc/index.html ---------------------------------------------------------------------- diff --git a/blog/2018/07/20/jemalloc/index.html b/blog/2018/07/20/jemalloc/index.html index 9998a67..22f42fe 100644 --- a/blog/2018/07/20/jemalloc/index.html +++ b/blog/2018/07/20/jemalloc/index.html @@ -259,7 +259,7 @@ reallocation.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/2018/08/07/0.10.0-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2018/08/07/0.10.0-release/index.html b/blog/2018/08/07/0.10.0-release/index.html index fad6c70..c1df241 100644 --- a/blog/2018/08/07/0.10.0-release/index.html +++ b/blog/2018/08/07/0.10.0-release/index.html @@ -187,7 +187,7 @@ languages.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/blog/index.html ---------------------------------------------------------------------- diff --git a/blog/index.html b/blog/index.html index 386bff9..e141538 100644 --- a/blog/index.html +++ b/blog/index.html @@ -132,6 +132,115 @@ <div class="blog-post" style="margin-bottom: 4rem"> <h1> + Apache Arrow 0.11.0 Release + <a href="/blog/2018/10/09/0.11.0-release/" class="permalink" title="Permalink">â</a> + </h1> + + + + <p> + <span class="badge badge-secondary">Published</span> + <span class="published"> + 09 Oct 2018 + </span> + <br /> + <span class="badge badge-secondary">By</span> + <a href="http://wesmckinney.com">Wes McKinney (wesm)</a> + </p> + <!-- + +--> + +<p>The Apache Arrow team is pleased to announce the 0.11.0 release. It is the +product of 2 months of development and includes <a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.11.0"><strong>287 resolved +issues</strong></a>.</p> + +<p>See the <a href="https://arrow.apache.org/install">Install Page</a> to learn how to get the libraries for your +platform. The <a href="https://arrow.apache.org/release/0.11.0.html">complete changelog</a> is also available.</p> + +<p>We discuss some highlights from the release and other project news in this +post.</p> + +<h2 id="arrow-flight-rpc-and-messaging-framework">Arrow Flight RPC and Messaging Framework</h2> + +<p>We are developing a new Arrow-native RPC framework, Arrow Flight, based on +<a href="http://grpc.io">gRPC</a> for high performance Arrow-based messaging. Through low-level +extensions to gRPCâs internal memory management, we are able to avoid expensive +parsing when receiving datasets over the wire, unlocking unprecedented levels +of performance in moving datasets from one machine to another. We will be +writing more about Flight on the Arrow blog in the future.</p> + +<p>Prototype implementations are available in Java and C++, and we will be focused +in the coming months on hardening the Flight RPC framework for enterprise-grade +production use cases.</p> + +<h2 id="parquet-and-arrow-c-communities-joining-forces">Parquet and Arrow C++ communities joining forces</h2> + +<p>After discussion over the last year, the Apache Arrow and Apache Parquet C++ +communities decide to merge the Parquet C++ codebase into the Arrow C++ +codebase and work together in a âmonorepoâ structure. This should result in +better developer productivity in core Parquet work as well as in Arrow +integration.</p> + +<p>Before this codebase merge, we had a circular dependency between the Arrow and +Parquet codebases, since the Parquet C++ library is used in the Arrow Python +library.</p> + +<h2 id="gandiva-llvm-expression-compiler-donation">Gandiva LLVM Expression Compiler donation</h2> + +<p><a href="http://dremio.com">Dremio Corporation</a> has donated the <a href="http://github.com/dremio/gandiva">Gandiva</a> LLVM expression compiler +to Apache Arrow. We will be working on cross-platform builds, packaging, and +language bindings (e.g. in Python) for Gandiva in the upcoming 0.12 release and +beyond. We will write more about Gandiva in the future.</p> + +<h2 id="parquet-c-glib-bindings-donation">Parquet C GLib Bindings Donation</h2> + +<p>PMC member <a href="https://github.com/kou">Kouhei Sutou</a> has donated GLib bindings for the Parquet C++ +libraries, which are designed to work together with the existing Arrow GLib +bindings.</p> + +<h2 id="c-csv-reader-project">C++ CSV Reader Project</h2> + +<p>We have begun developing a general purpose multithreaded CSV file parser in +C++. The purpose of this library is to parse and convert comma-separated text +files into Arrow columnar record batches as efficiently as possible. The +prototype version features Python bindings, and any language that can use the +C++ libraries (including C, R, and Ruby).</p> + +<h2 id="new-matlab-bindings">New MATLAB bindings</h2> + +<p><a href="https://mathworks.com">The MathWorks</a> has contributed an initial MEX file binding to the Arrow +C++ libraries. Initially, it is possible to read Arrow-based Feather files in +MATLAB. We are looking forward to seeing more developments for MATLAB users.</p> + +<h2 id="r-library-in-development">R Library in Development</h2> + +<p>The community has begun implementing <a href="https://github.com/apache/arrow/tree/master/r">R language bindings and interoperability</a> +with the Arrow C++ libraries. This will include support for zero-copy shared +memory IPC and other tools needed to improve R integration with Apache Spark +and more.</p> + +<h2 id="support-for-cuda-based-gpus-in-python">Support for CUDA-based GPUs in Python</h2> + +<p>This release includes Python bindings to the Arrow CUDA integration C++ +library. This work is targeting interoperability with <a href="https://github.com/numba/numba">Numba</a> and the <a href="http://gpuopenanalytics.com/">GPU +Open Analytics Initiative</a>.</p> + +<h2 id="upcoming-roadmap">Upcoming Roadmap</h2> + +<p>In the coming months, we will continue to make progress on many fronts, with +Gandiva packaging, expanded language support (especially in R), and improved +data access (e.g. CSV, Parquet files) in focus.</p> + + + </div> + + + + + + <div class="blog-post" style="margin-bottom: 4rem"> + <h1> Apache Arrow 0.10.0 Release <a href="/blog/2018/08/07/0.10.0-release/" class="permalink" title="Permalink">â</a> </h1> @@ -1011,16 +1120,16 @@ Benchmarking <code class="highlighter-rouge">ray.put</code> and <code class="hig <span class="k">def</span> <span class="nf">benchmark_object</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">number</span><span class="o">=</span><span class="mi">10</span><span class="p">):</span> - <span class="c1"># Time serialization and deserialization for pickle. -</span> <span class="n">pickle_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> + <span class="c"># Time serialization and deserialization for pickle.</span> + <span class="n">pickle_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="k">lambda</span><span class="p">:</span> <span class="n">pickle</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=</span><span class="n">pickle</span><span class="o">.</span><span class="n">HIGHEST_PROTOCOL</span><span class="p">),</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span> <span class="n">serialized_obj</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">pickle</span><span class="o">.</span><span class="n">HIGHEST_PROTOCOL</span><span class="p">)</span> <span class="n">pickle_deserialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="k">lambda</span><span class="p">:</span> <span class="n">pickle</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">serialized_obj</span><span class="p">),</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span> - <span class="c1"># Time serialization and deserialization for Ray. -</span> <span class="n">ray_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> + <span class="c"># Time serialization and deserialization for Ray.</span> + <span class="n">ray_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="k">lambda</span><span class="p">:</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">serialize</span><span class="p">(</span><span class="n">obj</span><span class="p">)</span><span class="o">.</span><span class="n">to_buffer</span><span class="p">(),</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span> <span class="n">serialized_obj</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">serialize</span><span class="p">(</span><span class="n">obj</span><span class="p">)</span><span class="o">.</span><span class="n">to_buffer</span><span class="p">()</span> <span class="n">ray_deserialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> @@ -1051,7 +1160,7 @@ Benchmarking <code class="highlighter-rouge">ray.put</code> and <code class="hig <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">fontsize</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span> <span class="n">plt</span><span class="o">.</span><span class="n">tight_layout</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">yticks</span><span class="p">(</span><span class="n">fontsize</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span> - <span class="n">plt</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'plot-'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">+</span> <span class="s">'.png'</span><span class="p">,</span> <span class="nb">format</span><span class="o">=</span><span class="s">'png'</span><span class="p">)</span> + <span class="n">plt</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'plot-'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">+</span> <span class="s">'.png'</span><span class="p">,</span> <span class="n">format</span><span class="o">=</span><span class="s">'png'</span><span class="p">)</span> <span class="n">test_objects</span> <span class="o">=</span> <span class="p">[</span> @@ -1439,26 +1548,26 @@ the client can write to the buffer and construct the object within the allocated buffer. When the client is done, the client <em>seals</em> the buffer making the object immutable and making it available to other Plasma clients.</p> -<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Create an object. -</span><span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> +<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Create an object.</span> +<span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> <span class="n">object_size</span> <span class="o">=</span> <span class="mi">1000</span> -<span class="nb">buffer</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">client</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">object_id</span><span class="p">,</span> <span class="n">object_size</span><span class="p">))</span> +<span class="nb">buffer</span> <span class="o">=</span> <span class="n">memoryview</span><span class="p">(</span><span class="n">client</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">object_id</span><span class="p">,</span> <span class="n">object_size</span><span class="p">))</span> -<span class="c1"># Write to the buffer. -</span><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">):</span> +<span class="c"># Write to the buffer.</span> +<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">):</span> <span class="nb">buffer</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span> -<span class="c1"># Seal the object making it immutable and available to other clients. -</span><span class="n">client</span><span class="o">.</span><span class="n">seal</span><span class="p">(</span><span class="n">object_id</span><span class="p">)</span> +<span class="c"># Seal the object making it immutable and available to other clients.</span> +<span class="n">client</span><span class="o">.</span><span class="n">seal</span><span class="p">(</span><span class="n">object_id</span><span class="p">)</span> </code></pre></div></div> <p><strong>Getting an object:</strong> After an object has been sealed, any client who knows the object ID can get the object.</p> -<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Get the object from the store. This blocks until the object has been sealed. -</span><span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> +<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Get the object from the store. This blocks until the object has been sealed.</span> +<span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> <span class="p">[</span><span class="n">buff</span><span class="p">]</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">get</span><span class="p">([</span><span class="n">object_id</span><span class="p">])</span> -<span class="nb">buffer</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">buff</span><span class="p">)</span> +<span class="nb">buffer</span> <span class="o">=</span> <span class="n">memoryview</span><span class="p">(</span><span class="n">buff</span><span class="p">)</span> </code></pre></div></div> <p>If the object has not been sealed yet, then the call to <code class="highlighter-rouge">client.get</code> will block @@ -2171,7 +2280,7 @@ and memoryviews, so now code like this is possible:</p> <span class="n">In</span> <span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="n">buf</span> <span class="n">Out</span><span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="o"><</span><span class="n">pyarrow</span><span class="o">.</span><span class="n">_io</span><span class="o">.</span><span class="n">Buffer</span> <span class="n">at</span> <span class="mh">0x7f6c0a84b538</span><span class="o">></span> -<span class="n">In</span> <span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span> +<span class="n">In</span> <span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="n">memoryview</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span> <span class="n">Out</span><span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="o"><</span><span class="n">memory</span> <span class="n">at</span> <span class="mh">0x7f6c0a8c5e88</span><span class="o">></span> <span class="n">In</span> <span class="p">[</span><span class="mi">10</span><span class="p">]:</span> <span class="n">buf</span><span class="o">.</span><span class="n">to_pybytes</span><span class="p">()</span> @@ -2267,7 +2376,7 @@ instructions for getting started.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/committers/index.html ---------------------------------------------------------------------- diff --git a/committers/index.html b/committers/index.html index 1f764a5..266071c 100644 --- a/committers/index.html +++ b/committers/index.html @@ -342,7 +342,7 @@ <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/docs/ipc.html ---------------------------------------------------------------------- diff --git a/docs/ipc.html b/docs/ipc.html index e1d927c..2a9fdb7 100644 --- a/docs/ipc.html +++ b/docs/ipc.html @@ -379,7 +379,7 @@ region) to be multiples of 64 bytes:</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/docs/memory_layout.html ---------------------------------------------------------------------- diff --git a/docs/memory_layout.html b/docs/memory_layout.html index ae01a2c..a36b36f 100644 --- a/docs/memory_layout.html +++ b/docs/memory_layout.html @@ -791,7 +791,7 @@ type: List<String> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/docs/metadata.html ---------------------------------------------------------------------- diff --git a/docs/metadata.html b/docs/metadata.html index 124652b..0ed948e 100644 --- a/docs/metadata.html +++ b/docs/metadata.html @@ -541,7 +541,7 @@ indicated unit. For second and millisecond: 32-bit, for the others 64-bit.</p> <footer class="footer"> <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> <p>© 2017 Apache Software Foundation</p> - <script type="text/javascript" src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous"></script> + <script src="/assets/main-8d2a359fd27a888246eb638b36a4e8b68ac65b9f11c48b9fac601fa0c9a7d796.js" integrity="sha256-jSo1n9J6iIJG62OLNqTotorGW58RxIufrGAfoMmn15Y=" crossorigin="anonymous" type="text/javascript"></script> </footer> </div> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/dc12f116/feed.xml ---------------------------------------------------------------------- diff --git a/feed.xml b/feed.xml index f053934..bbd6c64 100644 --- a/feed.xml +++ b/feed.xml @@ -1,4 +1,87 @@ -<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.8.4">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2018-10-08T09:13:06-04:00</updated><id>/feed.xml</id><entry><title type="html">Apache Arrow 0.10.0 Release</title><link href="/blog/2018/08/07/0.10.0-release/" rel="alternate" type="text/html" title="Apache Arrow 0.10.0 Release" /><published>2018-08-07T00:00:00-04:00</published><updated>2018-08-07T00:00:00-04:00</updated><id>/blog/2018/08/07/0.10.0-release</id><content type="html" xml:base="/blog/2018/08/07/0.10.0-release/"><!-- +<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.8.4">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2018-10-09T05:00:31-04:00</updated><id>/feed.xml</id><entry><title type="html">Apache Arrow 0.11.0 Release</title><link href="/blog/2018/10/09/0.11.0-release/" rel="alternate" type="text/html" title="Apache Arrow 0.11.0 Release" /><published>2018-10-09T00:00:00-04:00</published><updated>2018-10-09T00:00:00-04:00</updated><id>/blog/2018/10/09/0.11.0-release</id><content type="html" xml:base="/blog/2018/10/09/0.11.0-release/"><!-- + +--> + +<p>The Apache Arrow team is pleased to announce the 0.11.0 release. It is the +product of 2 months of development and includes <a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.11.0"><strong>287 resolved +issues</strong></a>.</p> + +<p>See the <a href="https://arrow.apache.org/install">Install Page</a> to learn how to get the libraries for your +platform. The <a href="https://arrow.apache.org/release/0.11.0.html">complete changelog</a> is also available.</p> + +<p>We discuss some highlights from the release and other project news in this +post.</p> + +<h2 id="arrow-flight-rpc-and-messaging-framework">Arrow Flight RPC and Messaging Framework</h2> + +<p>We are developing a new Arrow-native RPC framework, Arrow Flight, based on +<a href="http://grpc.io">gRPC</a> for high performance Arrow-based messaging. Through low-level +extensions to gRPCâs internal memory management, we are able to avoid expensive +parsing when receiving datasets over the wire, unlocking unprecedented levels +of performance in moving datasets from one machine to another. We will be +writing more about Flight on the Arrow blog in the future.</p> + +<p>Prototype implementations are available in Java and C++, and we will be focused +in the coming months on hardening the Flight RPC framework for enterprise-grade +production use cases.</p> + +<h2 id="parquet-and-arrow-c-communities-joining-forces">Parquet and Arrow C++ communities joining forces</h2> + +<p>After discussion over the last year, the Apache Arrow and Apache Parquet C++ +communities decide to merge the Parquet C++ codebase into the Arrow C++ +codebase and work together in a âmonorepoâ structure. This should result in +better developer productivity in core Parquet work as well as in Arrow +integration.</p> + +<p>Before this codebase merge, we had a circular dependency between the Arrow and +Parquet codebases, since the Parquet C++ library is used in the Arrow Python +library.</p> + +<h2 id="gandiva-llvm-expression-compiler-donation">Gandiva LLVM Expression Compiler donation</h2> + +<p><a href="http://dremio.com">Dremio Corporation</a> has donated the <a href="http://github.com/dremio/gandiva">Gandiva</a> LLVM expression compiler +to Apache Arrow. We will be working on cross-platform builds, packaging, and +language bindings (e.g. in Python) for Gandiva in the upcoming 0.12 release and +beyond. We will write more about Gandiva in the future.</p> + +<h2 id="parquet-c-glib-bindings-donation">Parquet C GLib Bindings Donation</h2> + +<p>PMC member <a href="https://github.com/kou">Kouhei Sutou</a> has donated GLib bindings for the Parquet C++ +libraries, which are designed to work together with the existing Arrow GLib +bindings.</p> + +<h2 id="c-csv-reader-project">C++ CSV Reader Project</h2> + +<p>We have begun developing a general purpose multithreaded CSV file parser in +C++. The purpose of this library is to parse and convert comma-separated text +files into Arrow columnar record batches as efficiently as possible. The +prototype version features Python bindings, and any language that can use the +C++ libraries (including C, R, and Ruby).</p> + +<h2 id="new-matlab-bindings">New MATLAB bindings</h2> + +<p><a href="https://mathworks.com">The MathWorks</a> has contributed an initial MEX file binding to the Arrow +C++ libraries. Initially, it is possible to read Arrow-based Feather files in +MATLAB. We are looking forward to seeing more developments for MATLAB users.</p> + +<h2 id="r-library-in-development">R Library in Development</h2> + +<p>The community has begun implementing <a href="https://github.com/apache/arrow/tree/master/r">R language bindings and interoperability</a> +with the Arrow C++ libraries. This will include support for zero-copy shared +memory IPC and other tools needed to improve R integration with Apache Spark +and more.</p> + +<h2 id="support-for-cuda-based-gpus-in-python">Support for CUDA-based GPUs in Python</h2> + +<p>This release includes Python bindings to the Arrow CUDA integration C++ +library. This work is targeting interoperability with <a href="https://github.com/numba/numba">Numba</a> and the <a href="http://gpuopenanalytics.com/">GPU +Open Analytics Initiative</a>.</p> + +<h2 id="upcoming-roadmap">Upcoming Roadmap</h2> + +<p>In the coming months, we will continue to make progress on many fronts, with +Gandiva packaging, expanded language support (especially in R), and improved +data access (e.g. CSV, Parquet files) in focus.</p></content><author><name>wesm</name></author></entry><entry><title type="html">Apache Arrow 0.10.0 Release</title><link href="/blog/2018/08/07/0.10.0-release/" rel="alternate" type="text/html" title="Apache Arrow 0.10.0 Release" /><published>2018-08-07T00:00:00-04:00</published><updated>2018-08-07T00:00:00-04:00</updated><id>/blog/2018/08/07/0.10.0-release</id><content type="html" xml:base="/blog/2018/08/07/0.10.0-release/"><!-- --> @@ -706,16 +789,16 @@ Benchmarking <code class="highlighter-rouge">ray.put</code> <span class="k">def</span> <span class="nf">benchmark_object</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">number</span><span class="o">=</span><span class="mi">10</span><span class="p">):</span> - <span class="c1"># Time serialization and deserialization for pickle. -</span> <span class="n">pickle_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> + <span class="c"># Time serialization and deserialization for pickle.</span> + <span class="n">pickle_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="k">lambda</span><span class="p">:</span> <span class="n">pickle</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">protocol</span><span class="o">=</span><span class="n">pickle</span><span class="o">.</span><span class="n">HIGHEST_PROTOCOL</span><span class="p">),</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span> <span class="n">serialized_obj</span> <span class="o">=</span> <span class="n">pickle</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">obj</span><span class="p">,</span> <span class="n">pickle</span><span class="o">.</span><span class="n">HIGHEST_PROTOCOL</span><span class="p">)</span> <span class="n">pickle_deserialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span><span class="k">lambda</span><span class="p">:</span> <span class="n">pickle</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">serialized_obj</span><span class="p">),</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span> - <span class="c1"># Time serialization and deserialization for Ray. -</span> <span class="n">ray_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> + <span class="c"># Time serialization and deserialization for Ray.</span> + <span class="n">ray_serialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> <span class="k">lambda</span><span class="p">:</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">serialize</span><span class="p">(</span><span class="n">obj</span><span class="p">)</span><span class="o">.</span><span class="n">to_buffer</span><span class="p">(),</span> <span class="n">number</span><span class="o">=</span><span class="n">number</span><span class="p">)</span> <span class="n">serialized_obj</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">serialize</span><span class="p">(</span><span class="n">obj</span><span class="p">)</span><span class="o">.</span><span class="n">to_buffer</span><span class="p">()</span> <span class="n">ray_deserialize</span> <span class="o">=</span> <span class="n">timeit</span><span class="o">.</span><span class="n">timeit</span><span class="p">(</span> @@ -746,7 +829,7 @@ Benchmarking <code class="highlighter-rouge">ray.put</code> <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">(</span><span class="n">fontsize</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span> <span class="n">plt</span><span class="o">.</span><span class="n">tight_layout</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">yticks</span><span class="p">(</span><span class="n">fontsize</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span> - <span class="n">plt</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'plot-'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">+</span> <span class="s">'.png'</span><span class="p">,</span> <span class="nb">format</span><span class="o">=</span><span class="s">'png'</span><span class="p">)</span> + <span class="n">plt</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'plot-'</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">+</span> <span class="s">'.png'</span><span class="p">,</span> <span class="n">format</span><span class="o">=</span><span class="s">'png'</span><span class="p">)</span> <span class="n">test_objects</span> <span class="o">=</span> <span class="p">[</span> @@ -1001,123 +1084,4 @@ milliseconds, or <code class="highlighter-rouge">'us'</code&g <p>We are still discussing the roadmap to 1.0.0 release on the <a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">developer mailing list</a>. The focus of the 1.0.0 release will likely be memory format stability and hardening integration tests across the remaining data types implemented in -Java and C++. Please join the discussion there.</p></content><author><name>wesm</name></author></entry><entry><title type="html">Plasma In-Memory Object Store</title><link href="/blog/2017/08/08/plasma-in-memory-object-store/" rel="alternate" type="text/html" title="Plasma In-Memory Object Store" /><published>2017-08-08T00:00:00-04:00</published><updated>2017-08-08T00:00:00-04:00</updated><id>/blog/2017/08/08/plasma-in-memory-object-store</id><content type="html" xml:base="/blog/2017/08/08/plasma-in-memory-object-store/"><!-- - ---> - -<p><em><a href="https://people.eecs.berkeley.edu/~pcmoritz/">Philipp Moritz</a> and <a href="http://www.robertnishihara.com">Robert Nishihara</a> are graduate students at UC - Berkeley.</em></p> - -<h2 id="plasma-a-high-performance-shared-memory-object-store">Plasma: A High-Performance Shared-Memory Object Store</h2> - -<h3 id="motivating-plasma">Motivating Plasma</h3> - -<p>This blog post presents Plasma, an in-memory object store that is being -developed as part of Apache Arrow. <strong>Plasma holds immutable objects in shared -memory so that they can be accessed efficiently by many clients across process -boundaries.</strong> In light of the trend toward larger and larger multicore machines, -Plasma enables critical performance optimizations in the big data regime.</p> - -<p>Plasma was initially developed as part of <a href="https://github.com/ray-project/ray">Ray</a>, and has recently been moved -to Apache Arrow in the hopes that it will be broadly useful.</p> - -<p>One of the goals of Apache Arrow is to serve as a common data layer enabling -zero-copy data exchange between multiple frameworks. A key component of this -vision is the use of off-heap memory management (via Plasma) for storing and -sharing Arrow-serialized objects between applications.</p> - -<p><strong>Expensive serialization and deserialization as well as data copying are a -common performance bottleneck in distributed computing.</strong> For example, a -Python-based execution framework that wishes to distribute computation across -multiple Python âworkerâ processes and then aggregate the results in a single -âdriverâ process may choose to serialize data using the built-in <code class="highlighter-rouge">pickle</code> -library. Assuming one Python process per core, each worker process would have to -copy and deserialize the data, resulting in excessive memory usage. The driver -process would then have to deserialize results from each of the workers, -resulting in a bottleneck.</p> - -<p>Using Plasma plus Arrow, the data being operated on would be placed in the -Plasma store once, and all of the workers would read the data without copying or -deserializing it (the workers would map the relevant region of memory into their -own address spaces). The workers would then put the results of their computation -back into the Plasma store, which the driver could then read and aggregate -without copying or deserializing the data.</p> - -<h3 id="the-plasma-api">The Plasma API:</h3> - -<p>Below we illustrate a subset of the API. The C++ API is documented more fully -<a href="https://github.com/apache/arrow/blob/master/cpp/apidoc/tutorials/plasma.md">here</a>, and the Python API is documented <a href="https://github.com/apache/arrow/blob/master/python/doc/source/plasma.rst">here</a>.</p> - -<p><strong>Object IDs:</strong> Each object is associated with a string of bytes.</p> - -<p><strong>Creating an object:</strong> Objects are stored in Plasma in two stages. First, the -object store <em>creates</em> the object by allocating a buffer for it. At this point, -the client can write to the buffer and construct the object within the allocated -buffer. When the client is done, the client <em>seals</em> the buffer making the object -immutable and making it available to other Plasma clients.</p> - -<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Create an object. -</span><span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> -<span class="n">object_size</span> <span class="o">=</span> <span class="mi">1000</span> -<span class="nb">buffer</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">client</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">object_id</span><span class="p">,</span> <span class="n">object_size</span><span class="p">))</span> - -<span class="c1"># Write to the buffer. -</span><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1000</span><span class="p">):</span> - <span class="nb">buffer</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span> - -<span class="c1"># Seal the object making it immutable and available to other clients. -</span><span class="n">client</span><span class="o">.</span><span class="n">seal</span><span class="p">(</span><span class="n">object_id</span><span class="p">)</span> -</code></pre></div></div> - -<p><strong>Getting an object:</strong> After an object has been sealed, any client who knows the -object ID can get the object.</p> - -<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Get the object from the store. This blocks until the object has been sealed. -</span><span class="n">object_id</span> <span class="o">=</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">plasma</span><span class="o">.</span><span class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> <span class="o">*</span> <span class="n">b</span><span class="s">'a'</span><span class="p">)</span> -<span class="p">[</span><span class="n">buff</span><span class="p">]</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">get</span><span class="p">([</span><span class="n">object_id</span><span class="p">])</span> -<span class="nb">buffer</span> <span class="o">=</span> <span class="nb">memoryview</span><span class="p">(</span><span class="n">buff</span><span class="p">)</span> -</code></pre></div></div> - -<p>If the object has not been sealed yet, then the call to <code class="highlighter-rouge">client.get</code> will block -until the object has been sealed.</p> - -<h3 id="a-sorting-application">A sorting application</h3> - -<p>To illustrate the benefits of Plasma, we demonstrate an <strong>11x speedup</strong> (on a -machine with 20 physical cores) for sorting a large pandas DataFrame (one -billion entries). The baseline is the built-in pandas sort function, which sorts -the DataFrame in 477 seconds. To leverage multiple cores, we implement the -following standard distributed sorting scheme.</p> - -<ul> - <li>We assume that the data is partitioned across K pandas DataFrames and that -each one already lives in the Plasma store.</li> - <li>We subsample the data, sort the subsampled data, and use the result to define -L non-overlapping buckets.</li> - <li>For each of the K data partitions and each of the L buckets, we find the -subset of the data partition that falls in the bucket, and we sort that -subset.</li> - <li>For each of the L buckets, we gather all of the K sorted subsets that fall in -that bucket.</li> - <li>For each of the L buckets, we merge the corresponding K sorted subsets.</li> - <li>We turn each bucket into a pandas DataFrame and place it in the Plasma store.</li> -</ul> - -<p>Using this scheme, we can sort the DataFrame (the data starts and ends in the -Plasma store), in 44 seconds, giving an 11x speedup over the baseline.</p> - -<h3 id="design">Design</h3> - -<p>The Plasma store runs as a separate process. It is written in C++ and is -designed as a single-threaded event loop based on the <a href="https://redis.io/">Redis</a> event loop library. -The plasma client library can be linked into applications. Clients communicate -with the Plasma store via messages serialized using <a href="https://google.github.io/flatbuffers/">Google Flatbuffers</a>.</p> - -<h3 id="call-for-contributions">Call for contributions</h3> - -<p>Plasma is a work in progress, and the API is currently unstable. Today Plasma is -primarily used in <a href="https://github.com/ray-project/ray">Ray</a> as an in-memory cache for Arrow serialized objects. -We are looking for a broader set of use cases to help refine Plasmaâs API. In -addition, we are looking for contributions in a variety of areas including -improving performance and building other language bindings. Please let us know -if you are interested in getting involved with the project.</p></content><author><name>Philipp Moritz and Robert Nishihara</name></author></entry></feed> \ No newline at end of file +Java and C++. Please join the discussion there.</p></content><author><name>wesm</name></author></entry></feed> \ No newline at end of file
