http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/08/15/0.6.0-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/08/15/0.6.0-release/index.html b/blog/2017/08/15/0.6.0-release/index.html new file mode 100644 index 0000000..4276b8c --- /dev/null +++ b/blog/2017/08/15/0.6.0-release/index.html @@ -0,0 +1,234 @@ +<!DOCTYPE html> +<html lang="en-US"> + <head> + <meta charset="UTF-8"> + <title>Apache Arrow Homepage</title> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <meta name="generator" content="Jekyll v3.4.3"> + <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> + <link rel="icon" type="image/x-icon" href="/favicon.ico"> + + <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900"> + + <link href="/css/main.css" rel="stylesheet"> + <link href="/css/syntax.css" rel="stylesheet"> + <script src="https://code.jquery.com/jquery-3.2.1.min.js" + integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4=" + crossorigin="anonymous"></script> + <script src="/assets/javascripts/bootstrap.min.js"></script> + + <!-- Global Site Tag (gtag.js) - Google Analytics --> +<script async src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1"></script> +<script> + window.dataLayer = window.dataLayer || []; + function gtag(){dataLayer.push(arguments)}; + gtag('js', new Date()); + + gtag('config', 'UA-107500873-1'); +</script> + + + </head> + + + +<body class="wrap"> + <div class="container"> + <nav class="navbar navbar-default"> + <div class="container-fluid"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#arrow-navbar"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a class="navbar-brand" href="/">Apache Arrow™ </a> + </div> + + <!-- Collect the nav links, forms, and other content for toggling --> + <div class="collapse navbar-collapse" id="arrow-navbar"> + <ul class="nav navbar-nav"> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Project Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/install/">Install</a></li> + <li><a href="/blog/">Blog</a></li> + <li><a href="/release/">Releases</a></li> + <li><a href="https://issues.apache.org/jira/browse/ARROW">Issue Tracker</a></li> + <li><a href="https://github.com/apache/arrow">Source Code</a></li> + <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> + <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> + <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Specification<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/memory_layout.html">Memory Layout</a></li> + <li><a href="/docs/metadata.html">Metadata</a></li> + <li><a href="/docs/ipc.html">Messaging / IPC</a></li> + </ul> + </li> + + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Documentation<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/python">Python</a></li> + <li><a href="/docs/cpp">C++ API</a></li> + <li><a href="/docs/java">Java API</a></li> + <li><a href="/docs/c_glib">C GLib API</a></li> + </ul> + </li> + <!-- <li><a href="/blog">Blog</a></li> --> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">ASF Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="http://www.apache.org/">ASF Website</a></li> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Donate</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </li> + </ul> + <a href="http://www.apache.org/"> + <img style="float:right;" src="/img/asf_logo.svg" width="120px"/> + </a> + </div><!-- /.navbar-collapse --> + </div> + </nav> + + + <h2> + Apache Arrow 0.6.0 Release + <a href="/blog/2017/08/15/0.6.0-release/" class="permalink" title="Permalink">â</a> + </h2> + + + + <div class="panel"> + <div class="panel-body"> + <div> + <span class="label label-default">Published</span> + <span class="published"> + <i class="fa fa-calendar"></i> + 15 Aug 2017 + </span> + </div> + <div> + <span class="label label-default">By</span> + <a href="http://wesmckinney.com"><i class="fa fa-user"></i> Wes McKinney (wesm)</a> + </div> + </div> + </div> + + <!-- + +--> + +<p>The Apache Arrow team is pleased to announce the 0.6.0 release. It includes +<a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.6.0"><strong>90 resolved JIRAs</strong></a> with the new Plasma shared memory object store, and +improvements and bug fixes to the various language implementations. The Arrow +memory format remains stable since the 0.3.x release.</p> + +<p>See the <a href="http://arrow.apache.org/install">Install Page</a> to learn how to get the libraries for your +platform. The <a href="http://arrow.apache.org/release/0.6.0.html">complete changelog</a> is also available.</p> + +<h2 id="plasma-shared-memory-object-store">Plasma Shared Memory Object Store</h2> + +<p>This release includes the <a href="http://arrow.apache.org/blog/2017/08/08/plasma-in-memory-object-store/">Plasma Store</a>, which you can read more about in +the linked blog post. This system was originally developed as part of the <a href="https://ray-project.github.io/ray/">Ray +Project</a> at the <a href="https://rise.cs.berkeley.edu/">UC Berkeley RISELab</a>. We recognized that Plasma would be +highly valuable to the Arrow community as a tool for shared memory management +and zero-copy deserialization. Additionally, we believe we will be able to +develop a stronger software stack through sharing of IO and buffer management +code.</p> + +<p>The Plasma store is a server application which runs as a separate process. A +reference C++ client, with Python bindings, is made available in this +release. Clients can be developed in Java or other languages in the future to +enable simple sharing of complex datasets through shared memory.</p> + +<h2 id="arrow-format-addition-map-type">Arrow Format Addition: Map type</h2> + +<p>We added a Map logical type to represent ordered and unordered maps +in-memory. This corresponds to the <code class="highlighter-rouge">MAP</code> logical type annotation in the Parquet +format (where maps are represented as repeated structs).</p> + +<p>Map is represented as a list of structs. It is the first example of a logical +type whose physical representation is a nested type. We have not yet created +implementations of Map containers in any of the implementations, but this can +be done in a future release.</p> + +<p>As an example, the Python data:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>data = [{'a': 1, 'bb': 2, 'cc': 3}, {'dddd': 4}] +</code></pre> +</div> + +<p>Could be represented in an Arrow <code class="highlighter-rouge">Map<String, Int32></code> as:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>Map<String, Int32> = List<Struct<keys: String, values: Int32>> + is_valid: [true, true] + offsets: [0, 3, 4] + values: Struct<keys: String, values: Int32> + children: + - keys: String + is_valid: [true, true, true, true] + offsets: [0, 1, 3, 5, 9] + data: abbccdddd + - values: Int32 + is_valid: [true, true, true, true] + data: [1, 2, 3, 4] +</code></pre> +</div> +<h2 id="python-changes">Python Changes</h2> + +<p>Some highlights of Python development outside of bug fixes and general API +improvements include:</p> + +<ul> + <li>New <code class="highlighter-rouge">strings_to_categorical=True</code> option when calling <code class="highlighter-rouge">Table.to_pandas</code> will +yield pandas <code class="highlighter-rouge">Categorical</code> types from Arrow binary and string columns</li> + <li>Expanded Hadoop Filesystem (HDFS) functionality to improve compatibility with +Dask and other HDFS-aware Python libraries.</li> + <li>s3fs and other Dask-oriented filesystems can now be used with +<code class="highlighter-rouge">pyarrow.parquet.ParquetDataset</code></li> + <li>More graceful handling of pandasâs nanosecond timestamps when writing to +Parquet format. You can now pass <code class="highlighter-rouge">coerce_timestamps='ms'</code> to cast to +milliseconds, or <code class="highlighter-rouge">'us'</code> for microseconds.</li> +</ul> + +<h2 id="toward-arrow-100-and-beyond">Toward Arrow 1.0.0 and Beyond</h2> + +<p>We are still discussing the roadmap to 1.0.0 release on the <a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">developer mailing +list</a>. The focus of the 1.0.0 release will likely be memory format stability +and hardening integration tests across the remaining data types implemented in +Java and C++. Please join the discussion there.</p> + + + + <hr/> +<footer class="footer"> + <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> + <p>© 2017 Apache Software Foundation</p> +</footer> + + </div> +</body> +</html>
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/09/18/0.7.0-release/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/09/18/0.7.0-release/index.html b/blog/2017/09/18/0.7.0-release/index.html new file mode 100644 index 0000000..6504954 --- /dev/null +++ b/blog/2017/09/18/0.7.0-release/index.html @@ -0,0 +1,311 @@ +<!DOCTYPE html> +<html lang="en-US"> + <head> + <meta charset="UTF-8"> + <title>Apache Arrow Homepage</title> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <meta name="generator" content="Jekyll v3.4.3"> + <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> + <link rel="icon" type="image/x-icon" href="/favicon.ico"> + + <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900"> + + <link href="/css/main.css" rel="stylesheet"> + <link href="/css/syntax.css" rel="stylesheet"> + <script src="https://code.jquery.com/jquery-3.2.1.min.js" + integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4=" + crossorigin="anonymous"></script> + <script src="/assets/javascripts/bootstrap.min.js"></script> + + <!-- Global Site Tag (gtag.js) - Google Analytics --> +<script async src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1"></script> +<script> + window.dataLayer = window.dataLayer || []; + function gtag(){dataLayer.push(arguments)}; + gtag('js', new Date()); + + gtag('config', 'UA-107500873-1'); +</script> + + + </head> + + + +<body class="wrap"> + <div class="container"> + <nav class="navbar navbar-default"> + <div class="container-fluid"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#arrow-navbar"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a class="navbar-brand" href="/">Apache Arrow™ </a> + </div> + + <!-- Collect the nav links, forms, and other content for toggling --> + <div class="collapse navbar-collapse" id="arrow-navbar"> + <ul class="nav navbar-nav"> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Project Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/install/">Install</a></li> + <li><a href="/blog/">Blog</a></li> + <li><a href="/release/">Releases</a></li> + <li><a href="https://issues.apache.org/jira/browse/ARROW">Issue Tracker</a></li> + <li><a href="https://github.com/apache/arrow">Source Code</a></li> + <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> + <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> + <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Specification<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/memory_layout.html">Memory Layout</a></li> + <li><a href="/docs/metadata.html">Metadata</a></li> + <li><a href="/docs/ipc.html">Messaging / IPC</a></li> + </ul> + </li> + + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Documentation<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/python">Python</a></li> + <li><a href="/docs/cpp">C++ API</a></li> + <li><a href="/docs/java">Java API</a></li> + <li><a href="/docs/c_glib">C GLib API</a></li> + </ul> + </li> + <!-- <li><a href="/blog">Blog</a></li> --> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">ASF Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="http://www.apache.org/">ASF Website</a></li> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Donate</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </li> + </ul> + <a href="http://www.apache.org/"> + <img style="float:right;" src="/img/asf_logo.svg" width="120px"/> + </a> + </div><!-- /.navbar-collapse --> + </div> + </nav> + + + <h2> + Apache Arrow 0.7.0 Release + <a href="/blog/2017/09/18/0.7.0-release/" class="permalink" title="Permalink">â</a> + </h2> + + + + <div class="panel"> + <div class="panel-body"> + <div> + <span class="label label-default">Published</span> + <span class="published"> + <i class="fa fa-calendar"></i> + 18 Sep 2017 + </span> + </div> + <div> + <span class="label label-default">By</span> + <a href="http://wesmckinney.com"><i class="fa fa-user"></i> Wes McKinney (wesm)</a> + </div> + </div> + </div> + + <!-- + +--> + +<p>The Apache Arrow team is pleased to announce the 0.7.0 release. It includes +<a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.7.0"><strong>133 resolved JIRAs</strong></a> many new features and bug fixes to the various +language implementations. The Arrow memory format remains stable since the +0.3.x release.</p> + +<p>See the <a href="http://arrow.apache.org/install">Install Page</a> to learn how to get the libraries for your +platform. The <a href="http://arrow.apache.org/release/0.7.0.html">complete changelog</a> is also available.</p> + +<p>We include some highlights from the release in this post.</p> + +<h2 id="new-pmc-member-kouhei-sutou">New PMC Member: Kouhei Sutou</h2> + +<p>Since the last release we have added <a href="https://github.com/kou">Kou</a> to the Arrow Project Management +Committee. He is also a PMC for Apache Subversion, and a major contributor to +many other open source projects.</p> + +<p>As an active member of the Ruby community in Japan, Kou has been developing the +GLib-based C bindings for Arrow with associated Ruby wrappers, to enable Ruby +users to benefit from the work thatâs happening in Apache Arrow.</p> + +<p>We are excited to be collaborating with the Ruby community on shared +infrastructure for in-memory analytics and data science.</p> + +<h2 id="expanded-javascript-typescript-implementation">Expanded JavaScript (TypeScript) Implementation</h2> + +<p><a href="https://github.com/trxcllnt">Paul Taylor</a> from the <a href="https://github.com/netflix/falcor">Falcor</a> and <a href="http://reactivex.io">ReactiveX</a> projects has worked to +expand the JavaScript implementation (which is written in TypeScript), using +the latest in modern JavaScript build and packaging technology. We are looking +forward to building out the JS implementation and bringing it up to full +functionality with the C++ and Java implementations.</p> + +<p>We are looking for more JavaScript developers to join the project and work +together to make Arrow for JS work well with many kinds of front end use cases, +like real time data visualization.</p> + +<h2 id="type-casting-for-c-and-python">Type casting for C++ and Python</h2> + +<p>As part of longer-term efforts to build an Arrow-native in-memory analytics +library, we implemented a variety of type conversion functions. These functions +are essential in ETL tasks when conforming one table schema to another. These +are similar to the <code class="highlighter-rouge">astype</code> function in NumPy.</p> + +<div class="language-python highlighter-rouge"><pre class="highlight"><code><span class="n">In</span> <span class="p">[</span><span class="mi">17</span><span class="p">]:</span> <span class="kn">import</span> <span class="nn">pyarrow</span> <span class="kn">as</span> <span class="nn">pa</span> + +<span class="n">In</span> <span class="p">[</span><span class="mi">18</span><span class="p">]:</span> <span class="n">arr</span> <span class="o">=</span> <span class="n">pa</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="bp">True</span><span class="p">,</span> <span class="bp">False</span><span class="p">,</span> <span class="bp">None</span><span class="p">,</span> <span class="bp">True</span><span class="p">])</span> + +<span class="n">In</span> <span class="p">[</span><span class="mi">19</span><span class="p">]:</span> <span class="n">arr</span> +<span class="n">Out</span><span class="p">[</span><span class="mi">19</span><span class="p">]:</span> +<span class="o"><</span><span class="n">pyarrow</span><span class="o">.</span><span class="n">lib</span><span class="o">.</span><span class="n">BooleanArray</span> <span class="nb">object</span> <span class="n">at</span> <span class="mh">0x7ff6fb069b88</span><span class="o">></span> +<span class="p">[</span> + <span class="bp">True</span><span class="p">,</span> + <span class="bp">False</span><span class="p">,</span> + <span class="n">NA</span><span class="p">,</span> + <span class="bp">True</span> +<span class="p">]</span> + +<span class="n">In</span> <span class="p">[</span><span class="mi">20</span><span class="p">]:</span> <span class="n">arr</span><span class="o">.</span><span class="n">cast</span><span class="p">(</span><span class="n">pa</span><span class="o">.</span><span class="n">int32</span><span class="p">())</span> +<span class="n">Out</span><span class="p">[</span><span class="mi">20</span><span class="p">]:</span> +<span class="o"><</span><span class="n">pyarrow</span><span class="o">.</span><span class="n">lib</span><span class="o">.</span><span class="n">Int32Array</span> <span class="nb">object</span> <span class="n">at</span> <span class="mh">0x7ff6fb0383b8</span><span class="o">></span> +<span class="p">[</span> + <span class="mi">1</span><span class="p">,</span> + <span class="mi">0</span><span class="p">,</span> + <span class="n">NA</span><span class="p">,</span> + <span class="mi">1</span> +<span class="p">]</span> +</code></pre> +</div> + +<p>Over time these will expand to support as many input-and-output type +combinations with optimized conversions.</p> + +<h2 id="new-arrow-gpu-cuda-extension-library-for-c">New Arrow GPU (CUDA) Extension Library for C++</h2> + +<p>To help with GPU-related projects using Arrow, like the <a href="http://gpuopenanalytics.com/">GPU Open Analytics +Initiative</a>, we have started a C++ add-on library to simplify Arrow memory +management on CUDA-enabled graphics cards. We would like to expand this to +include a library of reusable CUDA kernel functions for GPU analytics on Arrow +columnar memory.</p> + +<p>For example, we could write a record batch from CPU memory to GPU device memory +like so (some error checking omitted):</p> + +<div class="language-c++ highlighter-rouge"><pre class="highlight"><code><span class="cp">#include <arrow/api.h> +#include <arrow/gpu/cuda_api.h> +</span> +<span class="k">using</span> <span class="k">namespace</span> <span class="n">arrow</span><span class="p">;</span> + +<span class="n">gpu</span><span class="o">::</span><span class="n">CudaDeviceManager</span><span class="o">*</span> <span class="n">manager</span><span class="p">;</span> +<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">gpu</span><span class="o">::</span><span class="n">CudaContext</span><span class="o">></span> <span class="n">context</span><span class="p">;</span> + +<span class="n">gpu</span><span class="o">::</span><span class="n">CudaDeviceManager</span><span class="o">::</span><span class="n">GetInstance</span><span class="p">(</span><span class="o">&</span><span class="n">manager</span><span class="p">)</span> +<span class="n">manager_</span><span class="o">-></span><span class="n">GetContext</span><span class="p">(</span><span class="n">kGpuNumber</span><span class="p">,</span> <span class="o">&</span><span class="n">context</span><span class="p">);</span> + +<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">RecordBatch</span><span class="o">></span> <span class="n">batch</span> <span class="o">=</span> <span class="n">GetCpuData</span><span class="p">();</span> + +<span class="n">std</span><span class="o">::</span><span class="n">shared_ptr</span><span class="o"><</span><span class="n">gpu</span><span class="o">::</span><span class="n">CudaBuffer</span><span class="o">></span> <span class="n">device_serialized</span><span class="p">;</span> +<span class="n">gpu</span><span class="o">::</span><span class="n">SerializeRecordBatch</span><span class="p">(</span><span class="o">*</span><span class="n">batch</span><span class="p">,</span> <span class="n">context_</span><span class="p">.</span><span class="n">get</span><span class="p">(),</span> <span class="o">&</span><span class="n">device_serialized</span><span class="p">));</span> +</code></pre> +</div> + +<p>We can then âreadâ the GPU record batch, but the returned <code class="highlighter-rouge">arrow::RecordBatch</code> +internally will contain GPU device pointers that you can use for CUDA kernel +calls:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>std::shared_ptr<RecordBatch> device_batch; +gpu::ReadRecordBatch(batch->schema(), device_serialized, + default_memory_pool(), &device_batch)); + +// Now run some CUDA kernels on device_batch +</code></pre> +</div> + +<h2 id="decimal-integration-tests">Decimal Integration Tests</h2> + +<p><a href="http://github.com/cpcloud">Phillip Cloud</a> has been working on decimal support in C++ to enable Parquet +read/write support in C++ and Python, and also end-to-end testing against the +Arrow Java libraries.</p> + +<p>In the upcoming releases, we hope to complete the remaining data types that +need end-to-end testing between Java and C++:</p> + +<ul> + <li>Fixed size lists (variable-size lists already implemented)</li> + <li>Fixes size binary</li> + <li>Unions</li> + <li>Maps</li> + <li>Time intervals</li> +</ul> + +<h2 id="other-notable-python-changes">Other Notable Python Changes</h2> + +<p>Some highlights of Python development outside of bug fixes and general API +improvements include:</p> + +<ul> + <li>Simplified <code class="highlighter-rouge">put</code> and <code class="highlighter-rouge">get</code> arbitrary Python objects in Plasma objects</li> + <li><a href="http://arrow.apache.org/docs/python/ipc.html">High-speed, memory efficient object serialization</a>. This is important +enough that we will likely write a dedicated blog post about it.</li> + <li>New <code class="highlighter-rouge">flavor='spark'</code> option to <code class="highlighter-rouge">pyarrow.parquet.write_table</code> to enable easy +writing of Parquet files maximized for Spark compatibility</li> + <li><code class="highlighter-rouge">parquet.write_to_dataset</code> function with support for partitioned writes</li> + <li>Improved support for Dask filesystems</li> + <li>Improved Python usability for IPC: read and write schemas and record batches +more easily. See the <a href="http://arrow.apache.org/docs/python/api.html">API docs</a> for more about these.</li> +</ul> + +<h2 id="the-road-ahead">The Road Ahead</h2> + +<p>Upcoming Arrow releases will continue to expand the project to cover more use +cases. In addition to completing end-to-end testing for all the major data +types, some of us will be shifting attention to building Arrow-native in-memory +analytics libraries.</p> + +<p>We are looking for more JavaScript, R, and other programming language +developers to join the project and expand the available implementations and +bindings to more languages.</p> + + + + <hr/> +<footer class="footer"> + <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> + <p>© 2017 Apache Software Foundation</p> +</footer> + + </div> +</body> +</html> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html ---------------------------------------------------------------------- diff --git a/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html b/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html index 20b8f1d..debd6c6 100644 --- a/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html +++ b/blog/2017/10/15/fast-python-serialization-with-ray-and-arrow/index.html @@ -64,6 +64,7 @@ <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> </ul> </li> <li class="dropdown"> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/index.html ---------------------------------------------------------------------- diff --git a/blog/index.html b/blog/index.html index a295e06..6e60f8c 100644 --- a/blog/index.html +++ b/blog/index.html @@ -63,6 +63,7 @@ <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> </ul> </li> <li class="dropdown"> @@ -432,7 +433,7 @@ Benchmarking <code class="highlighter-rouge">ray.put</code> and <code class="hig <div class="container"> <h2> Apache Arrow 0.7.0 Release - <a href="/blog/2017/09/19/0.7.0-release/" class="permalink" title="Permalink">â</a> + <a href="/blog/2017/09/18/0.7.0-release/" class="permalink" title="Permalink">â</a> </h2> @@ -443,7 +444,7 @@ Benchmarking <code class="highlighter-rouge">ray.put</code> and <code class="hig <span class="label label-default">Published</span> <span class="published"> <i class="fa fa-calendar"></i> - 19 Sep 2017 + 18 Sep 2017 </span> </div> <div> @@ -623,7 +624,7 @@ bindings to more languages.</p> <div class="container"> <h2> Apache Arrow 0.6.0 Release - <a href="/blog/2017/08/16/0.6.0-release/" class="permalink" title="Permalink">â</a> + <a href="/blog/2017/08/15/0.6.0-release/" class="permalink" title="Permalink">â</a> </h2> @@ -634,7 +635,7 @@ bindings to more languages.</p> <span class="label label-default">Published</span> <span class="published"> <i class="fa fa-calendar"></i> - 16 Aug 2017 + 15 Aug 2017 </span> </div> <div> @@ -737,7 +738,7 @@ Java and C++. Please join the discussion there.</p> <div class="container"> <h2> Plasma In-Memory Object Store - <a href="/blog/2017/08/08/plasma-in-memory-object-store/" class="permalink" title="Permalink">â</a> + <a href="/blog/2017/08/07/plasma-in-memory-object-store/" class="permalink" title="Permalink">â</a> </h2> @@ -748,7 +749,7 @@ Java and C++. Please join the discussion there.</p> <span class="label label-default">Published</span> <span class="published"> <i class="fa fa-calendar"></i> - 08 Aug 2017 + 07 Aug 2017 </span> </div> <div> @@ -947,7 +948,7 @@ the conversion to Arrow data can be done on the JVM and pushed back for the Spar executors to perform in parallel, drastically reducing the load on the driver.</p> <p>As of the merging of <a href="https://issues.apache.org/jira/browse/SPARK-13534">SPARK-13534</a>, the use of Arrow when calling <code class="highlighter-rouge">toPandas()</code> -needs to be enabled by setting the SQLConf âspark.sql.execution.arrow.enableâ to +needs to be enabled by setting the SQLConf âspark.sql.execution.arrow.enabledâ to âtrueâ. Letâs look at a simple usage example.</p> <div class="highlighter-rouge"><pre class="highlight"><code>Welcome to @@ -973,7 +974,7 @@ In [2]: %time pdf = df.toPandas() CPU times: user 17.4 s, sys: 792 ms, total: 18.1 s Wall time: 20.7 s -In [3]: spark.conf.set("spark.sql.execution.arrow.enable", "true") +In [3]: spark.conf.set("spark.sql.execution.arrow.enabled", "true") In [4]: %time pdf = df.toPandas() CPU times: user 40 ms, sys: 32 ms, total: 72 ms @@ -1008,7 +1009,7 @@ It is planned to add pyarrow as a pyspark dependency so that <p>Currently, the controlling SQLConf is disabled by default. This can be enabled programmatically as in the example above or by adding the line -âspark.sql.execution.arrow.enable=trueâ to <code class="highlighter-rouge">SPARK_HOME/conf/spark-defaults.conf</code>.</p> +âspark.sql.execution.arrow.enabled=trueâ to <code class="highlighter-rouge">SPARK_HOME/conf/spark-defaults.conf</code>.</p> <p>Also, not all Spark data types are currently supported and limited to primitive types. Expanded type support is in the works and expected to also be in the Spark @@ -1041,7 +1042,7 @@ helped push this effort forwards.</p> <div class="container"> <h2> Apache Arrow 0.5.0 Release - <a href="/blog/2017/07/25/0.5.0-release/" class="permalink" title="Permalink">â</a> + <a href="/blog/2017/07/24/0.5.0-release/" class="permalink" title="Permalink">â</a> </h2> @@ -1052,7 +1053,7 @@ helped push this effort forwards.</p> <span class="label label-default">Published</span> <span class="published"> <i class="fa fa-calendar"></i> - 25 Jul 2017 + 24 Jul 2017 </span> </div> <div> @@ -1334,7 +1335,7 @@ conda install turbodbc -c conda-forge <div class="container"> <h2> Apache Arrow 0.4.0 Release - <a href="/blog/2017/05/23/0.4.0-release/" class="permalink" title="Permalink">â</a> + <a href="/blog/2017/05/22/0.4.0-release/" class="permalink" title="Permalink">â</a> </h2> @@ -1345,7 +1346,7 @@ conda install turbodbc -c conda-forge <span class="label label-default">Published</span> <span class="published"> <i class="fa fa-calendar"></i> - 23 May 2017 + 22 May 2017 </span> </div> <div> @@ -1439,7 +1440,7 @@ Linux. We are working on providing binary wheel installers for Windows as well.< <div class="container"> <h2> Apache Arrow 0.3.0 Release - <a href="/blog/2017/05/08/0.3-release/" class="permalink" title="Permalink">â</a> + <a href="/blog/2017/05/07/0.3-release/" class="permalink" title="Permalink">â</a> </h2> @@ -1450,7 +1451,7 @@ Linux. We are working on providing binary wheel installers for Windows as well.< <span class="label label-default">Published</span> <span class="published"> <i class="fa fa-calendar"></i> - 08 May 2017 + 07 May 2017 </span> </div> <div> @@ -1463,7 +1464,7 @@ Linux. We are working on providing binary wheel installers for Windows as well.< --> -<p>Translations: <a href="/blog/2017/05/08/0.3-release-japanese/">æ¥æ¬èª</a></p> +<p>Translations: <a href="/blog/2017/05/07/0.3-release-japanese/">æ¥æ¬èª</a></p> <p>The Apache Arrow team is pleased to announce the 0.3.0 release of the project. It is the product of an intense 10 weeks of development since the http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/committers/index.html ---------------------------------------------------------------------- diff --git a/committers/index.html b/committers/index.html index 5e8fac6..738495c 100644 --- a/committers/index.html +++ b/committers/index.html @@ -63,6 +63,7 @@ <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> </ul> </li> <li class="dropdown">
