This is an automated email from the ASF dual-hosted git repository.
wesm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 9895181 ARROW-1934: [Website] 0.8.0 release highlights blog post
9895181 is described below
commit 9895181610c0a5ef1ba836300ea2036e1a3e5621
Author: Wes McKinney <[email protected]>
AuthorDate: Tue Dec 19 10:29:30 2017 -0500
ARROW-1934: [Website] 0.8.0 release highlights blog post
I took a hack at this after poring through the changelog, others please let
me know if you'd like to add or change anything. I need to incorporate Sidd's
blog post and add a link to that here. I can publish all of this sometime
tomorrow morning New York time and post to social media etc.
Author: Wes McKinney <[email protected]>
Closes #1432 from wesm/ARROW-1934 and squashes the following commits:
4aa6bc6a [Wes McKinney] Tweaks for publication
da9e65d4 [Wes McKinney] Start drafting 0.8.0 release blog post
---
site/_posts/2017-12-19-0.8.0-release.md | 192 ++++++++++++++++++++++++++++++++
1 file changed, 192 insertions(+)
diff --git a/site/_posts/2017-12-19-0.8.0-release.md
b/site/_posts/2017-12-19-0.8.0-release.md
new file mode 100644
index 0000000..c65d308
--- /dev/null
+++ b/site/_posts/2017-12-19-0.8.0-release.md
@@ -0,0 +1,192 @@
+---
+layout: post
+title: "Apache Arrow 0.8.0 Release"
+date: "2017-12-19 00:01:00 -0400"
+author: wesm
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements. See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+The Apache Arrow team is pleased to announce the 0.8.0 release. It is the
+product of 10 weeks of development andincludes [**286 resolved JIRAs**][1] with
+many new features and bug fixes to the various language implementations. This
+is the largest release since 0.3.0 earlier this year.
+
+As part of work towards a stabilizing the Arrow format and making a 1.0.0
+release sometime in 2018, we made a series of backwards-incompatible changes to
+the serialized Arrow metadata that requires Arrow readers and writers (0.7.1
+and earlier) to upgrade in order to be compatible with 0.8.0 and higher. We
+expect future backwards-incompatible changes to be rare going forward.
+
+See the [Install Page][2] to learn how to get the libraries for your
+platform. The [complete changelog][3] is also available.
+
+We discuss some highlights from the release and other project news in this
+post.
+
+## Projects "Powered By" Apache Arrow
+
+A growing ecosystem of projects are using Arrow to solve in-memory analytics
+and data interchange problems. We have added a new [Powered By][18] page to the
+Arrow website where we can acknowledge open source projects and companies which
+are using Arrow. If you would like to add your project to the list as an Arrow
+user, please let us know.
+
+## New Arrow committers
+
+Since the last release, we have added 5 new Apache committers:
+
+* [Phillip Cloud][5], who has mainly contributed to C++ and Python
+* [Bryan Cutler][13], who has mainly contributed to Java and Spark integration
+* [Li Jin][14], who has mainly contributed to Java and Spark integration
+* [Paul Taylor][4], who has mainly contributed to JavaScript
+* [Siddharth Teotia][15], who has mainly contributed to Java
+
+Welcome to the Arrow team, and thank you for your contributions!
+
+## Improved Java vector API, performance improvements
+
+Siddharth Teotia led efforts to revamp the Java vector API to make things
+simpler and faster. As part of this, we removed the dichotomy between nullable
+and non-nullable vectors.
+
+See [Sidd's blog post][10] for more about these changes.
+
+## Decimal support in C++, Python, consistency with Java
+
+[Phillip Cloud][5] led efforts this release to harden details about exact
+decimal values in the Arrow specification and ensure a consistent
+implementation across Java, C++, and Python.
+
+Arrow now supports decimals represented internally as a 128-bit little-endian
+integer, with a set precision and scale (as defined in many SQL-based
+systems). As part of this work, we needed to change Java's internal
+representation from big- to little-endian.
+
+We are now integration testing decimals between Java, C++, and Python, which
+will facilitate Arrow adoption in Apache Spark and other systems that use both
+Java and Python.
+
+Decimal data can now be read and written by the [Apache Parquet C++
+library][6], including via pyarrow.
+
+In the future, we may implement support for smaller-precision decimals
+represented by 32- or 64-bit integers.
+
+## C++ improvements: expanded kernels library and more
+
+In C++, we have continued developing the new `arrow::compute` submodule
+consisting of native computation fuctions for Arrow data. New contributor
+[Licht Takeuchi][7] helped expand the supported types for type casting in
+`compute::Cast`. We have also implemented new kernels `Unique` and
+`DictionaryEncode` for computing the distinct elements of an array and
+dictionary encoding (conversion to categorical), respectively.
+
+We expect the C++ computation "kernel" library to be a major expansion area for
+the project over the next year and beyond. Here, we can also implement SIMD-
+and GPU-accelerated versions of basic in-memory analytics functionality.
+
+As minor breaking API change in C++, we have made the `RecordBatch` and `Table`
+APIs "virtual" or abstract interfaces, to enable different implementations of a
+record batch or table which conform to the standard interface. This will help
+enable features like lazy IO or column loading.
+
+There was significant work improving the C++ library generally and supporting
+work happening in Python and C. See the change log for full details.
+
+## GLib C improvements: Meson build, GPU support
+
+Developing of the GLib-based C bindings has generally tracked work happening in
+the C++ library. These bindings are being used to develop [data science tools
+for Ruby users][8] and elsewhere.
+
+The C bindings now support the [Meson build system][9] in addition to
+autotools, which enables them to be built on Windows.
+
+The Arrow GPU extension library is now also supported in the C bindings.
+
+## JavaScript: first independent release on NPM
+
+[Brian Hulette][11] and [Paul Taylor][4] have been continuing to drive efforts
+on the TypeScript-based JavaScript implementation.
+
+Since the last release, we made a first JavaScript-only Apache release, version
+0.2.0, which is [now available on NPM][12]. We decided to make separate
+JavaScript releases to enable the JS library to release more frequently than
+the rest of the project.
+
+## Python improvements
+
+In addition to some of the new features mentioned above, we have made a variety
+of usability and performance improvements for integrations with pandas, NumPy,
+Dask, and other Python projects which may make use of pyarrow, the Arrow Python
+library.
+
+Some of these improvements include:
+
+* [Component-based serialization][16] for more flexible and memory-efficient
+ transport of large or complex Python objects
+* Substantially improved serialization performance for pandas objects when
+ using `pyarrow.serialize` and `pyarrow.deserialize`. This includes a special
+ `pyarrow.pandas_serialization_context` which further accelerates certain
+ internal details of pandas serialization * Support zero-copy reads for
+* `pandas.DataFrame` using `pyarrow.deserialize` for objects without Python
+ objects
+* Multithreaded conversions from `pandas.DataFrame` to `pyarrow.Table` (we
+ already supported multithreaded conversions from Arrow back to pandas)
+* More efficient conversion from 1-dimensional NumPy arrays to Arrow format
+* New generic buffer compression and decompression APIs `pyarrow.compress` and
+ `pyarrow.decompress`
+* Enhanced Parquet cross-compatibility with [fastparquet][17] and improved Dask
+ support
+* Python support for accessing Parquet row group column statistics
+
+## Upcoming Roadmap
+
+The 0.8.0 release includes some API and format changes, but upcoming releases
+will focus on ompleting and stabilizing critical functionality to move the
+project closer to a 1.0.0 release.
+
+With the ecosystem of projects using Arrow expanding rapidly, we will be
+working to improve and expand the libraries in support of downstream use cases.
+
+We continue to look for more JavaScript, Julia, R, Rust, and other programming
+language developers to join the project and expand the available
+implementations and bindings to more languages.
+
+[1]:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.8.0
+[2]: https://arrow.apache.org/install
+[3]: https://arrow.apache.org/release/0.7.0.html
+[3]: https://github.com/kou
+[4]: https://github.com/trxcllnt
+[5]: https://github.com/cpcloud
+[6]: https://github.com/apache/parquet-cpp
+[7]: https://github.com/licht-t
+[8]: https://github.com/red-data-tools
+[9]: https://mesonbuild.com
+[10]: https://arrow.apache.org/blog/2017/12/19/java-vector-improvements/
+[11]: https://github.com/TheNeuralBit
+[12]: http://npmjs.org/package/apache-arrow
+[13]: https://github.com/BryanCutler
+[14]: https://github.com/icexelloss
+[15]: https://github.com/siddharthteotia
+[16]: http://arrow.apache.org/docs/python/ipc.html
+[17]: https://github.com/dask/fastparquet
+[18]: http://arrow.apache.org/powered_by/
--
To stop receiving notification emails like this one, please contact
['"[email protected]" <[email protected]>'].