pitrou commented on a change in pull request #63:
URL: https://github.com/apache/arrow-site/pull/63#discussion_r456640654



##########
File path: _posts/2020-07-16-1.0.0-release.md
##########
@@ -0,0 +1,211 @@
+---
+layout: post
+title: "Apache Arrow 1.0.0 Release"
+date: "2020-07-16 00:00:00 -0600"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+The Apache Arrow team is pleased to announce the 1.0.0 release. This covers
+over 2 months of development work and includes [**XX resolved issues**][1]
+from [**XX distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+## 1.0 Format Release
+
+The Arrow format received several changes and additions, leading to the
+1.0 format version:
+
+* The metadata version was bumped to a new version V5, indicating an
+  incompatible change in the buffer layour of Union types (ARROW-9258).
+  All other types keep the same layout as in V4.
+
+* Dictionary indices are now allowed to be unsigned rather than signed
+  (ARROW-9259). Using UInt64 is still discouraged because of poor Java
+  support.
+
+* A "Feature" enum has been added to announce the use of specific optional
+  features in an IPC stream, such as buffer compression (ARROW-9308).  This
+  new field is not used by any implementation yet.
+
+* Optional buffer compression using LZ4 or ZStandard was added to the IPC
+  format (ARROW-300).
+
+* Decimal types now have an optional "bitWidth" field, defaulting to 128
+  (ARROW-8985).  This will allow for future support of other decimal widths
+  such as 32- and 64-bit.
+
+The 1.0 major release indicates that the Arrow columnar format is declared
+stable, with [forward and backward compatibility guarantees][5].
+
+Integration testing has been expanded to test for extension types and
+nested dictionaries.
+
+XXX Link to implementation matrix
+
+Format changes: unions, unsigned int dictionary indices, decimal bit width, 
feature enum.
+
+## Community
+
+Since the last release, we have added two new committers:
+
+* Liya Fan
+* Ji Liu
+
+Thank you for all your contributions!
+
+<!-- Acknowledge and link to any new committers and PMC members since the last 
release. See previous release announcements for examples. -->
+
+## Arrow Flight RPC notes
+
+Flight now offers DoExchange, a fully bidirectional data endpoint, in addition
+to DoGet and DoPut, in C++, Java, and Python. Middlewares in all languages now
+expose binary-valued headers. Additionally, servers and clients can set Arrow
+IPC read/write options in all languages, making compatibility easier with 
earlier
+versions of Arrow Flight.
+
+In C++ and Python, Flight now exposes more options from gRPC, including the
+address of the client (on the server) and the ability to set low-level gRPC
+client options. Flight also supports mutual TLS authentication and the ability
+for a client to control the size of a data message on the wire.
+
+## C++ notes
+
+Support for static linking with Arrow has been vastly improved, including the
+introduction of a `libarrow_bundled_dependencies.a` library bundling all
+required dependencies together (ARROW-7605).
+
+Following the Arrow format changes, Union arrays cannot have a top-level
+bitmap anymore (ARROW-9278; see also the IPC changes below).
+
+A number of improvements were made to reduce the overall code size in the
+Arrow library.
+
+A convenience API `GetBuildInfo` allows querying the characteristics of
+the Arrow library (ARROW-6521).  We encourage you to suggest any desired
+addition to the returned information.
+
+We added an optional dependency to the `utf8proc` library, used in several
+compute functions (ARROW-8961; see below).
+
+Instead of sharing the same concrete classes, sparse and dense unions now
+have separated classes (`SparseUnionType` and `DenseUnionType`, as well
+as `SparseUnionArray`, `DenseUnionArray`, `SparseUnionScalar`,
+`DenseUnionScalar`; ARROW-8866).
+
+Arrow can now be built for iOS using the right set of CMake options, though
+we don't officially support it (ARROW-8795).  See
+[this 
writeup](https://github.com/UnfoldedInc/deck.gl-native-dependencies/blob/master/docs/iOS-BUILD.md#arrow-v0170)
+for details.
+
+### Compute functions
+
+The compute kernel layer was extensively reworked (ARROW-8792).  It now offers
+a generic function lookup, dispatch and execution mechanism.  Furthermore,
+elaborate internal scaffoldings make it vastly easier to write new function
+kernels.
+
+Several compute functions have been added.  Unicode-compliant predicates and
+transforms, such as lowercase and uppercase transforms, are now available.
+
+The available compute functions are listed exhaustively in the Sphinx-generated
+documentation.
+
+### Datasets
+
+Datasets can now be read from CSV files (ARROW-7759).
+
+### Feather
+
+The Feather format is now available in version 2, which is simply the Arrow
+IPC file format with another name.
+
+### IPC
+
+By default, we now write IPC streams with metadata V5.  However, metadata V4
+can be requested by setting the appropriate member in `IpcWriteOptions`.
+
+V4 as well as V5 metadata IPC streams can be read properly, with one
+exception: a V4 metadata stream containing Union arrays with top-level
+null values will refuse reading.
+
+Support for dictionary replacement and dictionary delta was implemented
+(ARROW-7285).
+
+### Parquet
+
+Writing files with the LZ4 codec is disabled, because it produces files
+incompatible with the widely-used Hadoop Parquet implementation (ARROW-9424).
+Support will be reenabled once we align the LZ4 implementation with the
+special buffer encoding expected by Hadoop.
+
+## C# notes
+
+## Go notes
+
+## Java notes
+
+## JavaScript notes
+
+## Python notes
+
+The size of wheel packages is significantly reduced.  One side effect is
+that these wheels do not enable Gandiva anymore (ARROW-5082).
+
+The Scalar class hierarchy was reworked to more closely follow its C++
+counterpart (ARROW-9017).
+
+TLS CA certificates are looked up more reliably when using the S3 filesystem,
+especially with manylinux wheels (ARROW-9261).
+
+The encoding of CSV files can now be specified explicitly, defaulting to UTF8
+(ARROW-9106).  Custom timestamp parsers can now be used for CSV files
+(ARROW-8711).
+
+Filesystems can now be implemented in pure Python (ARROW-8766).  As a result,
+[fsspec](https://filesystem-spec.readthedocs.io)-based filesystems can now
+be used in datasets (ARROW-9383).
+
+## R notes
+
+The R package added support for converting to and from many additional Arrow 
types. Tables showing how R types are mapped to Arrow types and vice versa have 
been added to the [introductory vignette][6], and nearly all types are handled. 
In addition, R `attributes` like custom classes and metadata are now preserved 
when converting a `data.frame` to an Arrow Table and are restored when loading 
them back into R.

Review comment:
       Please, can text be properly word-wrapped at ~80 characters for easier 
review and editing?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to