mrkn commented on code in PR #207:
URL: https://github.com/apache/arrow-site/pull/207#discussion_r869727077


##########
_posts/2022-05-05-8.0.0-release.md:
##########
@@ -0,0 +1,230 @@
+---
+layout: post
+title: "Apache Arrow 8.0.0 Release"
+date: "2022-05-05 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+
+The Apache Arrow team is pleased to announce the 8.0.0 release. This covers
+over 3 months of development work and includes [**586 resolved issues**][1]
+from [**YYY distinct contributors**][2]. See the Install Page to learn how to
+get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+## Community
+
+Since the 7.0.0 release, Kun Liu, Raphael Taylor-Davies Xudong Wang, Yijie Shen
+and Liang-Chi Hsieh have been invited to be committers.
+Thanks for your contributions and participation in the project!
+
+## Columnar Format Notes
+
+## Arrow Flight RPC notes
+
+Flight SQL has been extended with a method to get type metadata 
([ARROW-15313](https://issues.apache.org/jira/browse/ARROW-15313)) and column 
metadata in returned schemas 
([ARROW-15314](https://issues.apache.org/jira/browse/ARROW-15314), 
[ARROW-16064](https://issues.apache.org/jira/browse/ARROW-16064)) New 
documentation is available describing Flight and Flight SQL, along with several 
Cookbook recipes 
([ARROW-14698](https://issues.apache.org/jira/browse/ARROW-14698), 
[ARROW-16065](https://issues.apache.org/jira/browse/ARROW-16065)).
+
+The C++ libraries now support UCX as a network transport 
([ARROW-15706](https://issues.apache.org/jira/browse/ARROW-15706)), and the 
APIs have been refactored to allow other transports to be implemented 
([ARROW-15282](https://issues.apache.org/jira/browse/ARROW-15282)). UCX support 
is experimental and still subject to change. Many of the APIs have been 
refactored to use the `arrow::Result` type, and the original variants have been 
deprecated ([ARROW-16032](https://issues.apache.org/jira/browse/ARROW-16032)). 
Support for gRPC >= 1.43 has been added 
([ARROW-15551](https://issues.apache.org/jira/browse/ARROW-15551)).
+## C++ notes
+
+### Compute
+
+Arrow C++ can now optionally build with support for the experimental
+[Substrait](https://substrait.io/) query representation format 
([ARROW-15238](https://issues.apache.org/jira/browse/ARROW-15238)).
+
+A number of compute kernels operating on temporal data have been added:
+
+* addition, subtraction and multiplication between various temporal types;
+* a new "is_dst" function to compute whether the input timestamps fall within
+daylight saving time (DST);
+* a new "is_leap_year" function to compute whether the input timestamps fall
+within a leap year.
+
+It is possible to enable a timezone database on Windows at runtime
+by calling the `arrow::Initialize()` function 
([ARROW-13168](https://issues.apache.org/jira/browse/ARROW-13168)).
+
+New hash aggregations are available: "hash_one" to return one value from each
+group ([ARROW-13993](https://issues.apache.org/jira/browse/ARROW-13993)), and 
"hash_list" to return all values from each group
+([ARROW-15152](https://issues.apache.org/jira/browse/ARROW-15152)).  Null 
columns are now supported on the sum, mean and product
+hash aggregates 
([ARROW-15506](https://issues.apache.org/jira/browse/ARROW-15506)).  Also, it 
is now possible to execute hash
+"aggregations" with only key columns 
([ARROW-15609](https://issues.apache.org/jira/browse/ARROW-15609)).
+
+A new compute function "map_lookup" allows looking up a given key in a map
+array ([ARROW-15089](https://issues.apache.org/jira/browse/ARROW-15089)).
+
+New compute functions "sqrt" and "sqrt_checked" allow extracting the square
+root of their input 
([ARROW-15614](https://issues.apache.org/jira/browse/ARROW-15614)).
+
+Casting between two struct types is now possible, assuming the destination 
field
+names all exist in the source struct type 
([ARROW-1888](https://issues.apache.org/jira/browse/ARROW-1888), 
[ARROW-15643](https://issues.apache.org/jira/browse/ARROW-15643)).
+
+Optional OpenTelemetry tracing has been added to kernel functions and execution
+plan nodes ([ARROW-15061](https://issues.apache.org/jira/browse/ARROW-15061)).
+
+The CMake build option `ARROW_ENGINE` has been renamed to `ARROW_SUBSTRAIT`,
+to better reflect its actual effect 
([ARROW-16158](https://issues.apache.org/jira/browse/ARROW-16158)).
+
+### CSV
+
+It is now possible to change the field delimiter when writing a CSV file
+([ARROW-15672](https://issues.apache.org/jira/browse/ARROW-15672)).
+
+### Dataset
+
+The ORC dataset scanner now observes the batch size parameter 
([ARROW-14153](https://issues.apache.org/jira/browse/ARROW-14153)).
+
+The dataset layer now supports filename-based partitioning, where the data
+files are all laid out in the dataset's base directory, their names prefixed
+with the partition values separated by underscore characters 
([ARROW-14612](https://issues.apache.org/jira/browse/ARROW-14612)).
+
+Optional OpenTelemetry tracing has been added to the dataset scanner 
([ARROW-15067](https://issues.apache.org/jira/browse/ARROW-15067)).
+
+### Filesystem
+
+It is possible to instantiate a Google Cloud Storage (GCS) filesystem
+from a URI, making GCS implicitly usable in the datasets layer 
([ARROW-14893](https://issues.apache.org/jira/browse/ARROW-14893)).
+Recognized URI schemes are `gs` and `gcs`.
+
+`FileSystem::DeleteDirContents` can now optionally succeed when the directory
+doesn't exist 
([ARROW-16159](https://issues.apache.org/jira/browse/ARROW-16159)).
+
+### IO
+
+It is possible to override the number of IO threads using the
+environment variable `ARROW_IO_THREADS` 
([ARROW-15941](https://issues.apache.org/jira/browse/ARROW-15941)).
+
+### IPC
+
+The IPC file reader and writer now allow accessing the custom metadata
+associated with record batches 
([ARROW-16131](https://issues.apache.org/jira/browse/ARROW-16131)).
+
+### Miscellaneous
+
+It is possible to enable lightweight memory checks on the standard memory pools
+using a dedicated environment variable `ARROW_DEBUG_MEMORY_POOL` 
([ARROW-15550](https://issues.apache.org/jira/browse/ARROW-15550)).
+These checks are not a replacement for sophisticated checkers such as Address
+Sanitizer or Valgrind, but might come up useful if those tools are not
+available.
+
+Temporal data is now validated when doing full array validation 
([ARROW-10924](https://issues.apache.org/jira/browse/ARROW-10924)).
+The validation catches values not matching the specification (for example,
+a time value being outside of the span of one day).
+
+The GDB plugin now attempts to print the data of an array, in addition to its
+metadata ([ARROW-15389](https://issues.apache.org/jira/browse/ARROW-15389)).  
This only works for primitive datatypes.
+
+Pretty-printing is now shorter and more customizable for nested datatypes
+([ARROW-14798](https://issues.apache.org/jira/browse/ARROW-14798)).
+
+## C# notes
+
+## Go notes
+
+### Bug fixes
+
+* parquet_reader / parquet_schema no longer crash 
([ARROW-15509](https://github.com/apache/arrow/pull/12303))
+* Base64 encoding of origin Arrow schema properly uses padding and decodes 
both with or without the padding in pqarrow.getOriginSchema 
([ARROW-15544](https://github.com/apache/arrow/pull/12679))
+* ipc.Writer no longer includes unnecessary offsets when encoding sliced 
arrays ([ARROW-15715](https://github.com/apache/arrow/pull/12453))
+* Use base64.StdEncoding for Arrow Flight Basic Auth middleware for proper 
encoding/decoding ([ARROW-15772](https://github.com/apache/arrow/pull/12503))
+* Fix panic during concurrent compression of ipc body buffers due to negative 
WaitGroup counter ([ARROW-15792](https://github.com/apache/arrow/pull/12518))
+* Fix memory leak in pqarrow.NewColumnWriter with nested structures 
([ARROW-15946](https://github.com/apache/arrow/pull/12641))
+* ipc.FileReader no longer leaks memory when using ZSTD compression 
([ARROW-16163](https://github.com/apache/arrow/pull/12857))
+
+### Enhancements
+
+#### Flight RPC
+
+* *Breaking Change* gRPC version is updated and Flight Server creation has 
been simplified. Flight servers must now embed flight.BaseFlightServer 
([ARROW-15418](https://github.com/apache/arrow/pull/12233))
+* You can now provide a full net.Listener as an alternative to just providing 
an address to bind to when creating a Flight server 
([ARROW-16082](https://github.com/apache/arrow/pull/12768))
+
+#### Parquet
+
+* 'unpack_bool', sum_float64, and bitmap handling functions have been given 
optimized assembly implementation for Arm64 NEON 
([ARROW-15440](https://github.com/apache/arrow/pull/12398), 
[ARROW-15742](https://github.com/apache/arrow/pull/12502), 
[ARROW-15995](https://github.com/apache/arrow/pull/12687))
+* Go Parquet handling has been simplified to only need io.ReadSeeker instead 
of a ReadAtSeeker interface 
([ARROW-15963](https://github.com/apache/arrow/pull/12658))
+* BitSetRunReader and helper functions have been lifted to internal/bitutils 
package to share between arrow and parquet implementations 
([ARROW-15950](https://github.com/apache/arrow/pull/12926))
+* Parquet Reader now properly obeys the buffer size read property for buffered 
streams ([ARROW-16187](https://github.com/apache/arrow/pull/12876))
+* parquet NewBufferedReader no longer panics 
([ARROW-16283](https://github.com/apache/arrow/pull/12960))
+
+#### Arrow
+
+* Go Arrow library now supports Dictionary Arrays 
([ARROW-3039](https://issues.apache.org/jira/browse/ARROW-3039), 
[ARROW-9378](https://issues.apache.org/jira/browse/ARROW-9378), 
[GH-12158](https://github.com/apache/arrow/pull/12158))
+* array.ArrayEqual, array.ArrayApproxEqual have been renamed to array.Equal 
and array.ApproxEqual. Aliases are provided to avoid breaking existing code 
which will be removed in v9. 
([ARROW-5598](https://github.com/apache/arrow/pull/12877))
+* Custom cpu discovery package replaced with using golang.org/x/sys/cpu 
([ARROW-16193](https://github.com/apache/arrow/pull/12764))
+* *Breaking Change* array.Interface, array.Record, array.Table, etc. were 
deprecated in v7 in favor of arrow.Array, arrow.Record, etc. The deprecated 
aliases have been removed in v8 
([ARROW-16192](https://github.com/apache/arrow/pull/12960))
+
+#### CI
+
+* staticcheck is now run as part of CI to lint the Go code 
([ARROW-15296](https://github.com/apache/arrow/pull/12540))
+* Travis builds on Arm64 for Go are no longer allowed to fail 
([ARROW-15400](https://github.com/apache/arrow/pull/12214))
+
+## Java notes
+* Java 17 is now supported as a target and has been added to tested platforms
+* When scanning datasets an `ArrowReader` is now returned, which makes easier 
to create `VectorSchemaRoot` from it.
+* Java Documentation had a overall improvement, with few sections added and 
most sections rewritten as more clear tutorials
+* Overall improvements to FlightSQL support in Java
+* [Java Cookbook](https://arrow.apache.org/cookbook/java/index.html) is now 
available
+## JavaScript notes
+
+## Python notes
+
+In general, the Python bindings benefit from improvements in the C++ library
+(e.g. new compute functions); see the C++ notes above for additional details.
+In addition:
+
+* Tables and Datasets now support the ``join`` operation to perform `left`, 
`right`, `full` joins of `inner` or `outer` types. The result of the join 
operation will be a new table 
([ARROW-14293](https://issues.apache.org/jira/browse/ARROW-14293)). See 
https://arrow.apache.org/docs/dev/python/compute.html#table-and-dataset-joins 
for examples.
+* Additional legacy keywords and properties of the ``ParquetDataset`` class 
have been deprecated and will issue a warning, in favor of functionality based 
on the `pyarrow.dataset` functionality 
([ARROW-16119](https://issues.apache.org/jira/browse/ARROW-16119)).
+* It is now possible to create references to nested fields in a Table or 
Dataset using ``py.field("a", "b")`` 
([ARROW-11259](https://issues.apache.org/jira/browse/ARROW-11259)).
+* Docstrings in ``Schema``, ``ChunkedArray``, ``Tensor``, ``RecordBatch``, 
``parquet`` and ``Table`` now include examples, in the documentation, on how to 
use the methods and classes 
([ARROW-15367](https://issues.apache.org/jira/browse/ARROW-15367)).
+* Reading and writing Parquet files now supports encryption 
([ARROW-9947](https://issues.apache.org/jira/browse/ARROW-9947)). See the 
[docs](https://arrow.apache.org/docs/python/parquet.html#parquet-modular-encryption-columnar-encryption)
 for more details.
+* Support for `zoneinfo` (Python 3.9+) and `dateutil` timezones in conversion 
to Arrow data structures ([ARROW-5248](https://issues.apache.org/jira/browse/)).
+* Multiple bugfixes and segfaults resolved.
+
+## R notes
+
+This release includes:
+
+ * Support for over 20 additional `lubridate` and `base` date and time 
functions in Arrow dpylr queries,
+ * An API to allow external packages to define custom extension array types 
and custom conversions into Arrow types,
+ * Support for concatenating Arrays, RecordBatches, and Tables, including with 
`c()`, `rbind()` and `cbind()`.
+
+For more on what’s in the 8.0.0 R package, see the [R changelog][4].
+
+## Ruby and C GLib notes
+
+### Ruby
+
+### C GLib
+

Review Comment:
   ```suggestion
   ### C GLib
   
   * Add `gaflight_client_close` 
([[ARROW-15487](https://issues.apache.org/jira/browse/ARROW-15487)](https://issues.apache.org/jira/browse/ARROW-15487))
   * Add `GParquetFileMetadata` and `gparquet_arrow_file_reader_get_metadata` 
([[ARROW-16214](https://issues.apache.org/jira/browse/ARROW-16214)](https://issues.apache.org/jira/browse/ARROW-16214))
   * Fix `GArrowGIOInputStream` so that all the data is completely read 
([[ARROW-15626](https://issues.apache.org/jira/browse/ARROW-15626)](https://issues.apache.org/jira/browse/ARROW-15626))
   * Add `garrow_string_array_builder_append_string_len` and 
`garrow_large_string_array_builder_append_string_len` 
([[ARROW-15629)](https://issues.apache.org/jira/browse/ARROW-15629)](https://issues.apache.org/jira/browse/ARROW-15629)
   * Add `GParquetRowGroupMetadata` 
([[ARROW-16245](https://issues.apache.org/jira/browse/ARROW-16245)](https://issues.apache.org/jira/browse/ARROW-16245))
   * Add `GParquetColumnChunkMetadata` 
([[ARROW-16250](https://issues.apache.org/jira/browse/ARROW-16250)](https://issues.apache.org/jira/browse/ARROW-16250))
   * Add `GArrowGCSFileSystem` 
([[ARROW-16247](https://issues.apache.org/jira/browse/ARROW-16247)](https://issues.apache.org/jira/browse/ARROW-16247))
   * Add `GParquetStatistics` and its family 
([[ARROW-16251](https://issues.apache.org/jira/browse/ARROW-16251)](https://issues.apache.org/jira/browse/ARROW-16251))
   * Add missing casts for `GArrowRoundMode` 
([[ARROW-16296](https://issues.apache.org/jira/browse/ARROW-16296)](https://issues.apache.org/jira/browse/ARROW-16296))
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to