This is an automated email from the ASF dual-hosted git repository.

raulcd pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-site.git


The following commit(s) were added to refs/heads/main by this push:
     new 47f69206be7 [Website] Version 12.0.0 blog post (#346)
47f69206be7 is described below

commit 47f69206be7fb50120a224409bb411624158e0f1
Author: Raúl Cumplido <[email protected]>
AuthorDate: Mon May 8 12:31:38 2023 +0200

    [Website] Version 12.0.0 blog post (#346)
    
    PR to start adding the blog post information for the Release 12.0.0
    
    The 12.0.0 milestone with all the GitHub closed issues can be found
    here:
    https://github.com/apache/arrow/milestone/51?closed=1
    
    ---------
    
    Co-authored-by: Dominik Moritz <[email protected]>
    Co-authored-by: Sutou Kouhei <[email protected]>
    Co-authored-by: David Li <[email protected]>
    Co-authored-by: Matt Topol <[email protected]>
    Co-authored-by: Dewey Dunnington <[email protected]>
    Co-authored-by: Joris Van den Bossche <[email protected]>
    Co-authored-by: Alenka Frim <[email protected]>
    Co-authored-by: Weston Pace <[email protected]>
---
 _posts/2023-05-02-12.0.0-release.md | 289 ++++++++++++++++++++++++++++++++++++
 1 file changed, 289 insertions(+)

diff --git a/_posts/2023-05-02-12.0.0-release.md 
b/_posts/2023-05-02-12.0.0-release.md
new file mode 100644
index 00000000000..ea2f63325e0
--- /dev/null
+++ b/_posts/2023-05-02-12.0.0-release.md
@@ -0,0 +1,289 @@
+---
+layout: post
+title: "Apache Arrow 12.0.0 Release"
+date: "2023-05-02 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+The Apache Arrow team is pleased to announce the 12.0.0 release. This covers
+over 3 months of development work and includes [**476 resolved issues**][1]
+with [**531 commits from 97 distinct contributors**][2].
+See the [Install Page](https://arrow.apache.org/install/)
+to learn how to get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+## Community
+
+Since the 11.0.0 release, Wang Mingming, Mustafa Akur and Ruihang Xia
+have been invited to be committers.
+Will Jones have joined the Project Management Committee (PMC).
+
+Thanks for your contributions and participation in the project!
+
+## Columnar Format Notes
+
+A first "canonical" extension type has been formalized: 
`arrow.fixed_shape_tensor` to
+represent an Array where each slot contains a tensor, with all tensors having 
the same
+dimension and shape, [GH-33923](https://github.com/apache/arrow/issues/33923).
+This is based on a Fixed-Size List layout as storage array
+(<https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#fixed-shape-tensor-extension>).
+
+## Arrow Flight RPC notes
+
+The JDBC driver for Arrow Flight SQL has had some bugfixes, and has been 
refactored into a core library (which is not distributed as an uberjar with 
shaded names) and a driver (which is distributed as an uberjar).
+
+The Java server builder API now offers easier access to the underlying gRPC 
builder.
+
+Go now implements the Flight SQL extensions for Substrait and transaction 
support.
+
+## Plasma notes
+
+Plasma was deprecated since 10.0.0. Plasma is removed in this
+release. [GH-33243][GH-33243]
+
+## C++ notes
+
+* Run-End Encoded Arrays have been implemented and are accessible 
([GH-32104](https://github.com/apache/arrow/issues/32104))
+* The FixedShapeTensor Logical value type has been implemented using 
ExtensionType ([GH-15483](https://github.com/apache/arrow/issues/15483), 
[GH-34796](https://github.com/apache/arrow/issues/34796))
+
+### Compute
+
+* New kernel to convert timestamp with timezone to wall time 
([GH-33143](https://github.com/apache/arrow/issues/33143))
+* Cast kernels are now built into libarrow by default 
([GH-34388](https://github.com/apache/arrow/issues/34388))
+
+### Acero
+
+* Acero has been moved out of libarrow into it's own shared library, allowing 
for smaller builds of the core libarrow 
([GH-15280](https://github.com/apache/arrow/issues/15280))
+* Exec nodes now can have a concept of "ordering" and will reject non-sensible 
plans ([GH-34136](https://github.com/apache/arrow/issues/34136))
+* New exec nodes: "pivot_longer" 
([GH-34266](https://github.com/apache/arrow/issues/34266)), "order_by" 
([GH-34248](https://github.com/apache/arrow/issues/34248)) and "fetch" 
([GH-34059](https://github.com/apache/arrow/issues/34059))
+* _Breaking Change:_ Reorder output fields of "group_by" node so that 
keys/segment keys come before aggregates 
([GH-33616](https://github.com/apache/arrow/issues/33616))
+
+### Substrait
+
+* Add support for the `round` function 
[GH-33588](https://github.com/apache/arrow/issues/33588)
+* Add support for the `cast` expression element 
[GH-31910](https://github.com/apache/arrow/issues/31910)
+* Added API reference documentation 
[GH-34011](https://github.com/apache/arrow/issues/34011)
+* Added an extension relation to support segmented aggregation 
[GH-34626](https://github.com/apache/arrow/issues/34626)
+* The output of the aggregate relation now conforms to the spec 
[GH-34786](https://github.com/apache/arrow/issues/34786)
+
+### Parquet
+
+* Added support for DeltaLengthByteArray encoding to the Parquet writer 
([GH-33024](https://github.com/apache/arrow/issues/33024))
+* NaNs are correctly handled now for Parquet predicate push-downs 
([GH-18481](https://github.com/apache/arrow/issues/18481))
+* Added support for reading Parquet page indexes 
([GH-33596](https://github.com/apache/arrow/issues/33596)) and writing page 
indexes ([GH-34053](https://github.com/apache/arrow/issues/34053))
+* Parquet writer can write columns in parallel now 
([GH-33655](https://github.com/apache/arrow/issues/33655))
+* Fixed incorrect number of rows in Parquet V2 page headers 
([GH-34086](https://github.com/apache/arrow/issues/34086))
+* Fixed incorrect Parquet page null_count when stats are disabled 
([GH-34326](https://github.com/apache/arrow/issues/34326))
+* Added support for reading BloomFilters to the Parquet Reader 
([GH-34665](https://github.com/apache/arrow/issues/34665))
+* Parquet File-writer can now add additional key-value metadata after it has 
been opened ([GH-34888](https://github.com/apache/arrow/issues/34888))
+* _Breaking Change:_ The default row group size for the Arrow writer changed 
from 64Mi rows to 1Mi rows. 
[GH-34280](https://github.com/apache/arrow/issues/34280)
+
+### ORC
+
+* Added support for the union type in ORC writer 
([GH-34262](https://github.com/apache/arrow/issues/34262))
+* Fixed ORC CHAR type mapping with Arrow 
([GH-34823](https://github.com/apache/arrow/issues/34823))
+* Fixed timestamp type mapping between ORC and arrow 
([GH-34590](https://github.com/apache/arrow/issues/34590))
+
+### Datasets
+
+* Added support for reading JSON datasets 
([GH-33209](https://github.com/apache/arrow/issues/33209))
+* Dataset writer now supports specifying a function callback to construct the 
file name in addition to the existing file name template 
([GH-34565](https://github.com/apache/arrow/issues/34565))
+
+### Filesystems
+
+* GcsFileSystem::OpenInputFile avoids unnecessary downloads 
([GH-34051](https://github.com/apache/arrow/issues/34051))
+
+### Other changes
+
+* Convenience Append(std::optional<T>...) methods have been added to array 
builders ([GH-14863](https://github.com/apache/arrow/issues/14863))
+* A deprecated OpenTelemetry header was removed from the Flight library 
([GH-34417](https://github.com/apache/arrow/issues/34417))
+* Fixed crash in "take" kernels on ExtensionArrays with an underlying 
dictionary type ([GH-34619](https://github.com/apache/arrow/issues/34619))
+* Fixed bug where the C-Data bridge did not preserve nullability of map values 
on import ([GH-34983](https://github.com/apache/arrow/issues/34983))
+* Added support for EqualOptions to RecordBatch::Equals 
([GH-34968](https://github.com/apache/arrow/issues/34968))
+* zstd dependency upgraded to v1.5.5 
([GH-34899](https://github.com/apache/arrow/issues/34899))
+* Improved handling of "logical" nulls such as with union and RunEndEncoded 
arrays ([GH-34361](https://github.com/apache/arrow/issues/34361))
+* Fixed incorrect handling of uncompressed body buffers in IPC reader, added 
IpcWriteOptions::min_space_savings for optional compression optimizations 
([GH-15102](https://github.com/apache/arrow/issues/15102))
+
+## C# notes
+
+* Support added for importing / exporting schemas and types via the
+C data interface [GH-34737](https://github.com/apache/arrow/issues/34737)
+* Support added for the half-float data type 
[GH-25163](https://github.com/apache/arrow/issues/25163)
+* Schemas are now allowed to have multiple fields with the same name
+[GH-34076](https://github.com/apache/arrow/issues/34076)
+* Added support for reading compressed IPC files 
[GH-32240](https://github.com/apache/arrow/issues/32240)
+* Add [] operator to Schema 
[GH-34119](https://github.com/apache/arrow/issues/32240)
+
+## Go notes
+
+* Run-End Encoded Arrays have been added to the Golang implementation 
([GH-32104](https://github.com/apache/arrow/issues/32104), 
[GH-32946](https://github.com/apache/arrow/issues/32946), 
[GH-20407](https://github.com/apache/arrow/issues/20407), 
[GH-32949](https://github.com/apache/arrow/issues/32949))
+* The SQLite Flight SQL Example has been improved and you can now `go get` a 
simple SQLite Flight SQL Server mainprog using `go get 
github.com/apache/arrow/go/v12/arrow/flight/flightsql/example/cmd/sqlite_flightsql_server`
 ([GH-33840](https://github.com/apache/arrow/issues/33840))
+* Fixed bug causing builds to fail with the `noasm` build tag 
([GH-34044](https://github.com/apache/arrow/issues/34044)) and added a CI test 
run that uses the `noasm` tag 
([GH-34055](https://github.com/apache/arrow/issues/34055))
+* Fixed issue allowing building on riscv64-freebsd 
([GH-34629](https://github.com/apache/arrow/issues/34629))
+* Fixed issue preventing building on 32-bit platforms 
([GH-35133](https://github.com/apache/arrow/issues/35133))
+
+### Arrow
+
+* Fixed bug in C-Data handling of `ArrowArrayStream.get_next` when handling 
uninitialized `ArrowArrays` 
([GH-33767](https://github.com/apache/arrow/issues/33767))
+* _Breaking Change:_ Added `Err()` method to `RecordReader` interface so that 
it can propagate errors 
([GH-33789](https://github.com/apache/arrow/issues/33789))
+* Fixed potential panic in C-Data API for misusing an invalid handle 
([GH-33864](https://github.com/apache/arrow/issues/33864))
+* A new cgo-based Allocator that does not depend on libarrow has been added to 
the memory package ([GH-33901](https://github.com/apache/arrow/issues/33901))
+* CSV Reader and Writer now support Extension type arrays 
([GH-34334](https://github.com/apache/arrow/issues/34334))
+* Fixed bug preventing the reading of IPC streams/files with compression 
enabled but uncompressed buffers 
([GH-34385](https://github.com/apache/arrow/issues/34385))
+* Added interface which can be added to an `ExtensionType` to allow Builders 
to be created via `NewBuilder`, enabling easy building of nested fields 
containing extension types 
([GH-34453](https://github.com/apache/arrow/issues/34453))
+* Added utilities to perform Array diffing 
([GH-34790](https://github.com/apache/arrow/issues/34790))
+* Added `SetColumn` method to `arrow.Record` 
([GH-34832](https://github.com/apache/arrow/issues/34832))
+* Added `GetValue` method to `arrow.Metadata` 
([GH-34855](https://github.com/apache/arrow/issues/34855))
+* Added `Pow` method to `decimal128.Num` and `decimal256.Num` 
([GH-34863](https://github.com/apache/arrow/issues/34863))
+
+#### Flight
+
+* Fixed bug in `StreamChunks` for Flight SQL to correctly propagate to the 
gRPC client ([GH-33717](https://github.com/apache/arrow/issues/33717))
+* Fixed issue that prevented compatibility with gRPC < v1.45 
([GH-33734](https://github.com/apache/arrow/issues/33734))
+* Added support to bind a `RecordReader` for supplying parameters to a Flight 
SQL Prepared statement 
([GH-33794](https://github.com/apache/arrow/issues/33794))
+* Prepared Statement methods for FlightSQL client now allows gRPC Call Options 
([GH-33867](https://github.com/apache/arrow/issues/33867))
+* FlightSQL Extensions have been implemented (for transactions and Substrait 
support) ([GH-33935](https://github.com/apache/arrow/issues/33935))
+* A driver compatible with `database/sql` for FlightSQL has been added 
([GH-34332](https://github.com/apache/arrow/issues/34332))
+
+#### Compute
+
+* "run_end_encode" and "run_end_decode" functions added to compute functions 
([GH-20408](https://github.com/apache/arrow/issues/20408))
+* "unique" function added 
([GH-34171](https://github.com/apache/arrow/issues/34171))
+
+### Parquet
+
+* `pqarrow` pkg now handles DICTIONARY fields natively 
([GH-33466](https://github.com/apache/arrow/issues/33466))
+* Fixed rare panic when writing list of 8 structs 
([GH-33600](https://github.com/apache/arrow/issues/33600))
+* Added support for `pqarrow` pkg to write LargeString and LargeBinary types 
([GH-33875](https://github.com/apache/arrow/issues/33875))
+* Fixed bug where `pqarrow.NewSchemaManifest` created the wrong field type for 
Array Object fields ([GH-34101](https://github.com/apache/arrow/issues/34101))
+* Added support to Parquet Writer for Extension type Arrays 
([GH-34330](https://github.com/apache/arrow/issues/34330))
+
+## Java notes
+
+* Update Java JNI modules to consider Arrow ACERO 
[GH-34862](https://github.com/apache/arrow/issues/34862)
+* Ability to register additional GRPC services with FlightServer 
[GH-34778](https://github.com/apache/arrow/issues/34778)
+* Allow sending custom headers/properties through Arrow Flight SQL JDBC 
[GH-33874](https://github.com/apache/arrow/issues/33874)
+
+## JavaScript notes
+
+No changes.
+
+## Python notes
+
+Compatibility notes:
+
+* Plasma has been removed in this release 
([GH-33243](https://github.com/apache/arrow/issues/33243)). In additiona, the 
deprecated serialization module in PyArrow was also removed 
([GH-29705](https://github.com/apache/arrow/issues/29705)). IPC (Inter-Process 
Communication) functionality of pyarrow or the standard library pickle should 
be used instead.
+* The deprecated `use_async` keyword has been removed from the dataset module 
([GH-30774](https://github.com/apache/arrow/issues/30774))
+* Minimum Cython version to build PyArrow from source has been raised to 
0.29.31 ([GH-34933](https://github.com/apache/arrow/issues/34933)). In 
addition, PyArrow can now be compiled using Cython 3 
([GH-34564](https://github.com/apache/arrow/issues/34564)).
+
+New features:
+
+* A new `pyarrow.acero` module with initial bindings for the Acero execution 
engine has been added ([GH-33976](https://github.com/apache/arrow/issues/33976))
+* A new canonical extension type for fixed shaped tensor data has been 
defined. This is exposed in PyArrow as the FixedShapeTensorType 
([GH-34882](https://github.com/apache/arrow/issues/34882), 
[GH-34956](https://github.com/apache/arrow/issues/34956))
+* Run-End Encoded arrays binding has been implemented 
([GH-34686](https://github.com/apache/arrow/issues/34686), 
[GH-34568](https://github.com/apache/arrow/issues/34568))
+* Method `is_nan` has been added to Array, ChunkedArray and Expression 
([GH-34154](https://github.com/apache/arrow/issues/34154))
+* Dataframe interchange protocol has been implemented for RecordBatch 
([GH-33926](https://github.com/apache/arrow/issues/33926))
+
+Other improvements:
+
+* Extension arrays can now be concatenated 
([GH-31868](https://github.com/apache/arrow/issues/31868))
+* `get_partition_keys` helper function is implemented in the `dataset` module 
to access the partitioning field's key/value from the partition expression of a 
certain dataset fragment 
([GH-33825](https://github.com/apache/arrow/issues/33825))
+* PyArrow Array objects can now be accepted by the `pa.array()` constructor 
([GH-34411](https://github.com/apache/arrow/issues/34411))
+* The default row group size when writing parquet files has been changed 
([GH-34280](https://github.com/apache/arrow/issues/34280))
+* RecordBatch has the `select()` method implemented 
([GH-34359](https://github.com/apache/arrow/issues/34359))
+* New method `drop_column` on the `pyarrow.Table` supports passing a single 
column as a string ([GH-33377](https://github.com/apache/arrow/issues/33377))
+* User-defined tabular functions, which are a user-functions implemented in 
Python that return a stateful stream of tabular data, are now also supported 
([GH-32916](https://github.com/apache/arrow/issues/32916))
+* [Arrow Archery 
tool](https://arrow.apache.org/docs/dev/developers/continuous_integration/archery.html)
 now includes linting of the Cython files 
([GH-31905](https://github.com/apache/arrow/issues/31905))
+* _Breaking Change:_ Reorder output fields of "group_by" node so that 
keys/segment keys come before aggregates 
([GH-33616](https://github.com/apache/arrow/issues/33616))
+
+Relevant bug fixes:
+
+* Acero can now detect and raise an error in case a join operation needs too 
much bytes of key data 
([GH-34474](https://github.com/apache/arrow/issues/34474))
+* Fix for converting non-sequence object in `pa.array()` 
([GH-34944](https://github.com/apache/arrow/issues/34944))
+* Fix erroneous table conversion to pandas if table includes an extension 
array that does not implement `to_pandas_dtype` 
([GH-34906](https://github.com/apache/arrow/issues/34906))
+* Reading from a closed `ArrayStreamBatchReader` now returns invalid status 
instead of segfaulting 
([GH-34165](https://github.com/apache/arrow/issues/34165))
+* `array()` now returns `pyarrow.Array` and not `pyarrow.ChunkedArray` for 
columns with `__arrow_array__` method and only one chunk so that the conversion 
of pandas dataframe with categorical column of dtype `string[pyarrow]` does not 
fail ([GH-33727](https://github.com/apache/arrow/issues/33727))
+* Custom type mapper in `to_pandas` now converts index dtypes together with 
column dtypes ([GH-34283](https://github.com/apache/arrow/issues/34283))
+
+## R notes
+
+* The `read_parquet()` and `read_feather()` functions can now accept URL
+  arguments.
+* The `json_credentials` argument in `GcsFileSystem$create()` now accepts
+  a file path containing the appropriate authentication token.
+* The `$options` member of `GcsFileSystem` objects can now be inspected.
+* The `read_csv_arrow()` and `read_json_arrow()` functions now accept literal 
text input wrapped in
+  `I()` to improve compatability with `readr::read_csv()`.
+* Nested fields can now be accessed using `$` and `[[` in dplyr expressions.
+
+For more on what’s in the 12.0.0 R package, see the [R changelog][4].
+
+## Ruby and C GLib notes
+
+* Flight SQL: Added support for authentication. [GH-34074][GH-34074]
+* Compute: Added `GArrowRankOptions`. [GH-34425][GH-34425]
+* Compute: Added `GArrowFilterOptions`. [GH-34650][GH-34650]
+* Compute: Added `GArrowIndexOptions`. [GH-15286][GH-15286]
+* Compute: Added `GArrowMatchSubstringOptions`. [GH-15285][GH-15285]
+* Added `GArrowDenseUnionArrayBuilder`. [GH-21429][GH-21429]
+* Added `GArrowSparseUnionArrayBuilder`. [GH-21430][GH-21430]
+
+### Ruby notes
+
+* Improved `Arrow::Table#join` API. [GH-15287][GH-15287]
+* Flight SQL: Added `ArrowFlight::RecordBatchReader#each`. [GH-15287][GH-15287]
+* Added `Arrow::DenseUnionArray#get_value`. [GH-34837][GH-34837]
+* Added `Arrow::SparseUnionArray#get_value`. [GH-34837][GH-34837]
+* Expression: Added support for
+  `table.slice {|slicer| slicer.column.match_substring(string)` and
+  related shortcuts. [GH-34819][GH-34819] [GH-34951][GH-34951]
+* _Breaking change:_ `Arrow::Table#slice` with a filter removes null
+  records. [GH-34953][GH-34953]
+
+## Rust notes
+
+The Rust projects have moved to separate repositories outside the
+main Arrow monorepo. For notes on the latest release of the Rust
+implementation, see the latest [Arrow Rust changelog][5].
+
+[1]: https://github.com/apache/arrow/milestone/51?closed=1
+[2]: {{ site.baseurl }}/release/12.0.0.html#contributors
+[3]: {{ site.baseurl }}/release/12.0.0.html#changelog
+[4]: {{ site.baseurl }}/docs/r/news/
+[5]: https://github.com/apache/arrow-rs/tags
+
+[GH-15285]: https://github.com/apache/arrow/issues/15285
+[GH-15286]: https://github.com/apache/arrow/issues/15286
+[GH-15287]: https://github.com/apache/arrow/issues/15287
+[GH-21429]: https://github.com/apache/arrow/issues/21429
+[GH-21430]: https://github.com/apache/arrow/issues/21430
+[GH-33243]: https://github.com/apache/arrow/issues/33243
+[GH-34819]: https://github.com/apache/arrow/issues/34819
+[GH-34837]: https://github.com/apache/arrow/issues/34837
+[GH-34074]: https://github.com/apache/arrow/issues/34074
+[GH-34425]: https://github.com/apache/arrow/issues/34425
+[GH-34650]: https://github.com/apache/arrow/issues/34650
+[GH-34951]: https://github.com/apache/arrow/issues/34951
+[GH-34953]: https://github.com/apache/arrow/issues/34953

Reply via email to