This is an automated email from the ASF dual-hosted git repository.
raulcd pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-site.git
The following commit(s) were added to refs/heads/master by this push:
new c7c529cc358 [Website] Version 11.0.0 blog post (#300)
c7c529cc358 is described below
commit c7c529cc35862f939152e969356f0cbfd9a6dfe6
Author: Raúl Cumplido <[email protected]>
AuthorDate: Thu Feb 2 12:19:32 2023 +0100
[Website] Version 11.0.0 blog post (#300)
PR to start adding the blog post information for the Release 11.0.0
With the movement from JIRA to GitHub issues here a couple of links to
help finding the issues for 11.0.0.
11.0.0 RC 0 changelog so far can be found on this commit:
https://github.com/apache/arrow/commit/6d0f608d1491ef00cd8adfdb030fffb6f7605104
The 11.0.0 milestone with all the GitHub closed issues can be found
here:
https://github.com/apache/arrow/milestone/1?closed=1
---------
Co-authored-by: David Li <[email protected]>
Co-authored-by: Matt Topol <[email protected]>
Co-authored-by: Dominik Moritz <[email protected]>
Co-authored-by: Alenka Frim <[email protected]>
Co-authored-by: Nic Crane <[email protected]>
Co-authored-by: Larry White <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Kenta Murata <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Co-authored-by: Weston Pace <[email protected]>
---
_posts/2023-01-25-11.0.0-release.md | 212 ++++++++++++++++++++++++++++++++++++
1 file changed, 212 insertions(+)
diff --git a/_posts/2023-01-25-11.0.0-release.md
b/_posts/2023-01-25-11.0.0-release.md
new file mode 100644
index 00000000000..770b525e44f
--- /dev/null
+++ b/_posts/2023-01-25-11.0.0-release.md
@@ -0,0 +1,212 @@
+---
+layout: post
+title: "Apache Arrow 11.0.0 Release"
+date: "2023-01-25 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements. See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+
+The Apache Arrow team is pleased to announce the 11.0.0 release. This covers
+over 3 months of development work and includes [**423 resolved issues**][1]
+from [**95 distinct contributors**][2]. See the [Install
Page](https://arrow.apache.org/install/)
+to learn how to get the libraries for your platform.
+
+The release notes below are not exhaustive and only expose selected highlights
+of the release. Many other bugfixes and improvements have been made: we refer
+you to the [complete changelog][3].
+
+## Community
+
+Since the 10.0.0 release, Ben Baumgold, Will Jones, Eric Patrick Hanson,
+Curtis Vogt, Yang Jiang, Jarrett Revels, Raúl Cumplido, Jacob Wujciak,
+Jie Wen and Brent Gardner have been invited to be committers.
+Kun Liu have joined the Project Management Committee (PMC).
+
+As per our newly started tradition of rotating the PMC chair once a year
+Andrew Lamb was elected as the new PMC chair and VP.
+
+Thanks for your contributions and participation in the project!
+
+## Columnar Format Notes
+
+## Arrow Flight RPC notes
+
+In the C++/Python Flight clients, DoAction now properly streams the results,
instead of blocking until the call finishes. Applications that did not consume
the iterator before should fully consume the result.
([#15069](https://github.com/apache/arrow/issues/15069))
+
+## C++ notes
+ * It is now possible to specify alignment when making allocations with a
MemoryPool [GH-33056](https://github.com/apache/arrow/issues/33056)
+ * It is now possible to run an ExecPlan without using any CPU threads
+ * Added kernel for slicing list values
[GH-33168](https://github.com/apache/arrow/issues/33168)
+ * Added kernel for slicing binary arrays
[GH-20357](https://github.com/apache/arrow/issues/20357)
+ * When comparing list arrays for equality the list field name is now ignored
[GH-30519](https://github.com/apache/arrow/issues/30519)
+ * Add support for partitioning on columns that contain special characters
[GH-33448](https://github.com/apache/arrow/issues/33448)
+ * Added a streaming reader for JSON
[GH-33140](https://github.com/apache/arrow/issues/33140)
+ * Added support for incremental writes to the ORC writer
[GH-33047](https://github.com/apache/arrow/issues/33047)
+ * Added support for casting decimal to string and writing decimal to CSV
[GH-33002](https://github.com/apache/arrow/issues/33002)
+ * Fixed an assert in the scanner that would occur when batch_readahead was
set to 0 [GH-15264](https://github.com/apache/arrow/issues/15264)
+ * Fixed bug where arrays with a null data buffer would not be accepted when
imported via the C data API
[GH-14875](https://github.com/apache/arrow/issues/14875)
+ * Fixed bug where arrays with a zero-case union data type would not be
accepted when imported via the C data API
[GH-14855](https://github.com/apache/arrow/issues/14855)
+ * Fixed bug where case_when could return incorrect values
[GH-33382](https://github.com/apache/arrow/issues/33382)
+ * Fixed bug where RecordBatch::Equals was ignoring field names
[GH-33285](https://github.com/apache/arrow/issues/33285)
+## C# notes
+
+No major changes to C#.
+
+## Go notes
+* Go's benchmarks will now get added to [Conbench](https://conbench.ursa.dev)
alongside the benchmarks for other implementations
(GH-32983)[https://github.com/apache/arrow/issues/32983]
+* Exposed FlightService_ServiceDesc and RegisterFlightServiceServer to allow
easily incorporating a flight service into an existing gRPC server
(GH-15174)[https://github.com/apache/arrow/issues/15174]
+
+### Arrow
+* Function `ApproxEquals` was implemented for scalar values
(GH-29581)[https://github.com/apache/arrow/issues/29581]
+* `UnmarshalJSON` for the `RecordBuilder` now properly handles extra unknown
fields with complex/nested values
(GH-31840)[https://github.com/apache/arrow/issues/31840]
+* Decimal128 and Decimal256 type support has been added to the CSV reader
(GH-33111)[https://github.com/apache/arrow/issues/33111]
+* Fixed bug in `array.UnionBuilder` where `Len` method always returned 0
(GH-14775)[https://github.com/apache/arrow/issues/14775]
+* Fixed bug for handling slices of Map arrays when marshalling to JSON and for
IPC (GH-14780)[https://github.com/apache/arrow/issues/14780]
+* Fixed memory leak when compressing IPC message body buffers
(GH-14883)[https://github.com/apache/arrow/issues/14883]
+* Added the ability to easily append scalar values to array builders
(GH-15005)[https://github.com/apache/arrow/issues/15005]
+
+#### Compute
+* Scalar binary (add/subtract/multiply/divide/etc.) and unary arithmetic
(abs/neg/sqrt/sign/etc.) has been implemented for the compute package
(GH-33086)[https://github.com/apache/arrow/issues/33086] this includes easy
functions like `compute.Add` and `compute.Divide` etc.
+* Scalar boolean functions like AND/OR/XOR/etc. have been implemented for
compute (GH-33279)[https://github.com/apache/arrow/issues/33279]
+* Scalar comparison function kernels have been implemented for compute
(equal/greater/greater_equal/less/less_equal)
(GH-33308)[https://github.com/apache/arrow/issues/33308]
+* Scalar compute functions are compatible with dictionary encoded arrays by
casting them to their value types
(GH-33502)[https://github.com/apache/arrow/issues/33502]
+
+### Parquet
+* Panic when decoding a delta_bit_packed encoded column has been fixed
(GH-33483)[https://github.com/apache/arrow/issues/33483]
+* Fixed memory leak from Allocator in `pqarrow.WriteArrowToColumn`
(GH-14865)[https://github.com/apache/arrow/issues/14865]
+* Fixed `writer.WriteBatch` to properly handle writing encrypted parquet
columns and no longer silently fail, but instead propagate an error
(GH-14940)[https://github.com/apache/arrow/issues/14940]
+
+## Java notes
+* Implement support for writing compressed files
([#15223](https://github.com/apache/arrow/pull/15223))
+* Improve performance by short-circuiting null checks when comparing non null
field types ([#15106](https://github.com/apache/arrow/pull/15106))
+* Several enhancements to dictionary encoding
([#14891](https://github.com/apache/arrow/pull/14891),
([#14902](https://github.com/apache/arrow/pull/14902),
([#14874](https://github.com/apache/arrow/pull/14874))
+* Extend Table to support additional vector types
([#14573](https://github.com/apache/arrow/pull/14573))
+* Enhance and simplify handling of allocation management by integrating C Data
into allocator hierarchy ([#14506](https://github.com/apache/arrow/pull/14506))
+* Make ComplexCopier agnostic of specific implementation of MapWriter
([#14557](https://github.com/apache/arrow/pull/14557))
+* Distribute Apple M1 compatible JNI libraries via mavencentral
([#14472](https://github.com/apache/arrow/pull/14472))
+* Extend Table copy functionality, and support returning copies of individual
vectors ([#14389](https://github.com/apache/arrow/pull/14389))
+
+## JavaScript notes
+
+* Bugfixes and dependency updates.
+* Arrow now requires BigInt support.
[GH-33681](https://github.com/apache/arrow/pull/33682)
+
+## Python notes
+
+Compatibility notes:
+
+* PyArrow now requires pandas >= 1.0
([ARROW-18173](https://issues.apache.org/jira/browse/ARROW-18173))
+* The `pyarrow.parquet.ParquetDataset()` class now by default uses the new
Dataset API
+ under the hood (`use_legacy_dataset=False`). You can still pass
+ `use_legacy_dataset=True` to get the legacy implementation, but this option
will be
+ removed in a next release
+ ([ARROW-16728](https://issues.apache.org/jira/browse/ARROW-16728)).
+
+New features:
+
+* Added support for the [DataFrame Interchange
Protocol](https://data-apis.org/dataframe-protocol/latest/purpose_and_scope.html)
+ for ``pyarrow.Table``
([GH-33346](https://github.com/apache/arrow/issues/33346)).
+* New kernels: `list_slice()` to slice each list element of a ListArray
+ returning a new ListArray
([ARROW-17960](https://issues.apache.org/jira/browse/ARROW-17960)).
+* A new `filter()` method on the Dataset class as additional API to filter a
Dataset
+ before consuming it
([ARROW-16616](https://issues.apache.org/jira/browse/ARROW-16616)).
+* New `sort()` method for (Chunked)Array and `sort_by()` method for
RecordBatch,
+ providing a convenience on top of the `sort_indices` kernel
+ ([GH-14778](https://github.com/apache/arrow/issues/14778)), and a new
+ `Dataset.sort_by()` method
([GH-14975](https://github.com/apache/arrow/issues/14975)).
+
+Other improvements:
+
+* Support for custom metadata of record batches in the IPC read and write APIs
+ ([ARROW-16430](https://issues.apache.org/jira/browse/ARROW-16430)).
+* Support URIs and the `filesystem` parameter in `pyarrow.parquet.ParquetFile`
+ ([ARROW-18272](https://issues.apache.org/jira/browse/ARROW-18272)) and
+ `pyarrow.parquet.write_metadata`
+ ([ARROW-18225](https://issues.apache.org/jira/browse/ARROW-18225)).
+* When writing a dataset to IPC using `pyarrow.dataset.write_dataset()`, you
can now
+ specify IPC specific options, such as compression
+ ([ARROW-17991](https://issues.apache.org/jira/browse/ARROW-17991))
+* The `pyarrow.array()` function now allows to construct a MapArray from a
sequence of
+ dicts (in addition to a sequence of tuples)
+ ([ARROW-17832](https://issues.apache.org/jira/browse/ARROW-17832)).
+* The `struct_field()` kernel now also accepts field names in addition to
integer
+ indices ([ARROW-17989](https://issues.apache.org/jira/browse/ARROW-17989)).
+* Casting to string is now supported for duration
([ARROW-15822](https://issues.apache.org/jira/browse/ARROW-15822))
+ and decimal
([ARROW-17458](https://issues.apache.org/jira/browse/ARROW-17458)) types,
+ which also means those can now be written to CSV.
+* When writing to CSV, you can now specify the quoting style
+ ([GH-14755](https://github.com/apache/arrow/issues/14755)).
+* The `pyarrow.ipc.read_schema()` function now accepts a Message object
+ ([ARROW-18423](https://issues.apache.org/jira/browse/ARROW-18423)).
+* The Time32Scalar, Time64Scalar, Date32Scalar and Date64Scalar classes got a
`.value`
+ attribute to access the underlying integer value, similar to the other
date-time
+ related scalars
([ARROW-18264](https://issues.apache.org/jira/browse/ARROW-18264))
+* Duration type is now supported in the hash kernels like `dictionary_encode`
+ ([GH-15226](https://github.com/apache/arrow/issues/15226)).
+* Fix silent overflow when converting `datetime.timedelta` to duration type
+ ([ARROW-15026](https://issues.apache.org/jira/browse/ARROW-15026)).
+
+Relevant bug fixes:
+
+* Numpy conversion for ListArray is improved taking into account sliced
offset, avoiding
+ increased memory usage
([GH-20512](https://github.com/apache/arrow/issues/20512)
+* Fix writing files with multi-byte characters in file name
+ ([ARROW-18123](https://issues.apache.org/jira/browse/ARROW-18123)).
+
+## R notes
+* map_batches() is lazy by default; it now returns a RecordBatchReader instead
of a list of RecordBatch objects unless lazy = FALSE.
[GH-14521](https://github.com/apache/arrow/issues/14521)
+* A substantial reorganisation, rewrite of and addition to, many of the
vignettes and README. [GH-14514](https://github.com/apache/arrow/issues/14514)
+
+For more on what’s in the 11.0.0 R package, see the [R changelog][4].
+
+## Ruby and C GLib notes
+
+### Ruby
+
+* `Arrow::Table#save` now always returns self instead of the result of its
`raw_records`[GH-15289](https://github.com/apache/arrow/issues/15289)
+* Improve the GC-related crash prevention system by guarding the shared
objects from GC [ARROW-18161](https://issues.apache.org/jira/browse/ARROW-18161)
+* Add `Arrow::HalfFloat` and `raw_records` support in `Arrow::HalfFloatArray`
[ARROW-18086](https://issues.apache.org/jira/browse/ARROW-18086)
+* Support omitting join keys in `Table#join`
[GH-15084](https://github.com/apache/arrow/issues/15084)
+* Add support for `Arrow::Table.load(uri, schema:)`
[ARROW-15206](https://issues.apache.org/jira/browse/ARROW-15206)
+* Add `Arrow::ColumnContainable#column_names` (e.g.
`Arrow::Table#column_names`)
[GH-15085](https://github.com/apache/arrow/issues/15085)
+* Add `to_arrow_chunked_array` methods to support converting to
`Arrow::ChunkedArray`
[ARROW-18405](https://issues.apache.org/jira/browse/ARROW-18405)
+
+### C GLib
+
+* Add `garrow_chunked_array_new_empty()`
[GH-33671](https://github.com/apache/arrow/issues/33671)
+* Add `GArrowProjectNodeOptions`
[GH-33670](https://github.com/apache/arrow/issues/33670)
+* Add `GADatasetHivePartitioning`
[GH-15257](https://github.com/apache/arrow/issues/15257)
+* The signature of `garrow_execute_plain_wait()` was changed to take the
`error` argument and to return the finished status
[GH-15254](https://github.com/apache/arrow/issues/15254)
+* Add support for half float
[GH-15168](https://github.com/apache/arrow/issues/15168)
+* Add `GADatasetFinishOptions`
[GH-15146](https://github.com/apache/arrow/issues/15146)
+
+## Rust notes
+
+The Rust projects have moved to separate repositories outside the
+main Arrow monorepo. For notes on the latest release of the Rust
+implementation, see the latest [Arrow Rust changelog][5].
+
+[1]: https://github.com/apache/arrow/milestone/1?closed=1
+[2]: {{ site.baseurl }}/release/11.0.0.html#contributors
+[3]: {{ site.baseurl }}/release/11.0.0.html#changelog
+[4]: {{ site.baseurl }}/docs/r/news/
+[5]: https://github.com/apache/arrow-rs/tags
\ No newline at end of file