This is an automated email from the ASF dual-hosted git repository.
alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-site.git
The following commit(s) were added to refs/heads/main by this push:
new b4cd319cd81 Website: Add blog post for arrow-rs 57.0.0 (#720)
b4cd319cd81 is described below
commit b4cd319cd81b3355244f30fe57239d7f4a110cf0
Author: Andrew Lamb <[email protected]>
AuthorDate: Thu Oct 30 12:58:35 2025 -0400
Website: Add blog post for arrow-rs 57.0.0 (#720)
- Closes https://github.com/apache/arrow-rs/issues/8463
Preview URL:
https://alamb.github.io/arrow-site/blog/2025/09/04/arrow-rs-57.0.0/
This release has a crazy amount of content so we should tell the world
about it. Here are two related blogs:
- https://github.com/apache/arrow-site/pull/712
- https://github.com/apache/arrow-site/pull/711
---------
Co-authored-by: Copilot <[email protected]>
---
_posts/2025-10-30-arrow-rs-57.0.0.md | 249 +++++++++++++++++++++++++++++++++++
1 file changed, 249 insertions(+)
diff --git a/_posts/2025-10-30-arrow-rs-57.0.0.md
b/_posts/2025-10-30-arrow-rs-57.0.0.md
new file mode 100644
index 00000000000..5e9405dc004
--- /dev/null
+++ b/_posts/2025-10-30-arrow-rs-57.0.0.md
@@ -0,0 +1,249 @@
+---
+layout: post
+title: "Apache Arrow Rust 57.0.0 Release"
+date: "2025-10-30 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements. See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+The Apache Arrow team is pleased to announce that the v57.0.0 release of
Apache Arrow
+Rust is now available on crates.io ([arrow] and [parquet]) and as [source
download].
+
+[arrow]: https://crates.io/crates/arrow
+[parquet]: https://crates.io/crates/parquet
+[source download]:
https://dist.apache.org/repos/dist/release/arrow/arrow-rs-57.0.0
+
+See the [57.0.0 changelog] for a full list of changes.
+
+[57.0.0 changelog]: https://github.com/apache/arrow-rs/blob/57.0.0/CHANGELOG.md
+
+
+## New Features
+
+Note: Arrow Rust hosts the development of the [parquet] crate, a high
+performance Rust implementation of [Apache Parquet].
+
+### Performance: 4x Faster Parquet Metadata Parsing 🚀
+
+Ed Seidl ([@etseidl]) and Jörn Horstmann ([@jhorstmann]) contributed a
rewritten
+thrift metadata parser for Parquet files which is almost 4x faster than the
+previous parser based on the `thrift` crate. This is especially exciting for
low
+latency use cases and reading Parquet files with large amounts of metadata
(e.g.
+many row groups or columns).
+See the [blog post about the new Parquet metadata parser] for more details.
+
+<div style="display: flex; gap: 16px; justify-content: center; align-items:
flex-start;">
+ <img src="{{ site.baseurl }}/img/rust-parquet-metadata/results.png"
width="100%" class="img-responsive" alt="" aria-hidden="true">
+</div>
+
+*Figure 1:* Performance improvements of [Apache Parquet] metadata parsing
between version `56.2.0` and `57.0.0`.
+
+
+[Apache Parquet]: https://parquet.apache.org/
+[@etseidl]: https://github.com/etseidl
+[@jhorstmann]: https://github.com/jhorstmann
+
+[blog post about the new Parquet metadata parser]:
https://arrow.apache.org/blog/2025/10/23/rust-parquet-metadata/
+
+### New `arrow-avro` Crate
+
+The `57.0.0` release introduces a new [`arrow-avro`] crate contributed by
[@jecsand838]
+and [@nathaniel-d-ef] that provides much more efficient conversion between
+[Apache Avro](https://avro.apache.org/) and Arrow `RecordBatch`es, as well as
broader feature support.
+
+Previously, Arrow‑based systems that read or wrote Avro data
+typically used the general‑purpose [apache-avro] crate. While mature and
+feature‑complete, its row-oriented API does not support features such as
+projection pushdown or vectorized execution. The new `arrow-avro` crate
supports
+these features efficiently by converting Avro data directly into Arrow's
+columnar format.
+
+See the [blog post about adding arrow-avro] for more details.
+
+<div style="display: flex; gap: 16px; justify-content: center; align-items:
flex-start; padding: 20px 15px;">
+<img src="{{ site.baseurl
}}/img/introducing-arrow-avro/arrow-avro-architecture.svg"
+ width="100%"
+ alt="High-level `arrow-avro` architecture"
+ style="background:#fff">
+</div>
+
+*Figure 2:* Architecture of the `arrow-avro` crate.
+
+
+[@jecsand838]: https://github.com/jecsand838
+[@nathaniel-d-ef]: https://github.com/nathaniel-d-ef
+[apache-avro]: https://crates.io/crates/apache-avro
+[`arrow-avro`]: https://crates.io/crates/arrow-avro
+
+[blog post about adding arrow-avro]:
https://arrow.apache.org/blog/2025/10/23/introducing-arrow-avro/
+
+
+### Parquet Variant Support 🧬
+
+The Apache Parquet project recently added a [new `Variant` type] for
+representing semi-structured data. The `57.0.0` release includes support for
reading and
+writing both normal and shredded `Variant` values to and from Parquet files. It
+also includes [parquet-variant], a complete library for working with `Variant`
+values, [`VariantArray`] for working with arrays of `Variant` values in Apache
+Arrow, computation kernels for converting to/from JSON and Arrow types,
+extracting paths, and shredding values.
+
+[new `Variant` type]:
https://github.com/apache/parquet-format/blob/master/VariantEncoding.md
+[`VariantArray`]:
https://docs.rs/parquet/latest/parquet/variant/struct.VariantArray.html
+[parquet-variant]: https://crates.io/crates/parquet-variant
+
+```rust
+ // Use the VariantArrayBuilder to build a VariantArray
+let mut builder = VariantArrayBuilder::new(3);
+builder.new_object().with_field("name", "Alice").finish(); // row 1: {"name":
"Alice"}
+builder.append_value("such wow"); // row 2: "such wow" (a string)
+let array = builder.build();
+
+// Since VariantArray is an ExtensionType, it needs to be converted
+// to an ArrayRef and Field with the appropriate metadata
+// before it can be written to a Parquet file
+let field = array.field("data");
+let array = ArrayRef::from(array);
+// create a RecordBatch with the VariantArray
+let schema = Schema::new(vec![field]);
+let batch = RecordBatch::try_new(Arc::new(schema), vec![array])?;
+
+// Now you can write the RecordBatch to the Parquet file, as normal
+let file = std::fs::File::create("variant.parquet")?;
+let mut writer = ArrowWriter::try_new(file, batch.schema(), None)?;
+writer.write(&batch)?;
+writer.close()?;
+```
+
+
+This support is being integrated into query engines, such as
+[@friendlymatthew]'s [`datafusion-variant`] crate to integrate into DataFusion
+and [delta-rs]. While this support is still experimental, we believe the APIs
+are mostly complete and do not expect major changes. Please consider trying
+it out and providing feedback and improvements.
+
+[`datafusion-variant`]:
https://github.com/datafusion-contrib/datafusion-variant
+[delta-rs]: https://github.com/delta-io/delta-rs/issues/3637
+
+Thanks to the many contributors who made this possible, including:
+* Ryan Johnson ([@scovich]), Congxian Qiu ([@klion26]), and Liam Bao
([@liamzwbao]) for completing the implementation
+* Li Jiaying ([@PinkCrow007]), Aditya Bhatnagar ([@carpecodeum]), and Malthe
Karbo ([@mkarbo]) for
+initiating the work
+* Everyone else who has contributed, including [@superserious-dev],
[@friendlymatthew], [@micoo227], [@Weijun-H],
+ [@harshmotw-db], [@odysa], [@viirya], [@adriangb], [@kosiew],
[@codephage2020],
+ [@ding-young], [@mbrobbel], [@petern48], [@sdf-jkl], [@abacef], and
[@mprammer].
+
+[@PinkCrow007]: https://github.com/PinkCrow007
+[@mkarbo]: https://github.com/mkarbo
+[@carpecodeum]: https://github.com/carpecodeum
+[@scovich]: https://github.com/scovich
+[@superserious-dev]: https://github.com/superserious-dev
+[@friendlymatthew]: https://github.com/friendlymatthew
+[@micoo227]: https://github.com/micoo227
+[@Weijun-H]: https://github.com/Weijun-H
+[@harshmotw-db]: https://github.com/harshmotw-db
+[@odysa]: https://github.com/odysa
+[@viirya]: https://github.com/viirya
+[@klion26]: https://github.com/klion26
+[@adriangb]: https://github.com/adriangb
+[@kosiew]: https://github.com/kosiew
+[@liamzwbao]: https://github.com/liamzwbao
+[@codephage2020]: https://github.com/codephage2020
+[@ding-young]: https://github.com/ding-young
+[@mbrobbel]: https://github.com/mbrobbel
+[@petern48]: https://github.com/petern48
+[@sdf-jkl]: https://github.com/sdf-jkl
+[@abacef]: https://github.com/abacef
+[@mprammer]: https://github.com/mprammer
+
+See the ticket [Variant type support in Parquet #6736] for more details
+
+
+[Variant type support in Parquet #6736]:
https://github.com/apache/arrow-rs/issues/6736
+
+
+### Parquet Geometry Support 🗺️
+
+
+The `57.0.0` release also includes support for reading and writing [Parquet
Geometry
+types], `GEOMETRY` and `GEOGRAPHY`, including `GeospatialStatistics`
+contributed by Kyle Barron ([@kylebarron]), Dewey Dunnington ([@paleolimbot]),
+Kaushik Srinivasan ([@kaushiksrini]), and Blake Orth ([@BlakeOrth]).
+
+Please see the [Implement Geometry and Geography type support in Parquet]
tracking ticket for more details.
+
+[@kylebarron]: https://github.com/kylebarron
+[@paleolimbot]: https://github.com/paleolimbot
+[@kaushiksrini]: https://github.com/kaushiksrini
+[@BlakeOrth]: https://github.com/BlakeOrth
+
+[Parquet Geometry types]:
https://github.com/apache/parquet-format/blob/master/Geospatial.md
+
+
+[Implement Geometry and Geography type support in Parquet]:
https://github.com/apache/arrow-rs/issues/8373
+
+## Thanks to Our Contributors
+```console
+$ git shortlog -sn 56.0.0..57.0.0
+ 36 Matthijs Brobbel
+ 20 Andrew Lamb
+ 13 Ryan Johnson
+ 11 Ed Seidl
+ 10 Connor Sanders
+ 8 Alex Huang
+ 5 Emil Ernerfeldt
+ 5 Liam Bao
+ 5 Matthew Kim
+ 4 nathaniel-d-ef
+ 3 Raz Luvaton
+ 3 albertlockett
+ 3 dependabot[bot]
+ 3 mwish
+ 2 Ben Ye
+ 2 Congxian Qiu
+ 2 Dewey Dunnington
+ 2 Kyle Barron
+ 2 Lilian Maurel
+ 2 Mark Nash
+ 2 Nuno Faria
+ 2 Pepijn Van Eeckhoudt
+ 2 Tobias Schwarzinger
+ 2 lichuang
+ 1 Adam Gutglick
+ 1 Adam Reeve
+ 1 Alex Stephen
+ 1 Chen Chongchen
+ 1 Jack
+ 1 Jeffrey Vo
+ 1 Jörn Horstmann
+ 1 Kaushik Srinivasan
+ 1 Li Jiaying
+ 1 Lin Yihai
+ 1 Marco Neumann
+ 1 Piotr Findeisen
+ 1 Piotr Srebrny
+ 1 Samuele Resca
+ 1 Van De Bio
+ 1 Yan Tingwang
+ 1 ding-young
+ 1 kosiew
+ 1 张林伟
+```