This is an automated email from the ASF dual-hosted git repository.

paleolimbot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-site.git


The following commit(s) were added to refs/heads/main by this push:
     new 6a9752df967 Add nanoarrow 0.6.0 release post (#545)
6a9752df967 is described below

commit 6a9752df96766b6c28795b41881a4bcf801686c5
Author: Dewey Dunnington <[email protected]>
AuthorDate: Wed Oct 23 02:18:35 2024 +0000

    Add nanoarrow 0.6.0 release post (#545)
    
    This post adds an update for the nanoarrow 0.6.0 release!
    
    ---------
    
    Co-authored-by: Sutou Kouhei <[email protected]>
    Co-authored-by: David Li <[email protected]>
---
 _posts/2024-10-07-nanoarrow-0.6.0-release.md | 289 +++++++++++++++++++++++++++
 1 file changed, 289 insertions(+)

diff --git a/_posts/2024-10-07-nanoarrow-0.6.0-release.md 
b/_posts/2024-10-07-nanoarrow-0.6.0-release.md
new file mode 100644
index 00000000000..3496763d5f2
--- /dev/null
+++ b/_posts/2024-10-07-nanoarrow-0.6.0-release.md
@@ -0,0 +1,289 @@
+---
+layout: post
+title: "Apache Arrow nanoarrow 0.6.0 Release"
+date: "2024-10-07 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+The Apache Arrow team is pleased to announce the 0.6.0 release of
+Apache Arrow nanoarrow. This release covers 114 resolved issues from
+10 contributors.
+
+## Release Highlights
+
+- Run End Encoding support
+- StringView support
+- IPC Write support
+- DLPack/device support
+- IPC/Device available from CMake/Meson as feature flags
+
+See the
+[Changelog](https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.6.0/CHANGELOG.md)
+for a detailed list of contributions to this release.
+
+## Breaking Changes
+
+Most changes included in the nanoarrow 0.6.0 release will not break downstream
+code; however, two changes with respect to packaging and distribution may 
require
+users to update the code used to bring nanoarrow in as a dependency.
+
+In nanoarrow 0.5.0 and earlier, the bundled single-file amalgamation was 
included in
+the `dist/` subdirectory or could be generated using a specially-crafted CMake
+command. The nanoarrow 0.6.0 release removes the pre-compiled includes and 
migrates
+the code used to generate it to Python. This setup is less confusing for 
contributors
+(whose editors would frequently jump into the wrong `nanoarrow.h`) and is a 
less confusing
+use of CMake. Users can generate the `dist/` subdirectory as it previously 
existed
+with:
+
+``` shell
+python ci/scripts/bundle.py \
+  --source-output-dir=dist \
+  --include-output-dir=dist \
+  --header-namespace= \
+  --with-device \
+  --with-ipc \
+  --with-testing \
+  --with-flatcc
+```
+
+Second, the Arrow IPC and ArrowDeviceArray implementations previously lived in 
the `extensions/`
+subdirectory of the repository. This was helpful during the initial 
development of these
+features; however, the nanoarrow 0.6.0 release added the requisite feature 
coverage and testing
+such that the appropriate home for them is now the main `src/` directory. As 
such, one
+can now build nanoarrow with IPC and/or device support using:
+
+``` shell
+cmake -S . -B build -DNANOARROW_IPC=ON -DNANOARROW_DEVICE=ON
+```
+
+## Features
+
+### Float16, StringView, and REE Support
+
+The nanoarrow 0.6.0 release adds support for Arrow's float16 (half float), 
string view,
+and run-end encoding support. The C library supports building float16 arrays 
using
+`ArrowArrayAppendDouble()` and supports building string view and binary view 
arrays
+using `ArrowArrayAppendString()` and/or `ArrowArrayAppendBytes()` and supports 
consuming
+using `ArrowArrayViewGetStringUnsafe()` and/or 
`ArrowArrayViewGetBytesUnsafe()`. R and
+Python users can request a string view or float16 type when building an array, 
and
+conversion back to R/Python strings is suppored.
+
+``` python
+# pip install nanoarrow
+# conda install nanoarrow -c conda-forge
+import nanoarrow as na
+
+na.Array(["abc", "def", None], na.string_view())
+#> nanoarrow.Array<string_view>[3]
+#> 'abc'
+#> 'def'
+#> None
+na.Array([1, 2, 3], na.float16())
+#> nanoarrow.Array<half_float>[3]
+#> 1.0
+#> 2.0
+#> 3.0
+```
+
+``` r
+# install.packages("nanoarrow")
+library(nanoarrow)
+
+as_nanoarrow_array(c("abc", "def", NA), schema = na_string_view()) |>
+  convert_array()
+#> [1] "abc" "def" NA
+as_nanoarrow_array(c(1, 2, 3), schema = na_half_float()) |>
+  convert_array()
+#> [1] 1 2 3
+```
+
+Creating/consuming run-end encoding arrays by element is not yet
+supported in C, R, or Python; however, arrays can be built or consumed by 
assembling
+the correct array/buffer structure in C.
+
+Thank you to [cocoa-xu](https://github.com/cocoa-xu) for adding float16 and 
run-end encoding
+support and thank you to [WillAyd](https://github.com/WillAyd) for adding 
string view support!
+
+### IPC Write Support
+
+The nanoarrow library has supported reading
+[Arrow IPC streams](https://arrow.apache.org/docs/format/Columnar.html)
+since 0.4.0; however, could not write streams of its own. The nanoarrow 0.6.0 
release adds
+support for stream writing from C using the `ArrowIpcWriter` and stream writing
+from R and Python:
+
+```python
+import io
+import nanoarrow as na
+from nanoarrow import ipc
+
+out = io.BytesIO()
+with ipc.StreamWriter.from_writable(out) as writer:
+    writer.write_stream(ipc.InputStream.example())
+
+out.seek(0)
+na.ArrayStream.from_readable(out).read_all()
+#> nanoarrow.Array<non-nullable struct<some_col: int32>>[3]
+#> {'some_col': 1}
+#> {'some_col': 2}
+#> {'some_col': 3}
+```
+
+``` r
+library(nanoarrow)
+
+tf <- tempfile()
+nycflights13::flights |> write_nanoarrow(tf)
+
+read_nanoarrow(tf) |> tibble::as_tibble()
+#> # A tibble: 336,776 × 19
+#>     year month   day dep_time sched_dep_time dep_delay arr_time 
sched_arr_time
+#>    <int> <int> <int>    <int>          <int>     <dbl>    <int>          
<int>
+#>  1  2013     1     1      517            515         2      830            
819
+#>  2  2013     1     1      533            529         4      850            
830
+#>  3  2013     1     1      542            540         2      923            
850
+#>  4  2013     1     1      544            545        -1     1004           
1022
+#>  5  2013     1     1      554            600        -6      812            
837
+#>  6  2013     1     1      554            558        -4      740            
728
+#>  7  2013     1     1      555            600        -5      913            
854
+#>  8  2013     1     1      557            600        -3      709            
723
+#>  9  2013     1     1      557            600        -3      838            
846
+#> 10  2013     1     1      558            600        -2      753            
745
+#> # ℹ 336,766 more rows
+#> # ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
+#> #   tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
+#> #   hour <dbl>, minute <dbl>, time_hour <dttm>
+```
+
+As a result of the IPC write support, nanoarrow now joins the Arrow IPC 
integration tests
+to ensure compatability across implementations. With the exception of
+[arrow-rs due to a bug in the Rust flatbuffers 
implementation](https://github.com/apache/arrow-rs/issues/5052),
+nanoarrow is now tested against all participating Arrow implementations with 
every commit.
+
+A huge thank you to [bkietz](https://github.com/bkietz) for implementing this 
support and
+the tests (which included multiple bugfixes and identification of 
inconsistencies of
+flatbuffer verification in C, Rust, and C++!).
+
+### DLPack/CUDA Support
+
+The nanoarrow 0.6.0 release includes improved support for the
+[Arrow C Device data 
interface](https://arrow.apache.org/docs/format/CDeviceDataInterface.html).
+In particular, the CUDA device implementation was improved to more efficiently 
coordinate
+synchronization when copying arrays to/from the GPU and migrated to use the 
driver API
+for wider compatibility. The nanoarrow Python bindings have limited support 
for creating
+`ArrowDeviceArray` wrappers that implement the
+[`__arrow_c_device_array__` 
protocol](https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html#export-protocol)
+from anything that implements DLPack:
+
+``` python
+# Currently requires:
+# export NANOARROW_PYTHON_CUDA=/usr/local/cuda
+# pip install --force-reinstall --no-binary=":all:" nanoarrow
+import nanoarrow as na
+from nanoarrow import device
+import cupy as cp
+
+device.c_device_array(cp.array([1, 2, 3]))
+#> <nanoarrow.device.CDeviceArray>
+#> - device_type: CUDA <2>
+#> - device_id: 0
+#> - array: <nanoarrow.c_array.CArray int64>
+#>   - length: 3
+#>   - offset: 0
+#>   - null_count: 0
+#>   - buffers: (0, 133980798058496)
+#>   - dictionary: NULL
+#>   - children[0]:
+
+darray = device.c_device_array(cp.array([1, 2, 3]))
+cp.from_dlpack(darray.array.view().buffer(1))
+#> array([1, 2, 3])
+```
+
+Thank you to [AlenkaF](https://github.com/AlenkaF), 
[shwina](https://github.com/shwina),
+and [danepitkin](https://github.com/danepitkin) for their contributions to and
+review of this feature!
+
+### Build System Support for IPC/Device
+
+Lastly, the CMake build system was refactored to enable `FetchContent` to
+work in an even wider variety of
+[develop/build/install 
scenarios](https://github.com/apache/arrow-nanoarrow/tree/main/examples/cmake-scenarios).
 In most cases, CMake-based projects should be able
+to add the nanoarrow C library with device and/or IPC support as a dependency 
with:
+
+``` cmake
+include(FetchContent)
+
+# If required:
+# set(NANOARROW_IPC ON)
+# set(NANOARROW_DEVICE ON)
+fetchcontent_declare(nanoarrow
+                     URL 
"https://www.apache.org/dyn/closer.lua?action=download&filename=arrow/nanoarrow-0.6.0/apache-arrow-0.6.0.tar.gz";)
+fetchcontent_makeavailable(nanoarrow)
+
+add_executable(some_target ...)
+target_link_libraries(some_target
+                      PRIVATE
+                      nanoarrow::nanoarrow
+                      # If needed
+                      # nanoarrow::nanoarrow_ipc
+                      # nanoarrow::nanoarrow_device
+                      )
+```
+
+Linking against nanoarrow installed via `cmake --install` and located
+via `find_package()` is also supported.
+
+Users of the Meson build system can install the latest nanoarrow with:
+
+``` shell
+mkdir subprojects
+meson wrap install nanoarrow
+```
+
+...and declared as a dependency with:
+
+``` shell
+nanoarrow_dep = dependency('nanoarrow')
+example_exec = executable('example_meson_minimal_app',
+                          'src/app.cc',
+                          dependencies: [nanoarrow_dep])
+```
+
+## Contributors
+
+This release consists of contributions from 10 contributors in addition
+to the invaluable advice and support of the Apache Arrow community.
+
+```console
+$ git shortlog -sn 
apache-arrow-nanoarrow-0.6.0.dev..apache-arrow-nanoarrow-0.6.0 | grep -v 
"GitHub Actions"
+    64  Dewey Dunnington
+    19  William Ayd
+    16  Benjamin Kietzman
+     5  Cocoa
+     2  Abhishek Singh
+     1  Ashwin Srinath
+     1  Dane Pitkin
+     1  Jacob Wujciak-Jens
+     1  Matt Topol
+     1  Tao Zuhong
+```

Reply via email to