This is an automated email from the ASF dual-hosted git repository.
paleolimbot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-site.git
The following commit(s) were added to refs/heads/main by this push:
new a2c45ab5158 [Website]: Add nanoarrow 0.1 release post (#288)
a2c45ab5158 is described below
commit a2c45ab5158755d30343f137872ac5667c50e5df
Author: Dewey Dunnington <[email protected]>
AuthorDate: Tue Mar 7 17:00:04 2023 -0400
[Website]: Add nanoarrow 0.1 release post (#288)
Co-authored-by: David Li <[email protected]>
---
_posts/2023-03-01-nanoarrow-0.1.0-release.md | 231 +++++++++++++++++++++++++++
1 file changed, 231 insertions(+)
diff --git a/_posts/2023-03-01-nanoarrow-0.1.0-release.md
b/_posts/2023-03-01-nanoarrow-0.1.0-release.md
new file mode 100644
index 00000000000..9b66b048ccd
--- /dev/null
+++ b/_posts/2023-03-01-nanoarrow-0.1.0-release.md
@@ -0,0 +1,231 @@
+---
+layout: post
+title: "Apache Arrow nanoarrow 0.1 Release"
+date: "2023-03-07 00:00:00"
+author: pmc
+categories: [release]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements. See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+The Apache Arrow team is pleased to announce the 0.1.0 release of
+Apache Arrow nanoarrow. This initial release covers 31 resolved issues from
+6 contributors.
+
+## Release Highlights
+
+This initial release includes the following:
+
+- A [C library](#using-nanoarrow-in-c) bundled as two files (nanoarrow.c
+ and nanoarrow.h).
+- An [R package](#using-nanoarrow-in-c-r-and-python) providing bindings for
users
+ of the R programming language.
+
+See the
+[Changelog](https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.1.0/CHANGELOG.md)
+for a detailed list of contributions leading up to the initial release.
+
+## Why nanoarrow?
+
+The adoption of the
+[Arrow C Data
Interface](https://arrow.apache.org/docs/format/CDataInterface.html)
+and the [Arrow C Stream
Interface](https://arrow.apache.org/docs/format/CStreamInterface.html)
+since their
+[introduction](https://arrow.apache.org/blog/2020/05/03/introducing-arrow-c-data-interface/)
+have been impressive and enthusiastic: not only have Arrow language bindings
+adopted the standard to pass data among themselves, a growing number of
+high-profile libraries like
+[GDAL](https://gdal.org/development/rfc/rfc86_column_oriented_api.html) and
+[DuckDB](https://duckdb.org/2021/12/03/duck-arrow.html) use the standard to
+improve performance and provide an ABI-stable interface to tabular input and
output.
+
+GDAL and DuckDB are fortunate to have hard-working and forward-thinking
maintainers
+that were motivated to provide support for the Arrow C Data and Stream
interfaces
+even though the code to do so required an intimate knowledge of both the
interface
+and the columnar specification on which it is based.
+
+The vision of [nanoarrow](https://github.com/apache/arrow-nanoarrow)
+is that it should be trivial for a library or application to implement an
Arrow-based
+interface: if a library consumes or produces tabular data, Arrow should be the
+first place developers look. Developers shouldn't have to be familiar with the
+details of the columnar specification---nor should they have to take on any
+build-time dependencies---to get started.
+
+The [Arrow Database Connectivity
(ADBC)](https://arrow.apache.org/docs/format/ADBC.html)
+specification is a good example of such a project, and provided a strong
+motivator for the development of nanoarrow: at the heart of ADBC is the
+idea of a core "driver manager" and database-specific drivers that are
distributed
+as independent C/C++/Python/R/Java/Go projects. At least in R and Python,
+embedding an existing Arrow implementation (e.g., Arrow C++) is challenging
+in the context of multiple packages intended to be loaded into the same
process.
+As of this writing, ADBC includes nanoarrow-based SQLite and PostgreSQL drivers
+and a nanoarrow-based validation suite for drivers.
+
+## Using nanoarrow in C
+
+The nanoarrow C library is distributed as
+[two files (nanoarrow.h and
nanoarrow.c)](https://github.com/apache/arrow-nanoarrow/tree/main/dist)
+that can be copied and vendored into an existing code base. This results in
+a static library of about 50 KB and builds in less than a second. Some
features
+that nanoarrow provides are:
+
+* [Helpers to create types, schemas, and
metadata](https://apache.github.io/arrow-nanoarrow/dev/c.html#creating-schemas)
+* [Growable
buffers](https://apache.github.io/arrow-nanoarrow/dev/c.html#owning-growable-buffers),
+ including the option for custom allocators/deallocators.
+* [Bitmap (i.e., bitpacked boolean)
utilities](https://apache.github.io/arrow-nanoarrow/dev/c.html#bitmap-utilities)
+* An [API for building arrays from
buffers](https://apache.github.io/arrow-nanoarrow/dev/c.html#creating-arrays)
+* An [API for building arrays
element-wise](https://apache.github.io/arrow-nanoarrow/dev/c.html#creating-arrays)
+* An [API to extract elements
element-wise](https://apache.github.io/arrow-nanoarrow/dev/c.html#reading-arrays)
+ from an existing array.
+
+For example, one can build an integer array element-wise:
+
+```c
+#include "nanoarrow.h"
+
+int make_simple_array(struct ArrowArray* array_out, struct ArrowSchema*
schema_out) {
+ struct ArrowError error;
+ array_out->release = NULL;
+ schema_out->release = NULL;
+
+ NANOARROW_RETURN_NOT_OK(ArrowArrayInitFromType(array_out,
NANOARROW_TYPE_INT32));
+
+ NANOARROW_RETURN_NOT_OK(ArrowArrayStartAppending(array_out));
+ NANOARROW_RETURN_NOT_OK(ArrowArrayAppendInt(array_out, 1));
+ NANOARROW_RETURN_NOT_OK(ArrowArrayAppendInt(array_out, 2));
+ NANOARROW_RETURN_NOT_OK(ArrowArrayAppendInt(array_out, 3));
+ NANOARROW_RETURN_NOT_OK(ArrowArrayFinishBuilding(array_out, &error));
+
+ NANOARROW_RETURN_NOT_OK(ArrowSchemaInitFromType(schema_out,
NANOARROW_TYPE_INT32));
+
+ return NANOARROW_OK;
+}
+```
+
+Similarly, one can extract elements from an array:
+
+```c
+#include <stdio.h>
+#include "nanoarrow.h"
+
+int print_simple_array(struct ArrowArray* array, struct ArrowSchema* schema) {
+ struct ArrowError error;
+ struct ArrowArrayView array_view;
+ NANOARROW_RETURN_NOT_OK(ArrowArrayViewInitFromSchema(&array_view, schema,
&error));
+
+ if (array_view.storage_type != NANOARROW_TYPE_INT32) {
+ printf("Array has storage that is not int32\n");
+ }
+
+ int result = ArrowArrayViewSetArray(&array_view, array, &error);
+ if (result != NANOARROW_OK) {
+ ArrowArrayViewReset(&array_view);
+ return result;
+ }
+
+ for (int64_t i = 0; i < array->length; i++) {
+ printf("%d\n", (int)ArrowArrayViewGetIntUnsafe(&array_view, i));
+ }
+
+ ArrowArrayViewReset(&array_view);
+ return NANOARROW_OK;
+}
+```
+
+## Using nanoarrow in C++, R, and Python
+
+Recognizing that many projects for which nanoarrow may be useful will have
+access a higher-level runtime than C, there are experiments to provide
+these users with a minimal set of useful tools.
+
+For C++ projects, an experimental
+["nanoarrow.hpp"](https://apache.github.io/arrow-nanoarrow/dev/cpp.html)
+interface provides `unique_ptr`-like wrappers for nanoarrow C objects to
+reduce the verbosity of using the nanoarrow API. For example, the previous
+`print_simple_array()` implementation would collapse to:
+
+```cpp
+#include <stdio.h>
+#include "nanoarrow.hpp"
+
+int print_simple_array2(struct ArrowArray* array, struct ArrowSchema* schema) {
+ struct ArrowError error;
+ nanoarrow::UniqueArrayView array_view;
+ NANOARROW_RETURN_NOT_OK(ArrowArrayViewInitFromSchema(array_view.get(),
schema, &error));
+ NANOARROW_RETURN_NOT_OK(ArrowArrayViewSetArray(array_view.get(), array,
&error));
+ for (int64_t i = 0; i < array->length; i++) {
+ printf("%d\n", (int)ArrowArrayViewGetIntUnsafe(array_view.get(), i));
+ }
+ return NANOARROW_OK;
+}
+```
+
+For R packages, experimental
+[R bindings](https://apache.github.io/arrow-nanoarrow/dev/r/index.html) provide
+a limited set of conversions between R vectors and Arrow arrays such that
+R bindings for a library with an Arrow-based interface do not need to provide
+this behaviour themselves. Additional features include printing and validating
+the content of the C structures at the heart of the C Data and C Stream
+interfaces to facilitate the development of bindings to Arrow-based libraries.
+
+```r
+# install.packages("remotes")
+remotes::install_github("apache/arrow-nanoarrow/r", build = FALSE)
+library(nanoarrow)
+
+as_nanoarrow_array(1:5)
+#> <nanoarrow_array int32[5]>
+#> $ length : int 5
+#> $ null_count: int 0
+#> $ offset : int 0
+#> $ buffers :List of 2
+#> ..$ :<nanoarrow_buffer_validity[0 b] at 0x0>
+#> ..$ :<nanoarrow_buffer_data_int32[20 b] at 0x135d13c28>
+#> $ dictionary: NULL
+#> $ children : list()
+```
+
+A [Python package
skeleton](https://github.com/apache/arrow-nanoarrow/tree/main/python)
+exists in the nanoarrow repository and further functionality may be added once
+the C library interface has stabilized.
+
+## Try nanoarrow
+
+For any interested in giving nanoarrow a try, the easiest way to get started
is to clone the
+[nanoarrow repository from
GitHub](https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.1.0)
+and build/modify the
+[minimal CMake build
example](https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.1.0/examples/cmake-minimal).
+For applied usage, one can refer to the
+[ADBC SQLite
driver](https://github.com/apache/arrow-adbc/tree/main/c/driver/sqlite)
+and the [ADBC PostgreSQL
driver](https://github.com/apache/arrow-adbc/tree/main/c/driver/postgresql).
+
+## Contributors
+
+This initial release consists of contributions from 6 contributors in addition
+to the invaluable advice and support of the Apache Arrow developer mailing
list.
+
+```
+$ git shortlog -sn
8339114637919b661c1c8fae6764ceed532c935e..apache-arrow-nanoarrow-0.1.0 | grep
-v "GitHub Actions"
+ 100 Dewey Dunnington
+ 7 David Li
+ 2 Dirk Eddelbuettel
+ 1 Dane Pitkin
+ 1 Jonathan Keane
+ 1 Joris Van den Bossche
+```