paleolimbot commented on code in PR #340: URL: https://github.com/apache/arrow-nanoarrow/pull/340#discussion_r1449116618
########## python/README.md: ########## @@ -43,97 +43,129 @@ If you can import the namespace, you're good to go! import nanoarrow as na ``` -## Example +## Low-level C library bindings -The Arrow C Data and Arrow C Stream interfaces are comprised of three structures: the `ArrowSchema` which represents a data type of an array, the `ArrowArray` which represents the values of an array, and an `ArrowArrayStream`, which represents zero or more `ArrowArray`s with a common `ArrowSchema`. All three can be wrapped by Python objects using the nanoarrow Python package. +The Arrow C Data and Arrow C Stream interfaces are comprised of three structures: the `ArrowSchema` which represents a data type of an array, the `ArrowArray` which represents the values of an array, and an `ArrowArrayStream`, which represents zero or more `ArrowArray`s with a common `ArrowSchema`. ### Schemas -Use `nanoarrow.schema()` to convert a data type-like object to an `ArrowSchema`. This is currently only implemented for pyarrow objects. +Use `nanoarrow.c_schema()` to convert an object to an `ArrowSchema` and wrap it as a Python object. This works for any object implementing the [Arrow PyCapsule Interface](https://arrow.apache.org/docs/format/CDataInterface.html) (e.g., `pyarrow.Schema`, `pyarrow.DataType`, and `pyarrow.Field`). ```python import pyarrow as pa -schema = na.schema(pa.decimal128(10, 3)) +schema = na.c_schema(pa.decimal128(10, 3)) +schema ``` -You can extract the fields of a `Schema` object one at a time or parse it into a view to extract deserialized parameters. + + + + <nanoarrow.c_lib.CSchema decimal128(10, 3)> + - format: 'd:10,3' + - name: '' + - flags: 2 + - metadata: NULL + - dictionary: NULL + - children[0]: + + + +You can extract the fields of a `CSchema` object one at a time or parse it into a view to extract deserialized parameters. ```python -print(schema.format) -print(schema.view().decimal_precision) -print(schema.view().decimal_scale) +na.c_schema_view(schema) ``` - d:10,3 - 10 - 3 -The `nanoarrow.schema()` helper is currently only implemented for pyarrow objects. If your data type has an `_export_to_c()`-like function, you can get the address of a freshly-allocated `ArrowSchema` as well: + + <nanoarrow.c_lib.CSchemaView> + - type: 'decimal128' + - storage_type: 'decimal128' + - decimal_bitwidth: 128 + - decimal_precision: 10 + - decimal_scale: 3 + + + +Advanced users can allocate an empty `CSchema` and populate its contents by passing its `._addr()` to a schema-exporting function. ```python -schema = na.Schema.allocate() +schema = na.c_schema() Review Comment: That's a great point...`na.c_schema()` can in theory be used to sanitize input, and allocating a blank one has a totally different use case. I updated this to `nanoarrow.allocate_c_XXX()` for now...I'm not sure the `CSchema` family of classes should be in the root namespace and defining the function in Python gives better documentation when typing in an IDE 🤷 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
