Re: [PR] refactor(python): Document, prefix, and add reprs for C-wrapping classes [arrow-nanoarrow]

via GitHub Thu, 11 Jan 2024 08:33:48 -0800


paleolimbot commented on code in PR #340:
URL: https://github.com/apache/arrow-nanoarrow/pull/340#discussion_r1449116618



##########
python/README.md:
##########
@@ -43,97 +43,129 @@ If you can import the namespace, you're good to go!
 import nanoarrow as na
 ```
 
-## Example
+## Low-level C library bindings
 
-The Arrow C Data and Arrow C Stream interfaces are comprised of three 
structures: the `ArrowSchema` which represents a data type of an array, the 
`ArrowArray` which represents the values of an array, and an 
`ArrowArrayStream`, which represents zero or more `ArrowArray`s with a common 
`ArrowSchema`. All three can be wrapped by Python objects using the nanoarrow 
Python package.
+The Arrow C Data and Arrow C Stream interfaces are comprised of three 
structures: the `ArrowSchema` which represents a data type of an array, the 
`ArrowArray` which represents the values of an array, and an 
`ArrowArrayStream`, which represents zero or more `ArrowArray`s with a common 
`ArrowSchema`.
 
 ### Schemas
 
-Use `nanoarrow.schema()` to convert a data type-like object to an 
`ArrowSchema`. This is currently only implemented for pyarrow objects.
+Use `nanoarrow.c_schema()` to convert an object to an `ArrowSchema` and wrap 
it as a Python object. This works for any object implementing the [Arrow 
PyCapsule Interface](https://arrow.apache.org/docs/format/CDataInterface.html) 
(e.g., `pyarrow.Schema`, `pyarrow.DataType`, and `pyarrow.Field`).
 
 
 ```python
 import pyarrow as pa
-schema = na.schema(pa.decimal128(10, 3))
+schema = na.c_schema(pa.decimal128(10, 3))
+schema
 ```
 
-You can extract the fields of a `Schema` object one at a time or parse it into 
a view to extract deserialized parameters.
+
+
+
+    <nanoarrow.c_lib.CSchema decimal128(10, 3)>
+    - format: 'd:10,3'
+    - name: ''
+    - flags: 2
+    - metadata: NULL
+    - dictionary: NULL
+    - children[0]:
+
+
+
+You can extract the fields of a `CSchema` object one at a time or parse it 
into a view to extract deserialized parameters.
 
 
 ```python
-print(schema.format)
-print(schema.view().decimal_precision)
-print(schema.view().decimal_scale)
+na.c_schema_view(schema)
 ```
 
-    d:10,3
-    10
-    3
 
 
-The `nanoarrow.schema()` helper is currently only implemented for pyarrow 
objects. If your data type has an `_export_to_c()`-like function, you can get 
the address of a freshly-allocated `ArrowSchema` as well:
+
+    <nanoarrow.c_lib.CSchemaView>
+    - type: 'decimal128'
+    - storage_type: 'decimal128'
+    - decimal_bitwidth: 128
+    - decimal_precision: 10
+    - decimal_scale: 3
+
+
+
+Advanced users can allocate an empty `CSchema` and populate its contents by 
passing its `._addr()` to a schema-exporting function.
 
 
 ```python
-schema = na.Schema.allocate()
+schema = na.c_schema()

Review Comment:
   That's a great point...`na.c_schema()` can in theory be used to sanitize 
input, and allocating a blank one has a totally different use case.
   
   I updated this to `nanoarrow.allocate_c_XXX()` for now...I'm not sure the 
`CSchema` family of classes should be in the root namespace and defining the 
function in Python gives better documentation when typing in an IDE 🤷 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] refactor(python): Document, prefix, and add reprs for C-wrapping classes [arrow-nanoarrow]

Reply via email to