jorgecarleitao commented on a change in pull request #9778:
URL: https://github.com/apache/arrow/pull/9778#discussion_r600723317
##########
File path: rust/arrow/src/ffi.rs
##########
@@ -425,6 +447,27 @@ unsafe fn create_buffer(
NonNull::new(ptr as *mut u8).map(|ptr| Buffer::from_unowned(ptr, len,
array))
}
+unsafe fn create_child_arrays(
+ array: Arc<FFI_ArrowArray>,
+ schema: Arc<FFI_ArrowSchema>,
+) -> Result<Vec<ArrayData>> {
+ if array.n_children == 0 {
+ return Ok(vec![]);
+ }
+ let mut children = Vec::with_capacity(array.n_children as usize);
+ for i in 0..array.n_children as usize {
+ let arr_ptr = *array.children.add(i);
+ let schema_ptr = *schema.children.add(i);
+ let arrow_arr = ArrowArray::try_from_raw(
+ arr_ptr as *const FFI_ArrowArray,
+ schema_ptr as *const FFI_ArrowSchema,
+ )?;
+ let data = ArrayData::try_from(arrow_arr)?;
+ children.push(data)
+ }
+ Ok(children)
Review comment:
can probably be simplified to
```rust
(0..array.n_children).map(|| {
...
}).collect()
```
##########
File path: rust/arrow/src/datatypes/field.rs
##########
@@ -39,6 +39,12 @@ pub struct Field {
metadata: Option<BTreeMap<String, String>>,
}
+impl Default for Field {
+ fn default() -> Self {
+ Field::new("", DataType::Null, false)
+ }
+}
+
Review comment:
Why should a default field be the Null field? My feeling is that should
not have a default for `Field`.
##########
File path: rust/arrow/src/ffi.rs
##########
@@ -193,6 +204,8 @@ fn to_datatype(format: &str) -> Result<DataType> {
"ttm" => DataType::Time32(TimeUnit::Millisecond),
"ttu" => DataType::Time64(TimeUnit::Microsecond),
"ttn" => DataType::Time64(TimeUnit::Nanosecond),
+ "+l" => DataType::List(Box::new(Default::default())),
+ "+L" => DataType::LargeList(Box::new(Default::default())),
Review comment:
I am not sure this is sufficient. For nested types, I think that we must
transverse the childs before initializing the parent's data type. As is, the
parent will get `List(Null)`, but the child will get whatever data type the
child has.
We unfortunately do not even validate that `child dataType = parent
datatype`, which is why this trick works. However, we will not be fulfilling
the specification and can result in UB.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]