jorgecarleitao commented on a change in pull request #9778:
URL: https://github.com/apache/arrow/pull/9778#discussion_r600723317



##########
File path: rust/arrow/src/ffi.rs
##########
@@ -425,6 +447,27 @@ unsafe fn create_buffer(
     NonNull::new(ptr as *mut u8).map(|ptr| Buffer::from_unowned(ptr, len, 
array))
 }
 
+unsafe fn create_child_arrays(
+    array: Arc<FFI_ArrowArray>,
+    schema: Arc<FFI_ArrowSchema>,
+) -> Result<Vec<ArrayData>> {
+    if array.n_children == 0 {
+        return Ok(vec![]);
+    }
+    let mut children = Vec::with_capacity(array.n_children as usize);
+    for i in 0..array.n_children as usize {
+        let arr_ptr = *array.children.add(i);
+        let schema_ptr = *schema.children.add(i);
+        let arrow_arr = ArrowArray::try_from_raw(
+            arr_ptr as *const FFI_ArrowArray,
+            schema_ptr as *const FFI_ArrowSchema,
+        )?;
+        let data = ArrayData::try_from(arrow_arr)?;
+        children.push(data)
+    }
+    Ok(children)

Review comment:
       can probably be simplified to 
   
   ```rust
   (0..array.n_children).map(|| {
   ...
   }).collect()
   ```

##########
File path: rust/arrow/src/datatypes/field.rs
##########
@@ -39,6 +39,12 @@ pub struct Field {
     metadata: Option<BTreeMap<String, String>>,
 }
 
+impl Default for Field {
+    fn default() -> Self {
+        Field::new("", DataType::Null, false)
+    }
+}
+

Review comment:
       Why should a default field be the Null field? My feeling is that should 
not have a default for `Field`.

##########
File path: rust/arrow/src/ffi.rs
##########
@@ -193,6 +204,8 @@ fn to_datatype(format: &str) -> Result<DataType> {
         "ttm" => DataType::Time32(TimeUnit::Millisecond),
         "ttu" => DataType::Time64(TimeUnit::Microsecond),
         "ttn" => DataType::Time64(TimeUnit::Nanosecond),
+        "+l" => DataType::List(Box::new(Default::default())),
+        "+L" => DataType::LargeList(Box::new(Default::default())),

Review comment:
       I am not sure this is sufficient. For nested types, I think that we must 
transverse the childs before initializing the parent's data type. As is, the 
parent will get `List(Null)`, but the child will get whatever data type the 
child has.
   
   We unfortunately do not even validate that `child dataType = parent 
datatype`, which is why this trick works. However, we will not be fulfilling 
the specification and can result in UB.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to