alamb commented on code in PR #10060:
URL: https://github.com/apache/arrow-rs/pull/10060#discussion_r3351659721


##########
arrow-array/src/ffi.rs:
##########
@@ -287,14 +287,9 @@ pub unsafe fn from_ffi(array: FFI_ArrowArray, schema: 
&FFI_ArrowSchema) -> Resul
         data_type: dt,
         owner: &array,
     };
-    let mut data = tmp.consume()?;
-    // arrow-rs has stricter alignment requirements than the C Data Interface 
spec;
-    // a no-op when buffers are already aligned. Unreachable under
-    // `cfg(feature = "force_validate")`; tracked in #10034.
-    // See https://github.com/apache/arrow/issues/43552 and
-    // https://github.com/apache/arrow-rs/issues/10028 for context.
-    data.align_buffers();
-    Ok(data)
+    // `consume` realigns buffers before validating (see #10034), so no

Review Comment:
   can you please add the actual github url here?



##########
arrow-array/src/ffi.rs:
##########
@@ -687,30 +686,48 @@ mod tests_to_then_from_ffi {
     }
     // case with nulls is tested in the docs, through the example on this 
module.
 
+    // Functional round-trip coverage of importing an under-aligned 
`Decimal128`
+    // buffer over the C Data Interface. NOTE: this is *not* the 
`force_validate`

Review Comment:
   the note about this not being a regression guard seems only related to this 
PR and will be confusing to future readers outside the context of this PR
   
   >  `arrow-array`'s `force_validate` feature is
   >     // empty and does not forward to `arrow-data/force_validate`, so under
   >     // `force_validate` this test does not actually reach `arrow-data`'s
   
   
   It seems to me like a better change would be to  forward the 
`force_validate` feature rather than making a second copy of the test 🤔 



##########
arrow-array/src/ffi.rs:
##########
@@ -356,18 +346,27 @@ impl ImportedArrowArray<'_> {
             child_data.push(d.consume()?);
         }
 
-        // Should FFI be checking validity?
-        Ok(unsafe {
-            ArrayData::new_unchecked(
-                self.data_type,
-                len,
-                null_count,
-                null_bit_buffer,
-                offset,
-                buffers,
-                child_data,
-            )
-        })
+        // 8-byte alignment is legal over the C Data Interface but violates
+        // arrow-rs's stricter `ArrayData` alignment invariant, so realign the
+        // imported buffers *before* validating them. `align_buffers(true)`
+        // realigns and then `build` validates (or skips validation on the FFI
+        // hot path, except under `force_validate` which always validates). 
This
+        // ordering is what makes import work under `force_validate`; see 
#10034.
+        let mut builder = ArrayDataBuilder::new(self.data_type)
+            .len(len)
+            .offset(offset)
+            .null_bit_buffer(null_bit_buffer)
+            .buffers(buffers)
+            .child_data(child_data)
+            .align_buffers(true);
+        if let Some(null_count) = null_count {
+            builder = builder.null_count(null_count);
+        }
+        // SAFETY: `skip_validation` only skips the redundant FFI validation on

Review Comment:
   the safety of this statement is that FFI is implicuty trusted. We can 
probably reference the security model here as justification: 
https://arrow.apache.org/docs/format/Security.html#c-data-interface
   
   
   The `force_validate` build comments seme to be about this PR and would 
likely be confusing to someone reading this PR in the future



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to