Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-29 Thread via GitHub


viirya commented on PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#issuecomment-3133236650

   Thank you @alamb 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-29 Thread via GitHub


alamb commented on PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#issuecomment-3132941010

   Thanks again @viirya 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-29 Thread via GitHub


alamb merged PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


alamb commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237949057


##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   sorry I was confused



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237922973


##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   Hmm, did you want to ask why I added `#[cfg(not(feature = 
"force_validate"))]`? It is because our CI runs arrow-array tests with 
`force_validate` enabled. If I don't add this line, CI will run the test and 
fail. So I must add the line to pass CI.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237922973


##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   Hmm, did you want to ask why I added `#[cfg(not(feature = 
"force_validate"))]`? It is because our CI runs arrow-array tests with 
`force_validate` enabled. If I don't add this line, CI will run the test and 
fail.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237920593


##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   ?
   
   Yea, the test works when `force_validate` is not enabled. So if you run with 
`cargo test -p arrow-array --features=ffi` , it works.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237920593


##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   Yea, the test works when `force_validate` is not enabled. So if you run with 
`cargo test -p arrow-array --features=ffi` , it works.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


alamb commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237881296


##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   
   > Because cargo test -p arrow-array --features=ffi doesn't enable 
force_validate so it works. 
   
   Right, so `force_validate` is not enabled
   
   > This new test only works when force_validate is not enabled.
   
   Isn't that same?  The test works when force_validate is not enabled 😕 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237873697


##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   > I ran the test without this line and it seems to work fine:
   > cargo test -p arrow-array --features=ffi
   
   Because `cargo test -p arrow-array --features=ffi` doesn't enable 
`force_validate` so it works. This new test only works when `force_validate` is 
not enabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237870120


##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   If `force_validate` is enabled, when we call 
`GenericByteViewArray::new_unchecked` to create a Utf8View array, even it is 
"unchecked" one, but `force_validate` will force to validate the views buffer. 
But in this test the `ScalarBuffer` is a invalid one (it uses an unaligned 
dangling pointer), so there will be a runtime error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237865740


##
arrow-array/src/ffi.rs:
##
@@ -408,7 +408,17 @@ impl ImportedArrowArray<'_> {
 .map(|index| {
 let len = self.buffer_len(index, variadic_buffer_lens, 
&self.data_type)?;
 match unsafe { create_buffer(self.owner.clone(), self.array, 
index, len) } {
-Some(buf) => Ok(buf),
+Some(buf) => {
+// External libraries may use a dangling pointer for a 
buffer with length 0.
+// We respect the array length specified in the C Data 
Interface. Actually,

Review Comment:
   I just wanted to say that if there is a question on if the length is zero, 
but the pointer isn't a dangling pointer. Previously we simply take any 
non-null pointer as valid one and didn't consider the length. We ignored the 
possibility of a dangling pointer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


alamb commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237842657


##
.github/workflows/arrow.yml:
##
@@ -68,7 +68,10 @@ jobs:
   - name: Test arrow-schema
 run: cargo test -p arrow-schema --all-features
   - name: Test arrow-array
-run: cargo test -p arrow-array --all-features
+run: |
+  cargo test -p arrow-array --all-features
+  # Disable feature `force_validate`
+  cargo test -p arrow-array

Review Comment:
   I suggest we explicitly run with the ffi feature too
   
   ```suggestion
 cargo test -p arrow-array --features=ffi
   ```



##
arrow-array/src/ffi.rs:
##
@@ -408,7 +408,17 @@ impl ImportedArrowArray<'_> {
 .map(|index| {
 let len = self.buffer_len(index, variadic_buffer_lens, 
&self.data_type)?;
 match unsafe { create_buffer(self.owner.clone(), self.array, 
index, len) } {
-Some(buf) => Ok(buf),
+Some(buf) => {
+// External libraries may use a dangling pointer for a 
buffer with length 0.
+// We respect the array length specified in the C Data 
Interface. Actually,

Review Comment:
   I don't understand this comment. I guess in my mind  "if the length is 
incorrect we can't make a good buffer" is "obvious" but maybe it is not for 
other readers



##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   I don't understand why this can't work with `force_validate`
   
   I ran the test without this line and it seems to work fine:
   
   ```diff
   diff --git a/arrow-array/src/ffi.rs b/arrow-array/src/ffi.rs
   index 2ee2fd379e..5b05bcd81d 100644
   --- a/arrow-array/src/ffi.rs
   +++ b/arrow-array/src/ffi.rs
   @@ -621,7 +621,6 @@ mod tests_to_then_from_ffi {
}
   
#[test]
   -#[cfg(not(feature = "force_validate"))]
fn test_decimal_round_trip() -> Result<()> {
// create an array natively
let original_array = [Some(12345_i128), Some(-12345_i128), None]
   @@ -1523,7 +1522,6 @@ mod tests_from_ffi {
}
   
#[test]
   -#[cfg(not(feature = "force_validate"))]
fn test_empty_string_with_non_zero_offset() -> Result<()> {
use super::ImportedArrowArray;
use arrow_buffer::{MutableBuffer, OffsetBuffer};
   @@ -1677,7 +1675,6 @@ mod tests_from_ffi {
}
   
#[test]
   -#[cfg(not(feature = "force_validate"))]
fn test_utf8_view_ffi_from_dangling_pointer() {
let empty = 
GenericByteViewBuildernew().finish();
let buffers = empty.data_buffers().to_vec();
   ```
   
   ```shell
   cargo test -p arrow-array --features=ffi
   ```
   test result: ok. 185 passed; 0 failed; 1 ignored; 0 measured; 0 filtered 
out; finished in 24.13s
   ```
   



##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
 }
 }
 
+#[test]
+#[cfg(not(feature = "force_validate"))]

Review Comment:
   I don't understand why this can't work with `force_validate`
   
   I ran the test without this line and it seems to work fine:
   
   ```diff
   diff --git a/arrow-array/src/ffi.rs b/arrow-array/src/ffi.rs
   index 2ee2fd379e..5b05bcd81d 100644
   --- a/arrow-array/src/ffi.rs
   +++ b/arrow-array/src/ffi.rs
   @@ -621,7 +621,6 @@ mod tests_to_then_from_ffi {
}
   
#[test]
   -#[cfg(not(feature = "force_validate"))]
fn test_decimal_round_trip() -> Result<()> {
// create an array natively
let original_array = [Some(12345_i128), Some(-12345_i128), None]
   @@ -1523,7 +1522,6 @@ mod tests_from_ffi {
}
   
#[test]
   -#[cfg(not(feature = "force_validate"))]
fn test_empty_string_with_non_zero_offset() -> Result<()> {
use super::ImportedArrowArray;
use arrow_buffer::{MutableBuffer, OffsetBuffer};
   @@ -1677,7 +1675,6 @@ mod tests_from_ffi {
}
   
#[test]
   -#[cfg(not(feature = "force_validate"))]
fn test_utf8_view_ffi_from_dangling_pointer() {
let empty = 
GenericByteViewBuildernew().finish();
let buffers = empty.data_buffers().to_vec();
   ```
   
   ```shell
   cargo test -p arrow-array --features=ffi
   ```
   ```
   test result: ok. 185 passed; 0 failed; 1 ignored; 0 measured; 0 filtered 
out; finished in 24.13s
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2236685051


##
arrow-buffer/src/buffer/scalar.rs:
##
@@ -99,6 +112,16 @@ impl ScalarBuffer {
 pub fn ptr_eq(&self, other: &Self) -> bool {
 self.buffer.ptr_eq(&other.buffer)
 }
+
+/// Returns the number of elements in the buffer
+pub fn len(&self) -> usize {
+self.buffer.len() / std::mem::size_of::()
+}
+
+/// Returns if the buffer is empty
+pub fn is_empty(&self) -> bool {
+self.len() == 0
+}

Review Comment:
   Added them to avoid `ScalarBuffer` derefed as `[T]` in the added test.
   
   



##
arrow-buffer/src/buffer/scalar.rs:
##
@@ -99,6 +112,16 @@ impl ScalarBuffer {
 pub fn ptr_eq(&self, other: &Self) -> bool {
 self.buffer.ptr_eq(&other.buffer)
 }
+
+/// Returns the number of elements in the buffer
+pub fn len(&self) -> usize {
+self.buffer.len() / std::mem::size_of::()
+}
+
+/// Returns if the buffer is empty
+pub fn is_empty(&self) -> bool {
+self.len() == 0
+}

Review Comment:
   Added them to avoid `ScalarBuffer` derefed as `[T]` in the added test.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]

2025-07-28 Thread via GitHub


viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2236676096


##
.github/workflows/arrow.yml:
##
@@ -68,7 +68,10 @@ jobs:
   - name: Test arrow-schema
 run: cargo test -p arrow-schema --all-features
   - name: Test arrow-array
-run: cargo test -p arrow-array --all-features
+run: |
+  cargo test -p arrow-array --all-features
+  # Disable feature `force_validate`
+  cargo test -p arrow-array

Review Comment:
   Some tests are only run if `force_validate` is not enabled, I found that we 
don't run these tests in CI. Might be a problem if we silently break them. I 
enabled them here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]