Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on PR #8009: URL: https://github.com/apache/arrow-rs/pull/8009#issuecomment-3133236650 Thank you @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
alamb commented on PR #8009: URL: https://github.com/apache/arrow-rs/pull/8009#issuecomment-3132941010 Thanks again @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
alamb merged PR #8009: URL: https://github.com/apache/arrow-rs/pull/8009 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
alamb commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237949057
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
sorry I was confused
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237922973
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
Hmm, did you want to ask why I added `#[cfg(not(feature =
"force_validate"))]`? It is because our CI runs arrow-array tests with
`force_validate` enabled. If I don't add this line, CI will run the test and
fail. So I must add the line to pass CI.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237922973
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
Hmm, did you want to ask why I added `#[cfg(not(feature =
"force_validate"))]`? It is because our CI runs arrow-array tests with
`force_validate` enabled. If I don't add this line, CI will run the test and
fail.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237920593
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
?
Yea, the test works when `force_validate` is not enabled. So if you run with
`cargo test -p arrow-array --features=ffi` , it works.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237920593
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
Yea, the test works when `force_validate` is not enabled. So if you run with
`cargo test -p arrow-array --features=ffi` , it works.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
alamb commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237881296
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
> Because cargo test -p arrow-array --features=ffi doesn't enable
force_validate so it works.
Right, so `force_validate` is not enabled
> This new test only works when force_validate is not enabled.
Isn't that same? The test works when force_validate is not enabled 😕
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237873697
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
> I ran the test without this line and it seems to work fine:
> cargo test -p arrow-array --features=ffi
Because `cargo test -p arrow-array --features=ffi` doesn't enable
`force_validate` so it works. This new test only works when `force_validate` is
not enabled.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237870120
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
If `force_validate` is enabled, when we call
`GenericByteViewArray::new_unchecked` to create a Utf8View array, even it is
"unchecked" one, but `force_validate` will force to validate the views buffer.
But in this test the `ScalarBuffer` is a invalid one (it uses an unaligned
dangling pointer), so there will be a runtime error.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237865740
##
arrow-array/src/ffi.rs:
##
@@ -408,7 +408,17 @@ impl ImportedArrowArray<'_> {
.map(|index| {
let len = self.buffer_len(index, variadic_buffer_lens,
&self.data_type)?;
match unsafe { create_buffer(self.owner.clone(), self.array,
index, len) } {
-Some(buf) => Ok(buf),
+Some(buf) => {
+// External libraries may use a dangling pointer for a
buffer with length 0.
+// We respect the array length specified in the C Data
Interface. Actually,
Review Comment:
I just wanted to say that if there is a question on if the length is zero,
but the pointer isn't a dangling pointer. Previously we simply take any
non-null pointer as valid one and didn't consider the length. We ignored the
possibility of a dangling pointer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
alamb commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2237842657
##
.github/workflows/arrow.yml:
##
@@ -68,7 +68,10 @@ jobs:
- name: Test arrow-schema
run: cargo test -p arrow-schema --all-features
- name: Test arrow-array
-run: cargo test -p arrow-array --all-features
+run: |
+ cargo test -p arrow-array --all-features
+ # Disable feature `force_validate`
+ cargo test -p arrow-array
Review Comment:
I suggest we explicitly run with the ffi feature too
```suggestion
cargo test -p arrow-array --features=ffi
```
##
arrow-array/src/ffi.rs:
##
@@ -408,7 +408,17 @@ impl ImportedArrowArray<'_> {
.map(|index| {
let len = self.buffer_len(index, variadic_buffer_lens,
&self.data_type)?;
match unsafe { create_buffer(self.owner.clone(), self.array,
index, len) } {
-Some(buf) => Ok(buf),
+Some(buf) => {
+// External libraries may use a dangling pointer for a
buffer with length 0.
+// We respect the array length specified in the C Data
Interface. Actually,
Review Comment:
I don't understand this comment. I guess in my mind "if the length is
incorrect we can't make a good buffer" is "obvious" but maybe it is not for
other readers
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
I don't understand why this can't work with `force_validate`
I ran the test without this line and it seems to work fine:
```diff
diff --git a/arrow-array/src/ffi.rs b/arrow-array/src/ffi.rs
index 2ee2fd379e..5b05bcd81d 100644
--- a/arrow-array/src/ffi.rs
+++ b/arrow-array/src/ffi.rs
@@ -621,7 +621,6 @@ mod tests_to_then_from_ffi {
}
#[test]
-#[cfg(not(feature = "force_validate"))]
fn test_decimal_round_trip() -> Result<()> {
// create an array natively
let original_array = [Some(12345_i128), Some(-12345_i128), None]
@@ -1523,7 +1522,6 @@ mod tests_from_ffi {
}
#[test]
-#[cfg(not(feature = "force_validate"))]
fn test_empty_string_with_non_zero_offset() -> Result<()> {
use super::ImportedArrowArray;
use arrow_buffer::{MutableBuffer, OffsetBuffer};
@@ -1677,7 +1675,6 @@ mod tests_from_ffi {
}
#[test]
-#[cfg(not(feature = "force_validate"))]
fn test_utf8_view_ffi_from_dangling_pointer() {
let empty =
GenericByteViewBuildernew().finish();
let buffers = empty.data_buffers().to_vec();
```
```shell
cargo test -p arrow-array --features=ffi
```
test result: ok. 185 passed; 0 failed; 1 ignored; 0 measured; 0 filtered
out; finished in 24.13s
```
##
arrow-array/src/ffi.rs:
##
@@ -1660,6 +1676,25 @@ mod tests_from_ffi {
}
}
+#[test]
+#[cfg(not(feature = "force_validate"))]
Review Comment:
I don't understand why this can't work with `force_validate`
I ran the test without this line and it seems to work fine:
```diff
diff --git a/arrow-array/src/ffi.rs b/arrow-array/src/ffi.rs
index 2ee2fd379e..5b05bcd81d 100644
--- a/arrow-array/src/ffi.rs
+++ b/arrow-array/src/ffi.rs
@@ -621,7 +621,6 @@ mod tests_to_then_from_ffi {
}
#[test]
-#[cfg(not(feature = "force_validate"))]
fn test_decimal_round_trip() -> Result<()> {
// create an array natively
let original_array = [Some(12345_i128), Some(-12345_i128), None]
@@ -1523,7 +1522,6 @@ mod tests_from_ffi {
}
#[test]
-#[cfg(not(feature = "force_validate"))]
fn test_empty_string_with_non_zero_offset() -> Result<()> {
use super::ImportedArrowArray;
use arrow_buffer::{MutableBuffer, OffsetBuffer};
@@ -1677,7 +1675,6 @@ mod tests_from_ffi {
}
#[test]
-#[cfg(not(feature = "force_validate"))]
fn test_utf8_view_ffi_from_dangling_pointer() {
let empty =
GenericByteViewBuildernew().finish();
let buffers = empty.data_buffers().to_vec();
```
```shell
cargo test -p arrow-array --features=ffi
```
```
test result: ok. 185 passed; 0 failed; 1 ignored; 0 measured; 0 filtered
out; finished in 24.13s
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on code in PR #8009:
URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2236685051
##
arrow-buffer/src/buffer/scalar.rs:
##
@@ -99,6 +112,16 @@ impl ScalarBuffer {
pub fn ptr_eq(&self, other: &Self) -> bool {
self.buffer.ptr_eq(&other.buffer)
}
+
+/// Returns the number of elements in the buffer
+pub fn len(&self) -> usize {
+self.buffer.len() / std::mem::size_of::()
+}
+
+/// Returns if the buffer is empty
+pub fn is_empty(&self) -> bool {
+self.len() == 0
+}
Review Comment:
Added them to avoid `ScalarBuffer` derefed as `[T]` in the added test.
##
arrow-buffer/src/buffer/scalar.rs:
##
@@ -99,6 +112,16 @@ impl ScalarBuffer {
pub fn ptr_eq(&self, other: &Self) -> bool {
self.buffer.ptr_eq(&other.buffer)
}
+
+/// Returns the number of elements in the buffer
+pub fn len(&self) -> usize {
+self.buffer.len() / std::mem::size_of::()
+}
+
+/// Returns if the buffer is empty
+pub fn is_empty(&self) -> bool {
+self.len() == 0
+}
Review Comment:
Added them to avoid `ScalarBuffer` derefed as `[T]` in the added test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] Create empty buffer for a buffer specified in the C Data Interface with length zero [arrow-rs]
viirya commented on code in PR #8009: URL: https://github.com/apache/arrow-rs/pull/8009#discussion_r2236676096 ## .github/workflows/arrow.yml: ## @@ -68,7 +68,10 @@ jobs: - name: Test arrow-schema run: cargo test -p arrow-schema --all-features - name: Test arrow-array -run: cargo test -p arrow-array --all-features +run: | + cargo test -p arrow-array --all-features + # Disable feature `force_validate` + cargo test -p arrow-array Review Comment: Some tests are only run if `force_validate` is not enabled, I found that we don't run these tests in CI. Might be a problem if we silently break them. I enabled them here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
