HaoYang670 commented on code in PR #2686:
URL: https://github.com/apache/arrow-rs/pull/2686#discussion_r966584235
##########
arrow/src/array/array_string.rs:
##########
@@ -164,8 +162,7 @@ impl<OffsetSize: OffsetSizeTrait>
GenericStringArray<OffsetSize> {
.add_buffer(child_data.buffers()[0].slice(child_data.offset()))
.null_bit_buffer(v.data().null_buffer().cloned());
- let array_data = unsafe { builder.build_unchecked() };
- Self::from(array_data)
+ Self::from(builder.build().unwrap())
Review Comment:
Do we need to do all the data validation?
We only need to `validate_utf8` I guess.
##########
arrow/src/array/array_string.rs:
##########
@@ -352,8 +349,7 @@ impl<OffsetSize: OffsetSizeTrait>
From<GenericBinaryArray<OffsetSize>>
{
fn from(v: GenericBinaryArray<OffsetSize>) -> Self {
Review Comment:
It is better to add some comments here such as "This function is slow
because it checks the utf8 validation. Please use `from_list_unchecked` if you
can make sure values are valid utf8 strings". (I am not sure whether we could
add docs for the From trait)
##########
arrow/src/array/array_string.rs:
##########
@@ -352,8 +349,7 @@ impl<OffsetSize: OffsetSizeTrait>
From<GenericBinaryArray<OffsetSize>>
{
fn from(v: GenericBinaryArray<OffsetSize>) -> Self {
let builder = v.into_data().into_builder().data_type(Self::DATA_TYPE);
- let data = unsafe { builder.build_unchecked() };
- Self::from(data)
+ Self::from(builder.build().unwrap())
Review Comment:
We could directly use the `from_list` function.
##########
arrow/src/array/array_string.rs:
##########
@@ -129,8 +129,6 @@ impl<OffsetSize: OffsetSizeTrait>
GenericStringArray<OffsetSize> {
}
/// Convert a list array to a string array.
- /// This method is unsound because it does
- /// not check the utf-8 validation for each element.
fn from_list(v: GenericListArray<OffsetSize>) -> Self {
Review Comment:
It is better to preserve the unsafe implementation and rename it
`from_list_unchecked`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]