a10y commented on code in PR #6168:
URL: https://github.com/apache/arrow-rs/pull/6168#discussion_r1700817315


##########
arrow/src/util/bench_util.rs:
##########
@@ -160,6 +160,34 @@ pub fn create_string_array_with_len<Offset: 
OffsetSizeTrait>(
         .collect()
 }
 
+/// Creates a random (but fixed-seeded) string view array of a given size and 
null density.
+///
+/// See `create_string_array` above for more details.
+pub fn create_string_view_array(size: usize, null_density: f32) -> 
StringViewArray {
+    create_string_view_array_with_max_len(size, null_density, 400)

Review Comment:
   Yep it's a good point. I do think that the change is going to be more 
impactful for outlined vs inlined strings, as validation time is linear w.r.t. 
length of string.
   
   Though, even for the relatively short strings in TPC-H, the validation step 
was really significant (~22% of execution time)
   
   I have a flamegraph for q10 in 
https://github.com/spiraldb/vortex/pull/476#issuecomment-2261686919. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to