Re: [PR] chore: increase row count and batch size for more deterministic tests [arrow-rs]

via GitHub Tue, 06 Jan 2026 13:48:10 -0800


alamb commented on code in PR #9088:
URL: https://github.com/apache/arrow-rs/pull/9088#discussion_r2666399032



##########
arrow-json/benches/serde.rs:
##########
@@ -22,12 +22,15 @@ use rand::{Rng, rng};
 use serde::Serialize;
 use std::sync::Arc;
 
+const ROWS: usize = 1 << 18;
+
 #[allow(deprecated)]
 fn do_bench<R: Serialize>(c: &mut Criterion, name: &str, rows: &[R], schema: 
&Schema) {
     let schema = Arc::new(schema.clone());
+    let batch_size = rows.len();
     c.bench_function(name, |b| {
         b.iter(|| {
-            let builder = 
ReaderBuilder::new(schema.clone()).with_batch_size(64);
+            let builder = 
ReaderBuilder::new(schema.clone()).with_batch_size(batch_size);

Review Comment:
   I think a batch size of 256k (2**18) is also too big -- can we use 4K or 8KB 
instead? I think that would be more realistic?
   
   Parsing 256K rows does make sense to me



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] chore: increase row count and batch size for more deterministic tests [arrow-rs]

Reply via email to