alamb opened a new issue, #9386:
URL: https://github.com/apache/arrow-rs/issues/9386

   Since this recurses, this could potentially blow out the stack with 
pathalogical inputs (e.g. a RecordBatch with 1M rows with a max_row_group_count 
of 1). I don't think it is necessary to fix now, I just wanted to point it out
   
   _Originally posted by @alamb in 
https://github.com/apache/arrow-rs/pull/9357#discussion_r2790342725_
   
   
   Here is a reproducer (add to arrow/arrow_writer/mod.rs) which fails (process 
aborts due to stack overflow)
   
   ```rust
       #[test]
       fn test_row_group_limit_rows_only_pathological_stack_overflow_demo() {
           let schema = Arc::new(Schema::new(vec![Field::new(
               "int",
               ArrowDataType::Int32,
               false,
           )]));
           let array = Int32Array::from((0..1_000_000_i32).collect::<Vec<_>>());
           let batch = RecordBatch::try_new(schema.clone(), 
vec![Arc::new(array)]).unwrap();
   
           let props = WriterProperties::builder()
               .set_max_row_group_row_count(Some(1))
               .set_max_row_group_bytes(None)
               .build();
   
           let file = tempfile::tempfile().unwrap();
           let mut writer = ArrowWriter::try_new(file, schema, 
Some(props)).unwrap();
   
           // This currently recurses once per row-group split and can overflow 
the stack.
           writer.write(&batch).unwrap();
       }
   ```
               
   
   
   The expected behavior is either an error (or ideally) successfully write the 
file


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to