kawadakk opened a new issue, #4549: URL: https://github.com/apache/arrow-rs/issues/4549
**Describe the bug** `FixedSizeListBuilder::new` allocates extraneous capacity for the validity buffer. https://github.com/apache/arrow-rs/blob/72cafde586af831d911473c6d1bbd56d2482cfdb/arrow-array/src/builder/fixed_size_list_builder.rs#L73-L78 Line 77 specifies to pre-allocate `capacity` (`values_builder.len()`) elements when in fact only `values_builder.len() / value_length` elements are necessary in the validity buffer to cover all existing values in `values_builder`. **To Reproduce** ```rust use std::{ alloc::GlobalAlloc, sync::atomic::{AtomicUsize, Ordering}, }; fn main() { let mut el_builder = arrow::array::UInt8Builder::with_capacity(1024 * 1024); for _ in 0..1024 * 1024 { el_builder.append_value(0); } MAX.store(0, Ordering::Relaxed); // ignore the allocation for `UInt8Builder` let mut builder = arrow::array::FixedSizeListBuilder::new(el_builder, 1024); for _ in 0..1024 * 1024 / 1024 { builder.append(false); } // Check the allocation size of `FixedSizeListBuilder` validity buffer assert!(dbg!(MAX.load(Ordering::Relaxed)) <= 1024 * 1024 / 1024 / 8 + arrow::alloc::ALIGNMENT); } struct Alloc(std::alloc::System); #[global_allocator] static _A: Alloc = Alloc(std::alloc::System); static MAX: AtomicUsize = AtomicUsize::new(0); unsafe impl GlobalAlloc for Alloc { unsafe fn alloc(&self, layout: std::alloc::Layout) -> *mut u8 { // Remember largest allocation MAX.fetch_max(dbg!(layout.size()), Ordering::Relaxed); self.0.alloc(layout) } unsafe fn dealloc(&self, ptr: *mut u8, layout: std::alloc::Layout) { self.0.dealloc(ptr, layout) } } ``` **Expected behavior** <!-- A clear and concise description of what you expected to happen. --> **Additional context** <!-- Add any other context about the problem here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
