tustvold commented on issue #6839:
URL: https://github.com/apache/arrow-rs/issues/6839#issuecomment-2521731031

   The issue is that GenericColumnWriter::memory_size is not accounting for the 
data_pages it has buffered waiting for the dictionary page to be flushed. This 
should be a relatively straightforward case of changing it to be 
   
   ```
   pub(crate) fn memory_size(&self) -> usize {
       self.data_pages.iter().map(|x| x.data().len()).sum::<usize>()
           + self.column_metrics.total_bytes_written as usize
           + self.encoder.estimated_memory_size()
   }
   ```
   
   And adding an appropriate test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to