bjchambers opened a new issue #657:
URL: https://github.com/apache/arrow-rs/issues/657
**Describe the bug**
When writing a parquet file with a boolean column I receive the error: `EOF:
unable to put boolean value`.
**To Reproduce**
I don't yet have a minimal repro. For me I was writing batches of ~100k rows
containing a uint64 column and a boolean column. The boolean column included
null values.
**Expected behavior**
The boolean column to be correctly written.
**Additional context**
Based on debugging, I believe the problem may be the method used to extend
the array:
```
if bit_writer.bytes_written() + values.len() / 8 >=
bit_writer.capacity() {
bit_writer.extend(256);
}
```
When my case fails, `values.len()` is 27054. I suspect that even *after*
adding 256 bytes there isn't enough room for all of the values (256 bytes is
2048 bits). It would actually need to be extended with 3382 bytes to have
enough capacity for all of those rows.
It should probably be extended with something like `min(256, values.len() /
8)` or `values.len() / 8` rounded up to a power of 2.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]