sunchao commented on a change in pull request #443:
URL: https://github.com/apache/arrow-rs/pull/443#discussion_r650588350
##########
File path: parquet/src/data_type.rs
##########
@@ -661,8 +661,15 @@ pub(crate) mod private {
_: &mut W,
bit_writer: &mut BitWriter,
) -> Result<()> {
+ if bit_writer.bytes_written() + values.len() >=
bit_writer.capacity() {
Review comment:
Seems here `values.len` is the number of bits to be written? should we
use `values.len() / 8`?
##########
File path: parquet/src/data_type.rs
##########
@@ -661,8 +661,15 @@ pub(crate) mod private {
_: &mut W,
bit_writer: &mut BitWriter,
) -> Result<()> {
+ if bit_writer.bytes_written() + values.len() >=
bit_writer.capacity() {
+ bit_writer.extend(256);
+ }
for value in values {
- bit_writer.put_value(*value as u64, 1);
+ if !bit_writer.put_value(*value as u64, 1) {
Review comment:
Yea, we can either do this or make sure up front that there's enough
capacity to write. One minor concern is putting the if branch inside the for
loop might hurt the performance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]