mapleFU commented on PR #34632:
URL: https://github.com/apache/arrow/pull/34632#issuecomment-1483365469
Seems Java has a config, but use would not change it:
```java
/**
* Config for delta binary packing
*/
class DeltaBinaryPackingConfig {
final int blockSizeInValues;
final int miniBlockNumInABlock;
final int miniBlockSizeInValues;
public DeltaBinaryPackingConfig(int blockSizeInValues, int
miniBlockNumInABlock) {
this.blockSizeInValues = blockSizeInValues;
this.miniBlockNumInABlock = miniBlockNumInABlock;
double miniSize = (double) blockSizeInValues / miniBlockNumInABlock;
Preconditions.checkArgument(miniSize % 8 == 0, "miniBlockSize must be
multiple of 8, but it's " + miniSize);
this.miniBlockSizeInValues = (int) miniSize;
}
public static DeltaBinaryPackingConfig readConfig(InputStream in) throws
IOException {
return new DeltaBinaryPackingConfig(BytesUtils.readUnsignedVarInt(in),
BytesUtils.readUnsignedVarInt(in));
}
public BytesInput toBytesInput() {
return BytesInput.concat(
BytesInput.fromUnsignedVarInt(blockSizeInValues),
BytesInput.fromUnsignedVarInt(miniBlockNumInABlock));
}
}
```
arrow-rs use fixed size writer:
```rust
impl<T: DataType> DeltaBitPackEncoder<T> {
/// Creates new delta bit packed encoder.
pub fn new() -> Self {
Self::assert_supported_type();
// Size miniblocks so that they can be efficiently decoded
let mini_block_size = match T::T::PHYSICAL_TYPE {
Type::INT32 => 32,
Type::INT64 => 64,
_ => unreachable!(),
};
let num_mini_blocks = DEFAULT_NUM_MINI_BLOCKS;
let block_size = mini_block_size * num_mini_blocks;
assert_eq!(block_size % 128, 0);
DeltaBitPackEncoder {
page_header_writer: BitWriter::new(MAX_PAGE_HEADER_WRITER_SIZE),
bit_writer: BitWriter::new(MAX_BIT_WRITER_SIZE),
total_values: 0,
first_value: 0,
current_value: 0, // current value to keep adding deltas
block_size, // can write fewer values than block size for
last block
mini_block_size,
num_mini_blocks,
values_in_block: 0, // will be at most block_size
deltas: vec![0; block_size],
_phantom: PhantomData,
}
}
```
I think I can first make it in, and than add an issue to control that
behavior.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]