luoyuxia opened a new issue, #128:
URL: https://github.com/apache/paimon-rust/issues/128
## Parent Issue
Part of #124 (support partitioned table)
Depends on #126 (BinaryRow deserialization), #127 (partition path generation)
## Background
`TableScan::plan_snapshot()` currently discards partition information when
building `DataSplit`s:
```rust
// table_scan.rs:154
for ((_partition, bucket), group_entries) in groups {
// ...
// table_scan.rs:171-173
// todo: consider partitioned table
let bucket_path = format!("{base_path}/bucket-{bucket}");
let partition = BinaryRow::new(0); // Always empty!
}
```
For partitioned tables, the correct path should be
`{table_path}/{partition_path}/bucket-{bucket}`, e.g.,
`{table_path}/dt=2024-01-01/bucket-0/`.
## What needs to be done
1. **Pass partition type info to `plan_snapshot()`**
- Add partition keys (names) and partition field types (from
`TableSchema`) as parameters, or pass the `TableSchema` itself
- Alternatively, change `plan_snapshot()` from a static method to an
instance method that can access `self.table.schema`
2. **Decode partition bytes into `BinaryRow`**
- For each group key `(partition_bytes, bucket)`, construct a `BinaryRow`
from the raw bytes using `BinaryRow::from_bytes(arity, data)`
- The arity is the number of partition keys
3. **Generate partition path using `PartitionPathUtils`**
- Call the partition path utility (from #127) to compute the partition
path segment
- Construct `bucket_path` as
`{table_path}/{partition_path}/bucket-{bucket}`
4. **Store actual partition data in `DataSplit`**
- Pass the decoded `BinaryRow` (with real data) to
`DataSplitBuilder.with_partition()` instead of the empty `BinaryRow::new(0)`
## Affected files
- `crates/paimon/src/table/table_scan.rs` — `plan_snapshot()` method
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]