luoyuxia opened a new issue, #131:
URL: https://github.com/apache/paimon-rust/issues/131

   ## Parent Issue
   Part of #124 (support partitioned table)
   Depends on #126 (BinaryRow deserialization), #127 (partition path generation)
   
   ## Background
   
   Currently, `TableScan` and `TableRead` do not support partitioned tables:
   
   1. **`TableScan::plan_snapshot()`** discards partition info and builds wrong 
bucket paths (`{base_path}/bucket-{bucket}` instead of 
`{base_path}/{partition_path}/bucket-{bucket}`)
   2. **`TableRead::to_arrow()`** explicitly rejects partitioned tables with 
`Unsupported` error
   3. No integration tests for partitioned table reading
   
   ## What needs to be done
   
   ### 1. Fix TableScan to generate correct partition paths
   
   - Pass partition type info (from `TableSchema`) into `plan_snapshot()`, or 
change it to an instance method that can access `self.table.schema`
   - Decode partition bytes from `ManifestEntry` into `BinaryRow` using 
`BinaryRow::from_bytes(arity, data)`
   - Use partition path utils (from #127) to compute the partition path segment
   - Construct `bucket_path` as `{table_path}/{partition_path}/bucket-{bucket}`
   - Store actual decoded `BinaryRow` in `DataSplit` instead of empty 
`BinaryRow::new(0)`
   
   ### 2. Remove partitioned table read restriction in TableRead
   
   - Remove the partition key check in `TableRead::to_arrow()` 
(`read_builder.rs:88-95`)
   - ArrowReader should work without changes once paths are correct
   
   ### 3. Add integration tests
   
   - Prepare test fixtures for partitioned tables (single and multiple 
partition keys)
   - Integration test: read a partitioned table end-to-end, verify DataSplits 
have correct `bucket_path` with partition segments
   - Verify partition column values are correct in returned RecordBatches
   
   ## Affected files
   - `crates/paimon/src/table/table_scan.rs` — `plan_snapshot()` method
   - `crates/paimon/src/table/read_builder.rs` — `TableRead::to_arrow()` method
   - `crates/integration_tests/tests/` — new test file(s)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to