924060929 commented on PR #65135: URL: https://github.com/apache/doris/pull/65135#issuecomment-4875292359
### Design notes (for discussion, not blockers) Separating these from the bugs above since they're about the integration seams, not clear defects. The core direction is right: `position_deletes` is fundamentally a file scan (unlike the other sys tables, which are JVM-side `DataTask`s), so reading it natively in BE and reusing the Parquet/ORC readers + the deletion-vector helper is the correct call. - **Partition column is built at the wrong layer.** Rendering the delete file's own-spec partition to plan-time JSON and then matching it positionally against the sys table's *union* partition struct (`Partitioning.partitionType`) is exactly what produces the silent-`NULL` in #3. Coercing `PartitionData` to the table's unified partition type (as Iceberg's own metadata scan does) would be more robust than reconstructing per-file-spec JSON on the FE. - **Sys-table split planning isn't polymorphic.** `position_deletes` is special-cased by string compare in `doGetSystemTableSplits` plus a parallel `setIcebergPositionDeleteSysTableParams`. The next natively-planned sys table would need the same copy-paste (string check + `doGetXxxSplits` + `createXxxSysSplit` + `setXxxParams` + a new `TableFormatType` + a BE bypass). A hook on `SysTable`/`IcebergSysTable` would localize this. - **`IcebergSplit` has become a 3-way union.** Seven `positionDelete*` fields + a boolean discriminator now sit on the shared `IcebergSplit` alongside the data-split and JNI-serialized-split shapes. `positionDeleteContent` is a primitive `int` defaulting to `0`, which is the `DATA` content id — a split that sets the DV flag but forgets to set the content would be misread as a plain position delete with no error. A subclass (or a self-describing descriptor) would make the three shapes explicit. - **Thrift `TIcebergDeleteFileDesc.file_format`** is stamped for this path but deliberately left unset by the MOR delete path (and PUFFIN is relabeled `PARQUET` so the range lands on the duplicated `file_scanner` branch). Worth aligning so the same field means one thing regardless of which producer filled it. These are design opinions, not blockers — happy to defer to your and the committers' judgment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
