schenksj commented on PR #4532:
URL: 
https://github.com/apache/datafusion-comet/pull/4532#issuecomment-4827359543

   Good question. Short version: for a normal Comet-native scan of a 
partitioned table this code never runs — partition values aren't 
`ConstantColumnVector` on that path. Comet's native scan serializes partition 
values into the native plan and DataFusion expands them into Arrow columns 
natively (`CometNativeScan.scala:169-191`, `partition2Proto`), so the 1M-batch 
partitioned-scan case you describe incurs zero cost from this change.
   
   **Why Delta hits this at all (the motivating case):** contrib-delta 
accelerates Delta reads natively — partition values, column-mapping, and 
row-tracking are all synthesized *inside* the kernel/native scan, so its output 
is pure Arrow and never contains a `ConstantColumnVector`. But certain Delta 
*write* rewrites — OPTIMIZE bin-packing and the deletion-vector 
DELETE/UPDATE/MERGE rewrites — need read semantics the native scan can't 
faithfully reproduce. For example, with 
`deletionVectors.useMetadataRowIndex=false` the rewrite requests two 
distinctly-named row-index columns and native can only synthesize one; the 
inverted row-index-filter semantics and the column-mapping-materialized 
`row_commit_version` similarly can't be reconstructed from kernel enumeration. 
For exactly those reads the contrib *deliberately declines* native acceleration 
and hands the scan back to vanilla Spark's `FileSourceScanExec` — that's the 
"Comet → plain" step, and it's a correctness decision, not
  an oversight. Spark's vectorized reader then emits the partition and 
Delta-synthetic columns (`__delta_internal_is_row_deleted`, the row-index, 
row-tracking ids) as `ConstantColumnVector`s.
   
   The rest of the rewrite plan *above* that one declined scan is still 
Comet-accelerated — the repartition/bin-pack and the columnar-to-row step that 
feeds Delta's writer — so the constant-bearing Spark batch crosses *back* into 
a Comet operator (the "plain → Comet" step). That's where it reaches 
`getBatchFieldVectors` (shuffle/broadcast serialize) or 
`NativeUtil.exportBatch` (FFI), which previously only handled `CometVector` and 
threw "Comet execution only takes Arrow Arrays". So the round trip isn't Comet 
leaking constants out of native execution — it's a single deliberately-declined 
scan sandwiched in an otherwise-native plan, with Comet operators on both sides 
of it.
   
   So this materialization is per-batch, O(numRows) per constant column, but 
only in that bridging window. I looked at whether a cheaper encoding could 
avoid the expansion there, and it can't without new native work: the FFI export 
side requires a real Arrow `FieldVector`, and the native import side normalizes 
everything to materialized arrays — dictionaries are proactively unpacked 
(`scan.rs:99-100`) and RunEndEncoded isn't supported at all, with no 
scalar-backed column type crossing FFI. A dictionary would just get unpacked on 
entry (`scan.rs:161`), erasing the saving. And on `main` this case isn't 
cheaper — it throws `SparkException`. So it's correctness vs. crash on a 
non-hot path, not a per-batch regression on scans. Happy to file a follow-up if 
we ever want a true constant/REE column across FFI, but that's a sizable native 
feature.
   
   (Disclosure: this analysis and reply were written with help from AI — Claude 
Code.)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to