malinjawi commented on code in PR #12131:
URL: https://github.com/apache/gluten/pull/12131#discussion_r3304804564
##########
gluten-delta/src/main/scala/org/apache/gluten/execution/DeltaScanTransformer.scala:
##########
@@ -55,16 +67,30 @@ case class DeltaScanTransformer(
override lazy val fileFormat: ReadFileFormat =
ReadFileFormat.ParquetReadFormat
+ private lazy val deltaDeletionVectorRegistration
+ : DeltaScanTransformer.DeletionVectorRegistration =
+ DeltaScanTransformer.registerDeletionVectorsFromFileFormat(relation)
Review Comment:
Thanks for pointing that out @zhztheplayer
I updated this in d06ec27bb to follow the same split-time handoff shape
from #10740 more closely.
The PR now removes the driver-side `DeltaDeletionVectorRegistry` and the
relation-wide lazy DV registration from `DeltaScanTransformer`. DV payloads are
materialized from each `PartitionedFile`'s Delta metadata when building split
info, then passed to native through the external split payload buffers added in
this PR.
I kept the #12040-style external payload channel instead of embedding the
serialized bitmap directly into Substrait like #10740 did, because #12040
introduced the native Delta split descriptor path that consumes payload buffers
separately. This keeps the code simpler while avoiding large binary DV payloads
in the Substrait plan.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]