malinjawi commented on code in PR #12131:
URL: https://github.com/apache/gluten/pull/12131#discussion_r3304804564


##########
gluten-delta/src/main/scala/org/apache/gluten/execution/DeltaScanTransformer.scala:
##########
@@ -55,16 +67,30 @@ case class DeltaScanTransformer(
 
   override lazy val fileFormat: ReadFileFormat = 
ReadFileFormat.ParquetReadFormat
 
+  private lazy val deltaDeletionVectorRegistration
+      : DeltaScanTransformer.DeletionVectorRegistration =
+    DeltaScanTransformer.registerDeletionVectorsFromFileFormat(relation)

Review Comment:
   Good point. I updated this in d06ec27bb to follow the same split-time 
handoff shape from #10740 more closely.
   
   The PR now removes the driver-side `DeltaDeletionVectorRegistry` and the 
relation-wide lazy DV registration from `DeltaScanTransformer`. DV payloads are 
materialized from each `PartitionedFile`'s Delta metadata when building split 
info, then passed to native through the external split payload buffers added in 
this PR.
   
   I kept the #12040-style external payload channel instead of embedding the 
serialized bitmap directly into Substrait like #10740 did, because #12040 
introduced the native Delta split descriptor path that consumes payload buffers 
separately. This keeps the code simpler while avoiding large binary DV payloads 
in the Substrait plan.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to