malinjawi commented on code in PR #12131: URL: https://github.com/apache/gluten/pull/12131#discussion_r3304804564
########## gluten-delta/src/main/scala/org/apache/gluten/execution/DeltaScanTransformer.scala: ########## @@ -55,16 +67,30 @@ case class DeltaScanTransformer( override lazy val fileFormat: ReadFileFormat = ReadFileFormat.ParquetReadFormat + private lazy val deltaDeletionVectorRegistration + : DeltaScanTransformer.DeletionVectorRegistration = + DeltaScanTransformer.registerDeletionVectorsFromFileFormat(relation) Review Comment: Good point. I updated this in d06ec27bb to follow the same split-time handoff shape from #10740 more closely. The PR now removes the driver-side `DeltaDeletionVectorRegistry` and the relation-wide lazy DV registration from `DeltaScanTransformer`. DV payloads are materialized from each `PartitionedFile`'s Delta metadata when building split info, then passed to native through the external split payload buffers added in this PR. I kept the #12040-style external payload channel instead of embedding the serialized bitmap directly into Substrait like #10740 did, because #12040 introduced the native Delta split descriptor path that consumes payload buffers separately. This keeps the code simpler while avoiding large binary DV payloads in the Substrait plan. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
