iemejia opened a new pull request, #12390:
URL: https://github.com/apache/gluten/pull/12390

   ## What changes were proposed in this pull request?
   
   Optimize the Delta Lake integration's planning-time performance, targeting 
two hot paths: DV (Deletion Vector) materialization on the driver and 
post-transform rule application.
   
   ### DV Materialization (`DeltaDeletionVectorScanInfo.normalize`)
   
   - **Cache table path resolution**: resolve once per partition instead of 
per-file. Eliminates N-1 redundant `FileSystem.exists()` calls (HTTP HEAD 
requests on object stores).
   - **Cache Hadoop Configuration**: create one instance per partition instead 
of per-file deep clones.
   - **Read raw DV bytes directly**: for on-disk DVs, read the raw Portable 
Roaring Bitmap bytes via `DeletionVectorStore.readRangeFromStream` (with 
checksum verification) instead of deserializing into Java Roaring objects and 
re-serializing. The on-disk format already matches what Velox expects.
   - **(delta40)** Cache reflective method lookup for `parseDescriptor` in a 
`lazy val`.
   
   ### Post-transform Rules (`DeltaPostTransformRules`)
   
   - **Early-exit guard**: skip all Delta rules if no `DeltaScanTransformer` is 
present. Eliminates 5 full plan traversals for non-Delta queries.
   - **Fused rule execution**: combine 3 Delta rules under a single registered 
rule.
   - **Shallow `containsNativeDeltaScan`**: O(1) direct child/grandchild check 
instead of O(n^2) subtree traversal.
   - **Pre-computed `inputFileRelatedNames`**: static `Set[String]` instead of 
allocating 3 Expression objects per call.
   - **Batched `createPhysicalAttributes`**: single call with full attribute 
list instead of per-column.
   
   ### Allocation Reduction
   
   - **`scanFilters` as `lazy val`**: avoids rebuilding the physicalByExprId 
map and expression tree walk on every call (invoked 3+ times per scan node).
   - **`UnsafeByteOperations.unsafeWrap`**: zero-copy ByteString for DV bytes 
instead of `ByteString.copyFrom`.
   
   ## Measured Results (local filesystem, 100 DV-bearing files)
   
   ```
   Benchmark                         Before    After     Speedup
   -------                           ------    -----     -------
   DV Materialization (100 files)    22 ms     7 ms      3.3x
   Post-transform rules (Delta)     37 us     20 us      1.8x
   Post-transform rules (parquet)   4908 ns   220 ns    22.3x
   ```
   
   ### Projected impact on object stores (100 DV files)
   
   | Storage | Before | After | Speedup |
   |---------|--------|-------|---------|
   | Local FS | 22 ms | 7 ms | 3.3x |
   | ABFS | 2-24 sec | 1.0-1.1 s | 2-22x |
   | GCS | 3-30 sec | 1.0-1.1 s | 3-27x |
   | S3 | 5-45 sec | 1.1-1.2 s | 5-38x |
   
   ## How was this patch tested?
   
   - All existing Delta tests pass (`VeloxDeltaSuite`, 
`DeltaDeletionVectorScanInfoSuite`)
   - Added targeted unit tests:
     - `post-transform rules are no-op on non-Delta plans` (validates 
early-exit guard)
     - `post-transform rules produce DeltaScanTransformer for Delta tables` 
(validates offloading)
     - `scanFilters returns consistent results on repeated access` (validates 
lazy val caching)
   - Added `DeltaPlanningBenchmark` for reproducible before/after measurement
   - Scalastyle, Checkstyle, Spotless: all pass with zero violations
   
   ## Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude claude-opus-4.6


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to