durgaprasadml opened a new pull request, #38750:
URL: https://github.com/apache/beam/pull/38750

   Description:
   
   ## What does this PR do?
   
   This PR implements the Delta Lake source reader using the Delta Kernel API 
and adds performance/integration tests for Delta Lake reads.
   
   The implementation introduces a parallelized read path for Delta tables by 
planning scans on the coordinator and distributing Parquet file reads across 
Beam workers.
   
   ## Changes Included
   
   ### DeltaIO Reader Implementation
   - Completed DeltaIO.ReadRows
   - Added Delta Kernel snapshot loading support
   - Added scan planning and file descriptor generation
   - Implemented parallel Parquet reads using Beam transforms
   - Added Beam Schema inference from Delta schemas
   - Added logical Delta Row → Beam Row conversion
   - Added support for:
     - primitive types
     - nested structs
     - arrays
     - maps
   
   ### Performance / Integration Tests
   Added:
   - DeltaIOIT
   - DeltaIOTestPipelineOptions
   
   Test scenarios:
   - testReadSmall
   - testReadLarge
   - testReadPartitioned
   
   The tests:
   - generate Delta tables locally
   - create Delta logs dynamically
   - validate partitioned reads
   - collect throughput and latency metrics
   - publish metrics using IOITMetrics
   
   ### Build Updates
   Updated sdks/java/io/delta/build.gradle with required integration test 
dependencies and Hadoop runtime dependencies required by Delta Kernel.
   
   ## Verification
   
   Executed:
   
   bash ./gradlew :sdks:java:io:delta:compileJava ./gradlew 
:sdks:java:io:delta:compileTestJava ./gradlew :sdks:java:io:delta:test --tests 
org.apache.beam.sdk.io.delta.DeltaIOIT 
   
   Fixes #38559


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to