Re: [PR] Remove redundant getSnapshotAt calls per commit [incubator-xtable]

2026-02-20 Thread via GitHub


the-other-tim-brown commented on PR #791:
URL: https://github.com/apache/incubator-xtable/pull/791#issuecomment-3937667388

   Thanks @brishi19791 for the optimizaiton!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Remove redundant getSnapshotAt calls per commit [incubator-xtable]

2026-02-20 Thread via GitHub


the-other-tim-brown merged PR #791:
URL: https://github.com/apache/incubator-xtable/pull/791


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Remove redundant getSnapshotAt calls per commit [incubator-xtable]

2026-02-13 Thread via GitHub


brishi19791 commented on code in PR #791:
URL: https://github.com/apache/incubator-xtable/pull/791#discussion_r2804569193


##
xtable-core/src/main/java/org/apache/xtable/delta/DeltaTableExtractor.java:
##
@@ -44,6 +44,10 @@ public class DeltaTableExtractor {
 
   public InternalTable table(DeltaLog deltaLog, String tableName, Long 
version) {
 Snapshot snapshot = deltaLog.getSnapshotAt(version, Option.empty());
+return table(snapshot, tableName);
+  }
+
+  public InternalTable table(Snapshot snapshot, String tableName) {

Review Comment:
   thanks for the feedback Ashvin. added javadoc and validator



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Remove redundant getSnapshotAt calls per commit [incubator-xtable]

2026-02-04 Thread via GitHub


ashvina commented on code in PR #791:
URL: https://github.com/apache/incubator-xtable/pull/791#discussion_r2767163196


##
xtable-core/src/main/java/org/apache/xtable/delta/DeltaTableExtractor.java:
##
@@ -44,6 +44,10 @@ public class DeltaTableExtractor {
 
   public InternalTable table(DeltaLog deltaLog, String tableName, Long 
version) {
 Snapshot snapshot = deltaLog.getSnapshotAt(version, Option.empty());
+return table(snapshot, tableName);
+  }
+
+  public InternalTable table(Snapshot snapshot, String tableName) {

Review Comment:
   Consider adding a brief Javadoc on the new public api  `table(Snapshot, 
String)` explaining assumptions (`snapshot` must be valid and correspond to 
`tableName`’s table).
   
   Or, add a input validator 
   ```
   private static Snapshot requireMetadata(Snapshot s, String tableName) {
 if (s == null || s.metadata() == null || s.metadata().schema() == null) {
   throw new IllegalStateException("Missing metadata/schema for table: " + 
tableName 
 + " at version: " + (s != null ? s.version() : "unknown"));
 }
 return s;
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Remove redundant getSnapshotAt calls per commit [incubator-xtable]

2026-02-02 Thread via GitHub


brishi19791 commented on code in PR #791:
URL: https://github.com/apache/incubator-xtable/pull/791#discussion_r2754967531


##
xtable-core/src/main/java/org/apache/xtable/delta/DeltaConversionSource.java:
##
@@ -85,16 +85,15 @@ public InternalTable getTable(Long version) {
 
   @Override
   public InternalTable getCurrentTable() {
-DeltaLog deltaLog = DeltaLog.forTable(sparkSession, basePath);
 Snapshot snapshot = deltaLog.snapshot();
-return getTable(snapshot.version());
+return tableExtractor.table(snapshot, tableName);
   }
 
   @Override
   public InternalSnapshot getCurrentSnapshot() {
 DeltaLog deltaLog = DeltaLog.forTable(sparkSession, basePath);

Review Comment:
   yes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Remove redundant getSnapshotAt calls per commit [incubator-xtable]

2026-02-01 Thread via GitHub


kevinjqliu commented on code in PR #791:
URL: https://github.com/apache/incubator-xtable/pull/791#discussion_r2752061685


##
xtable-core/src/main/java/org/apache/xtable/delta/DeltaConversionSource.java:
##
@@ -85,16 +85,15 @@ public InternalTable getTable(Long version) {
 
   @Override
   public InternalTable getCurrentTable() {
-DeltaLog deltaLog = DeltaLog.forTable(sparkSession, basePath);
 Snapshot snapshot = deltaLog.snapshot();
-return getTable(snapshot.version());
+return tableExtractor.table(snapshot, tableName);
   }
 
   @Override
   public InternalSnapshot getCurrentSnapshot() {
 DeltaLog deltaLog = DeltaLog.forTable(sparkSession, basePath);

Review Comment:
   ```suggestion
   ```
   not needed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Remove redundant getSnapshotAt calls per commit [incubator-xtable]

2026-02-01 Thread via GitHub


kevinjqliu commented on PR #791:
URL: https://github.com/apache/incubator-xtable/pull/791#issuecomment-3831985828

   > We now fetch the snapshot once per commit/version and reuse it to 
construct the InternalTable (via a DeltaTableExtractor.table(Snapshot, 
tableName) overload), instead of re-resolving the same snapshot multiple times.
   
   Thats awesome, thanks for adding the perf improvement. It might be a good 
idea to add a test (either here or a follow up PR) to verify that the entire 
conversion process should only read the DeltaLog once. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



[PR] Remove redundant getSnapshotAt calls per commit [incubator-xtable]

2026-01-30 Thread via GitHub


brishi19791 opened a new pull request, #791:
URL: https://github.com/apache/incubator-xtable/pull/791

   This PR removes redundant DeltaLog.getSnapshotAt(version) calls in the Delta 
source conversion path that were happening for every commit. getSnapshotAt can 
internally trigger an expensive Spark job and associated network I/O (e.g., 
listing/reading Delta log metadata from remote storage) to resolve the snapshot 
for a given version. We now fetch the snapshot once per commit/version and 
reuse it to construct the InternalTable (via a 
DeltaTableExtractor.table(Snapshot, tableName) overload), instead of 
re-resolving the same snapshot multiple times.
   
   Impact
   
   - Avoids redundant snapshot resolution work per commit/version (and the 
Spark job + network calls it may trigger).
   - Reduces end-to-end conversion latency, especially for large commit 
backlogs.
   - No intended functional behavior change; performance optimization only.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]