gengliangwang commented on code in PR #56121:
URL: https://github.com/apache/spark/pull/56121#discussion_r3364760596
##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/transactions/Transaction.java:
##########
@@ -66,6 +67,20 @@ public interface Transaction extends Closeable {
*/
void abort();
+ /**
+ * Attempts to register materialized scans against this transaction's read
set.
+ * <p>
+ * An example use case is cache reuse. Spark passes the scans of a candidate
cached subtree
+ * for the transaction's catalog and the connector decides whether to accept
them.
+ * <p>
+ * The connector must either accept (returning {@code true} after adding any
relevant scans
+ * to the read set) or refuse (returning {@code false} without modifying the
read set).
+ *
+ * @param scans the scans to register.
Review Comment:
Two small doc points on this `@Evolving` API:
1. The accept/refuse decision is all-or-nothing for the whole `scans` array,
but the doc doesn't say so — "must either accept … or refuse" can read as if it
were per-scan. Worth one sentence that the return value covers the entire
batch: the connector accepts reuse of all the passed scans or none, there's no
partial acceptance — a `false` makes Spark bypass the whole cached subtree.
2. `@param scans the scans to register.` is circular (it restates the method
name). For a public API, describe what they are:
```suggestion
* @param scans the materialized scans Spark offers for registration
against this transaction's read set.
```
Both are contract-level, so they keep the body generic per @aokolnychyi's
earlier point about not over-fitting to caching. The caller-specific scope —
that Spark passes only this one cached subtree's scans, already filtered to the
transaction's catalog — is better documented at the call site
(`CacheManager.validateCachedEntryForTransaction`) than baked into this
contract.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]