Copilot commented on code in PR #6380:
URL: https://github.com/apache/hive/pull/6380#discussion_r3292917112
##########
ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java:
##########
@@ -1355,6 +1356,12 @@ public void setSnapshotRef(String snapshotRef) {
this.snapshotRef = snapshotRef;
}
+ public String getQualifier() {
+ return Stream.of(metaTable, snapshotRef, asOfVersion, asOfTimestamp)
+ .filter(Objects::nonNull).findFirst()
+ .orElse("");
Review Comment:
`Table#getQualifier()` currently returns the *first* non-null of (metaTable,
snapshotRef, asOfVersion, asOfTimestamp). This can still cause cache-key
collisions (and incorrect scan merging) when values overlap across qualifier
types (e.g., a tag/branch named the same as an Iceberg metadata table like
"snapshots"), and it also ignores `versionIntervalFrom`. Additionally, blank
values (e.g., "") are treated as present, which can mask a non-blank
snapshotRef/asOfVersion and recreate collisions. Consider building a structured
qualifier that (a) skips blank strings, (b) includes type prefixes (e.g.,
meta=…, ref=…, ver=…, ts=…, from=…), and (c) incorporates all applicable
time-travel fields (including interval-from) so distinct snapshots/ranges
cannot share a cache key.
##########
ql/src/java/org/apache/hadoop/hive/ql/parse/PrunedPartitionList.java:
##########
@@ -56,7 +54,7 @@ public PrunedPartitionList(Table source, Set<Partition>
partitions,
public PrunedPartitionList(Table source, String key, Set<Partition>
partitions,
List<String> referred, boolean hasUnknowns) {
this.source = Objects.requireNonNull(source);
- this.ppListKey = Optional.ofNullable(key);
+ this.ppListKey = key;
this.referred = Objects.requireNonNull(referred);
Review Comment:
`PrunedPartitionList` now stores the key as a plain `String`, but the
constructor still accepts (and the default constructor passes) `null`. This
weakens the API contract compared to `Optional<String>` and makes it easy for
callers to accidentally dereference the key as non-null. Either enforce a
non-null key (and update call sites to always provide one), or explicitly
keep/restore optionality (e.g., `Optional<String>` or an annotated `@Nullable`
return) to avoid hidden nulls propagating into caches/maps.
##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java:
##########
@@ -646,9 +646,12 @@ protected boolean areMergeable(ParseContext pctx,
TableScanOperator tsOp1, Table
return false;
}
- // HIVE-29509: Include snapshotRef to ensure different Iceberg
branches/tags are treated as distinct tables
- if (!Objects.equals(tsOp1.getConf().getSnapshotRef(),
tsOp2.getConf().getSnapshotRef())) {
- LOG.debug("Snapshot Ref differ {} ~ {}",
tsOp1.getConf().getSnapshotRef(), tsOp2.getConf().getSnapshotRef());
+ // Time-travel qualifier (snapshotRef, asOfVersion, asOfTimestamp) must
match,
+ // otherwise the two scans address different snapshots of the same table.
+ String qualifier1 = tsOp1.getConf().getTableMetadata().getQualifier();
+ String qualifier2 = tsOp2.getConf().getTableMetadata().getQualifier();
+ if (!Objects.equals(qualifier1, qualifier2)) {
+ LOG.debug("Qualifier differ {} ~ {}", qualifier1, qualifier2);
Review Comment:
Log message grammar: "Qualifier differ" should be "Qualifier differs" (or
"Qualifiers differ").
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]