laskoviymishka commented on code in PR #1098:
URL: https://github.com/apache/iceberg-go/pull/1098#discussion_r3272278011
##########
table/evaluators.go:
##########
@@ -1568,12 +1568,36 @@ func (m *strictMetricsEval) canContainNulls(fieldID
int) bool {
return exists && cnt > 0
}
+func (m *strictMetricsEval) mayContainNulls(field iceberg.NestedField) bool {
+ cnt, exists := m.nullCounts[field.ID]
+ if !exists {
+ return !field.Required
+ }
+
+ return cnt > 0
+}
+
func (m *strictMetricsEval) canContainNans(fieldID int) bool {
cnt, exists := m.nanCounts[fieldID]
return exists && cnt > 0
}
+func (m *strictMetricsEval) mayContainNans(field iceberg.NestedField) bool {
+ switch field.Type.(type) {
+ case iceberg.Float32Type, iceberg.Float64Type:
+ default:
+ return false
+ }
Review Comment:
The empty `case iceberg.Float32Type, iceberg.Float64Type:` arm with
`default: return false` below is valid but unusual — a reader pauses to confirm
the float arm is intentionally empty-and-fall-through rather than an oversight.
A one-line comment in the arm, or inverting to a positive guard, would save the
double-take.
##########
table/evaluators.go:
##########
@@ -1568,12 +1568,36 @@ func (m *strictMetricsEval) canContainNulls(fieldID
int) bool {
return exists && cnt > 0
}
+func (m *strictMetricsEval) mayContainNulls(field iceberg.NestedField) bool {
+ cnt, exists := m.nullCounts[field.ID]
+ if !exists {
+ return !field.Required
Review Comment:
`VisitNotEqual` and `VisitNotIn` still call `canContainNulls(fieldID) ||
canContainNans(fieldID)` — same absent-count short-circuit this PR is replacing
everywhere else. Under their inverted return semantics (`rowsMustMatch` instead
of `rowsMightNotMatch`), the bug class is identical: for an optional field with
no per-key null count, `canContainNulls` returns `false`, the guard
short-circuits, and the bounds check can wrongly conclude all rows satisfy `col
!= x` when nulls may actually be present. Both methods already have `field :=
t.Ref().Field()` in scope, so this is the same two-line substitution to
`m.mayContainNulls(field) || m.mayContainNans(field)`. Worth folding in here to
fully close #1091.
##########
table/evaluators_test.go:
##########
@@ -2346,6 +2346,87 @@ func (suite *StrictMetricsTestSuite) TestMissingStats() {
}
}
+func (suite *StrictMetricsTestSuite) TestMissingNullCounts() {
Review Comment:
Both new tests exercise only `GreaterThan`. The six migrated visitors share
the same early guard so coverage is defensible, but one `EqualTo` + one `In`
case (or a comment noting `GreaterThan` is representative) would make the
regression intent explicit. Once `VisitNotEqual` / `VisitNotIn` are addressed,
a case there closes the loop.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]