laskoviymishka commented on code in PR #984:
URL: https://github.com/apache/iceberg-go/pull/984#discussion_r3235445080
##########
types.go:
##########
@@ -777,6 +822,143 @@ func (UnknownType) primitive() {}
func (UnknownType) Type() string { return "unknown" }
func (UnknownType) String() string { return "unknown" }
+const defaultGeoCRS = "OGC:CRS84"
+
+type GeometryType struct {
+ crs string
+}
+
+func GeometryTypeOf(crs string) (GeometryType, error) {
+ if crs == "" {
+ return GeometryType{}, fmt.Errorf("%w: invalid CRS: (empty
string)", ErrInvalidTypeString)
+ }
+ if strings.EqualFold(crs, defaultGeoCRS) {
+ return GeometryType{}, nil
+ }
+
+ return GeometryType{crs: crs}, nil
+}
+
+func (g GeometryType) CRS() string {
+ if g.crs == "" {
+ return defaultGeoCRS
+ }
+
+ return g.crs
+}
+
+func (g GeometryType) Equals(other Type) bool {
+ rhs, ok := other.(GeometryType)
+ if !ok {
+ return false
+ }
+
+ return g.crs == rhs.crs
+}
+
+func (GeometryType) primitive() {}
+func (g GeometryType) Type() string {
+ if g.crs == "" {
+ return "geometry"
+ }
+
+ return fmt.Sprintf("geometry(%s)", g.crs)
+}
+
+func (g GeometryType) String() string {
+ return g.Type()
+}
+
+const defaultGeographyAlgorithm = string(geoarrow.EdgeSpherical)
+
+func toGeoArrowEdgeInterpolation(s string) (geoarrow.EdgeInterpolation, error)
{
+ algo :=
geoarrow.EdgeInterpolation(strings.ToLower(strings.TrimSpace(s)))
+ switch algo {
+ case geoarrow.EdgeSpherical, geoarrow.EdgeVincenty,
+ geoarrow.EdgeThomas, geoarrow.EdgeAndoyer, geoarrow.EdgeKarney:
+ return algo, nil
+ default:
+ return "", fmt.Errorf("%w: invalid edge interpolation
algorithm", ErrInvalidTypeString)
+ }
+}
+
+type GeographyType struct {
+ crs string
+ algorithm string
+}
+
+func GeographyTypeOf(crs string, algorithm string) (GeographyType, error) {
+ if crs == "" {
+ return GeographyType{}, fmt.Errorf("%w: invalid CRS: (empty
string)", ErrInvalidTypeString)
+ }
+ if algorithm == "" {
+ return GeographyType{}, fmt.Errorf("%w: invalid algorithm:
(empty string)", ErrInvalidTypeString)
+ }
+
+ normalizedCRS := crs
+ if strings.EqualFold(crs, defaultGeoCRS) {
+ normalizedCRS = ""
+ }
+
+ validatedAlg, err := toGeoArrowEdgeInterpolation(algorithm) // validate
algorithm
+ if err != nil {
+ return GeographyType{}, fmt.Errorf("invalid algorithm: %w", err)
+ }
+ normalizedAlgorithm := string(validatedAlg)
+ if strings.EqualFold(algorithm, defaultGeographyAlgorithm) {
Review Comment:
Canonicalization branches on the raw input, not the validated form.
`toGeoArrowEdgeInterpolation` above trims and lowercases, but
`strings.EqualFold(algorithm, defaultGeographyAlgorithm)` here compares against
the raw string — so `GeographyTypeOf("srid:4326", "spherical")` normalizes to
`{algorithm: ""}` but `GeographyTypeOf("srid:4326", " spherical ")` keeps
`{algorithm: "spherical"}`. Two semantically identical types whose `Equals`
returns false. JSON path trims before calling so this is invisible there, but
Go callers can trip it.
One-line fix — branch on `validatedAlg`, not `algorithm`:
```go
if validatedAlg == geoarrow.EdgeInterpolation(defaultGeographyAlgorithm) {
normalizedAlgorithm = ""
}
```
##########
exprs.go:
##########
@@ -450,6 +450,8 @@ func createBoundRef(field NestedField, acc accessor)
BoundReference {
return &boundRef[Decimal]{field: field, acc: acc}
case UUIDType:
return &boundRef[uuid.UUID]{field: field, acc: acc}
+ case GeographyType, GeometryType:
+ return &boundRef[[]byte]{field: field, acc: acc}
Review Comment:
Found while tracing this — `createBoundRef` handles geo here, but four
downstream dispatch sites that switch on the same `Type` don't, and they're
inconsistent in how they fail.
`exprs.go:637` (unary predicate dispatch) — `case FixedType, BinaryType`
doesn't include geo, falls through to `panic("unhandled bound reference type:
...")` at line 644. `IsNull(Ref("geom_col"))` on an optional geo column
(exactly the case this PR's null-default rule enforces) crashes the process.
`exprs.go:799` (literal predicate) and `exprs.go:967` (set predicate) — same
shape, friendly error on fall-through.
`visitors.go:323` (`doCmp`) — `case BinaryType, FixedType` falls through to
`panic(ErrType)` at line 332. Metric-evaluation paths hit this.
Two options: (a) add `case GeometryType, GeographyType` at all four sites,
treating geo as `[]byte` like Binary — most useful since `IsNull` / `NotNull` /
`IsIn(null)` on geo are semantically valid; (b) reject geo in `createBoundRef`
so all expression paths fail consistently at bind. I'd take (a).
##########
table/substrait/substrait.go:
##########
@@ -169,6 +169,10 @@ func (convertToSubstrait) VisitUnknown() types.Type {
// Returning nil indicates this type cannot be converted to Substrait
return nil
}
+func (convertToSubstrait) VisitGeometry(iceberg.GeometryType) types.Type {
return &types.BinaryType{} }
Review Comment:
gofmt doesn't force one-line vs multi-line on single-return bodies: it
preserves whatever you write. If a line-length linter is balking, easier path
is to make VisitGeometry multi-line too for symmetry. Reads cleaner than either
form on its own.
##########
types.go:
##########
@@ -777,6 +822,143 @@ func (UnknownType) primitive() {}
func (UnknownType) Type() string { return "unknown" }
func (UnknownType) String() string { return "unknown" }
+const defaultGeoCRS = "OGC:CRS84"
+
+type GeometryType struct {
+ crs string
+}
+
+func GeometryTypeOf(crs string) (GeometryType, error) {
+ if crs == "" {
+ return GeometryType{}, fmt.Errorf("%w: invalid CRS: (empty
string)", ErrInvalidTypeString)
+ }
+ if strings.EqualFold(crs, defaultGeoCRS) {
Review Comment:
Same shape as the algorithm canonicalization in `GeographyTypeOf` below —
`strings.EqualFold(crs, defaultGeoCRS)` compares the raw input, so
`GeometryTypeOf("OGC:CRS84")` normalizes to the zero value but
`GeometryTypeOf(" OGC:CRS84 ")` doesn't. Trim at the top of the constructor
(`crs = strings.TrimSpace(crs)`) to make canonicalization tolerant.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]