Hi all,

I noticed a discrepancy between the specification and the behavior of
the current reference implementation regarding valid source types for
the identity transform.

The current spec states that the identity transform can be applied to:

```
Any type except for geometry, geography, and variant
```

However, the Java impl[1] only accepts primitive types and
additionally rejects geometry, geography and variant, and the Go
impl[2] follows the same behavior.

This means that types such as structs, lists, and maps are rejected by
the implementations, even though they would appear to be allowed by
the wording "Any type except ...".

Iceberg Python currently accepts all primitive types[3] and I opened a
PR[4] to reject geometry and geography to align with Java and Go.

Iceberg C++ currently also accepts all primitive types[5], I am
updating this behavior as part of the v3 type support work[6].

Iceberg Rust is not currently affected because it does not yet support
geometry, geography, or variant.

Given the existing Java and Go behavior, I think the specification
could be clarified to more accurately describe the intended valid
source types for the identity transform.

My proposed wording is below (there is no variant since it's not even
a primitive) and I created a PR[7] for this.

```
Any primitive type except for geometry and geography
```

WDYT?

[1] 
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/transforms/Identity.java#L100-L106
[2] https://github.com/apache/iceberg-go/blob/main/transforms.go#L115-L123
[3] 
https://github.com/apache/iceberg-python/blob/main/pyiceberg/transforms.py#L719-L720
[4] https://github.com/apache/iceberg-python/pull/3517
[5] 
https://github.com/apache/iceberg-cpp/blob/main/src/iceberg/transform_function.cc#L43-L46
[6] https://github.com/apache/iceberg-cpp/pull/752
[7] https://github.com/apache/iceberg/pull/16836

-- 
Regards
Junwang Zhao

Reply via email to