Hi Jonas,

Thanks for the proposal! I added some comments in the docs, but I'd like to
emphasize my biggest concern here as well.

When we talk about upper/lower-casing we have to know the locale, in which
that operation is to be performed.

Using a specific locale, we have to declare a particular natural language
context. Now, the question is how do we deal with identifiers that can
Unicode characters from different languages?

Tip of the "iceberg" :) : https://github.com/apache/iceberg/issues/9276

Thanks,
Dmitri.

On Fri, Oct 17, 2025 at 7:26 PM Honah J. <[email protected]> wrote:

> Hi everyone,
>
> I would like to start a discussion around supporting an option to make
> catalog case insensitive.
>
> In multi-engine data lake environments, different engines (Spark, Trino,
> Flink, etc.) apply different casing and normalization rules when reading or
> writing identifiers. As a result, the same logical table may be interpreted
> differently across engines. For example, Polaris currently preserves
> identifier casing, so a table created by Spark with mixed-case names may
> not be discoverable from Trino, which lowercases identifiers. This
> inconsistency burdens users and undermines script portability.
>
> I drafted a proposal[1] with more details and a solution: introducing an
> immutable catalog property to store and look up namespaces, tables, and
> other objects case‑insensitively
>
> I’d love to hear your feedback and suggestions!
>
> [1]
>
> https://docs.google.com/document/d/1-3ywobpRvgdHPhe0J4w7l6t4NX79iqaeFOohCXG_12U/edit?usp=sharing
> <
> https://docs.google.com/document/d/1-3ywobpRvgdHPhe0J4w7l6t4NX79iqaeFOohCXG_12U/edit?usp=sharing
> >
>
> Best regards,
> Jonas
>

Reply via email to