szehon-ho commented on code in PR #16727: URL: https://github.com/apache/iceberg/pull/16727#discussion_r3392519053
########## format/udf-spec.md: ########## @@ -131,6 +132,18 @@ Examples of complete definition-id signatures: * `int,string` – two parameters: int and string * `int,list<int>,struct<id:int,name:string>` – three parameters: an int, a list and a struct +#### Specific Name +The `specific-name` is an optional, user-assignable identifier for a single definition, analogous to the SQL standard's +routine *specific name*. It provides a stable handle for a definition that is independent of its signature, which engines +may use to reference a specific overload unambiguously (e.g., for SQL statements such as `DROP SPECIFIC FUNCTION`). + +* `specific-name` is for identification only. It **must not** be used for overload resolution; resolution is based on the + definition's `parameters` (see [Function Call Convention and Resolution in Engines](#function-call-convention-and-resolution-in-engines)). +* When present, `specific-name` **must** be unique among all definitions within the UDF metadata. Review Comment: Good catch. We considered specifying exact case-sensitive equality here, but decided **not** to define case-sensitivity (or any folding) in the spec, for consistency with how the rest of Iceberg's catalog handles names. In the REST catalog spec, namespace levels and table/view names are plain opaque `string`s with no normalization rules — and the only `case-sensitive` flag in that spec is explicitly scoped to scan field matching, not to identifiers. Identifier folding (the `Add_One` vs `add_one` question, regular vs. delimited identifiers, Spark lowercasing vs. an uppercasing engine) is applied by the engine *before* the name is written, exactly as it is for table and namespace names today. If the UDF spec mandated exact case-sensitive equality, it would be specifying *more* about name comparison than the catalog does for table names, and it still wouldn't resolve cross-engine folding disagreements, which genuinely live at the engine layer. Within a single metadata file the rule is deterministic — uniqueness is over the literal stored `definition-id` strings — so we leave folding to the engine rather than introduce a UDF-specific identifier-comparison rule that diverges from the catalog. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
