szehon-ho opened a new pull request, #16727: URL: https://github.com/apache/iceberg/pull/16727
## Summary Adds an optional `specific-name` field to the UDF function definition model in the [UDF spec](https://github.com/apache/iceberg/blob/main/format/udf-spec.md), analogous to the SQL standard's routine *specific name*. Resolves #16700 ## Motivation In the SQL standard, a routine has two distinct identifiers: - a **routine name** — the user-facing function name, which may be shared across overloads, and - a **specific name** — a schema-unique name that identifies a single overload unambiguously. The specific name exists precisely because referencing an overload by `name + signature` is awkward, so the standard lets a user assign a stable, human-readable handle that engines can use for statements like `DROP SPECIFIC FUNCTION`. Today the UDF model only has `definition-id`, which is *derived from the canonical parameter-type tuple*. It uniquely identifies an overload, but it is an opaque, signature-bound string rather than a user-chosen name. This change adds `specific-name` as an optional convenience alongside `definition-id`, giving engines a stable name to agree on without overloading the meaning of `definition-id`. ### SQL-standard example A user specifies the specific name at creation time with the `SPECIFIC` clause: \`\`\`sql CREATE FUNCTION add_one(x INT) RETURNS INT SPECIFIC add_one_int RETURN x + 1; CREATE FUNCTION add_one(x FLOAT) RETURNS FLOAT SPECIFIC add_one_float RETURN x + 1.0; \`\`\` and can then reference a single overload by that name, independent of its signature: \`\`\`sql DROP SPECIFIC FUNCTION add_one_int; \`\`\` This maps directly onto the new field: the two overloads above serialize with `"specific-name": "add_one_int"` and `"specific-name": "add_one_float"` respectively (see the updated Appendix A example). ## Design notes - `specific-name` is **optional** and **backward compatible** — `definition-id` is left unchanged, so existing metadata stays valid and readers that ignore the field still work. - It is for **identification only** and **must not** be used for overload resolution; resolution remains based on the definition's `parameters`. - When present, `specific-name` must be unique among all definitions in the UDF metadata, and must not collide with any other definition's `definition-id`, so a single string references at most one definition. - When absent, `definition-id` continues to serve as the definition's identifier. ## Changes - New optional `specific-name` row in the Definition fields table. - New **Specific Name** subsection defining its semantics and uniqueness rules. - Updated Appendix A (overloaded `add_one`) to illustrate the field. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
