szehon-ho commented on PR #16727:
URL: https://github.com/apache/iceberg/pull/16727#issuecomment-4652308675

   ## Alternative considered: making `definition-id` user-specifiable (not 
chosen)
   
   While designing this, we considered an alternative to adding a separate 
field: **allow `definition-id` itself to be optionally user-specified**, so it 
could double as the specific name.
   
   ### What Option B would look like
   
   Instead of deriving `definition-id` strictly from the canonical 
parameter-type tuple, it would *default* to that canonical form but be 
overridable with any user-chosen, unique string. The signature used for 
overload resolution and the one-definition-per-signature rule would then be 
based solely on the `parameters` list, and `definition-id` would become an 
opaque identity token (which could carry the human-readable "specific name").
   
   ### Why we didn't choose it
   
   - **It breaks a useful, checkable invariant.** Today `definition-id == 
canonical(parameters)`, so a reader can recompute the ID from the signature and 
validate it (some readers even parse it to recover types). Making it 
user-overridable removes that property and requires a *separate* uniqueness 
check on `parameters` that currently comes for free.
   - **Backward-compatibility risk.** Existing consumers may assume 
`definition-id` equals the signature. Allowing arbitrary overrides could 
silently break them. Option A leaves `definition-id` completely untouched, so 
it's purely additive.
   - **It re-conflates two concepts.** The whole motivation (#16700) is that 
the SQL standard keeps *signature* (for resolution) and *specific name* (for 
identification) separate. Option B reduces the field count but folds both roles 
back into one field — the exact conflation we're trying to address. A 
self-documenting, signature-derived `definition-id` plus an explicit 
`specific-name` keeps the two roles cleanly separated and matches the 
standard's model directly.
   - **Footgun potential.** A user could set `definition-id: "int"` on a 
`float` overload, producing an ID that looks like a signature but isn't. The 
canonical-by-default form is self-documenting; a free-form override is not.
   
   Net: Option A (this PR) is fully additive, preserves the verifiable 
canonical `definition-id`, and is the closest match to the SQL standard's 
two-name model, so we went with it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to