shtripat opened a new pull request, #765:
URL: https://github.com/apache/iceberg-go/pull/765

   ### Problem
   
   When a NestedField has a nil Type — which happens when the JSON payload uses 
the wrong key (e.g. "field_type" instead of "type") or omits the "type" key 
entirely — two failure modes exist:
   
   1. NestedField.UnmarshalJSON silently accepts nil types.
   
   Go's encoding/json leaves fields at their zero value when a key is absent or 
unrecognised. Since typeIFace wraps a Type interface, the zero value is nil. 
UnmarshalJSON already validates id but applies no equivalent guard to type, so 
a payload with a missing or misspelled type key produces a NestedField with 
Type == nil and returns no error.
   
   2. typeIFace.MarshalJSON panics on nil type.
   ```
   // before
   func (t *typeIFace) MarshalJSON() ([]byte, error) {
       if nested, ok := t.Type.(NestedType); ok {
           return json.Marshal(nested)
       }
       return []byte(`"` + t.Type.Type() + `"`), nil  // nil pointer 
dereference if t.Type == nil
   }
   ```
   
   When a schema containing a nil-type field is later marshalled (e.g. during 
CreateTable, CommitTable, or any serialisation path), t.Type.Type() is a method 
call on a nil interface — an unrecovered panic. There is no defer recover() in 
the json.Marshal call path, so the process crashes.
   
   ```
   Example triggering input:
   {
     "type": "struct",
     "fields": [
       {"id": 1, "name": "col", "field_type": "long", "required": true}
     ]
   }
   ```
   
   Before this fix: json.Unmarshal succeeds silently, subsequent json.Marshal 
panics. After this fix: json.Unmarshal returns ErrInvalidSchema immediately.
   
   ### Changes
   
   * types.go
   
   NestedField.UnmarshalJSON — add a nil-type guard after decoding, consistent 
with the existing id and name guards introduced in PRs #758 and #761: if n.Type 
== nil {
       return fmt.Errorf("%w: field %q is missing required 'type' key in JSON", 
ErrInvalidSchema, n.Name)
   }
   
   typeIFace.MarshalJSON — add a defensive nil guard so that a NestedField 
constructed programmatically with Type == nil returns an error instead of 
panicking: if t.Type == nil {
       return nil, fmt.Errorf("%w: field type is nil; use the \"type\" JSON key 
with a valid Iceberg type such as \"long\", \"string\", or \"boolean\"", 
ErrInvalidSchema)
   }
   
   * types_test.go — three new tests:
   
   - TestNestedFieldUnmarshalMissingType: absent "type" key → ErrInvalidSchema
   - TestNestedFieldUnmarshalWrongTypeKey: "field_type" instead of "type" → 
ErrInvalidSchema
   - TestTypeIFaceMarshalJSONNilType: programmatically constructed 
NestedField{Type: nil} → json.Marshal returns ErrInvalidSchema, no panic
   
   ### Behaviour after this change
   
   |                   Scenario                    |              Before        
       |             After             |
   
|-----------------------------------------------|-----------------------------------|-------------------------------|
   | "type" key absent from JSON                   | Silent nil type, panic on 
marshal | ErrInvalidSchema at unmarshal |
   | Wrong key ("field_type") used                 | Silent nil type, panic on 
marshal | ErrInvalidSchema at unmarshal |
   | NestedField with nil type marshalled directly | Unrecovered panic          
       | ErrInvalidSchema returned     |
   | Valid field with correct "type" key           | OK                         
       | OK (unchanged)                |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to