shtripat opened a new pull request, #765:
URL: https://github.com/apache/iceberg-go/pull/765
### Problem
When a NestedField has a nil Type — which happens when the JSON payload uses
the wrong key (e.g. "field_type" instead of "type") or omits the "type" key
entirely — two failure modes exist:
1. NestedField.UnmarshalJSON silently accepts nil types.
Go's encoding/json leaves fields at their zero value when a key is absent or
unrecognised. Since typeIFace wraps a Type interface, the zero value is nil.
UnmarshalJSON already validates id but applies no equivalent guard to type, so
a payload with a missing or misspelled type key produces a NestedField with
Type == nil and returns no error.
2. typeIFace.MarshalJSON panics on nil type.
```
// before
func (t *typeIFace) MarshalJSON() ([]byte, error) {
if nested, ok := t.Type.(NestedType); ok {
return json.Marshal(nested)
}
return []byte(`"` + t.Type.Type() + `"`), nil // nil pointer
dereference if t.Type == nil
}
```
When a schema containing a nil-type field is later marshalled (e.g. during
CreateTable, CommitTable, or any serialisation path), t.Type.Type() is a method
call on a nil interface — an unrecovered panic. There is no defer recover() in
the json.Marshal call path, so the process crashes.
```
Example triggering input:
{
"type": "struct",
"fields": [
{"id": 1, "name": "col", "field_type": "long", "required": true}
]
}
```
Before this fix: json.Unmarshal succeeds silently, subsequent json.Marshal
panics. After this fix: json.Unmarshal returns ErrInvalidSchema immediately.
### Changes
* types.go
NestedField.UnmarshalJSON — add a nil-type guard after decoding, consistent
with the existing id and name guards introduced in PRs #758 and #761: if n.Type
== nil {
return fmt.Errorf("%w: field %q is missing required 'type' key in JSON",
ErrInvalidSchema, n.Name)
}
typeIFace.MarshalJSON — add a defensive nil guard so that a NestedField
constructed programmatically with Type == nil returns an error instead of
panicking: if t.Type == nil {
return nil, fmt.Errorf("%w: field type is nil; use the \"type\" JSON key
with a valid Iceberg type such as \"long\", \"string\", or \"boolean\"",
ErrInvalidSchema)
}
* types_test.go — three new tests:
- TestNestedFieldUnmarshalMissingType: absent "type" key → ErrInvalidSchema
- TestNestedFieldUnmarshalWrongTypeKey: "field_type" instead of "type" →
ErrInvalidSchema
- TestTypeIFaceMarshalJSONNilType: programmatically constructed
NestedField{Type: nil} → json.Marshal returns ErrInvalidSchema, no panic
### Behaviour after this change
| Scenario | Before
| After |
|-----------------------------------------------|-----------------------------------|-------------------------------|
| "type" key absent from JSON | Silent nil type, panic on
marshal | ErrInvalidSchema at unmarshal |
| Wrong key ("field_type") used | Silent nil type, panic on
marshal | ErrInvalidSchema at unmarshal |
| NestedField with nil type marshalled directly | Unrecovered panic
| ErrInvalidSchema returned |
| Valid field with correct "type" key | OK
| OK (unchanged) |
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]