stevenzwu opened a new pull request, #16586:
URL: https://github.com/apache/iceberg/pull/16586

   ## Summary
   
   `JsonUtil.getStringArray(JsonNode)` calls `asText()` on each element without 
verifying that it is actually a JSON string. Non-textual elements are silently 
coerced (e.g. `45` becomes `"45"`, `true` becomes `"true"`), masking malformed 
inputs instead of surfacing them.
   
   This change adds an `isTextual()` check per element, mirroring the 
validation already performed by `JsonStringArrayIterator` (used by 
`getStringList`, `getStringSet`, and `getStringListOrNull`).
   
   ## Why this is safe
   
   There are three callers of `JsonUtil.getStringArray` today, and all three 
expect arrays of JSON strings; none rely on the coercion behavior:
   
   1. `ViewVersionParser.fromJson` 
(`core/src/main/java/org/apache/iceberg/view/ViewVersionParser.java:100`) — 
parses the `default-namespace` field of a view version. The Iceberg view spec 
defines this as a list of strings; only string levels are valid.
   2. `RESTSerializers.NamespaceDeserializer.deserialize` 
(`core/src/main/java/org/apache/iceberg/rest/RESTSerializers.java:262`) — 
deserializes a `Namespace` from a REST payload. The REST OpenAPI spec defines a 
Namespace as `array<string>`; non-string elements would be a protocol violation.
   3. `RemoteSignRequestParser.headersFromJson` 
(`core/src/main/java/org/apache/iceberg/rest/requests/RemoteSignRequestParser.java:136`)
 — parses HTTP header values, which are by definition strings.
   
   In every case, a non-string element indicates malformed input. Failing fast 
with `Cannot parse string from non-text value: <element>` is strictly more 
useful than silently coercing.
   
   ## Test plan
   
   - [x] New unit tests in `TestJsonUtil.getStringArray` covering: `null` node, 
non-array node, array containing a non-string element, valid string array, and 
empty array.
   - [x] Existing `TestJsonUtil`, `TestViewVersionParser`, 
`TestRemoteSignRequestParser` continue to pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to