anshul98ks123 opened a new pull request, #17534:
URL: https://github.com/apache/pinot/pull/17534
## Issue
`FieldSpec.getStringValue()` fails to serialize Scala collections to valid
JSON
## Description
When `jackson-module-scala` is on the classpath (via Kafka connectors),
Jackson deserializes empty JSON objects `{}` as Scala collections instead of
Java collections. The existing `getStringValue()` method only handles Java
`Map`/`List` via `instanceof` checks, but **Scala collections don't implement
`java.util.Map` or `java.util.List`**, causing them to fall through to
`toString()` which produces invalid JSON.
For example, a schema with ComplexFieldSpec:
```
"complexFieldSpecs": [
{
"name": "dimensions",
"dataType": "MAP",
"defaultNullValue": {}
}
]
```
When Jackson deserializes this with `jackson-module-scala` on the classpath:
1. "defaultNullValue": {} → scala.collection.immutable.Map$EmptyMap$
2. `setDefaultNullValue(scalaMap)` calls `getStringValue(scalaMap)`
3. instanceof Map returns false (Scala Map ≠ java.util.Map)
4. Falls through to `scalaMap.toString()` → "Map()"
5. Later, `setDataType(DataType.MAP)` triggers `getDefaultNullValue()` which
calls `DataType.MAP.convert("Map()")`
6. `JsonUtils.stringToObject("Map()", Map.class)` fails because "Map()" is
not valid JSON
```
Caused by: com.fasterxml.jackson.databind.exc.MismatchedInputException:
Cannot convert value: 'Map()' to type: MAP
at [Source: REDACTED; line: 333, column: 33] (through reference chain:
TablePreviewApi["tableConfigs"]->TableConfigs["schema"]->Schema["complexFieldSpecs"]
->ArrayList[0]->ComplexFieldSpec["dataType"])
```
## Bug
The bug is in
[FieldSpec.getStringValue()](https://github.com/startreedata/pinot/blob/3bb68a04bb03dd108030fd2280aee1701bb46058/pinot-spi/src/main/java/org/apache/pinot/spi/data/FieldSpec.java#L347):
```
public static String getStringValue(Object value) {
if (value instanceof BigDecimal) {
return ((BigDecimal) value).toPlainString();
}
if (value instanceof byte[]) {
return BytesUtils.toHexString((byte[]) value);
}
return value.toString(); // ← BUG: Scala Map.toString() = "Map()"
}
```
## Fix
This PR fixes that by detecting Scala collections via class name and
serializing them to JSON
```
public static String getStringValue(Object value) {
// ... BigDecimal, byte[] handling ...
// Handle Java collections AND Scala collections
if (value instanceof Map || value instanceof List ||
isScalaCollection(value)) {
try {
return JsonUtils.objectToString(value); // Serialize to JSON
properly
} catch (JsonProcessingException e) {
throw new RuntimeException("Failed to serialize collection to JSON",
e);
}
}
return value.toString();
}
```
## Testing
- [X] UTs
- [X] Lint Check
- [X] Ran STP locally and validated Preview API with following block
```
{
"schema": {
...
...
"complexFieldSpecs": [
{
"fieldType": "COMPLEX",
"childFieldSpecs": {
...
...
},
"singleValueField": true,
"notNull": false,
"allowTrailingZeros": false,
"defaultNullValueString": "{}",
"name": "address",
"defaultNullValue": {},
"dataType": "MAP"
}
]
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]