Maxim Gekk created SPARK-26303: ---------------------------------- Summary: Return partial results for bad JSON records Key: SPARK-26303 URL: https://issues.apache.org/jira/browse/SPARK-26303 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.4.0 Reporter: Maxim Gekk
Currently, JSON datasource and JSON functions return row with all null for a malformed JSON string in the PERMISSIVE mode when specified schema has the struct type. All nulls are returned even some of fields were parsed and converted to desired types successfully. The ticket aims to solve the problem by returning already parsed fields. The corrupted column specified via JSON option `columnNameOfCorruptRecord` or SQL config should contain whole original JSON string. For example, if the input has one JSON string: {code:json} {"a":0.1,"b":{},"c":"def"} {code} and specified schema is: {code:sql} a DOUBLE, b ARRAY<INT>, c STRING, _corrupt_record STRIN {code} expected output of `from_json` in the PERMISSIVE mode: {code} +---+----+---+--------------------------+ |a |b |c |_corrupt_record | +---+----+---+--------------------------+ |0.1|null|def|{"a":0.1,"b":{},"c":"def"}| +---+----+---+--------------------------+ {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org