Raghu Angadi created SPARK-46275:
------------------------------------
Summary: Protobuf: Permissive mode should return null rather than
struct with null fields
Key: SPARK-46275
URL: https://issues.apache.org/jira/browse/SPARK-46275
Project: Spark
Issue Type: Bug
Components: Protobuf, Structured Streaming
Affects Versions: 3.5.0
Reporter: Raghu Angadi
Fix For: 4.0.0, 3.5.1
Consider a protobuf with two fields {{message Person \{ string name = 1; int id
= 2; }}} .
* The struct returned by {{from_protobuf("Person")}} like this:
** {{STRUCT<name STRING, id INT>}}
* If the underlying binary record fails to deserialize, it results in a
exception and query fails.
* Buf if the option {{mode}} is set to {{PERMISSIVE}} , malformed records are
tolerated {{null}} is returned.
** {*}BUT{*}: The retuned struct looks like this {{{"name: null, id: "null"}}}
*** This is not convenient to the user.
*** *Ideally,* {{from_protobuf()}} *should return* {{null}} *.*
*** {{from_protobuf()}} borrowed the current behavior from {{from_avro()}}
implementation. It is not clear what the motivation was.
I think we should update the implementation to return {{null}} rather than a
struct with null-fields inside.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]