Raghu Angadi created SPARK-46275:
------------------------------------

             Summary: Protobuf: Permissive mode should return null rather than 
struct with null fields
                 Key: SPARK-46275
                 URL: https://issues.apache.org/jira/browse/SPARK-46275
             Project: Spark
          Issue Type: Bug
          Components: Protobuf, Structured Streaming
    Affects Versions: 3.5.0
            Reporter: Raghu Angadi
             Fix For: 4.0.0, 3.5.1


Consider a protobuf with two fields {{message Person \{ string name = 1; int id 
= 2; }}} .
 * The struct returned by {{from_protobuf("Person")}} like this:

 ** {{STRUCT<name STRING, id INT>}}

 * If the underlying binary record fails to deserialize, it results in a 
exception and query fails.

 * Buf if the option {{mode}} is set to {{PERMISSIVE}} , malformed records are 
tolerated {{null}} is returned.

 ** {*}BUT{*}: The retuned struct looks like this {{{"name: null, id: "null"}}}

 *** This is not convenient to the user.

 *** *Ideally,* {{from_protobuf()}} *should return* {{null}} *.*

 *** {{from_protobuf()}} borrowed the current behavior from {{from_avro()}} 
implementation. It is not clear what the motivation was.

I think we should update the implementation to return {{null}} rather than a 
struct with null-fields inside.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to