[ 
https://issues.apache.org/jira/browse/SPARK-46275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17793453#comment-17793453
 ] 

Raghu Angadi commented on SPARK-46275:
--------------------------------------

Context from [~gengliang] about `from_avro()` behavior:
{quote}This is the following JSON/CSV data source. On read failure, those file 
sources still try to fill the partial results instead of nulls. In Avro, it is 
hard to get partial parser result on error, so we simply create a row with all 
empty values. 
{quote}
 

> Protobuf: Permissive mode should return null rather than struct with null 
> fields
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-46275
>                 URL: https://issues.apache.org/jira/browse/SPARK-46275
>             Project: Spark
>          Issue Type: Bug
>          Components: Protobuf, Structured Streaming
>    Affects Versions: 3.5.0
>            Reporter: Raghu Angadi
>            Priority: Major
>             Fix For: 4.0.0, 3.5.1
>
>
> Consider a protobuf with two fields {{message Person { string name = 1; int 
> id = 2; }}
>  * The struct returned by {{from_protobuf("Person")}} like this:
>  ** STRUCT<name STRING, id INT>
>  * If the underlying binary record fails to deserialize, it results in a 
> exception and query fails.
>  * Buf if the option {{mode}} is set to {{PERMISSIVE}} , malformed records 
> are tolerated {{null}} is returned.
>  ** {*}BUT{*}: The retuned struct looks like this {{{"name: null, id: 
> "null"}}}
>  * 
>  ** 
>  *** This is not convenient to the user.
>  *** *Ideally,* {{from_protobuf()}} *should return* {{null}} *.*
>  ** {{from_protobuf()}} borrowed the current behavior from {{from_avro()}} 
> implementation. It is not clear what the motivation was.
> I think we should update the implementation to return {{null}} rather than a 
> struct with null-fields inside.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to