Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/15329
  
    @yhuai Sure, I support that idea too as I think in the same way; these 
would be corner cases. However, just the case I thought instantly is, to use 
the same schema used as it is (because inferring schema in JSON is discouraged 
in production). For example,
    
    ```scala
    df.write.format("json").save(...)
    ...
    spark.read.schema(df.schema).load(...)
    ```
    
    or
    
    ```scala
    anotherDF.select(from_json("a", df.schema))
    ```
    
    In that case, I guess the original `df` could contain non-nullable fields.
    
    FYI, as you might already know, it forces to the nullable schema 
reading/writing it from files but not for structured streaming/reading from 
rdd/`from_json`. I opened a PR for the consistency before 
(https://github.com/apache/spark/pull/14124).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to