[GitHub] [spark] zero323 commented on pull request #35296: [SPARK-37981][PYTHON] Add note for deleting Null and NaN

GitBox Sun, 23 Jan 2022 15:05:35 -0800


zero323 commented on pull request #35296:
URL: https://github.com/apache/spark/pull/35296#issuecomment-1019586587



   Thank your proposal @bjornjorgensen.
   
   First of all, some formalities:
   
   - Could you [enable GitHub 
actions](https://github.com/apache/spark/pull/35296/checks?check_run_id=4914852660)
  in your fork,  
   - It would be better to have JIRA ticket that specifically targets 
documentation update, as the one you linked doesn't even use `pyspark.pandas`.
   
   
   Regarding the change ‒ as-is, this doesn't really describe the actual 
behavior:
   
   - If `path` is not provided, `to_json`  delegates transformation to pandas, 
so fields with NULL are preserved.
   -  If `path` is provided, standard writer is used, so the same rules apply 
(in particular `ignoreNullFields` keyword option or 
`spark.sql.jsonGenerator.ignoreNullFields` config are respected).
   - In both cases, behavior doesn't depend on the number of null values in the 
column.
   
   
   Finally, you have a typo ‒ "The column **well** be deleted"  -> "The column 
will be deleted."
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zero323 commented on pull request #35296: [SPARK-37981][PYTHON] Add note for deleting Null and NaN

Reply via email to