vladanvasi-db commented on PR #49148: URL: https://github.com/apache/spark/pull/49148#issuecomment-2543171258
From my understanding, `toJSON` is a proper API and it was added in Spark 2.0.0. It is documented [here](https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/Dataset.html#toJSON--), at the official `DataSet` Spark documentation. On the internet, I see a lot of code examples using this `toJSON` API to construct series of JSON strings from DataSet/DataFrame, and I have not heard one place which says it is not a proper API or it is getting deprecated. The workaround that @HyukjinKwon proposed works in my case, so I want to hear your thoughts about whether it is worth extending `DataSet` API with `toJSON(jsonOptions: Map[String, String]` method. These JSON options are also well documented [here](https://spark.apache.org/docs/3.5.1/sql-data-sources-json.html), and I think it would be better for users to have a method `toJSON(options)` for their purpose instead of using the workaround that was proposed above. Please share your thoughts @cloud-fan @MaxGekk @HyukjinKwon because this one is critical for timestamp precision in `JSON` converted strings from the `DataSet`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
