Brijesh-Thakkar commented on code in PR #3018:
URL: https://github.com/apache/datafusion-comet/pull/3018#discussion_r2656218965
##########
native/spark-expr/src/json_funcs/to_json.rs:
##########
@@ -181,6 +188,23 @@ fn escape_string(input: &str) -> String {
escaped_string
}
+fn normalize_special_floats(arr: &StringArray) -> ArrayRef {
+ let mut builder = StringBuilder::with_capacity(arr.len(), arr.len() * 8);
+
+ for i in 0..arr.len() {
+ if arr.is_null(i) {
+ builder.append_null();
+ } else {
+ match arr.value(i) {
+ "Infinity" | "-Infinity" | "NaN" => builder.append_null(),
Review Comment:
I agree that handling this earlier would be preferable in general. In this
case, to_json delegates primitive type handling to spark_cast, and the goal
here was to avoid changing spark_cast behavior globally since it is used by
other expressions where preserving "NaN" / "Infinity" string output may be
expected.
Normalizing the values at the to_json layer keeps the change scoped
specifically to JSON semantics while still aligning the output with Spark’s
behavior.
That said, I’m happy to move the check earlier or adjust the approach if you
think handling this during float-to-string conversion would be more appropriate
for Comet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]