Re: [PR] Fix to_json handling of NaN and Infinity values (#3016) [datafusion-comet]

via GitHub Thu, 01 Jan 2026 01:39:59 -0800


Brijesh-Thakkar commented on code in PR #3018:
URL: https://github.com/apache/datafusion-comet/pull/3018#discussion_r2656218965



##########
native/spark-expr/src/json_funcs/to_json.rs:
##########
@@ -181,6 +188,23 @@ fn escape_string(input: &str) -> String {
     escaped_string
 }
 
+fn normalize_special_floats(arr: &StringArray) -> ArrayRef {
+    let mut builder = StringBuilder::with_capacity(arr.len(), arr.len() * 8);
+
+    for i in 0..arr.len() {
+        if arr.is_null(i) {
+            builder.append_null();
+        } else {
+            match arr.value(i) {
+                "Infinity" | "-Infinity" | "NaN" => builder.append_null(),

Review Comment:
   I agree that handling this earlier would be preferable in general. In this 
case, to_json delegates primitive type handling to spark_cast, and the goal 
here was to avoid changing spark_cast behavior globally since it is used by 
other expressions where preserving "NaN" / "Infinity" string output may be 
expected.
   
   Normalizing the values at the to_json layer keeps the change scoped 
specifically to JSON semantics while still aligning the output with Spark’s 
behavior.
   
   That said, I’m happy to move the check earlier or adjust the approach if you 
think handling this during float-to-string conversion would be more appropriate 
for Comet



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Fix to_json handling of NaN and Infinity values (#3016) [datafusion-comet]

Reply via email to