andygrove commented on PR #4592:
URL: 
https://github.com/apache/datafusion-comet/pull/4592#issuecomment-4739768650

   A few observations from a review pass. Acknowledging up front that this 
comment was drafted with help from AI (Claude Code).
   
   1. `native/spark-expr/src/string_funcs/split.rs` — `spark_split_sql` matches 
`(Array, Scalar)`, `(Array, Array)`, and `(Scalar, Scalar)`, but `(Scalar, 
Array)` falls through to the `exec_err!` arm. Could that shape ever come 
through, e.g. `split_part('abc', d, 1)` where the delimiter is a column? 
`spark_split` has the same gap so this is consistent with what's there today, 
but if it's reachable it's probably worth treating the same as `(Array, Array)` 
after broadcasting.
   
   2. `native/spark-expr/src/array_funcs/list_extract.rs` — want to make sure 
I'm reading the cast change correctly. `data_type(&schema)` returns the child 
field's element type, so the previous code was casting the default value to the 
outer list type instead of the element type, which would either no-op into a 
typed null or fail for `split_part`'s empty-string default. Switching to 
`element_type` lines up with how Spark's `ElementAt(... Some(Literal.create("", 
str.dataType)) ...)` wants the default typed. Is that the right reading?
   
   3. `spark/src/test/resources/sql-tests/expressions/string/split_part.sql` — 
coverage looks solid. Two nice-to-haves if you feel like adding them. A 
column-data row with a multi-character delimiter would exercise the path where 
`str::split` advances by more than one byte per match. And a fixture where the 
string contains regex metacharacters like `*` or `.` paired with a literal 
delimiter would assert at the SQL level what 
`test_split_sql_keeps_regex_chars_literal` asserts in Rust.
   
   4. CI hasn't reported any checks on 
`support-stringsplitsql-split-part-4561`. A push or workflow re-run would help 
confirm the SQL fixtures pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to