andygrove commented on PR #4592:
URL:
https://github.com/apache/datafusion-comet/pull/4592#issuecomment-4739768650
A few observations from a review pass. Acknowledging up front that this
comment was drafted with help from AI (Claude Code).
1. `native/spark-expr/src/string_funcs/split.rs` — `spark_split_sql` matches
`(Array, Scalar)`, `(Array, Array)`, and `(Scalar, Scalar)`, but `(Scalar,
Array)` falls through to the `exec_err!` arm. Could that shape ever come
through, e.g. `split_part('abc', d, 1)` where the delimiter is a column?
`spark_split` has the same gap so this is consistent with what's there today,
but if it's reachable it's probably worth treating the same as `(Array, Array)`
after broadcasting.
2. `native/spark-expr/src/array_funcs/list_extract.rs` — want to make sure
I'm reading the cast change correctly. `data_type(&schema)` returns the child
field's element type, so the previous code was casting the default value to the
outer list type instead of the element type, which would either no-op into a
typed null or fail for `split_part`'s empty-string default. Switching to
`element_type` lines up with how Spark's `ElementAt(... Some(Literal.create("",
str.dataType)) ...)` wants the default typed. Is that the right reading?
3. `spark/src/test/resources/sql-tests/expressions/string/split_part.sql` —
coverage looks solid. Two nice-to-haves if you feel like adding them. A
column-data row with a multi-character delimiter would exercise the path where
`str::split` advances by more than one byte per match. And a fixture where the
string contains regex metacharacters like `*` or `.` paired with a literal
delimiter would assert at the SQL level what
`test_split_sql_keeps_regex_chars_literal` asserts in Rust.
4. CI hasn't reported any checks on
`support-stringsplitsql-split-part-4561`. A push or workflow re-run would help
confirm the SQL fixtures pass.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]