Blizzara opened a new issue, #11410:
URL: https://github.com/apache/datafusion/issues/11410

   ### Describe the bug
   
   regexp_replace fails to produce correct number of rows if either `pattern` 
or `replacement` arg is a scalar NULL.  
   
   I think this is due to 
https://github.com/apache/datafusion/blob/7a23ea9bce32dc8ae195caa8ca052673031c06c9/datafusion/functions/src/regex/regexpreplace.rs#L316
 `fetch_string_arg` not passing the correct length to the 
`_regexp_replace_early_abort` function - when this specific arg is a scalar, 
its "array len" is just 1, and the abort function creates a 1-len array as the 
result.
   
   ### To Reproduce
   
   Normal case - values are valid scalars or array nulls:
   ```
   > select regexp_replace(col, 'a', 'c') from (values ('a'), ('b')) as 
tbl(col);
   +---------------------------------------------+
   | regexp_replace(tbl.col,Utf8("a"),Utf8("c")) |
   +---------------------------------------------+
   | c                                           |
   | b                                           |
   +---------------------------------------------+
   2 row(s) fetched. 
   Elapsed 0.001 seconds.
   
   > select regexp_replace(col, ncol, 'c') from (values ('a', NULL), ('b', 
NULL)) as tbl(col, ncol);
   +--------------------------------------------+
   | regexp_replace(tbl.col,tbl.ncol,Utf8("c")) |
   +--------------------------------------------+
   |                                            |
   |                                            |
   +--------------------------------------------+
   ```
   
   Failing case - pattern or replacement is a scalar NULL:
   ```
   > select regexp_replace(col, NULL, 'c') from (values ('a'), ('b')) as 
tbl(col);
   Internal error: UDF returned a different number of rows than expected. 
Expected: 2, Got: 1.
   This was likely caused by a bug in DataFusion's code and we would welcome 
that you file an bug report in our issue tracker
   
   > select regexp_replace(col, 'a', NULL) from (values ('a'), ('b')) as 
tbl(col);
   Internal error: UDF returned a different number of rows than expected. 
Expected: 2, Got: 1.
   This was likely caused by a bug in DataFusion's code and we would welcome 
that you file an bug report in our issue tracker
   ```
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to