JayjeetAtGithub opened a new issue, #7345:
URL: https://github.com/apache/arrow-datafusion/issues/7345
### Describe the bug
### Describe the bug
On running the query below on the Clickbench multi file dataset,
```sql
SELECT REGEXP_REPLACE("Referer", '^https?://(?:www.)?([^/]+)/.*$', '1') AS
k, AVG(length("Referer")) AS l, COUNT(*) AS c, MIN("Referer") FROM hits WHERE
"Referer" <> '' GROUP BY k HAVING COUNT(*) > 100000 ORDER BY l DESC LIMIT 25;
```
we get this error,
```bash
Internal error: The "regex_replace" function can only accept strings.. This
was likely caused by a bug in DataFusion's code and we would welcome that you
file an bug report in our issue tracker
```
### To Reproduce
Download the data using,
```bash
./benchmarks/bench.sh data clickbench_partitioned
```
A `hits_multi` directory with the parquet files will be created.
Execute the above queries,
```bash
datafusion-cli -c "CREATE EXTERNAL TABLE hits STORED AS PARQUET LOCATION
'hits_multi';" "{query}"
```
### Expected behavior
The queries should run successfully without erroring.
### Additional context
Datafusion 29.0.0
### To Reproduce
_No response_
### Expected behavior
_No response_
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]