JayjeetAtGithub opened a new issue, #7345:
URL: https://github.com/apache/arrow-datafusion/issues/7345

   ### Describe the bug
   
   ### Describe the bug
   
   On running the query below on the Clickbench multi file dataset, 
   
   ```sql
   SELECT REGEXP_REPLACE("Referer", '^https?://(?:www.)?([^/]+)/.*$', '1') AS 
k, AVG(length("Referer")) AS l, COUNT(*) AS c, MIN("Referer") FROM hits WHERE 
"Referer" <> '' GROUP BY k HAVING COUNT(*) > 100000 ORDER BY l DESC LIMIT 25;
   ```
   
   we get this error,
   
   ```bash
   Internal error: The "regex_replace" function can only accept strings.. This 
was likely caused by a bug in DataFusion's code and we would welcome that you 
file an bug report in our issue tracker
   ```
   
   ### To Reproduce
   
   Download the data using,
   
   ```bash
    ./benchmarks/bench.sh data clickbench_partitioned
   ```
   
   A `hits_multi` directory with the parquet files will be created.
   
   Execute the above queries,
   
   ```bash
   datafusion-cli -c "CREATE EXTERNAL TABLE hits STORED AS PARQUET LOCATION 
'hits_multi';" "{query}"
   ```
   
   ### Expected behavior
   
   The queries should run successfully without erroring.
   
   ### Additional context
   
   Datafusion 29.0.0
   
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to