Dandandan opened a new issue, #22247:
URL: https://github.com/apache/datafusion/issues/22247
### Describe the bug
While comparing simple SQL statements between PostgreSQL and DataFusion, I
found several PostgreSQL compatibility mismatches where DataFusion returns a
different answer or returns a value where PostgreSQL raises a domain error.
I excluded cases that are only unsupported PostgreSQL functions in
DataFusion and minor type/display-format differences.
### Environment
PostgreSQL was built locally from source:
```text
PostgreSQL 19devel
commit aa1f93a3387ad619c14cea2b8ed01e6f49cb6600
```
DataFusion:
```text
datafusion-cli 53.1.0
commit 1ab146ad6cc119c7656ae1def75fd40697e5f94a
```
### Mismatches
#### 1. `^` evaluates as bitwise XOR instead of PostgreSQL exponentiation
```sql
SELECT 2 ^ 3;
```
PostgreSQL:
```text
8
```
DataFusion:
```text
1
```
#### 2. `SIMILAR TO` should treat `%` as a wildcard
```sql
SELECT 'abc' SIMILAR TO 'a%';
```
PostgreSQL:
```text
true
```
DataFusion:
```text
false
```
#### 3. `replace` with an empty search string should be a no-op
```sql
SELECT replace('abc', '', 'x');
```
PostgreSQL:
```text
abc
```
DataFusion:
```text
xaxbxcx
```
#### 4. `array_length` of an empty array dimension should be NULL
```sql
SELECT array_length(array[]::int[], 1);
```
PostgreSQL:
```text
NULL
```
DataFusion:
```text
0
```
#### 5. Negative array subscripts should not index from the end
```sql
SELECT (array[10,20,30])[-1];
```
PostgreSQL:
```text
NULL
```
DataFusion:
```text
30
```
#### 6. `time + interval` should wrap within the 24-hour time domain
```sql
SELECT time '23:30' + interval '2 hours';
```
PostgreSQL:
```text
01:30:00
```
DataFusion:
```text
25 hours 30 mins
```
#### 7. `time - interval` should wrap within the 24-hour time domain
```sql
SELECT time '01:30' - interval '2 hours';
```
PostgreSQL:
```text
23:30:00
```
DataFusion:
```text
-30 mins
```
#### 8. `extract(second ...)` should preserve fractional seconds
```sql
SELECT extract(second from timestamp '2020-01-01 00:00:12.345678');
```
PostgreSQL:
```text
12.345678
```
DataFusion:
```text
12
```
#### 9. `extract(milliseconds ...)` should preserve fractional milliseconds
```sql
SELECT extract(milliseconds from timestamp '2020-01-01 00:00:12.345678');
```
PostgreSQL:
```text
12345.678
```
DataFusion:
```text
12345
```
#### 10. `regexp_count` should count empty-pattern matches
```sql
SELECT regexp_count('abc', '');
```
PostgreSQL:
```text
4
```
DataFusion:
```text
0
```
#### 11. `regexp_instr` with an empty pattern should return 1
```sql
SELECT regexp_instr('abc', '');
```
PostgreSQL:
```text
1
```
DataFusion:
```text
0
```
#### 12. `regexp_like` should honor PostgreSQL multiline flag `m`
```sql
SELECT regexp_like(E'a\nb', '^b', 'm');
```
PostgreSQL:
```text
true
```
DataFusion:
```text
false
```
#### 13. `regexp_replace` should honor PostgreSQL multiline flag `m`
```sql
SELECT regexp_replace(E'a\nb', '^b', 'x', 'm');
```
PostgreSQL:
```text
a
x
```
DataFusion:
```text
a\nb
```
#### 14. `round(float8)` should match PostgreSQL half-tie behavior
```sql
SELECT round(2.5::float8);
```
PostgreSQL:
```text
2
```
DataFusion:
```text
3.0
```
#### 15. `factorial(21)` should not overflow when PostgreSQL returns a
numeric answer
```sql
SELECT factorial(21);
```
PostgreSQL:
```text
51090942171709440000
```
DataFusion:
```text
Overflow happened on FACTORIAL(21)
```
#### 16. `factorial` of a negative value should error
```sql
SELECT factorial(-1);
```
PostgreSQL:
```text
ERROR: factorial of a negative number is undefined
```
DataFusion:
```text
1
```
#### 17. `sqrt(-1.0::float8)` should error, not return NaN
```sql
SELECT sqrt((-1.0)::float8);
```
PostgreSQL:
```text
ERROR: cannot take square root of a negative number
```
DataFusion:
```text
NaN
```
#### 18. `ln(-1.0::float8)` should error, not return NaN
```sql
SELECT ln((-1.0)::float8);
```
PostgreSQL:
```text
ERROR: cannot take logarithm of a negative number
```
DataFusion:
```text
NaN
```
#### 19. `log(0.0::float8)` should error, not return `-inf`
```sql
SELECT log(0.0::float8);
```
PostgreSQL:
```text
ERROR: cannot take logarithm of zero
```
DataFusion:
```text
-inf
```
#### 20. `power(0.0::float8, -1.0::float8)` should error, not return infinity
```sql
SELECT power(0.0::float8, -1.0::float8);
```
PostgreSQL:
```text
ERROR: zero raised to a negative power is undefined
```
DataFusion:
```text
inf
```
### Expected behavior
For PostgreSQL-compatible SQL semantics, DataFusion should either match
PostgreSQL's result or raise the same class of domain/semantic error for these
simple expressions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]