Dandandan opened a new issue, #22247:
URL: https://github.com/apache/datafusion/issues/22247

   ### Describe the bug
   
   While comparing simple SQL statements between PostgreSQL and DataFusion, I 
found several PostgreSQL compatibility mismatches where DataFusion returns a 
different answer or returns a value where PostgreSQL raises a domain error.
   
   I excluded cases that are only unsupported PostgreSQL functions in 
DataFusion and minor type/display-format differences.
   
   ### Environment
   
   PostgreSQL was built locally from source:
   
   ```text
   PostgreSQL 19devel
   commit aa1f93a3387ad619c14cea2b8ed01e6f49cb6600
   ```
   
   DataFusion:
   
   ```text
   datafusion-cli 53.1.0
   commit 1ab146ad6cc119c7656ae1def75fd40697e5f94a
   ```
   
   ### Mismatches
   
   #### 1. `^` evaluates as bitwise XOR instead of PostgreSQL exponentiation
   
   ```sql
   SELECT 2 ^ 3;
   ```
   
   PostgreSQL:
   
   ```text
   8
   ```
   
   DataFusion:
   
   ```text
   1
   ```
   
   #### 2. `SIMILAR TO` should treat `%` as a wildcard
   
   ```sql
   SELECT 'abc' SIMILAR TO 'a%';
   ```
   
   PostgreSQL:
   
   ```text
   true
   ```
   
   DataFusion:
   
   ```text
   false
   ```
   
   #### 3. `replace` with an empty search string should be a no-op
   
   ```sql
   SELECT replace('abc', '', 'x');
   ```
   
   PostgreSQL:
   
   ```text
   abc
   ```
   
   DataFusion:
   
   ```text
   xaxbxcx
   ```
   
   #### 4. `array_length` of an empty array dimension should be NULL
   
   ```sql
   SELECT array_length(array[]::int[], 1);
   ```
   
   PostgreSQL:
   
   ```text
   NULL
   ```
   
   DataFusion:
   
   ```text
   0
   ```
   
   #### 5. Negative array subscripts should not index from the end
   
   ```sql
   SELECT (array[10,20,30])[-1];
   ```
   
   PostgreSQL:
   
   ```text
   NULL
   ```
   
   DataFusion:
   
   ```text
   30
   ```
   
   #### 6. `time + interval` should wrap within the 24-hour time domain
   
   ```sql
   SELECT time '23:30' + interval '2 hours';
   ```
   
   PostgreSQL:
   
   ```text
   01:30:00
   ```
   
   DataFusion:
   
   ```text
   25 hours 30 mins
   ```
   
   #### 7. `time - interval` should wrap within the 24-hour time domain
   
   ```sql
   SELECT time '01:30' - interval '2 hours';
   ```
   
   PostgreSQL:
   
   ```text
   23:30:00
   ```
   
   DataFusion:
   
   ```text
   -30 mins
   ```
   
   #### 8. `extract(second ...)` should preserve fractional seconds
   
   ```sql
   SELECT extract(second from timestamp '2020-01-01 00:00:12.345678');
   ```
   
   PostgreSQL:
   
   ```text
   12.345678
   ```
   
   DataFusion:
   
   ```text
   12
   ```
   
   #### 9. `extract(milliseconds ...)` should preserve fractional milliseconds
   
   ```sql
   SELECT extract(milliseconds from timestamp '2020-01-01 00:00:12.345678');
   ```
   
   PostgreSQL:
   
   ```text
   12345.678
   ```
   
   DataFusion:
   
   ```text
   12345
   ```
   
   #### 10. `regexp_count` should count empty-pattern matches
   
   ```sql
   SELECT regexp_count('abc', '');
   ```
   
   PostgreSQL:
   
   ```text
   4
   ```
   
   DataFusion:
   
   ```text
   0
   ```
   
   #### 11. `regexp_instr` with an empty pattern should return 1
   
   ```sql
   SELECT regexp_instr('abc', '');
   ```
   
   PostgreSQL:
   
   ```text
   1
   ```
   
   DataFusion:
   
   ```text
   0
   ```
   
   #### 12. `regexp_like` should honor PostgreSQL multiline flag `m`
   
   ```sql
   SELECT regexp_like(E'a\nb', '^b', 'm');
   ```
   
   PostgreSQL:
   
   ```text
   true
   ```
   
   DataFusion:
   
   ```text
   false
   ```
   
   #### 13. `regexp_replace` should honor PostgreSQL multiline flag `m`
   
   ```sql
   SELECT regexp_replace(E'a\nb', '^b', 'x', 'm');
   ```
   
   PostgreSQL:
   
   ```text
   a
   x
   ```
   
   DataFusion:
   
   ```text
   a\nb
   ```
   
   #### 14. `round(float8)` should match PostgreSQL half-tie behavior
   
   ```sql
   SELECT round(2.5::float8);
   ```
   
   PostgreSQL:
   
   ```text
   2
   ```
   
   DataFusion:
   
   ```text
   3.0
   ```
   
   #### 15. `factorial(21)` should not overflow when PostgreSQL returns a 
numeric answer
   
   ```sql
   SELECT factorial(21);
   ```
   
   PostgreSQL:
   
   ```text
   51090942171709440000
   ```
   
   DataFusion:
   
   ```text
   Overflow happened on FACTORIAL(21)
   ```
   
   #### 16. `factorial` of a negative value should error
   
   ```sql
   SELECT factorial(-1);
   ```
   
   PostgreSQL:
   
   ```text
   ERROR: factorial of a negative number is undefined
   ```
   
   DataFusion:
   
   ```text
   1
   ```
   
   #### 17. `sqrt(-1.0::float8)` should error, not return NaN
   
   ```sql
   SELECT sqrt((-1.0)::float8);
   ```
   
   PostgreSQL:
   
   ```text
   ERROR: cannot take square root of a negative number
   ```
   
   DataFusion:
   
   ```text
   NaN
   ```
   
   #### 18. `ln(-1.0::float8)` should error, not return NaN
   
   ```sql
   SELECT ln((-1.0)::float8);
   ```
   
   PostgreSQL:
   
   ```text
   ERROR: cannot take logarithm of a negative number
   ```
   
   DataFusion:
   
   ```text
   NaN
   ```
   
   #### 19. `log(0.0::float8)` should error, not return `-inf`
   
   ```sql
   SELECT log(0.0::float8);
   ```
   
   PostgreSQL:
   
   ```text
   ERROR: cannot take logarithm of zero
   ```
   
   DataFusion:
   
   ```text
   -inf
   ```
   
   #### 20. `power(0.0::float8, -1.0::float8)` should error, not return infinity
   
   ```sql
   SELECT power(0.0::float8, -1.0::float8);
   ```
   
   PostgreSQL:
   
   ```text
   ERROR: zero raised to a negative power is undefined
   ```
   
   DataFusion:
   
   ```text
   inf
   ```
   
   ### Expected behavior
   
   For PostgreSQL-compatible SQL semantics, DataFusion should either match 
PostgreSQL's result or raise the same class of domain/semantic error for these 
simple expressions.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to