mcassels opened a new issue #723:
URL: https://github.com/apache/arrow-datafusion/issues/723
**Describe the bug**
ABS(col - x) in WHERE clause of query sometimes does not filter results
correctly. The test file has a float column with high-precision values. We want
to use `ABS(col - x) < y` to do equality comparisons for high-precision float
columns.
**To Reproduce**
datafusion cli example using test parquet file attached:
```
➜ arrow-datafusion git:(master) ✗ cargo run --bin datafusion-cli
Finished dev [unoptimized + debuginfo] target(s) in 0.22s
Running `target/debug/datafusion-cli`
> CREATE EXTERNAL TABLE foo STORED AS PARQUET LOCATION 'test.parquet';
0 rows in set. Query took 0.001 seconds.
> select * from foo;
+--------------------+
| c0 |
+--------------------+
| 107.0090813093981 |
| 125.51519138755981 |
| 141.83342587451415 |
| 113.65534481251639 |
| 251.10794896957802 |
| 112.08361695028363 |
+--------------------+
6 rows in set. Query took 0.006 seconds.
> select c0, ABS(c0 - 251.10794896957802) from foo;
+--------------------+-------------------------------------------+
| c0 | abs(c0 Minus Float64(251.10794896957802)) |
+--------------------+-------------------------------------------+
| 107.0090813093981 | 144.0988676601799 |
| 125.51519138755981 | 125.59275758201821 |
| 141.83342587451415 | 109.27452309506387 |
| 113.65534481251639 | 137.45260415706161 |
| 251.10794896957802 | 0 |
| 112.08361695028363 | 139.02433201929438 |
+--------------------+-------------------------------------------+
6 rows in set. Query took 0.006 seconds.
> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 -
251.10794896957802) < 1;
0 rows in set. Query took 0.005 seconds.
> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 -
251.10794896957802) < 111;
0 rows in set. Query took 0.003 seconds.
> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 -
251.10794896957802) < 150;
+--------------------+-------------------------------------------+
| c0 | abs(c0 Minus Float64(251.10794896957802)) |
+--------------------+-------------------------------------------+
| 107.0090813093981 | 144.0988676601799 |
| 125.51519138755981 | 125.59275758201821 |
| 141.83342587451415 | 109.27452309506387 |
| 113.65534481251639 | 137.45260415706161 |
| 251.10794896957802 | 0 |
| 112.08361695028363 | 139.02433201929438 |
+--------------------+-------------------------------------------+
6 rows in set. Query took 0.007 seconds.
```
**Expected behavior**
The query
```
> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 -
251.10794896957802) < 1;
```
was expected to give the following 1 row:
```
+--------------------+-------------------------------------------+
| c0 | abs(c0 Minus Float64(251.10794896957802)) |
+--------------------+-------------------------------------------+
| 251.10794896957802 | 0 |
```
And the query
```
> select c0, ABS(c0 - 251.10794896957802) from foo where ABS(c0 -
251.10794896957802) < 111;
```
was expected to give the following 2 rows:
```
+--------------------+-------------------------------------------+
| c0 | abs(c0 Minus Float64(251.10794896957802)) |
+--------------------+-------------------------------------------+
| 251.10794896957802 | 0 |
| 141.83342587451415 | 109.27452309506387 |
```
**Additional context**
Add any other context about the problem here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]