alamb commented on issue #22142:
URL: https://github.com/apache/datafusion/issues/22142#issuecomment-4722752314
Ok, I have found this whole issue quite confusing as it is not clear to me
what the scope of the issue is from this ticket as it describes a code problem,
rather than the wrong results that can result. Specifically it is not clear to
me from this issue, nor the various PRs that have been proposed, if this is
something specific to timestamps or if it is a more general problem
```sql
-- Two DISTINCT nanosecond timestamps that both truncate to the SAME
millisecond.
CREATE TABLE t AS
SELECT arrow_cast(1704067200001000000, 'Timestamp(Nanosecond, None)') AS
ts_ns
UNION ALL
SELECT arrow_cast(1704067200001234567, 'Timestamp(Nanosecond, None)');
-- WRONG: should return 2 rows, returns 1.
SELECT * FROM t
WHERE arrow_cast(ts_ns, 'Timestamp(Millisecond, None)') =
arrow_cast(1704067200001, 'Timestamp(Millisecond, None)');
Elapsed 0.002 seconds.
+-------------------------+
| ts_ns |
+-------------------------+
| 2024-01-01T00:00:00.001 |
+-------------------------+
1 row(s) fetched.
Elapsed 0.000 seconds.
```
You can see this is incorrect as you both rows have the truncated values
```sql
SELECT arrow_cast(ts_ns, 'Timestamp(Millisecond, None)') from t;
+----------------------------------------------------------+
| arrow_cast(t.ts_ns,Utf8("Timestamp(Millisecond, None)")) |
+----------------------------------------------------------+
| 2024-01-01T00:00:00.001 |
| 2024-01-01T00:00:00.001 |
+----------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.000 seconds.
```
The same thing happens with decimals:
```sql
CREATE TABLE t(d DECIMAL(20,4)) AS VALUES (1.2345), (1.2399), (1.2500);
-- 1.2345 -> 1.23 at scale 2, so this MUST return 1 row:
SELECT * FROM t
WHERE arrow_cast(d, 'Decimal128(20, 2)') = arrow_cast(1.23, 'Decimal128(20,
2)');
-- predicate rewritten to `d = 1.2300` (exact) -> returns 0 rows. WRONG.
+---+
| d |
+---+
+---+
0 row(s) fetched.
Elapsed 0.001 seconds.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]