jorisvandenbossche commented on issue #37355:
URL: https://github.com/apache/arrow/issues/37355#issuecomment-1692915164
A simpler reproducer:
```python
import pyarrow as pa
import pyarrow.compute as pc
# create table with tz-aware nanosecond resolution timestamp
table = pa.table({'timestamp': pa.array([1], pa.timestamp("ns", "UTC"))})
# comparison with microseconds works if there are microseconds
table.filter(pc.field("timestamp") <= pa.scalar(1, pa.timestamp("us",
"UTC")))
# comparison fails with microseconds if there are no microseconds
table.filter(pc.field("timestamp") <= pa.scalar(0, pa.timestamp("us",
"UTC")))
# ...
# ArrowNotImplementedError: Function 'less_equal' has no kernel
# matching input types (timestamp[ns, tz=UTC], timestamp[s])
# but works again if the resolution matches
table.filter(pc.field("timestamp") <= pa.scalar(0, pa.timestamp("ns",
"UTC")))
```
It somehow completely looses the type information of the scalar (both the
resolution and the timezone) somewhere inside Acero.
Calling the compute kernel directly instead of going through an expression
and execute with Acero seems to work fine:
```
>>> pc.less_equal(table["timestamp"], pa.scalar(0, pa.timestamp("us",
"UTC")))
<pyarrow.lib.ChunkedArray object at 0x7f804bbb0c20>
[
[
false
]
]
```
---
The above reproducer actually also fails with pyarrow 12.0.0, but still
seems fixed on the main branch. So some other change in 13.0.0 might have
changed the dataset filtering to take the code path from the reproducer above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]