pitrou commented on issue #40874:
URL: https://github.com/apache/arrow/issues/40874#issuecomment-2025694839
By the way, one can also take a look at the C++ results (with mimalloc):
```
CastInt64ToDoubleUnsafe/524288/1000 148707 ns 148677 ns
4705 items_per_second=3.52635G/s null_percent=0.1 size=524.288k
CastInt64ToDoubleUnsafe/524288/10 148746 ns 148718 ns
4601 items_per_second=3.5254G/s null_percent=10 size=524.288k
CastInt64ToDoubleUnsafe/524288/2 148716 ns 148681 ns
4709 items_per_second=3.52625G/s null_percent=50 size=524.288k
CastInt64ToDoubleUnsafe/524288/1 149002 ns 148976 ns
4699 items_per_second=3.51927G/s null_percent=100 size=524.288k
CastInt64ToDoubleUnsafe/524288/0 146288 ns 146262 ns
4783 items_per_second=3.58459G/s null_percent=0 size=524.288k
```
and compare them with a corresponding Python benchmark:
```python
>>> %timeit arr[:524288].cast(pa.float64(), safe=False,
memory_pool=pa.mimalloc_memory_pool())
153 µs ± 1.82 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
```
and then to Numpy:
```python
>>> np_arr = arr.to_numpy()
>>> %timeit np_arr[:524288].astype('float64')
153 µs ± 5.48 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
```
They are all very similar (~150 µs per iteration, despite different levels
of internal boilerplate).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]