Re: [I] [C++] Performance of numeric casts [arrow]

via GitHub Thu, 28 Mar 2024 10:02:13 -0700


pitrou commented on issue #40874:
URL: https://github.com/apache/arrow/issues/40874#issuecomment-2025694839


   By the way, one can also take a look at the C++ results (with mimalloc):
   ```
   CastInt64ToDoubleUnsafe/524288/1000     148707 ns       148677 ns         
4705 items_per_second=3.52635G/s null_percent=0.1 size=524.288k
   CastInt64ToDoubleUnsafe/524288/10       148746 ns       148718 ns         
4601 items_per_second=3.5254G/s null_percent=10 size=524.288k
   CastInt64ToDoubleUnsafe/524288/2        148716 ns       148681 ns         
4709 items_per_second=3.52625G/s null_percent=50 size=524.288k
   CastInt64ToDoubleUnsafe/524288/1        149002 ns       148976 ns         
4699 items_per_second=3.51927G/s null_percent=100 size=524.288k
   CastInt64ToDoubleUnsafe/524288/0        146288 ns       146262 ns         
4783 items_per_second=3.58459G/s null_percent=0 size=524.288k
   ```
   
   and compare them with a corresponding Python benchmark:
   ```python
   >>> %timeit arr[:524288].cast(pa.float64(), safe=False, 
memory_pool=pa.mimalloc_memory_pool())
   153 µs ± 1.82 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
   ```
   
   and then to Numpy:
   ```python
   >>> np_arr = arr.to_numpy()
   >>> %timeit np_arr[:524288].astype('float64')
   153 µs ± 5.48 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
   ```
   
   They are all very similar (~150 µs per iteration, despite different levels 
of internal boilerplate).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [C++] Performance of numeric casts [arrow]

Reply via email to