cyb70289 commented on issue #13787:
URL: https://github.com/apache/arrow/issues/13787#issuecomment-1207606511

   Looks the two cases are tested against different data size?
   [1, 1, 544, **192**] vs. [1, 1, 544, **992**]
   
   Besides, for the first test cast, I believe below line will realize all the 
physical pages.
   `buffer = sharedctypes.RawArray(ctypes.c_uint8, capacity + 1)`
   So the benchmarked code loop won't cause any page fault.
   
   But for pyarrow case, below line only reserves pages without truly allocate 
anyone.
   `mmap = pa.create_memory_map(path, 5000000 * 1000)`
   So the benchmarked code loop will trigger tons of page faults.
   
   I benchmarked the running time of the whole program, with same data size, 
and tempfile under 'dev/shm', pyarrow(0.459s) is faster than 
sharedctypes(0.947s).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to