kdkavanagh opened a new issue, #41206:
URL: https://github.com/apache/arrow/issues/41206

   ### Describe the enhancement requested
   
   We have some realtime stream processing systems using arrow which are 
primarily write-only in the process itself, accumulating big record batches in 
memory before flushing to disk.  Because we never read back the data in the 
process itself (and only re-access particular array indexes when we flush the 
batch), I'm curious if it might make sense to support `_mm_stream_*` writes in 
the array builder API so that writes skip caches and go straight to main memory 
to avoid needless cache thrashing.
   
   I'd imagine this would be a pretty common usecase for systems that are 
generating arrow datasets for downstream consumption
   
   I dont believe this is currently supported, as the only reference to 
`_mm_stream_` I can find is at 
https://github.com/apache/arrow/blob/ec2d7cbfb426854ccb57be1d2abd9e2f88b268b9/cpp/src/arrow/io/memory_benchmark.cc#L82
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to