mbignotti opened a new issue, #37008: URL: https://github.com/apache/arrow/issues/37008
### Describe the usage question you have. Please include as many useful details as possible. Hi! I'm relatively new to Arrow, hence I apologies if the question is not 100% clear. I've posted [here](https://stackoverflow.com/questions/76785928/fifo-queue-with-polars-dataframe) what I'm trying to achieve, but I feel this is probably easier to solve with PyArrow tools. In essence, I'm looking for the most efficient way to implement a FIFO queue with Arrow data, where the size of the queue is defined in terms of number of rows in a table. Here is a trivial example with Tables: ```python import pyarrow as pa import pyarrow.compute as pc # Define the maximum number of rows in the buffer buffer_size = 4 # Initialize the buffer with some data buffer = pa.table([pa.array([1,2,3,4]), pa.array([11,12,13,14])], names=['a', 'b']) # A new table arrives. This needs to be appended to the buffer. new_table = pa.table([pa.array([5,6]), pa.array([15,16])], names=['a', 'b']) # Append new data to the buffer buffer = pa.concat_tables([buffer, new_table]) # Update the buffer such that it respects the max number of rows allowed. buffer_len = len(buffer) buffer = pc.take(buffer, pa.array([buffer_len-i-1 for i in reversed(range(buffer_size))])) # Not sure there's an easier way to select the last n rows # Continue... ``` Of course, incoming data must respect the schema. However, I'm pretty sure the approach showed above is not the most efficient one. Can we use Arrow buffers to achieve the same result? Or maybe something else? Thanks a lot! ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
