[GitHub] [arrow] westonpace commented on issue #35268: [C++] OrderBy with spillover

via GitHub Fri, 09 Jun 2023 16:30:41 -0700


westonpace commented on issue #35268:
URL: https://github.com/apache/arrow/issues/35268#issuecomment-1585258861


   Are you trying to read a single row?  Or a whole batch of rows?
   
   If you need random access to individual rows then parquet is not going to be 
a good fit.  We might want to investigate some kind of row-major format.
   
   If you only need to load specific batches of data then could you create a 
row group for each batch?  Or a separate file for each batch?
   
   If you need random access to batches of data (e.g. you don't know the row 
group boundaries at write time but it isn't random access to rows) then we 
could maybe use the row skip feature that was recently added to parquet (I 
don't think it has been exposed yet).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on issue #35268: [C++] OrderBy with spillover

Reply via email to