westonpace commented on issue #35268: URL: https://github.com/apache/arrow/issues/35268#issuecomment-1585258861
Are you trying to read a single row? Or a whole batch of rows? If you need random access to individual rows then parquet is not going to be a good fit. We might want to investigate some kind of row-major format. If you only need to load specific batches of data then could you create a row group for each batch? Or a separate file for each batch? If you need random access to batches of data (e.g. you don't know the row group boundaries at write time but it isn't random access to rows) then we could maybe use the row skip feature that was recently added to parquet (I don't think it has been exposed yet). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
