marsupialtail commented on PR #14078:
URL: https://github.com/apache/arrow/pull/14078#issuecomment-1241119871

   Recap the discussion: we want to add slice to the FileFragment class so that 
we can open a Reader for just a partial byte range. Implemented a Slice method 
for FileFragment that makes a new FileFragment with a specified byte range.
   
   Currently this assumes that the byte ranges supplied respect line breaks for 
the CSV file format. If the byte range starts/ends in the middle of a line, 
then an error will be thrown when the Reader parses the start / end block. As a 
result of this this PR doesn't support Parquet since you should just use the 
subset API already available to get this functionality.
   
   In the future, we want to incorporate @zhztheplayer 's work to allow slicing 
in the middle of a row group for Parquet and possibly slicing in the middle of 
a linebreak for CSV. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to