marsupialtail commented on code in PR #13830:
URL: https://github.com/apache/arrow/pull/13830#discussion_r944829495


##########
cpp/src/arrow/dataset/file_base.cc:
##########
@@ -89,6 +89,28 @@ Result<std::shared_ptr<io::InputStream>> 
FileSource::OpenCompressed(
   return io::CompressedInputStream::Make(codec.get(), std::move(file));
 }
 
+Result<std::shared_ptr<io::InputStream>> FileSource::OpenRange(int64_t start,
+                                                               int64_t end) 
const {

Review Comment:
   I understand your use case to be: you have one Parquet file, you want to 
specify a byte range, and you want to read all the row groups that fit in that 
byte range. 
   
   I think with this API here, you don't need to split a single 
ParquetFileFragment into multiple, nor do you need a new C++ function. We can 
just incorporate your changes in file_parquet.cc and have the 
ParquetFileFragment interpret the start and end bytes in the way that you 
specified instead of insisting that they align on proper row group boundaries. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to