kou commented on issue #36001:
URL: https://github.com/apache/arrow/issues/36001#issuecomment-1583411241
> 1. Is there a way to tell Arrow to use at most 500MB-800MB of RAM?
Unfortunately, we don't have it...
> 2. Otherwise, is there a way to read and query the file piece by piece
without loading it entirely on RAM?
You can read data per row group:
```ruby
Arrow::FileInputStream.open("data.parquet") do |input|
reader = Parquet::ArrowFileReader.new(input)
reader.n_row_groups.times do |i|
table = reader.read_row_group(i)
# Process table
end
end
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]