kou commented on issue #45117:
URL: https://github.com/apache/arrow/issues/45117#issuecomment-2564593313
In general, you should not convert large data to raw Ruby objects. It's slow
as you seen.
But Red Arrow provides an optimized feature that converts an Arrow array to
a Ruby array. You can use it by `arrow::XXXArray#to_a`.
For example, you can get a `Hash` of `Array`s (not an `Array` of `Hash`es)
with the feature:
```ruby
def read_parquet
table = Arrow::TableLoader.load('data.parquet', { format: :parquet })
data = {}
table.schema.fields.each do |field|
data[field.name] = []
end
table.each_record_batch do |record_batch|
record_batch.each_column do |column|
data[column.name].concat(column.data.to_a)
end
end
data
end
```
Are you really need an `Array` of `Hash`es?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]