kou commented on issue #45117:
URL: https://github.com/apache/arrow/issues/45117#issuecomment-2564593313

   In general, you should not convert large data to raw Ruby objects. It's slow 
as you seen.
   
   But Red Arrow provides an optimized feature that converts an Arrow array to 
a Ruby array. You can use it by `arrow::XXXArray#to_a`.
   
   For example, you can get a `Hash` of `Array`s (not an `Array` of `Hash`es) 
with the feature:
   
   ```ruby
   def read_parquet
     table = Arrow::TableLoader.load('data.parquet', { format: :parquet })
     data = {}
     table.schema.fields.each do |field|
       data[field.name] = []
     end
     table.each_record_batch do |record_batch|
       record_batch.each_column do |column|
         data[column.name].concat(column.data.to_a)
       end
     end
     data
   end
   ```
   
   Are you really need an `Array` of `Hash`es?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to