On Wednesday, February 20, 2019 at 9:35:59 AM UTC-8, 
[email protected] wrote:
>
> Hi
>
> I'm using Sequel with PostgreSQL.
>
> I want to index each row of my table into an elasticsearch index.
> I want to execute bulk index on elasticsearch to minimize the number of 
> request.
>
>
> *How can I get batches of rows from PostgreSQL to perform bulk index?*
>
> Dataset#paged_each execute for each row a block which take a hash 
> (representing a row) as argument.
> Dataset#fetch_rows has the same behavior that Dataset#paged_each.
>
> *Is there a method which execute a block on each batch of rows?*
>
> In sequel-5.16.0/lib/sequel/adapters/postgres.rb, Dataset#fetch_rows would 
> behave like I need if Dataset#yield_hash_rows was implemented like below
>       def yield_hash_rows(res, cols)
>         ntuples = res.ntuples
>         recnum = 0
>         rows = []
>         while recnum < ntuples
>           fieldnum = 0
>           nfields = cols.length
>           converted_rec = {}
>           while fieldnum < nfields
>             type_proc, fieldsym = cols[fieldnum]
>             value = res.getvalue(recnum, fieldnum)
>             converted_rec[fieldsym] = (value && type_proc) ? 
> type_proc.call(value) : value
>             fieldnum += 1 
>           end
>           recnum += 1
>           rows << converted_rec
>         end
>         yield arr
>       end
>
>
> Thank you for your help.
>

Assuming you want an array of hashes yielded to the block instead of each 
hash, each_slice should work:

ds.paged_each.each_slice(100) do |array_of_hashes|
  # ...
end

Thanks,
Jeremy

-- 
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sequel-talk.
For more options, visit https://groups.google.com/d/optout.

Reply via email to