On Monday, August 15, 2016 at 11:01:28 AM UTC-7, Lee Smith wrote:
>
> Hi,
>
> I'm using the copy_table method to extract a 1 billion row Postgres table
> to csv in parts. This could be made simpler if copy_table returned an
> Enumerator instead of the entire table contents (if no block is given), as
> I could use the each_slice method.
>
>
> DB.copy_table(:table).each_slice(10000).with_index(1) do |part, part_count|
> File.open(format('%s_%05d.csv', :table, part_count), 'w') do |csv_part|
> part.each do |row|
> csv_part.puts(row)
> end
> end
> end
>
>
>
> Here is a possible implementation
> <https://github.com/LS80/sequel/commit/580bde8f54af0c98e7c5163c392fce027c663c30>
> .
>
> Any thoughts on a change like this?
>
> Regards,
> Lee
>
I don't think the implementation is thread safe, as it uses a connection
after it has been checked back in to the connection pool.
Is there a reason you couldn't just do:
DB.to_enum(:copy_table, :table).each_slice(10000)...
Personally, I'm not a fan of automatically returning an Enumerator when a
method that expects a block is called without a block. I know ruby 1.9+
does that by default in many cases, but I don't think the trade off (hiding
bugs vs nicer API) is a good one, considering you can always call #to_enum
manually if you want an Enumerator.
Thanks,
Jeremy
--
You received this message because you are subscribed to the Google Groups
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sequel-talk.
For more options, visit https://groups.google.com/d/optout.