Hello!

I am using Sequel for a plethora of ETL jobs at work. We've had a lot of 
success with PostgreSQL, MySQL, and MariaDB databases so far. We're slowly 
rolling all of our existing data engineering infrastructure into this 
project and are moving on to the SQL Server databases recently. That has 
been going well, also, until today when we hit a snag.

We're using Sequel, TinyTDS, and Datasets for this work. What we're doing 
is piping Dataset#paged_each into Dataset#multi_insert, passing blocks 
between a read and a write method on our Extract and Load handlers, by 
running Array#each_slice(1000) on the rows we get back from 
Dataset#paged_each. This has served us _very_ well in every other aspect of 
this application. Our ETLs are fast and have a ludicrously small memory 
footprint for the amount of data we are moving around (the background job 
processor stays steady at ~150MB despite moving GB of data on a daily 
basis).

This is also serving us well with SQL Server tables that are lightweight. 
Where we hit the snag is a denormalized table that is ~80 columns wide with 
just under 1.5 million row that we want to move between a data mart and a 
warehouse.

We're seeing TinyTDS connection timeouts from the read call that we send to 
Dataset#paged_each. I've tried debugging around a bit but I'm not making 
much progress. Any help or thoughts are appreciated!

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sequel-talk.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sequel-talk/7dd07add-340a-4c9d-a1c1-426e0f66bb28%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to