Well the approach is working already interfacing directly with ruby-oci8. 
I'm in more of a refactor all the things mode right now.

On of the biggest problems for me, is that my test db is an oracle instance 
over a slow VPN connecting to a underpowered / busy database in a lab god 
knows where. This gem/app is only ~700 lines of code and the test suite 
takes > 2 mins to run :(  So I have a couple of different reasons for 
looking into Sequel.  One of the biggies is that I'm hoping I can test 
against a local mysql db with the same schema 95% of the time, and then 
only test against oracle when I need to verify portability.

So I guess I should have been more specific.  I'm looking for a best 
practice as in, how would one of you guys write this code block?  This is 
what I'm doing with ruby-oci8 directly - 
https://gist.github.com/5be872c941c6e6a0ad28 , the piece of code that calls 
this bulk insert, slices the array.  So basically 
@all_the_things.slice(500) { |s| bulk_insert(s); these_failed(s) if 
things_go_boom }

The biggest hang up, is that with all the other stupid things I have to 
deal with, is that I need to batch insert, but I also HAVE to know what 
records failed and store info on those.  The later is what makes this a 
pain.

For instance.  If I bulk_insert(100,000) records and then commit.  It 
behaves in a sorta nice way in that it will return an error, a row count of 
how many where successful inserted, and "discard" the unsuccessful ones. 
 By discard, I mean it provides no information about what elements of the 
array/records failed.

This alone should be enough, just throwing away anything records that 
failed. But nooooooo.  The clients sending the data in are so fubar'd they 
consistently send duplicate data.  So this not only causes issues with the 
transactional db, but when it data gets moved to the warehouse.

So basically.  I'm wondering if I could refactor out a ton of code and with 
Sequel in my tool belt.  I'm looking at this right now - 
http://sequel.rubyforge.org/rdoc/classes/Sequel/Database.html#method-i-after_commit
 
and after_rollback

So given @stats = 50,000 "rows to be inserted"

How can I accomplish / can this be done?

DB[:stats].import(cols, @stats, :commit_every => 500)

While at the same time:

after_commit do |slice_of_500???|
after_rollback do |something_else|

On Saturday, January 5, 2013 9:59:23 PM UTC-6, Jeremy Evans wrote:
>
> On Friday, January 4, 2013 7:48:43 PM UTC-8, Brandt Lofton wrote: 
> > Alright, so I'm new to Sequel and I'm enjoying it quite a bit after 
> getting over having to make a connection before defining a model class. 
> > 
> > 
> > So here is the logic I need to use. 
> > 
> > 
> > Collect x rows of data, x = ~30k usually 
> > Loop through these records in batches of say, ~500 
> > Model import(those 500 records)Import Fail? Move these bad 500 to a low 
> priority queue so I can loop through them 1 by 1Do another 500 
> > I know what your asking... Why do you have to do this?  Because I didn't 
> design the part that the data comes from, or the database it ends up in. 
> > I cannot "detect" this bad data because it is violating duplicate key 
> constraints on the db side.  So I don't know if its bad until the database 
> tells me so. 
> > 
> > 
> > So anyways, I have been playing with the options Sequel has available. 
>  DB.transaction, Model.import(:slice => 1000) etc and trying to figure out 
> what's going to do the trick. 
> > 
> > 
> > But for once I'm going to take a step back and try some advice first. 
>  Anybody more experienced in this regard have any best practice advice for 
> me before I go much further? I'm over my wtf quota for today... 
>
> Your approach sounds reasonable.  If you are using MySQL, you might want 
> to use insert_ignore, which will skip duplicate records automatically. 
>
> Thanks, 
> Jeremy 
>

-- 
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/sequel-talk/-/8CjcVoeFy4gJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sequel-talk?hl=en.

Reply via email to