On Monday, July 6, 2015 at 3:53:04 PM UTC-7, [email protected] wrote: > > I have a Sequel application using PostgreSQL with some tables having > pre-computed, non-sequence primary keys. I want to just ignore duplicate > key inserts, so I currently use something like this (where MyDoc is a > Sequel Model): > > begin >> MyDoc.create( h ) >> rescue Sequel::UniqueConstraintViolation => e >> nil >> end >> > > This works, except that I've noticed I'm filling up my PostgreSQL > production logs with output of this form on each duplicate key: > > 2015-07-06 00:00:10.854 UTC >ERROR: duplicate key value violates unique >> constraint "my_docs_pkey" >> 2015-07-06 00:00:10.854 UTC >DETAIL: Key >> (ghash)=(HpeWX0-URGuE_cwYU0uH6eJ) already exists. >> 2015-07-06 00:00:10.854 UTC >STATEMENT: INSERT INTO "my_docs" ... >> > > ..and with the normal log rotation this is producing 2GB of extra log. The > disk I/O for the log output can't be a great thing. PostgreSQL > configuration doesn't support turning off *only* duplicate key errors. > I'd have to turn off all ERRORs by setting a global FATAL log level, and by > doing so risk missing other important ERRORs. The PostgreSQL devs > apparently feel its better to "fix" my SQL instead of supporting this kind > of log tuning. So I'd like to change my insert SQL from: > > INSERT INTO "my_docs" (columns...) VALUES (*values...*) RETURNING *; > > to: > > INSERT INTO "my_docs" > SELECT (*values...*) > WHERE NOT EXISTS (SELECT 1 FROM "my_docs" WHERE my_id = key) > RETURNING *; > > This is as described in http://stackoverflow.com/a/12388511. The > advantage here is its still one round trip for the insert and no error > produced for the common duplicate case. There is apparently still a slight > change of a race, but this should rare enough that I wouldn't mind logging > those duplicates. > > How might I best use Sequel to write a custom "create?" method with the > above "INSERT WHERE NOT EXISTS" syntax on MyDoc model? Also might support > for this this make sense as a Sequel extension? >
Unless performance is bottlenecked on this, I would just do: MyDoc.create(h) if MyDoc.where(:my_id=>h[:my_id]).empty? That causes an extra query per insert, though. Alternatively, you could add a before insert trigger that checked for duplicates and ignored them. Starting in PostgreSQL 9.5, you'll be able to use INSERT ON CONFLICT DO NOTHING (http://www.postgresql.org/docs/9.5/static/sql-insert.html), which Sequel will support before PostgreSQL 9.5 is officially released. In terms of actually doing this automatically, something like this should work: def dataset.insert_sql(*v) if v.length ==1 && v.first.is_a?(Hash) h = v.first cols, vals = [], []; h.each{|k, v| cols << k; vals << v} super(cols, db.select(*vals).exclude(select(1).where(:my_id=>h[:my_id]).exists)) else super end end Use that dataset as the dataset for the MyDoc model. Thanks, Jeremy -- You received this message because you are subscribed to the Google Groups "sequel-talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sequel-talk. For more options, visit https://groups.google.com/d/optout.
