On Monday, July 6, 2015 at 3:53:04 PM UTC-7, [email protected] wrote:
>
> I have a Sequel application using PostgreSQL with some tables having 
> pre-computed, non-sequence primary keys.  I want to just ignore duplicate 
> key inserts, so I currently use something like this (where MyDoc is a 
> Sequel Model):
>
>     begin
>>       MyDoc.create( h )
>>     rescue Sequel::UniqueConstraintViolation => e
>>       nil
>>     end
>>
>
> This works, except that I've noticed I'm filling up my PostgreSQL 
> production logs with output of this form on each duplicate key:
>
>  2015-07-06 00:00:10.854 UTC >ERROR:  duplicate key value violates unique 
>> constraint "my_docs_pkey"
>>  2015-07-06 00:00:10.854 UTC >DETAIL:  Key 
>> (ghash)=(HpeWX0-URGuE_cwYU0uH6eJ) already exists.
>>  2015-07-06 00:00:10.854 UTC >STATEMENT:  INSERT INTO "my_docs" ...
>>
>
> ..and with the normal log rotation this is producing 2GB of extra log. The 
> disk I/O for the log output can't be a great thing. PostgreSQL 
> configuration doesn't support turning off *only* duplicate key errors. 
> I'd have to turn off all ERRORs by setting a global FATAL log level, and by 
> doing so risk missing other important ERRORs. The PostgreSQL devs 
> apparently feel its better to "fix" my SQL instead of supporting this kind 
> of log tuning.  So I'd like to change my insert SQL from:
>
> INSERT INTO "my_docs" (columns...) VALUES (*values...*) RETURNING *;
>
> to:
>
>          INSERT INTO "my_docs"
>          SELECT (*values...*)
>          WHERE NOT EXISTS (SELECT 1 FROM "my_docs" WHERE my_id = key)
>          RETURNING *;
>
> This is as described in http://stackoverflow.com/a/12388511.  The 
> advantage here is its still one round trip for the insert and no error 
> produced for the common duplicate case. There is apparently still a slight 
> change of a race, but this should rare enough that I wouldn't mind logging 
> those duplicates.  
>
> How might I best use Sequel to write a custom "create?" method with the 
> above "INSERT WHERE NOT EXISTS" syntax on MyDoc model?  Also might support 
> for this this make sense as a Sequel extension? 
>

Unless performance is bottlenecked on this, I would just do:

  MyDoc.create(h) if MyDoc.where(:my_id=>h[:my_id]).empty?

That causes an extra query per insert, though.

Alternatively, you could add a before insert trigger that checked for 
duplicates and ignored them.

Starting in PostgreSQL 9.5, you'll be able to use INSERT ON CONFLICT DO 
NOTHING (http://www.postgresql.org/docs/9.5/static/sql-insert.html), which 
Sequel will support before PostgreSQL 9.5 is officially released.

In terms of actually doing this automatically, something like this should 
work:

  def dataset.insert_sql(*v)
    if v.length ==1 && v.first.is_a?(Hash)
      h = v.first
      cols, vals = [], [];
      h.each{|k, v| cols << k; vals << v}
      super(cols, 
db.select(*vals).exclude(select(1).where(:my_id=>h[:my_id]).exists))
    else
      super
    end
  end

Use that dataset as the dataset for the MyDoc model.

Thanks,
Jeremy

-- 
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sequel-talk.
For more options, visit https://groups.google.com/d/optout.

Reply via email to