On 5/4/05, Mischa Sandberg <[EMAIL PROTECTED]> wrote: > Quoting David Roussel <[EMAIL PROTECTED]>: > > > > COPY invokes all the same logic as INSERT on the server side > > > (rowexclusive locking, transaction log, updating indexes, rules). > > > The difference is that all the rows are inserted as a single > > > transaction. This reduces the number of fsync's on the xlog, > > > which may be a limiting factor for you. You'll want to crank > > > WAL_BUFFERS and CHECKPOINT_SEGMENTS to match, though. > > > One of my streams has 6K records; I run with WB=1000, CS=128. > > > > So what's the difference between a COPY and a batch of INSERT > > statements. Also, surely, fsyncs only occur at the end of a > > transaction, no need to fsync before a commit has been issued, > > right? > > Sorry, I was comparing granularities the other way araound. As far as > xlog is concerned, a COPY is ALWAYS one big txn, no matter how many > putline commands you use to feed the copy. With inserts, you can choose > whether to commit every row, every nth row, etc. > > Copy makes better use of the TCP connection for transmission. COPY uses > the TCP connection like a one-way pipe. INSERT is like an RPC: the > sender has to wait until the insert's return status roundtrips.
I have found even greater performance increases by using COPY FROM <filename> not COPY FROM STDIN. This allows the backend process to directly read the file, rather than shoving it over a pipe (thereby potentially hitting the CPU multiple times). My experience is that this is anywhere from 5-10x faster than INSERT statements on the whole, and sometimes 200x. Chris -- | Christopher Petrilli | [EMAIL PROTECTED] ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq