On 5/4/05, Mischa Sandberg <[EMAIL PROTECTED]> wrote:
Quoting David Roussel <[EMAIL PROTECTED]>:
COPY invokes all the same logic as INSERT on the server side (rowexclusive locking, transaction log, updating indexes, rules). The difference is that all the rows are inserted as a single transaction. This reduces the number of fsync's on the xlog, which may be a limiting factor for you. You'll want to crank WAL_BUFFERS and CHECKPOINT_SEGMENTS to match, though. One of my streams has 6K records; I run with WB=1000, CS=128.
So what's the difference between a COPY and a batch of INSERT statements. Also, surely, fsyncs only occur at the end of a transaction, no need to fsync before a commit has been issued, right?
Sorry, I was comparing granularities the other way araound. As far as xlog is concerned, a COPY is ALWAYS one big txn, no matter how many putline commands you use to feed the copy. With inserts, you can choose whether to commit every row, every nth row, etc.
Copy makes better use of the TCP connection for transmission. COPY uses the TCP connection like a one-way pipe. INSERT is like an RPC: the sender has to wait until the insert's return status roundtrips.
I have found even greater performance increases by using COPY FROM <filename> not COPY FROM STDIN. This allows the backend process to directly read the file, rather than shoving it over a pipe (thereby potentially hitting the CPU multiple times). My experience is that this is anywhere from 5-10x faster than INSERT statements on the whole, and sometimes 200x.
Chris
Unfortunately, COPY FROM '<file>' can only be done by a superuser. If you that option then that is great. If not...
-- Kind Regards, Keith
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend