On Tue, Nov 13, 2012 at 1:27 PM, Heikki Linnakangas <hlinnakan...@vmware.com > wrote:
> On 13.11.2012 21:13, Jon Nelson wrote: > >> I was working on a data warehousing project where a fair number of files >> could be COPY'd more or less directly into tables. I have a somewhat nice >> machine to work with, and I ran on 75% of the cores I have (75% of 32 is >> 24). >> >> Performance was pretty bad. With 24 processes going, each backend (in >> COPY) >> spent 98% of it's time in semop (as identified by strace). I tried larger >> and smaller shared buffers, all sorts of other tweaks, until I tried >> reducing the number of concurrent processes from 24 to 4. >> >> Disk I/O went up (on average) at least 10X and strace reports that the top >> system calls are write (61%), recvfrom (25%), and lseek (14%) - pretty >> reasonable IMO. >> >> Given that each COPY is into it's own, newly-made table with no indices or >> foreign keys, etc, I would have expected the interaction among the >> backends >> to be minimal, but that doesn't appear to be the case. What is the likely >> cause of the semops? >> > > I'd guess it's lock contention on WALInsertLock. That means, the system is > experiencing lock contention on generating WAL records for the insertions. > If that theory is correct, you ought to get a big gain if you have > wal_level=minimal, and you create or truncate the table in the same > transaction with the COPY. That allows the system to skip WAL-logging the > COPY. > wal_level doesn't exist for 8.4, but I have archive_mode = "off" and I am creating the table in the same transaction as the COPY. > > Or you could upgrade to 9.2. The WAL-logging of bulk COPY was optimized in > 9.2, it should help precisely the scenario you're facing. > Unfortunately, that's what I was expecting. -- Jon