Re: [PERFORM] copy vs. C function

Craig Ringer Sat, 10 Dec 2011 18:33:23 -0800

On 12/11/2011 09:27 AM, Jon Nelson wrote:

The first method involved writing a C program to parse a file, parse
the lines and output newly-formatted lines in a format that
postgresql's COPY function can use.
End-to-end, this takes 15 seconds for about 250MB (read 250MB, parse,
output new data to new file -- 4 seconds, COPY new file -- 10
seconds).

Why not `COPY tablename FROM /path/to/myfifo' ?

Just connect your import program up to a named pipe (fifo) created with`mknod myfifo p` either by redirecting stdout or by open()ing the fifofor write. Then have Pg read from the fifo. You'll save a round of diskwrites and reads.

The next approach I took was to write a C function in postgresql to
parse a single TEXT datum into an array of C strings, and then use
BuildTupleFromCStrings. There are 8 columns involved.
Eliding the time it takes to COPY the (raw) file into a temporary
table, this method took 120 seconds, give or take.

The difference was /quite/ a surprise to me. What is the probability
that I am doing something very, very wrong?

Have a look at how COPY does it within the Pg sources, see if that's anyhelp. I don't know enough about Pg's innards to answer this one beyondthat suggestion, sorry.


--
Craig Ringer

--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] copy vs. C function

Reply via email to