I was doing some testing on "insert" compared to "select into". I inserted 100 000 rows (with 8 column values) into a table, which took 14 seconds, compared to a select into, which took 0.8 seconds. (fyi, the inserts where batched, autocommit was turned off and it all happend on the local machine)

        Did you use prepared statements ?
Did you use INSERT INTO ... VALUES () with a long list of values, or just 100K insert statements ?

It's the time to parse statements, plan, execute, roundtrips with the client, context switches, time for your client library to escape the data and encode it and for postgres to decode it, etc. In a word : OVERHEAD.

        By the way which language and client library are you using ?

FYI 14s / 100k = 140 microseconds per individual SQL query. That ain't slow at all.

Does the select into translate into a specially optimised function in c that can cut corners which a insert can not do (e.g. lazy copying), or is it some other reason?

Yeah : instead of your client having to encode 100K * 8 values, send it over a socket, and postgres decoding it, INSERT INTO SELECT just takes the data, and writes the data. Same thing as writing a file a byte at a time versus using a big buffer.

The reason I am asking is that select into shows that a number of rows can be inserted into a table quite a lot faster than one would think was possible with ordinary sql. If that is the case, it means that if I write an pl-pgsql insert function in C instead of sql, then I can have my db perform order of magnitude faster.

Fortunately this is already done for you : there is the PREPARE statement, which will remove the parsing overhead. If you must insert many rows, use VALUES (),(),()...

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Reply via email to