PFC wrote:
Unfortunately its not fast enough, it needs to be done in no more than
1-2 seconds, ( and in production it will be maybe 20-50 columns of
data, perhaps divided over 5-10 tables.)
Additionally it needs to scale to perhaps three times as many columns
and perhaps 2 - 3 times as many rows in some situation within 1 seconds.
Further on it needs to allow for about 20 - 50 clients reading much of
that data before the next batch of data arrives.
Wow. What is the application ?
Geological surveys, where they perform realtime geo/hydro-phone shots of
areas of the size of 10x10km every 3-15 seconds.
test=> CREATE OR REPLACE FUNCTION test_insert( )
RETURNS VOID
LANGUAGE plpgsql
AS
$$
DECLARE
_i INTEGER;
BEGIN
FOR _i IN 0..100000 LOOP
INSERT INTO test (a,b,c,d,e,f) VALUES (1,2,3,4,5, _i);
END LOOP;
END;
$$;
CREATE FUNCTION
Temps : 51,948 ms
test=> SELECT test_insert();
test_insert
-------------
(1 ligne)
Temps : 1885,382 ms
I tested this one and it took 4 seconds, compared to the jdbc insert
which took 14 seconds, so its a lot faster. but not as fast as the
SELECT INTO.
I also tested an INSERT INTO FROM SELECT, which took 1.8 seconds, now we
are starting to talk about real performance.
However COPY is much faster because the parsing overhead and
de-escaping of data is faster. COPY is optimized for throughput.
So, advice :
For optimum throughput, have your application build chunks of data
into text files and use COPY. Or if your client lib supports the copy
interface, use it.
I did test COPY, i.e. the jdbc COPY patch for pg 8.1, it performs at
approx 1.8 seconds :) The test was done with text input, I am going to
test it with binary input, which I expect will increase the performance
with 20-50%.
All these test have ben performed on a laptop with a Kubuntu 6.10
version of pg 8.1 without any special pg performance tuning. So I expect
that compiling lates pg and doing some tuning on it and testing it on
the a representative server will give it an additional boost in performance.
The key here is that with abundance in performance, I can experiment
with the solution in a completely different way than if I had any
"artificial" restrictions.
You will need a fast disk system with xlog and data on separate
disks, several CPU cores (1 insert thread will max out 1 core, use the
others for selects), lots of RAM so index updates don't need to seek,
and tuning of bgwriter and checkpoints to avoid load spikes.
will have a look at it.
regards
thomas
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org