On 15.11.2013 07:47, David Rowley wrote:
On Fri, Nov 15, 2013 at 3:03 AM, Heikki Linnakangas <hlinnakan...@vmware.com
wrote

I think that means that we should just completely replace the list with
the hash table. The difference with a small N is lost in noise, so there's
no point in keeping the list as a fast path for small N. That'll make the
patch somewhat simpler.

Attached is a much more simple patch which gets rid of the initial linked
list.

Thanks, committed with minor copy-editing. I dialed down the initial size of the hash table from 1000 to 16, that ought to be enough.

I ran a quick performance test of my own, based on the script you sent. I modified it a bit to eliminate the PL/pgSQL overhead, making it more heavily bottlenecked by the nextval/currval overhead. Results:

nextval, 10000 seqs     36772   2426
currval, 1 seq          1176    1069
currval, 2 seqs         865     857
currval, 4 seqs         742     759
currval, 5 seqs         718     711
currval, 10 seqs        680     668
currval, 100 seqs       871     656
currval, 1000 seqs      3507    700
currval, 10000 seqs     34742   1224

The performance when you touch only a few sequences is unchanged. When you touch a lot of them, you gain. Just as you would expect.

Attached is the test script I used. After running the test, I realized that there's a little flaw in the test methodology, but I doesn't invalidates the results. I used the same backend for all the test runs, so even when currval() is called repeatedly for a single sequence, the linked list (or hash table, with the patch) nevertheless contains entries for all 10000 sequences. However, the sequences actually used by the test are always in the front of the list, because the nextval() calls were made in the same order. But with the unpatched version, if you called currval() on the lastly initialized sequence repeatedly, instead of the firstly initialized one, you would get much would get horrible performance, even when you touch only a single sequence.

Regarding the more grandiose ideas of using the relcache or rewriting the way sequences are stored altogether: this patch might become obsolete if we do any of that stuff, but that's ok. The immediate performance problem has been solved now, but those other ideas might be worth pursuing for other reasons.

- Heikki
CREATE or replace FUNCTION create_seq(n integer) RETURNS void
    LANGUAGE plpgsql
    AS $$
BEGIN
    drop table if exists testseqs;
    create table testseqs(seqoid oid, n int4);
    WHILE n > 0 LOOP
        EXECUTE 'CREATE SEQUENCE testseq' || n;
	insert into testseqs select oid, n from pg_class where relname=('testseq' || n);
      n := n - 1;
    END LOOP;
END
$$;

CREATE or replace FUNCTION drop_seq() RETURNS void language plpgsql as
$$
declare
   n int4;
begin
   for n in select testseqs.n from testseqs loop
     execute 'drop sequence testseq' || n;
   end loop;
end;
$$;


CREATE or replace FUNCTION nextval_seq(nseqs integer, niter integer) RETURNS bigint
    LANGUAGE sql
    AS $$
select count(*) from (
  select nextval(seqoid), n, g from testseqs, generate_series(1, niter) g where n <= nseqs
) foo;
$$;


CREATE or replace FUNCTION currval_seq(nseqs integer, niter integer) RETURNS bigint
    LANGUAGE sql
    AS $$
select count(*) from (
  select currval(seqoid), n, g from testseqs, generate_series(1, niter) g where n <= nseqs
) foo;
$$;


\timing on
--select create_seq(10000);

\echo warmup
select nextval_seq(10000, 100);

\echo calling nextval on 10000 distinct sequences, 100 times
select nextval_seq(10000, 100);

\echo calling currval on 1 sequence, 1000000 times
select currval_seq(1, 1000000);

\echo calling currval on 2 sequences, 500000 times
select currval_seq(2, 500000);

\echo calling currval on 4 sequences, 250000 times
select currval_seq(4, 250000);

\echo calling currval on 5 sequences, 200000 times
select currval_seq(5, 200000);

\echo calling currval on 10 sequences, 100000 times
select currval_seq(10, 100000);

\echo calling currval on 100 sequences, 10000 times
select currval_seq(100, 10000);

\echo calling currval on 1000 sequences, 1000 times
select currval_seq(1000, 1000);

\echo calling currval on 10000 distinct sequences, 100 times
select currval_seq(10000, 100);


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to