Hayato Kuroda (Fujitsu) <kuroda.hay...@fujitsu.com>, 4 Tem 2023 Sal,
08:42 tarihinde şunu yazdı:
> > > But in the later patch the tablesync worker tries to reuse the slot 
> > > during the
> > > synchronization, so in this case the application_name should be same as
> > slotname.
> > >
> >
> > Fair enough. I am slightly afraid that if we can't show the benefits
> > with later patches then we may need to drop them but at this stage I
> > feel we need to investigate why those are not helping?
>
> Agreed. Now I'm planning to do performance testing independently. We can 
> discuss
> based on that or Melih's one.

Here I attached  what I use for performance testing of this patch.

I only benchmarked the patch set with reusing connections very roughly
so far. But seems like it improves quite significantly. For example,
it took 611 ms to sync 100 empty tables, it was 1782 ms without
reusing connections.
First 3 patches from the set actually bring a good amount of
improvement, but not sure about the later patches yet.

Amit Kapila <amit.kapil...@gmail.com>, 3 Tem 2023 Pzt, 08:59 tarihinde
şunu yazdı:
> On thinking about this, I think the primary benefit we were expecting
> by saving network round trips for slot drop/create but now that we
> anyway need an extra round trip to establish a snapshot, so such a
> benefit was not visible. This is just a theory so we should validate
> it. The another idea as discussed before [1] could be to try copying
> multiple tables in a single transaction. Now, keeping a transaction
> open for a longer time could have side-effects on the publisher node.
> So, we probably need to ensure that we don't perform multiple large
> syncs and even for smaller tables (and later sequences) perform it
> only for some threshold number of tables which we can figure out by
> some tests. Also, the other safety-check could be that anytime we need
> to perform streaming (sync with apply worker), we won't copy more
> tables in same transaction.
>
> Thoughts?

Yeah, maybe going to the publisher for creating a slot or only a
snapshot does not really make enough difference. I was hoping that
creating only snapshot by an existing replication slot would help the
performance. I guess I was either wrong or am missing something in the
implementation.

The tricky bit with keeping a long transaction to copy multiple tables
is deciding how many tables one transaction can copy.

Thanks,
-- 
Melih Mutlu
Microsoft
--- on publisher
SELECT 'CREATE TABLE manytables_'||i||'(i int);' FROM generate_series(1, 100) 
g(i) \gexec
SELECT pg_create_logical_replication_slot('mysub_slot', 'pgoutput');

--- on subscriber
SELECT 'CREATE TABLE manytables_'||i||'(i int);' FROM generate_series(1, 100) 
g(i) \gexec

CREATE OR REPLACE PROCEDURE log_rep_test(max INTEGER) AS $$
DECLARE
    counter INTEGER := 1;
    total_duration INTERVAL := '0';
    avg_duration FLOAT := 0.0;
    start_time TIMESTAMP;
    end_time TIMESTAMP;
BEGIN
    WHILE counter <= max LOOP
        
        EXECUTE 'DROP SUBSCRIPTION IF EXISTS mysub;';

        start_time := clock_timestamp();
        EXECUTE 'CREATE SUBSCRIPTION mysub CONNECTION ''dbname=postgres 
port=5432'' PUBLICATION mypub WITH (create_slot=false, 
slot_name=''mysub_slot'');';
        COMMIT;

        WHILE EXISTS (SELECT 1 FROM pg_subscription_rel WHERE srsubstate != 
'r') LOOP
            COMMIT;
        END LOOP;

        end_time := clock_timestamp();


        EXECUTE 'ALTER SUBSCRIPTION mysub DISABLE;';
        EXECUTE 'ALTER SUBSCRIPTION mysub SET (slot_name = none);';

        
        total_duration := total_duration + (end_time - start_time);
        
        counter := counter + 1;
    END LOOP;
    
    IF max > 0 THEN
        avg_duration := EXTRACT(EPOCH FROM total_duration) / max * 1000;
    END IF;
    
    RAISE NOTICE '%', avg_duration;
END;
$$ LANGUAGE plpgsql;


call log_rep_test(5);

Reply via email to