On 5/5/25 11:30, Ruben Morais wrote:

HI,

Could be a hint but test with jit to off.
If not wrong as you change from 11 to 17, that could be a cause, just try it because in some cases plans changed when jit is on.

Not only JIT but also other extensions (such as timescale) could greatly affect the plan.

He could find if any GUC are set for the particular user :

select * from pg_db_role_setting where setrole = to_regrole('<unlucky_user>');


Regards,
*Rúben Morais*



On Mon, May 5, 2025 at 11:07 AM Achilleas Mantzios <a.mantz...@cloud.gatewaynet.com> wrote:


    On 5/5/25 11:00, Mladen Marinović wrote:


    On Mon, May 5, 2025 at 11:24 AM Achilleas Mantzios
    <a.mantz...@cloud.gatewaynet.com> wrote:


        On 5/5/25 09:52, Mladen Marinović wrote:
        Hi,

        We recently migrated our production instances from PG11 to
        PG17. While doing so we upgraded our pgBouncer instances
        from 1.12 to 1.24. As everything worked on the test servers
        we pushed this to production a few weeks ago. We did not
        notice any problems until a few days ago (but the problems
        were here from the start). The main manifestation of the
        problems is a service that runs a fixed query to get a
        backlog of unprocessed data (limited to a 1000 rows). When
        testing the query using pgAdmin connected directly to the
        database we get a result in cca. 20 seconds. The same query
        runs for 2 hours when using pgBouncer to connect to the same
        database.


        That's a huge jump, I hope you guys did extensive testing of
        your app. In which language is your app written? If java,
        then define prepareThreshold=0 in your jdbc and set
        max_prepared_statements = 0 in pgbouncer.

    Mainly python, but the problem was noticed in a java service.
    Prepare treshold was already set to 0. We changed the
    max_prepared_statements to 0 from the default (200) but no change
    was noticed.

        How about search paths ? any difference on those between the
        two runs ? Do you set search_path in pgbouncer ? what is
        "cca." btw ?


        The more interesting part is that when we issue an explain
        of the same query we get different plans. We did this a few
        seconds apart so there should be no difference in collected
        statistics. We ruled out prepared statements, as we
        suspected the generic plan might be the problem, but it is
        not. Is there any pgBouncer or PG17 parameter that might be
        the cause of this?


        Does this spawn any connections (such as dblink) ? are there
        limits per user/db pool_size in pgbouncer ?

    No additional connection nor dbling. Just plain SQL (CTE, SELECT,
    INSERT, UPDATE, DELETE,...) There are limits, but they are not
    hit. The query just uses a different plan and runs slower because
    of that.

        Pgbouncer, in contrast to its old friend PgPool-II is
        completely passive, just passes through SQL to the server as
        fast as possible as it can. But I am sure you know that. Good
        luck, keep us posted!

    Yes, that is what puzzles me.

    What is the pgbouncer's timeout in the server connections ?

    How about "idle in transaction" ? do you get any of those? What's
    the isolation level ?

    How about the user ? is this the same user doing pgadmin queries
    VS via the app ?

    Can you identify the user under which the problem is manifested and :

    ALTER user "unlucky_user" SET log_statement = 'all';

    ALTER user "unlucky_user" SET log_min_duration_statement = 0; --
    to help you debug the prepared statements .. just in case , and
    other stuff not printed by log_statement = all.

    Regards,
    Mladen Marinović

Reply via email to