On Mon, May 5, 2025 at 2:36 PM Achilleas Mantzios <
a.mantz...@cloud.gatewaynet.com> wrote:

>
> On 5/5/25 13:27, Mladen Marinović wrote:
>
>
>
> On Mon, May 5, 2025 at 12:07 PM Achilleas Mantzios <
> a.mantz...@cloud.gatewaynet.com> wrote:
>
>>
>> On 5/5/25 11:00, Mladen Marinović wrote:
>>
>>
>>
>> On Mon, May 5, 2025 at 11:24 AM Achilleas Mantzios <
>> a.mantz...@cloud.gatewaynet.com> wrote:
>>
>>>
>>> On 5/5/25 09:52, Mladen Marinović wrote:
>>>
>>> Hi,
>>>
>>> We recently migrated our production instances from PG11 to PG17. While
>>> doing so we upgraded our pgBouncer instances from 1.12 to 1.24. As
>>> everything worked on the test servers we pushed this to production a few
>>> weeks ago. We did not notice any problems until a few days ago (but the
>>> problems were here from the start). The main manifestation of the problems
>>> is a service that runs a fixed query to get a backlog of unprocessed data
>>> (limited to a 1000 rows). When testing the query using pgAdmin connected
>>> directly to the database we get a result in cca. 20 seconds. The same query
>>> runs for 2 hours when using pgBouncer to connect to the same database.
>>>
>>>
>>> That's a huge jump, I hope you guys did extensive testing of your app.
>>> In which language is your app written? If java, then define
>>> prepareThreshold=0 in your jdbc and set max_prepared_statements = 0 in
>>> pgbouncer.
>>>
>> Mainly python, but the problem was noticed in a java service.
>> Prepare treshold was already set to 0. We changed the  
>> max_prepared_statements
>> to 0 from the default (200) but no change was noticed.
>>
>>> How about search paths ? any difference on those between the two runs ?
>>> Do you set search_path in pgbouncer ? what is "cca." btw ?
>>>
>>>
>>> The more interesting part is that when we issue an explain of the same
>>> query we get different plans. We did this a few seconds apart so there
>>> should be no difference in collected statistics. We ruled out prepared
>>> statements, as we suspected the generic plan might be the problem, but it
>>> is not. Is there any pgBouncer or PG17 parameter that might be the cause of
>>> this?
>>>
>>>
>>> Does this spawn any connections (such as dblink) ? are there limits per
>>> user/db pool_size in pgbouncer ?
>>>
>> No additional connection nor dbling. Just plain SQL (CTE, SELECT, INSERT,
>> UPDATE, DELETE,...) There are limits, but they are not hit. The query just
>> uses a different plan and runs slower because of that.
>>
>>> Pgbouncer, in contrast to its old friend PgPool-II is completely
>>> passive, just passes through SQL to the server as fast as possible as it
>>> can. But I am sure you know that. Good luck, keep us posted!
>>>
>> Yes, that is what puzzles me.
>>
>> What is the pgbouncer's timeout in the server connections ?
>>
>> How about "idle in transaction" ? do you get any of those? What's the
>> isolation level ?
>>
>> How about the user ? is this the same user doing pgadmin queries VS via
>> the app ?
>>
>> Can you identify the user under which the problem is manifested and :
>>
>> ALTER user "unlucky_user" SET log_statement = 'all';
>>
>> ALTER user "unlucky_user" SET log_min_duration_statement = 0; -- to help
>> you debug the prepared statements .. just in case , and other stuff not
>> printed by log_statement = all.
>>
> None of those parameters should affect the fact that when issuing the
> explain select query (the statement is not prepared) from psql directly
> gives a different result than issuing it over the pgbouncer connection. The
> result is repeatable.
>
> We have rolled back pgbouncer to 1.12. and it seems the problem persists.
> This is one of the weirdest things I have ever seen with PostgreSQL.
>
>
> ok, this is something, at least one more extreme thought ruled out. How
> about search_path ?  is this the SAME user that is issuing the statements
> in pgadmin VS pgbouncer ?
>
The user is the same, the search path is the same:
show search_path ;
   search_path
-----------------
 "$user", public
(1 row)

There is only one schema (public) so there is no posibility that one
connection uses different tables. As I remember correctly the planner uses
the postgresql parameters (from the conf file visible in pg_settings) and
table/row statistics to choose a plan. It is unclear to me how those
numbers can be different when using pgbouncer (I have tried selecting the
configuration using both connections and it is the same).


> Is there a connect_query inside pgbouncer's conf ?
>
The connect_query parameter is not used.

> you have to show all configuration involved and also full logging on the
> backend for said user.
>
I can provide redacted explains if it would help. Full logging is not
feasible.

>
>
>> Regards,
>> Mladen Marinović
>>
>>

Reply via email to