Control: reassign -1 debci-collector
Control: retitle -1 debci-collector should handle a postgresql connection 
failure in a better way

On Wed, Jun 11, 2025 at 02:43:14PM +0200, Christoph Berg wrote:
> Re: Helmut Grohne
> > That said, I'm not super convinced that this is a good solution. Maybe
> > spending more effort on the debci side is warranted. In principle, debci
> > should work with a remote postgresql server and then no such
> > notification can happen.
> 
> You are not alone. Apps freaking out after database restarts is still
> widely seen.

I'm reassigning the bug to debci-collector as there is no useful thing
postgresql can do to support the use case. What follows is context for
debci maintainers.

When restarting postgresql (and thus closing existing connections),
debci-collector gets stuck. You get this:

| E, [2025-05-31T06:44:25.008155 #1704] ERROR -- #<Bunny::Session:0x938 
[email protected]:5671, vhost=/, addresses=[ci.example.com:5671]>: Uncaught 
exception from consumer #<Bunny::Consumer:1355380 @channel_id=1 
@queue=debci_results> @consumer_tag=bunny-1747892779000-472546353095>: 
#<ActiveRecord::StatementInvalid: PG::ConnectionBad: PQconsumeInput() FATAL:  
terminating connection due to administrator command
| server closed the connection unexpectedly
|         This probably means the server terminated abnormally
|         before or while processing the request.
| > @ 
/usr/share/rubygems-integration/all/gems/activerecord-6.1.7.10/lib/active_record/connection_adapters/postgresql_adapter.rb:687:in
 `exec_prepared'

And then for every further result being processed, you get this:

| E, [2025-05-31T06:45:05.396999 #1704] ERROR -- #<Bunny::Session:0x938 
[email protected]:5671, vhost=/, addresses=[ci.example.com:5671]>: Uncaught 
exception from consumer #<Bunny::Consumer:1355380 @channel_id=1 
@queue=debci_results> @consumer_tag=bunny-1747892779000-472546353095>: 
#<ActiveRecord::StatementInvalid: PG::ConnectionBad: PQsocket() can't get 
socket descriptor> @ 
/usr/share/rubygems-integration/all/gems/activerecord-6.1.7.10/lib/active_record/connection_adapters/postgresql_adapter.rb:687:in
 `exec_prepared'

This is due to how debci in general uses ActiveRecord. Looking into
lib/debci/db.rb, we may see that the last line is:

| Debci::DB.establish_connection

I understand this as one connection being opened at program startup and
its kept for the entire process lifetime. When it is closed, stuff just
fails.

It's not clear to me how to fix this, but ActiveRecord does have
ActiveRecord::Base.connection_pool.with_connection. I guess a first step
would be wrapping all database interactions with this such that
ActiveRecord can keep track of when connections are leased and released.
Then, we may request that the pool closes idle connections, but I
wouldn't know how.

The key complaint in this bug report is the failure mode. I suggest that
it becomes resilient to connection failure, but another way of dealing
with this is propagating the exception and terminating the
debci-collector process such that systemd can restart it. Solving it
that way would be a reasonable thing to do from my point of view.
Unfortunately, I did not figure out where that exception is caught and
logged rather than propagated.

Any ideas on how to move forward here?

Helmut

Reply via email to