Re: [Koha-devel] Background jobs quirks

2023-10-19 Thread David Cook via Koha-devel
In the stacktrace, there is a reference to 
/usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm which seems to be 
responsible for the reconnection.

 

Starman is based on /usr/share/perl5/Net/Server/PreFork.pm and I see that it 
ignores SIGPIPE signals. 

 

I’ve added a line to ignore SIGPIPE in background_jobs_worker.pl and that fixes 
this problem. It will ignore the SIGPIPE and it will just reconnect. 

 

A lot of the reports of people’s issues with Koha’s background jobs have been 
vague, and I wonder if this 1 line fix will resolve a lot of their issues…

 

In any case, I’ve opened up this bug report: 
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35111

 

I’ll get patches out for this ASAP. 

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: Jonathan Druart  
Sent: Thursday, October 19, 2023 5:41 PM
To: David Cook 
Cc: Koha Devel ; Tomas Cohen Arazi 

Subject: Re: [Koha-devel] Background jobs quirks

 

Hum weird, isn't DBIC supposed to deal with reconnection?

You could add the find in a try, then find again in the catch... (for debug 
purpose!)

Or force a new DBMS connection for each job (C4::Context->dbh({new => 1})), but 
that's not a long term solution.

 

Le jeu. 19 oct. 2023 à 08:04, David Cook mailto:dc...@prosentient.com.au> > a écrit :

Just one last one….

 

I put a Carp on to the SIGPIPE and sure enough it is the 
Koha::BackgroundJobs->find() causing the SIGPIPE to be raised. 

 

Of course, ignoring the SIGPIPE isn’t going to help because the job won’t be 
fetched. 

 

I’m not sure what the answer here is… and unfortunately I’ve ran out of time 
today. With my work task, I’ll probably abandon using the background jobs for 
the moment as I need to do something more reliable. I’ll think about this one 
later…

 

at /kohadevbox/koha/misc/background_jobs_worker.pl 
<http://background_jobs_worker.pl>  line 62.

main::__ANON__("PIPE") called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm 
line 932

eval {...} called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm line 932


DBIx::Class::Storage::DBI::connected(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78))
 called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm line 850


DBIx::Class::Storage::DBI::__ANON__(DBIx::Class::Storage::BlockRunner=HASH(0x55b81f9d8d28))
 called at /usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 190

DBIx::Class::Storage::BlockRunner::__ANON__() called at 
/usr/share/perl5/Context/Preserve.pm line 38

Context::Preserve::preserve_context(CODE(0x55b81fa01620), "replace", 
CODE(0x55b81fa016c8)) called at 
/usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 213


DBIx::Class::Storage::BlockRunner::_run(DBIx::Class::Storage::BlockRunner=HASH(0x55b81f9d8d28),
 CODE(0x55b81f9d8ce0)) called at 
/usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 105


DBIx::Class::Storage::BlockRunner::run(DBIx::Class::Storage::BlockRunner=HASH(0x55b81f9d8d28),
 CODE(0x55b81f9d8ce0)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm 
line 856

DBIx::Class::Storage::DBI::dbh_do(undef, undef, "SELECT `me`.`id`, 
`me`.`status`, `me`.`progress`, `me`.`size`"..., ARRAY(0x55b81f453d30), 
ARRAY(0x55b81f81d380)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm 
line 1939


DBIx::Class::Storage::DBI::_execute(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78),
 "select", ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8), HASH(0x55b81f16bd08), 
HASH(0x55b81f6cc0b8)) called at /usr/share/perl5/DBIx/Class/Stor

age/DBI.pm line 2584


DBIx::Class::Storage::DBI::_select(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78),
 ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8), HASH(0x55b81f16bd08), 
HASH(0x55b81f4ce6c0)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm

line 2761


DBIx::Class::Storage::DBI::select_single(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78),
 ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8), HASH(0x55b81f16bd08), 
HASH(0x55b81f4ce6c0)) called at /usr/share/perl5/DBIx/Class/ResultSet

.pm line 1101

DBIx::Class::ResultSet::single(DBIx::Class::ResultSet=HASH(0x55b81d89b180)) 
called at /usr/share/perl5/DBIx/Class/ResultSet.pm line 910

DBIx::Class::ResultSet::find(DBIx::Class::ResultSet=HASH(0x55b81f9d6b68), 
12) called at /kohadevbox/koha/Koha/Objects.pm line 96

Koha::Objects::find("Koha::BackgroundJobs", 12) called at 
/kohadevbox/koha/misc/background_jobs_worker.pl 
<http://background_jobs_worker.pl>  line 132

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: David Cook mailto:dc...@prosentient.com.au> > 
Sent: Thursday, October 19, 2023 4:26 PM
To: 'Koha Devel' mailto:koha-devel

Re: [Koha-devel] Background jobs quirks

2023-10-19 Thread Jonathan Druart via Koha-devel
Hum weird, isn't DBIC supposed to deal with reconnection?
You could add the find in a try, then find again in the catch... (for debug
purpose!)
Or force a new DBMS connection for each job (C4::Context->dbh({new => 1})),
but that's not a long term solution.

Le jeu. 19 oct. 2023 à 08:04, David Cook  a
écrit :

> Just one last one….
>
>
>
> I put a Carp on to the SIGPIPE and sure enough it is the
> Koha::BackgroundJobs->find() causing the SIGPIPE to be raised.
>
>
>
> Of course, ignoring the SIGPIPE isn’t going to help because the job won’t
> be fetched.
>
>
>
> I’m not sure what the answer here is… and unfortunately I’ve ran out of
> time today. With my work task, I’ll probably abandon using the background
> jobs for the moment as I need to do something more reliable. I’ll think
> about this one later…
>
>
>
> at /kohadevbox/koha/misc/background_jobs_worker.pl line 62.
>
> main::__ANON__("PIPE") called at
> /usr/share/perl5/DBIx/Class/Storage/DBI.pm line 932
>
> eval {...} called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm line
> 932
>
>
> DBIx::Class::Storage::DBI::connected(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78))
> called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm line 850
>
>
> DBIx::Class::Storage::DBI::__ANON__(DBIx::Class::Storage::BlockRunner=HASH(0x55b81f9d8d28))
> called at /usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 190
>
> DBIx::Class::Storage::BlockRunner::__ANON__() called at
> /usr/share/perl5/Context/Preserve.pm line 38
>
> Context::Preserve::preserve_context(CODE(0x55b81fa01620), "replace",
> CODE(0x55b81fa016c8)) called at
> /usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 213
>
>
> DBIx::Class::Storage::BlockRunner::_run(DBIx::Class::Storage::BlockRunner=HASH(0x55b81f9d8d28),
> CODE(0x55b81f9d8ce0)) called at
> /usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 105
>
>
> DBIx::Class::Storage::BlockRunner::run(DBIx::Class::Storage::BlockRunner=HASH(0x55b81f9d8d28),
> CODE(0x55b81f9d8ce0)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm
> line 856
>
> DBIx::Class::Storage::DBI::dbh_do(undef, undef, "SELECT `me`.`id`,
> `me`.`status`, `me`.`progress`, `me`.`size`"..., ARRAY(0x55b81f453d30),
> ARRAY(0x55b81f81d380)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm
> line 1939
>
>
> DBIx::Class::Storage::DBI::_execute(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78),
> "select", ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8),
> HASH(0x55b81f16bd08), HASH(0x55b81f6cc0b8)) called at
> /usr/share/perl5/DBIx/Class/Stor
>
> age/DBI.pm line 2584
>
>
> DBIx::Class::Storage::DBI::_select(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78),
> ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8), HASH(0x55b81f16bd08),
> HASH(0x55b81f4ce6c0)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm
>
> line 2761
>
>
> DBIx::Class::Storage::DBI::select_single(DBIx::Class::Storage::DBI::mysql=HASH(0x55b81d683d78),
> ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8), HASH(0x55b81f16bd08),
> HASH(0x55b81f4ce6c0)) called at /usr/share/perl5/DBIx/Class/ResultSet
>
> .pm line 1101
>
>
> DBIx::Class::ResultSet::single(DBIx::Class::ResultSet=HASH(0x55b81d89b180))
> called at /usr/share/perl5/DBIx/Class/ResultSet.pm line 910
>
>
> DBIx::Class::ResultSet::find(DBIx::Class::ResultSet=HASH(0x55b81f9d6b68),
> 12) called at /kohadevbox/koha/Koha/Objects.pm line 96
>
> Koha::Objects::find("Koha::BackgroundJobs", 12) called at
> /kohadevbox/koha/misc/background_jobs_worker.pl line 132
>
>
>
> David Cook
>
> Senior Software Engineer
>
> Prosentient Systems
>
> Suite 7.03
>
> 6a Glen St
>
> Milsons Point NSW 2061
>
> Australia
>
>
>
> Office: 02 9212 0899
>
> Online: 02 8005 0595
>
>
>
> *From:* David Cook 
> *Sent:* Thursday, October 19, 2023 4:26 PM
> *To:* 'Koha Devel' 
> *Cc:* 'Jonathan Druart' ; 'Tomas
> Cohen Arazi' 
> *Subject:* RE: [Koha-devel] Background jobs quirks
>
>
>
> Success!
>
>
>
> I created the persistent database connection between the background job
> worker and the database, and then I restarted the database.
>
>
>
> When I tried to enqueue a background job, it’s now stuck in a “New” status
> with a Progress of “null/1”, and /var/log/koha/kohadev/worker-error.log
> says the following:
>
> 20231019 05:02:14 kohadev-koha-worker-long_tasks: client (pid 23247)
> killed by signal 13, respawning
>
>
>
> Now Starman seems to have managed to re-establish a connection without any
> errors. It’s not clear why that might be. I suppose Starman/Plack might be
&

Re: [Koha-devel] Background jobs quirks

2023-10-19 Thread David Cook via Koha-devel
Just one last one..

 

I put a Carp on to the SIGPIPE and sure enough it is the
Koha::BackgroundJobs->find() causing the SIGPIPE to be raised. 

 

Of course, ignoring the SIGPIPE isn't going to help because the job won't be
fetched. 

 

I'm not sure what the answer here is. and unfortunately I've ran out of time
today. With my work task, I'll probably abandon using the background jobs
for the moment as I need to do something more reliable. I'll think about
this one later.

 

at /kohadevbox/koha/misc/background_jobs_worker.pl line 62.

main::__ANON__("PIPE") called at
/usr/share/perl5/DBIx/Class/Storage/DBI.pm line 932

eval {...} called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm line 932

 
DBIx::Class::Storage::DBI::connected(DBIx::Class::Storage::DBI::mysql=HASH(0
x55b81d683d78)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm line
850

 
DBIx::Class::Storage::DBI::__ANON__(DBIx::Class::Storage::BlockRunner=HASH(0
x55b81f9d8d28)) called at /usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm
line 190

DBIx::Class::Storage::BlockRunner::__ANON__() called at
/usr/share/perl5/Context/Preserve.pm line 38

Context::Preserve::preserve_context(CODE(0x55b81fa01620), "replace",
CODE(0x55b81fa016c8)) called at
/usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 213

 
DBIx::Class::Storage::BlockRunner::_run(DBIx::Class::Storage::BlockRunner=HA
SH(0x55b81f9d8d28), CODE(0x55b81f9d8ce0)) called at
/usr/share/perl5/DBIx/Class/Storage/BlockRunner.pm line 105

 
DBIx::Class::Storage::BlockRunner::run(DBIx::Class::Storage::BlockRunner=HAS
H(0x55b81f9d8d28), CODE(0x55b81f9d8ce0)) called at
/usr/share/perl5/DBIx/Class/Storage/DBI.pm line 856

DBIx::Class::Storage::DBI::dbh_do(undef, undef, "SELECT `me`.`id`,
`me`.`status`, `me`.`progress`, `me`.`size`"..., ARRAY(0x55b81f453d30),
ARRAY(0x55b81f81d380)) called at /usr/share/perl5/DBIx/Class/Storage/DBI.pm
line 1939

 
DBIx::Class::Storage::DBI::_execute(DBIx::Class::Storage::DBI::mysql=HASH(0x
55b81d683d78), "select", ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8),
HASH(0x55b81f16bd08), HASH(0x55b81f6cc0b8)) called at
/usr/share/perl5/DBIx/Class/Stor

age/DBI.pm line 2584

 
DBIx::Class::Storage::DBI::_select(DBIx::Class::Storage::DBI::mysql=HASH(0x5
5b81d683d78), ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8),
HASH(0x55b81f16bd08), HASH(0x55b81f4ce6c0)) called at
/usr/share/perl5/DBIx/Class/Storage/DBI.pm

line 2761

 
DBIx::Class::Storage::DBI::select_single(DBIx::Class::Storage::DBI::mysql=HA
SH(0x55b81d683d78), ARRAY(0x55b81f9d3278), ARRAY(0x55b81d7db5a8),
HASH(0x55b81f16bd08), HASH(0x55b81f4ce6c0)) called at
/usr/share/perl5/DBIx/Class/ResultSet

.pm line 1101

 
DBIx::Class::ResultSet::single(DBIx::Class::ResultSet=HASH(0x55b81d89b180))
called at /usr/share/perl5/DBIx/Class/ResultSet.pm line 910

 
DBIx::Class::ResultSet::find(DBIx::Class::ResultSet=HASH(0x55b81f9d6b68),
12) called at /kohadevbox/koha/Koha/Objects.pm line 96

Koha::Objects::find("Koha::BackgroundJobs", 12) called at
/kohadevbox/koha/misc/background_jobs_worker.pl line 132

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: David Cook  
Sent: Thursday, October 19, 2023 4:26 PM
To: 'Koha Devel' 
Cc: 'Jonathan Druart' ; 'Tomas
Cohen Arazi' 
Subject: RE: [Koha-devel] Background jobs quirks

 

Success!

 

I created the persistent database connection between the background job
worker and the database, and then I restarted the database.

 

When I tried to enqueue a background job, it's now stuck in a "New" status
with a Progress of "null/1", and /var/log/koha/kohadev/worker-error.log says
the following:

20231019 05:02:14 kohadev-koha-worker-long_tasks: client (pid 23247) killed
by signal 13, respawning

 

Now Starman seems to have managed to re-establish a connection without any
errors. It's not clear why that might be. I suppose Starman/Plack might be
suppressing SIGPIPE errors. We might have some subtle issues under the hood
without even realizing it. 

 

(That suddenly reminds me of that MYSQL_OPT_RECONNECT issue from August.
MySQL 8.0.34/8.1.0 deprecate automatic reconnection. I would hope that
DBD::MySQL would take care of that, but as we might recall that module isn't
really maintained.)

 

I am testing on 22.11 so maybe it's a bit different in master. but it's
interesting.  

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: David Cook mailto:dc...@prosentient.com.au>
> 
Sent: Thursday, October 19, 2023 3:55 PM
To: 'David Cook' mailto:dc...@prosentient.com.au>
>; 'Koha Devel' mailto:koha-devel@lists.koha-community.org> >
Subject: RE: [Koha-devel] Background jobs quirks

 

On a standard Debian package install of Koha, 

Re: [Koha-devel] Background jobs quirks

2023-10-18 Thread David Cook via Koha-devel
Success!

 

I created the persistent database connection between the background job
worker and the database, and then I restarted the database.

 

When I tried to enqueue a background job, it's now stuck in a "New" status
with a Progress of "null/1", and /var/log/koha/kohadev/worker-error.log says
the following:

20231019 05:02:14 kohadev-koha-worker-long_tasks: client (pid 23247) killed
by signal 13, respawning

 

Now Starman seems to have managed to re-establish a connection without any
errors. It's not clear why that might be. I suppose Starman/Plack might be
suppressing SIGPIPE errors. We might have some subtle issues under the hood
without even realizing it. 

 

(That suddenly reminds me of that MYSQL_OPT_RECONNECT issue from August.
MySQL 8.0.34/8.1.0 deprecate automatic reconnection. I would hope that
DBD::MySQL would take care of that, but as we might recall that module isn't
really maintained.)

 

I am testing on 22.11 so maybe it's a bit different in master. but it's
interesting.  

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: David Cook  
Sent: Thursday, October 19, 2023 3:55 PM
To: 'David Cook' ; 'Koha Devel'

Subject: RE: [Koha-devel] Background jobs quirks

 

On a standard Debian package install of Koha, it looks like there are 3
persistent connections to the database: 2x Starman workers and 1x Zebra
Indexer.

 

Then if you use the background jobs, the background job worker for that
queue will create a persistent connection as well. 

 

I had to work out the MySQL/MariaDB ID through trial and error since there
wasn't enough information in the processlist in the database, but I got the
ID, so I'll check the logs in the morning to see if that's the problem. 

 

--

 

I'm losing some confidence in this theory as I'm looking at the
koha-testing-docker database instance and I'm seeing similar timeouts, but
it hasn't triggered the "killed by signal 13, respawning".

 

2023-10-19  3:06:09 44 [Warning] Aborted connection 44 to db: 'koha_kohadev'
user: 'koha_kohadev' host: '172.21.0.4' (Got an error reading communication
packets)

2023-10-19  3:06:09 32 [Warning] Aborted connection 32 to db: 'koha_kohadev'
user: 'koha_kohadev' host: '172.21.0.4' (Got an error reading communication
packets)

 

--

 

I'm thinking the best thing to do is just ignore SIGPIPE and let our regular
error handling take care of it, as we should then get a more nuanced error
message somewhere.

 

Looking at https://metacpan.org/dist/Net-Stomp/source/lib/Net/Stomp.pm and
it ignores SIGPIPE when it's sending frames. so it's probably not the
$conn->ack() doing it.

 

I do wonder if it's "my $job = Koha::BackgroundJobs->find($args->{job_id});"

 

I suppose we'll see in the morning.

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: Koha-devel mailto:koha-devel-boun...@lists.koha-community.org> > On Behalf Of David
Cook via Koha-devel
Sent: Thursday, October 19, 2023 2:41 PM
To: 'Koha Devel' mailto:koha-devel@lists.koha-community.org> >
Subject: [Koha-devel] Background jobs quirks

 

Hi all,

 

Have you seen something like the following in your worker-error.log?

 

20231018 07:44:06 instance-koha-worker-long_tasks: client (pid 3949888)
killed by signal 13, respawning

20231019 07:21:26 instance-koha-worker-long_tasks: client (pid 4082852)
killed by signal 13, respawning

 

I found some discussion in a thread
https://www.mail-archive.com/koha@lists.katipo.co.nz/msg30046.html

 

Signal 13 would be a SIGPIPE I believe. At first, I assumed it was the
connection between the background worker and RabbitMQ, but based off the
RabbitMQ logs[1] that seemed unlikely.

 

MariaDB is another possibility. Looking at its logs [2] and I see some timed
out connections. 

 

I might monitor the connections between the background workers and the
database tonight and see if it's related. It might also explain why it
happens on my MariaDB system but not my MySQL system.

 

Anyway, just putting out some feelers.

 

[1]

2023-10-18 07:44:06.492400+11:00 [info] <0.30487.131> accepting STOMP
connection <0.30487.131> (127.0.0.1:59120 -> 127.0.0.1:61613)

2023-10-18 07:44:06.500534+11:00 [info] <0.30487.131> closing STOMP
connection <0.30487.131> (127.0.0.1:59120 -> 127.0.0.1:61613)

2023-10-18 07:44:06.510426+11:00 [info] <0.20712.130> closing STOMP
connection <0.20712.130> (127.0.0.1:57508 -> 127.0.0.1:61613)

2023-10-18 07:44:08.703380+11:00 [info] <0.30518.131> accepting STOMP
connection <0.30518.131> (127.0.0.1:59882 -> 127.0.0.1:61613)

 

2023-10-19 07:21:26.254065+11:00 [info] <0.27344.133> accepting STOMP
connection <0.27344.133> (127.0.0

Re: [Koha-devel] Background jobs quirks

2023-10-18 Thread David Cook via Koha-devel
On a standard Debian package install of Koha, it looks like there are 3
persistent connections to the database: 2x Starman workers and 1x Zebra
Indexer.

 

Then if you use the background jobs, the background job worker for that
queue will create a persistent connection as well. 

 

I had to work out the MySQL/MariaDB ID through trial and error since there
wasn't enough information in the processlist in the database, but I got the
ID, so I'll check the logs in the morning to see if that's the problem. 

 

--

 

I'm losing some confidence in this theory as I'm looking at the
koha-testing-docker database instance and I'm seeing similar timeouts, but
it hasn't triggered the "killed by signal 13, respawning".

 

2023-10-19  3:06:09 44 [Warning] Aborted connection 44 to db: 'koha_kohadev'
user: 'koha_kohadev' host: '172.21.0.4' (Got an error reading communication
packets)

2023-10-19  3:06:09 32 [Warning] Aborted connection 32 to db: 'koha_kohadev'
user: 'koha_kohadev' host: '172.21.0.4' (Got an error reading communication
packets)

 

--

 

I'm thinking the best thing to do is just ignore SIGPIPE and let our regular
error handling take care of it, as we should then get a more nuanced error
message somewhere.

 

Looking at https://metacpan.org/dist/Net-Stomp/source/lib/Net/Stomp.pm and
it ignores SIGPIPE when it's sending frames. so it's probably not the
$conn->ack() doing it.

 

I do wonder if it's "my $job = Koha::BackgroundJobs->find($args->{job_id});"

 

I suppose we'll see in the morning.

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: Koha-devel  On Behalf Of
David Cook via Koha-devel
Sent: Thursday, October 19, 2023 2:41 PM
To: 'Koha Devel' 
Subject: [Koha-devel] Background jobs quirks

 

Hi all,

 

Have you seen something like the following in your worker-error.log?

 

20231018 07:44:06 instance-koha-worker-long_tasks: client (pid 3949888)
killed by signal 13, respawning

20231019 07:21:26 instance-koha-worker-long_tasks: client (pid 4082852)
killed by signal 13, respawning

 

I found some discussion in a thread
https://www.mail-archive.com/koha@lists.katipo.co.nz/msg30046.html

 

Signal 13 would be a SIGPIPE I believe. At first, I assumed it was the
connection between the background worker and RabbitMQ, but based off the
RabbitMQ logs[1] that seemed unlikely.

 

MariaDB is another possibility. Looking at its logs [2] and I see some timed
out connections. 

 

I might monitor the connections between the background workers and the
database tonight and see if it's related. It might also explain why it
happens on my MariaDB system but not my MySQL system.

 

Anyway, just putting out some feelers.

 

[1]

2023-10-18 07:44:06.492400+11:00 [info] <0.30487.131> accepting STOMP
connection <0.30487.131> (127.0.0.1:59120 -> 127.0.0.1:61613)

2023-10-18 07:44:06.500534+11:00 [info] <0.30487.131> closing STOMP
connection <0.30487.131> (127.0.0.1:59120 -> 127.0.0.1:61613)

2023-10-18 07:44:06.510426+11:00 [info] <0.20712.130> closing STOMP
connection <0.20712.130> (127.0.0.1:57508 -> 127.0.0.1:61613)

2023-10-18 07:44:08.703380+11:00 [info] <0.30518.131> accepting STOMP
connection <0.30518.131> (127.0.0.1:59882 -> 127.0.0.1:61613)

 

2023-10-19 07:21:26.254065+11:00 [info] <0.27344.133> accepting STOMP
connection <0.27344.133> (127.0.0.1:54116 -> 127.0.0.1:61613)

2023-10-19 07:21:26.259129+11:00 [info] <0.27344.133> closing STOMP
connection <0.27344.133> (127.0.0.1:54116 -> 127.0.0.1:61613)

2023-10-19 07:21:26.269652+11:00 [info] <0.30518.131> closing STOMP
connection <0.30518.131> (127.0.0.1:59882 -> 127.0.0.1:61613)

2023-10-19 07:21:26.660604+11:00 [info] <0.27372.133> accepting STOMP
connection <0.27372.133> (127.0.0.1:46178 -> 127.0.0.1:61613)

 

[2]

Oct 18 02:08:28 awesome-host mariadbd[959]: 2023-10-18  2:08:28 6308
[Warning] Aborted connection 6308 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

Oct 18 02:15:58 awesome-host mariadbd[959]: 2023-10-18  2:15:58 6304
[Warning] Aborted connection 6304 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

Oct 18 02:16:06 awesome-host mariadbd[959]: 2023-10-18  2:16:06 6303
[Warning] Aborted connection 6303 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

Oct 18 21:10:08 awesome-host mariadbd[959]: 2023-10-18 21:10:08 6380
[Warning] Aborted connection 6380 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

Oct 19 02:21:17 awesome-host mariadbd[959]: 2023-10-19  2:21:17 6431
[Warning] Aborted connection 6431 to db: 'koha_instance' user:
'koha_i

[Koha-devel] Background jobs quirks

2023-10-18 Thread David Cook via Koha-devel
Hi all,

 

Have you seen something like the following in your worker-error.log?

 

20231018 07:44:06 instance-koha-worker-long_tasks: client (pid 3949888)
killed by signal 13, respawning

20231019 07:21:26 instance-koha-worker-long_tasks: client (pid 4082852)
killed by signal 13, respawning

 

I found some discussion in a thread
https://www.mail-archive.com/koha@lists.katipo.co.nz/msg30046.html

 

Signal 13 would be a SIGPIPE I believe. At first, I assumed it was the
connection between the background worker and RabbitMQ, but based off the
RabbitMQ logs[1] that seemed unlikely.

 

MariaDB is another possibility. Looking at its logs [2] and I see some timed
out connections. 

 

I might monitor the connections between the background workers and the
database tonight and see if it's related. It might also explain why it
happens on my MariaDB system but not my MySQL system.

 

Anyway, just putting out some feelers.

 

[1]

2023-10-18 07:44:06.492400+11:00 [info] <0.30487.131> accepting STOMP
connection <0.30487.131> (127.0.0.1:59120 -> 127.0.0.1:61613)

2023-10-18 07:44:06.500534+11:00 [info] <0.30487.131> closing STOMP
connection <0.30487.131> (127.0.0.1:59120 -> 127.0.0.1:61613)

2023-10-18 07:44:06.510426+11:00 [info] <0.20712.130> closing STOMP
connection <0.20712.130> (127.0.0.1:57508 -> 127.0.0.1:61613)

2023-10-18 07:44:08.703380+11:00 [info] <0.30518.131> accepting STOMP
connection <0.30518.131> (127.0.0.1:59882 -> 127.0.0.1:61613)

 

2023-10-19 07:21:26.254065+11:00 [info] <0.27344.133> accepting STOMP
connection <0.27344.133> (127.0.0.1:54116 -> 127.0.0.1:61613)

2023-10-19 07:21:26.259129+11:00 [info] <0.27344.133> closing STOMP
connection <0.27344.133> (127.0.0.1:54116 -> 127.0.0.1:61613)

2023-10-19 07:21:26.269652+11:00 [info] <0.30518.131> closing STOMP
connection <0.30518.131> (127.0.0.1:59882 -> 127.0.0.1:61613)

2023-10-19 07:21:26.660604+11:00 [info] <0.27372.133> accepting STOMP
connection <0.27372.133> (127.0.0.1:46178 -> 127.0.0.1:61613)

 

[2]

Oct 18 02:08:28 awesome-host mariadbd[959]: 2023-10-18  2:08:28 6308
[Warning] Aborted connection 6308 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

Oct 18 02:15:58 awesome-host mariadbd[959]: 2023-10-18  2:15:58 6304
[Warning] Aborted connection 6304 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

Oct 18 02:16:06 awesome-host mariadbd[959]: 2023-10-18  2:16:06 6303
[Warning] Aborted connection 6303 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

Oct 18 21:10:08 awesome-host mariadbd[959]: 2023-10-18 21:10:08 6380
[Warning] Aborted connection 6380 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

Oct 19 02:21:17 awesome-host mariadbd[959]: 2023-10-19  2:21:17 6431
[Warning] Aborted connection 6431 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

Oct 19 02:24:02 awesome-host mariadbd[959]: 2023-10-19  2:24:02 6432
[Warning] Aborted connection 6432 to db: 'koha_instance' user:
'koha_instance' host: 'localhost' (Got timeout reading communication
packets)

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

___
Koha-devel mailing list
Koha-devel@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : https://www.koha-community.org/
git : https://git.koha-community.org/
bugs : https://bugs.koha-community.org/